2020-09-28 20:29:44 +03:00
---
title: "HDFS Remote"
description: "Remote for Hadoop Distributed Filesystem"
2022-12-20 21:05:05 +01:00
versionIntroduced: "v1.54"
2020-09-28 20:29:44 +03:00
---
2021-07-20 19:45:41 +01:00
# {{< icon "fa fa-globe" >}} HDFS
2020-09-28 20:29:44 +03:00
2025-08-25 00:00:48 +02:00
[HDFS ](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html )
is a distributed file-system, part of the [Apache Hadoop ](https://hadoop.apache.org/ )
framework.
2020-09-28 20:29:44 +03:00
Paths are specified as `remote:` or `remote:path/to/dir` .
2021-10-14 15:40:18 +02:00
## Configuration
2020-09-28 20:29:44 +03:00
Here is an example of how to make a remote called `remote` . First run:
2025-10-31 21:58:24 +01:00
```console
2025-08-25 00:00:48 +02:00
rclone config
```
2020-09-28 20:29:44 +03:00
This will guide you through an interactive setup process:
2025-08-25 00:00:48 +02:00
```text
2021-11-01 21:34:46 +01:00
No remotes found, make a new one?
2020-09-28 20:29:44 +03:00
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n
name> remote
Type of storage to configure.
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
[skip]
XX / Hadoop distributed file system
\ "hdfs"
[skip]
Storage> hdfs
** See help for hdfs backend at: https://rclone.org/hdfs/ **
hadoop name node and port
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
1 / Connect to host namenode at port 8020
\ "namenode:8020"
namenode> namenode.hadoop:8020
hadoop user name
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
1 / Connect to hdfs as root
\ "root"
username> root
Edit advanced config? (y/n)
y) Yes
n) No (default)
y/n> n
Remote config
2024-08-16 12:05:43 +02:00
Configuration complete.
Options:
- type: hdfs
- namenode: namenode.hadoop:8020
- username: root
Keep this "remote" remote?
2020-09-28 20:29:44 +03:00
y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote
y/e/d> y
Current remotes:
Name Type
==== ====
hadoop hdfs
e) Edit existing remote
n) New remote
d) Delete remote
r) Rename remote
c) Copy remote
s) Set configuration password
q) Quit config
e/n/d/r/c/s/q> q
```
This remote is called `remote` and can now be used like this
See all the top level directories
2025-10-31 21:58:24 +01:00
```console
2025-08-25 00:00:48 +02:00
rclone lsd remote:
```
2020-09-28 20:29:44 +03:00
List the contents of a directory
2025-10-31 21:58:24 +01:00
```console
2025-08-25 00:00:48 +02:00
rclone ls remote:directory
```
2020-09-28 20:29:44 +03:00
Sync the remote `directory` to `/home/local/directory` , deleting any excess files.
2025-10-31 21:58:24 +01:00
```console
2025-08-25 00:00:48 +02:00
rclone sync --interactive remote:directory /home/local/directory
```
2020-09-28 20:29:44 +03:00
### Setting up your own HDFS instance for testing
You may start with a [manual setup ](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html )
or use the docker image from the tests:
If you want to build the docker image
2025-10-31 21:58:24 +01:00
```console
2020-09-28 20:29:44 +03:00
git clone https://github.com/rclone/rclone.git
cd rclone/fstest/testserver/images/test-hdfs
docker build --rm -t rclone/test-hdfs .
```
Or you can just use the latest one pushed
2025-10-31 21:58:24 +01:00
```console
2020-09-28 20:29:44 +03:00
docker run --rm --name "rclone-hdfs" -p 127.0.0.1:9866:9866 -p 127.0.0.1:8020:8020 --hostname "rclone-hdfs" rclone/test-hdfs
```
**NB** it need few seconds to startup.
For this docker image the remote needs to be configured like this:
2025-08-25 00:00:48 +02:00
```ini
2020-09-28 20:29:44 +03:00
[remote]
type = hdfs
namenode = 127.0.0.1:8020
username = root
```
2025-08-25 00:00:48 +02:00
You can stop this image with `docker kill rclone-hdfs` (**NB** it does not use
volumes, so all data uploaded will be lost.)
2020-09-28 20:29:44 +03:00
2023-11-18 13:36:46 +01:00
### Modification times
2020-09-28 20:29:44 +03:00
Time accurate to 1 second is stored.
### Checksum
No checksums are implemented.
### Usage information
2025-08-25 00:00:48 +02:00
You can use the `rclone about remote:` command which will display filesystem
size and current usage.
2020-09-28 20:29:44 +03:00
### Restricted filename characters
In addition to the [default restricted characters set ](/overview/#restricted-characters )
the following characters are also replaced:
| Character | Value | Replacement |
| --------- |:-----:|:-----------:|
| : | 0x3A | : |
Invalid UTF-8 bytes will also be [replaced ](/overview/#invalid-utf8 ).
2025-11-04 14:56:55 +01:00
<!-- autogenerated options start - DO NOT EDIT - instead edit fs.RegInfo in backend/hdfs/hdfs.go and run make backenddocs to verify --> <!-- markdownlint - disable - line line - length -->
2021-11-01 15:42:05 +00:00
### Standard options
2020-09-28 20:29:44 +03:00
2022-07-09 18:08:20 +01:00
Here are the Standard options specific to hdfs (Hadoop distributed file system).
2020-09-28 20:29:44 +03:00
#### --hdfs-namenode
2023-11-26 15:59:12 +00:00
Hadoop name nodes and ports.
2021-11-01 15:42:05 +00:00
2023-11-26 15:59:12 +00:00
E.g. "namenode-1:8020,namenode-2:8020,..." to connect to host namenodes at port 8020.
2020-09-28 20:29:44 +03:00
2022-03-18 12:29:54 +00:00
Properties:
2020-09-28 20:29:44 +03:00
- Config: namenode
- Env Var: RCLONE_HDFS_NAMENODE
2023-11-26 15:59:12 +00:00
- Type: CommaSepList
- Default:
2020-09-28 20:29:44 +03:00
#### --hdfs-username
2021-11-01 15:42:05 +00:00
Hadoop user name.
2020-09-28 20:29:44 +03:00
2022-03-18 12:29:54 +00:00
Properties:
2020-09-28 20:29:44 +03:00
- Config: username
- Env Var: RCLONE_HDFS_USERNAME
- Type: string
2022-03-18 12:29:54 +00:00
- Required: false
2020-09-28 20:29:44 +03:00
- Examples:
- "root"
2021-11-01 15:42:05 +00:00
- Connect to hdfs as root.
2020-09-28 20:29:44 +03:00
2021-11-01 15:42:05 +00:00
### Advanced options
2020-09-28 20:29:44 +03:00
2022-07-09 18:08:20 +01:00
Here are the Advanced options specific to hdfs (Hadoop distributed file system).
2020-09-28 20:29:44 +03:00
2021-01-16 18:52:08 +03:00
#### --hdfs-service-principal-name
2021-11-01 15:42:05 +00:00
Kerberos service principal name for the namenode.
2021-01-16 18:52:08 +03:00
Enables KERBEROS authentication. Specifies the Service Principal Name
2021-11-01 15:42:05 +00:00
(SERVICE/FQDN) for the namenode. E.g. \"hdfs/namenode.hadoop.docker\"
for namenode running as service 'hdfs' with FQDN 'namenode.hadoop.docker'.
2021-01-16 18:52:08 +03:00
2022-03-18 12:29:54 +00:00
Properties:
2021-01-16 18:52:08 +03:00
- Config: service_principal_name
- Env Var: RCLONE_HDFS_SERVICE_PRINCIPAL_NAME
- Type: string
2022-03-18 12:29:54 +00:00
- Required: false
2021-01-16 18:52:08 +03:00
#### --hdfs-data-transfer-protection
2021-11-01 15:42:05 +00:00
Kerberos data transfer protection: authentication|integrity|privacy.
2021-01-16 18:52:08 +03:00
Specifies whether or not authentication, data signature integrity
2023-06-30 14:11:17 +01:00
checks, and wire encryption are required when communicating with
the datanodes. Possible values are 'authentication', 'integrity'
and 'privacy'. Used only with KERBEROS enabled.
2021-01-16 18:52:08 +03:00
2022-03-18 12:29:54 +00:00
Properties:
2021-01-16 18:52:08 +03:00
- Config: data_transfer_protection
- Env Var: RCLONE_HDFS_DATA_TRANSFER_PROTECTION
- Type: string
2022-03-18 12:29:54 +00:00
- Required: false
2021-01-16 18:52:08 +03:00
- Examples:
- "privacy"
- Ensure authentication, integrity and encryption enabled.
2020-09-28 20:29:44 +03:00
#### --hdfs-encoding
2022-03-18 12:29:54 +00:00
The encoding for the backend.
2020-09-28 20:29:44 +03:00
2021-11-01 15:42:05 +00:00
See the [encoding section in the overview ](/overview/#encoding ) for more info.
2020-09-28 20:29:44 +03:00
2022-03-18 12:29:54 +00:00
Properties:
2020-09-28 20:29:44 +03:00
- Config: encoding
- Env Var: RCLONE_HDFS_ENCODING
2023-11-26 15:59:12 +00:00
- Type: Encoding
2020-09-28 20:29:44 +03:00
- Default: Slash,Colon,Del,Ctl,InvalidUtf8,Dot
2024-03-10 11:22:43 +00:00
#### --hdfs-description
2024-06-14 16:04:51 +01:00
Description of the remote.
2024-03-10 11:22:43 +00:00
Properties:
- Config: description
- Env Var: RCLONE_HDFS_DESCRIPTION
- Type: string
- Required: false
2025-11-04 14:56:55 +01:00
<!-- autogenerated options stop -->
2021-10-14 15:40:18 +02:00
## Limitations
2025-09-09 16:29:54 +01:00
- Erasure coding not supported, see [issue #8808 ](https://github.com/rclone/rclone/issues/8808 )
2021-10-14 15:40:18 +02:00
- No server-side `Move` or `DirMove` .
- Checksums not implemented.