1
0
mirror of https://github.com/rclone/rclone.git synced 2025-09-16 08:36:38 +02:00

cluster: add docs

This commit is contained in:
Nick Craig-Wood
2025-09-12 17:35:20 +01:00
parent c0ab36f8e4
commit 78e91796b0
4 changed files with 203 additions and 1 deletions

View File

@@ -291,7 +291,7 @@ jobs:
README.md
RELEASE.md
CODE_OF_CONDUCT.md
docs/content/{authors,bugs,changelog,docs,downloads,faq,filtering,gui,install,licence,overview,privacy}.md
docs/content/{authors,bugs,changelog,cluster,docs,downloads,faq,filtering,gui,install,licence,overview,privacy}.md
- name: Scan edits of autogenerated files
run: bin/check_autogenerated_edits.py 'origin/${{ github.base_ref }}'

192
docs/content/cluster.md Normal file
View File

@@ -0,0 +1,192 @@
---
title: "Cluster"
description: "Clustering rclone"
versionIntroduced: "v1.72"
---
# Cluster
Rclone has a cluster mode invoked with the `--cluster` flag. This
enables a group of rclone instances to work together on doing a sync.
This is controlled by a group of flags starting with `--cluster-` and
enabled with the `--cluster` flag.
```text
--cluster string Enable cluster mode with remote to use as shared storage
--cluster-batch-files int Max number of files for a cluster batch (default 1000)
--cluster-batch-size SizeSuffix Max size of files for a cluster batch (default 1Ti)
--cluster-cleanup ClusterCleanup Control which cluster files get cleaned up (default full)
--cluster-id string Set to an ID for the cluster. An ID of 0 or empty becomes the controller
--cluster-quit-workers Set to cause the controller to quit the workers when it finished
```
The command might look something like this which is a normal rclone
command but with a new `--cluster` flag which points at an rclone
remote defining the cluster storage. This is the signal to rclone that
it should engage the cluster mode with a controller and workers.
```sh
rclone copy source: destination: --flags --cluster /work
rclone copy source: destination: --flags --cluster s3:bucket
```
This works only with the `rclone sync`, `copy` and `move` commands.
If the remote specified by the `--cluster` command is inside the
`source:` or `destination:` it must be excluded with the filter flags.
Any rclone remotes used in the transfer must be defined in all cluster
nodes. Defining remotes with connection strings will get around that
problem.
## Terminology
The cluster has two logical groups, the controller and the workers.
There is one controller and many workers.
The controller and the workers will communicate with each other by
creating files in the remote pointed to by the `--cluster` flag. This
could be for example an S3 bucket or a Kubernetes PVC.
The files are JSON serialized rc commands. Multiple commands are sent
using `rc/batch`. The commands flow `pending``processing``done`
`finished`
```text
└── queue
├── pending ← pending task files created by the controller
├── processing ← claimed tasks being executed by a worker
├── done ← finished tasks awaiting the controller to read the result
└── finished ← completed task files
```
The cluster can be set up in two ways as a persistent cluster or as a
transient cluster.
### Persistent cluster
Run a cluster of workers using
```sh
rclone rcd --cluster /work
```
Then run rclone commands when required on the cluster:
```sh
rclone copy source: destination: --flags --cluster /work
```
In this mode there can be many rclone commands executing at once.
### Transient cluster
Run many copies of rclone simultaneously, for example in a Kubernetes
indexed job.
The rclone with `--cluster-id 0` becomes the controller and the others
become the workers. For a Kubernetes indexed job, setting
`--cluster-id $(JOB_COMPLETION_INDEX)` would work well.
Add the `--cluster-quit-workers` flag - this will cause the controller
to make sure the workers exit when it has finished.
All instances of rclone run a command like this so the whole cluster
can only run one rclone command:
```sh
rclone copy source: destination: --flags --cluster /work --cluster-id $(JOB_COMPLETION_INDEX) --cluster-quit-workers
```
## Controller
The controller runs the sync and work distribution.
- It does the listing of the source and destination directories
comparing files in order to find files which need to be transferred.
- Files which need to be transferred are then batched into jobs of
`--cluster-batch-files` files to transfer or `--cluster-batch-size`
max size in `queue/pending` for the workers to pick up.
- It watches `queue/done` for finished jobs and updates the transfer
statistics and logs any errors, accordingly moving the job to
`queue/finished`.
Once the sync is complete, if `--cluster-quit-workers` is set, then it
sends the workers a special command which causes them all to exit.
The controller only sends transfer jobs to the workers. All the other
tasks (eg listing, comparing) are done by the controller. The
controller does not execute any transfer tasks itself.
## Workers
The workers job is entirely to act as API endpoints that receive their
work via files in `/work`. Then
- Read work in `queue/pending`
- Attempt to rename into `queue/processing`
- If the cluster work directory supports atomic renames, then use
those, otherwise read the file, write the copy, delete the original.
If the delete fails then the rename was not successful (possible on
s3 backends).
- If successful then do that item of work. If not successful another
worker got there first and sleep for a bit then retry.
- After the copy is complete then remove the `queue/processing` file
or rename it into `queue/finished` if the `--cluster-cleanup` flag
allows it.
- Repeat
## Flags
### --cluster string
This enables the cluster mode. Without this flag, all the other
cluster flags are ignored. This should be given a remote which can be
a local directory, eg `/work` or a remote directory, eg `s3:bucket`.
### --cluster-batch-files int
This controls the number of files copied in a cluster batch. Setting
this larger may be more efficient but it means the statistics will be
less accurate on the controller (default 1000).
### --cluster-batch-size SizeSuffix
This controls the total size of files in a cluster batch. If the size
of the files in a batch exceeds this number then the batch will be
sent to the workers. Setting this larger may be more efficient but it
means the statistics will be less accurate on the controller. (default
1TiB)
### --cluster-cleanup ClusterCleanup
Controls which cluster files get cleaned up.
- `full` - clean all work files (default)
- `completed` - clean completed work files but leave the errors and status
- `none` - leave all the file (useful for debugging)
### --cluster-id string
Set an ID for the rclone instance. This can be a string or a number.
An ID of 0 will become the controller otherwise the instance will
become a worker. If this flag isn't supplied or the value is empty,
then a random string will be used instead.
### --cluster-quit-workers
If this flag is set, then when the controller finishes its sync task
it will quit all the workers before it exits.
## Not implemented
Here are some features from the original design which are not
implemented yet:
- the controller will not notice if workers die or fail to complete
their tasks
- the controller does not re-assign the workers work if necessary
- the controller does not restart the sync
- the workers do not write any status files (but the stats are
correctly accounted)

View File

@@ -3312,6 +3312,15 @@ For the remote control options and for instructions on how to remote control rcl
See [the remote control section](/rc/).
## Cluster
For the cluster options and for instructions on how to cluster rclone:
- `--cluster`
- Anything starting with `--cluster-`
See the [cluster section](/cluster/).
## Logging
rclone has 4 levels of logging, `ERROR`, `NOTICE`, `INFO` and `DEBUG`.

View File

@@ -19,6 +19,7 @@
<a class="dropdown-item" href="/filtering/"><i class="fa fa-book fa-fw"></i> Filtering</a>
<a class="dropdown-item" href="/gui/"><i class="fa fa-book fa-fw"></i> GUI</a>
<a class="dropdown-item" href="/rc/"><i class="fa fa-book fa-fw"></i> Remote Control</a>
<a class="dropdown-item" href="/cluster/"><i class="fa fa-book fa-fw"></i> Cluster</a>
<a class="dropdown-item" href="/changelog/"><i class="fa fa-book fa-fw"></i> Changelog</a>
<a class="dropdown-item" href="/bugs/"><i class="fa fa-book fa-fw"></i> Bugs</a>
<a class="dropdown-item" href="/faq/"><i class="fa fa-book fa-fw"></i> FAQ</a>