Data Transfer/Rsync
Rsync is a command line tool used for singlethreaded, one-directional synchronization. It allows to only transfer files that have changed or that were newly created on the source side. Rsync even allows to only synchronize parts of a file that have changed instead of the whole file because it operates on block level instead of file level.
Usage Examples
rsync -av dir1/ dir2/
Local synchronization of all files from dir1 to dir2
rsync -av --delete dir1/ dir2/
Local synchronization of all files from dir1 to dir2. Files that do not appear in dir1 are deleted in dir2.
rsync -av dir1/ <user>@<remotehost>:/home/kit/ifkm/ej4555/dir2/
Local to remote synchronization of all files from the local dir1 folder to the dir2 folder, which is accessed via the network using SSH.
rsync -av <user>@<remotehost>:/home/kit/ifkm/ej4555/dir1/ dir2/
Remote to local synchronization of files from the dir1 folder, which is accessed via the network using SSH, to the local dir2 folder.
The parameter --partial
allows Rsync to work with partially downloaded files. Interrupted Rsync sessions can be restarted where it left off by repeating the used command again.
The parameter setting --rsh=ssh
tells Rsync to use ssh as a remote shell to have a secure connection.
# Execute in your local folder, where the file bigdata.tgz is placed. $ rsync -P --rsh=ssh <username>@<remotehost>:bigdata.tgz ./bigdata.tgz
Best Practices
If Rsync is not found on the remote host:
You can add the Rsync path as additional option:
--rsync-path=/usr/bin/rsync
You can find the path by using which rsync
on the remotehost.
When the connection is slow:
Compress the data with the -z
option to make the transfer faster.