How to Synchronize Directories with Rsync
Synchronizing multiple directories is a simple task with rsync. Rsync allows you to make copies of directories exactly, easily and saving bandwidth. Forget about copying and deleting directories, with rsync you can have the directories you want perfectly synchronized between them.
Keep reading and I'll show you the first steps to start using rsync ...
WHAT IS RSYNC?
rsync is a free application for Unix / Linux and Microsoft Windows systems that offers efficient transmission of incremental data, which also operates with compressed and encrypted data. Using a delta encoding technique , it allows you to synchronize files and directories between two machines on a network or between two locations on the same machine, minimizing the volume of data transferred. An important feature of rsync not found in most programs or protocols is that the copy takes place with only one transmission in each direction. rsync can copy or display contained directories and file copy, optionally using compression and recursion.
rsync is distributed under the GNU General Public License .
rcync allows remote directory synchronization saving bandwidth
1 - NECESSARY ELEMENTS FOR THIS TUTORIAL
A PC with any GNU / Linux distribution
Internet connection (recommended)
2 - INSTALL RSYNC
Depending on the distribution used, the rsync installation will be done in one way or another. For example, for distributions based on GNU / Linux Debian and GNU / Linux Red Hat , the installation will be done as follows:
Debian-based distributions:
sudo apt-get update
sudo apt-get install rsync
Red Hat-based distributions:
sudo yum -y update
raul@redhat:~$ sudo yum install rsync
If you use another distribution, find out how to install the package with the package manager of your distribution, it will probably be very similar to the examples mentioned above.
3 - HOME DIRECTORY
To begin, we will start from a directory called "dir1" with a simple file and directory structure . To see the whole structure we can use commands like "tree" or "find":
find
.
./dir1
./dir1/file1.txt
./dir1/file2.txt
./dir1/dir11
./dir1/dir11/file11.txt
./dir1/dir11/file12.txt
4 - COPY FILES AND FOLDERS WITH RSYNC
To copy the contents from one directory to another , replacing all the files and subdirectories of the destination directory, use:
rsync -rtv directorio_origen/ directorio_destino/
We must take into account the final / slash that is added in the source_directory /. What is achieved with this bar is to prevent a new directory from being created. If we do not add said slash / in the source directory , then the source_directory will be created within the destination_directory .
For example, if we want to copy all the content of a directory called Photos into a Photos directory that already exists previously, we need to add the slash / at the end , if we don't, a directory called Photos will be created within the destination directory called Photos.
For example, to copy all the files and subdirectories of the directory "dir1" in another directory called "dir2" we will use the following command:
rsync -rtv dir1/ dir2/
- The -r parameter means that the copying will be done recursively , that is, all the directories, subdirectories and files stored in all of them will be copied.
- The -t parameter will cause the modification time of each file to be the same both in the source_directory and in the destination_directory.
- The -v parameter means that more information will be displayed during the synchronization process to see the progress of the command.
To check that everything has been copied correctly, we can use the "find" command :
find
.
./dir1
./dir1/file1.txt
./dir1/file2.txt
./dir1/dir11
./dir1/dir11/file11.txt
./dir1/dir11/file12.txt
./dir2
./dir2/dir11
./dir2/dir11/file11.txt
./dir2/dir11/file12.txt
./dir2/file1.txt
./dir2/file2.txt
We can also use the "diff" command to find the differences between directories:
diff -r dir1/ dir2/
5 - SYNCHRONIZE TWO DIRECTORIES (ONLY UPDATE)
If we need to update the content of the target_directory , ie add new files that are in the directorio_origen copy them to the target_directory , we must use the -u parameter . Thanks to this parameter, the content of the destination_directory will be updated based on the source_directory . Rsync uses the algorithm called delta-transfer to perform this operation.
In this example we are going to see how the algorithm works. Inside the directory ./dir1/ a new file called file3.txt has been created . At the time of "synchronizing" the directories, only the new file file3.txt will be transferred :
The structure of the directories is as follows:
find
.
./dir1
./dir1/file1.txt
./dir1/file2.txt
./dir1/dir11
./dir1/dir11/file11.txt
./dir1/dir11/file12.txt
./dir1/file3.txt
./dir2
./dir2/dir11
./dir2/dir11/file11.txt
./dir2/dir11/file12.txt
./dir2/file1.txt
./dir2/file2.txt
To update the directory content we will use the following command:
rsync -rtvu carpeta_origen/ carpeta_destino/
- The -r parameter means that the copying will be done recursively , that is, all the directories, subdirectories and files stored in all of them will be copied.
- The -t parameter will cause the modification time of each file to be the same both in the source_directory and in the destination_directory.
- The -v parameter means that more information will be displayed during the synchronization process to see the progress of the command.
- The -u parameter means that only the files and directories different between the source_directory and the destination_directory will be updated
To update the content of the directory ./dir1/ in ./dir2/ we will use the following command, where it is seen that only the file file3.txt will be transferred:
rsync -rtvu dir1/ dir2/
sending incremental file list
./
file3.txt
sent 248 bytes received 39 bytes 574.00 bytes/sec
total size is 0 speedup is 0.00
The new directory structure is now as follows:
find
.
./dir1
./dir1/file1.txt
./dir1/file2.txt
./dir1/dir11
./dir1/dir11/file11.txt
./dir1/dir11/file12.txt
./dir1/file3.txt
./dir2
./dir2/dir11
./dir2/dir11/file11.txt
./dir2/dir11/file12.txt
./dir2/file1.txt
./dir2/file2.txt
./dir2/file3.txt
We should note that by default , rsync used to differentiate the files of the directorio_origen and directorios_destino both the size of the files as its modification date . If what we want to do is use a hash in the files, we must use the -c parameter . With the -c parameter it is achieved that: if the checksum of the origin and the destination are equal , no operation will be performed on said file. The command to use in this case would be the following:
rsync -rvuc directorio_origen/ directorio_destino/
6 - COMPLETELY SYNCHRONIZE TWO DIRECTORIES
If you are looking for it is the synchronization 100% between two directories , ie, they are equal to each other , you need to copy files and directories from the directorio_origen to target_directory , but must be deleted in the target_directory files and directories deleted in the source_directory .
rsync allows you to do all of this thanks to the --delete parameter . The --delete parameter used together with the -u parameter, updates the modified files and allows us to keep two directories perfectly synchronized .
rsync -rtvu --delete carpeta_origen/ carpeta_destino/
- The -r parameter means that the copying will be done recursively , that is, all the directories, subdirectories and files stored in all of them will be copied.
- The -t parameter will cause the modification time of each file to be the same both in the source_directory and in the destination_directory.
- The -v parameter means that more information will be displayed during the synchronization process to see the progress of the command.
- The -u parameter means that only the files and directories different between the source_directory and the destination_directory will be updated
- The --delete parameter means that files that do not exist in the source_directory will be deleted from the destination_directory
The directory structure is as follows, where you can see that the file3.txt file exists in the destination_directory but not in the source:
find
.
./dir1
./dir1/file1.txt
./dir1/file2.txt
./dir1/dir11
./dir1/dir11/file11.txt
./dir1/dir11/file12.txt
./dir2
./dir2/dir11
./dir2/dir11/file11.txt
./dir2/dir11/file12.txt
./dir2/file1.txt
./dir2/file2.txt
./dir2/file3.txt
If we synchronize the directories 100% , you can see how the file file3.txt is deleted from the destination_directory since said file does not exist in the source_directory :
rsync -rtvu --delete dir1/ dir2/
sending incremental file list
deleting file3.txt
./
sent 177 bytes received 29 bytes 412.00 bytes/sec
total size is 0 speedup is 0.00
- The -r parameter means that the copying will be done recursively , that is, all the directories, subdirectories and files stored in all of them will be copied.
- The -t parameter will cause the modification time of each file to be the same both in the source_directory and in the destination_directory.
- The -v parameter means that more information will be displayed during the synchronization process to see the progress of the command.
- The -u parameter means that only the files and directories different between the source_directory and the destination_directory will be updated
- The --delete parameter means that files that do not exist in the source_directory will be deleted from the destination_directory
The new directory structure is as follows. The file file3.txt does not exist in either directory :
find
.
./dir1
./dir1/file1.txt
./dir1/file2.txt
./dir1/dir11
./dir1/dir11/file11.txt
./dir1/dir11/file12.txt
./dir2
./dir2/dir11
./dir2/dir11/file11.txt
./dir2/dir11/file12.txt
./dir2/file1.txt
./dir2/file2.txt
The file deletion process can be carried out at different times of the synchronization:
- Before transfer: rsync can search for missing files and delete them before the transfer process with the --delete-after parameter (default behavior)
- After transfer: rsync can search for missing files and delete them after the transfer is complete with the --delete-after parameter
- During transfer: rsync can delete files during transfer with the --delete-during parameter
- Delayed deletion : rsync can do the transfer and find the missing files during this process, wait until it has finished, and delete the files it found afterwards, with the --delete-delay parameter
7 - SYNCHRONIZE TWO REMOTE DIRECTORIES
With rsync it is also possible to copy files and synchronize a local directory with another remote directory located on any other computer on the network as long as you have SSH , RSH access or are running rsync as a service on the remote server.
To synchronize two directories, one of them located on a remote computer, you need the DNS address or IP address of the remote computer, as well as a user with access privileges .
For example, to synchronize a local_directory with a remote_directory , the following ways can be used:
rsync -rtvz source_directory / user @ domain: / path / target_directory /
rsync -rtvz directorio_origen / usuario@xxx.xxx.xxx.xxx / path / target_directory /
rsync -rtvz directorio_origen/ usuario@dominio:/ruta/directorio_destino/
rsync -rtvz directorio_origen/ usuario@xxx.xxx.xxx.xxx/ruta/directorio_destino/
For example, to synchronize a remote_directory with a local_directory , the following ways can be used:
rsync -rtvz usuario@dominio:/ruta/directorio_origen/ directorio_destino/
rsync -rtvz usuario@xxx.xxx.xxx.xxx/ruta/directorio_origen/ directorio_destino/
- The -r parameter means that the copying will be done recursively , that is, all the directories, subdirectories and files stored in all of them will be copied.
- The -t parameter will cause the modification time of each file to be the same both in the source_directory and in the destination_directory.
- The -v parameter means that more information will be displayed during the synchronization process to see the progress of the command.
- The -z parameter means that the files will be compressed before being sent
8 - COMPRESS FILES BEFORE SYNCHRONIZING
When making transfers between different computers, we may be interested in saving some bandwidth by compressing the files before they are sent. To do this, we must add the -z parameter . For example:
rsync -rtvz directorio_origen/ usuario@dominio:/directorio_destino/
- The -r parameter means that the copying will be done recursively , that is, all the directories, subdirectories and files stored in all of them will be copied.
- The -t parameter will cause the modification time of each file to be the same both in the source_directory and in the destination_directory.
- The -v parameter means that more information will be displayed during the synchronization process to see the progress of the command.
- The -z parameter means that the files will be compressed before being sent
Keep in mind that file compression requires a lot of CPU , therefore, when the files to be transferred are very large or the files are already in a compressed format, the -z parameter option is not very interesting.
9 - EXCLUDE FILES AND DIRECTORIES
On certain occasions we will be forced to exclude some files or directories. To do this, we can use the --exclude parameter followed by the file or directory that we want to exclude. For example:
Exclude a directory:
rsync -rtvz --exclude 'directorio' directorio_origen/ directorio_destino/
Exclude a file:
rsync -rtvz --exclude 'fichero.doc' directorio_origen/ directorio_destino/
Exclude a subdirectory:
rsync -rtvz --exclude 'ruta/directorio' directorio_origen/ directorio_destino/
Exclude a file in a subdirectory:
rsync -rtvz --exclude 'ruta/directorio/fichero.doc' directorio_origen/ directorio_destino/
If we want to exclude many files, in different paths, we can create a file with the list of all of them (one on each line) and pass it as a parameter as in the following example:
rsync -rtv --exclude-from 'lista_excluidos.txt' directorio_origen/ directorio_destino/
Post a Comment