using rsync for 'cp -rl' is copying endlessly - rsync

I am using cp -rl to copy a folder tree and would like to get some information about the copy.
I was thinking about using rsync,
this is what I tried so far this, but without any success, it seems to copy endlessly:
rsync -ah --info=progress2 -rl src_dir dst_dir
cp is doing it in less than a minute, rsync was still on after 10 minutes...
what am I doing wrong?

Related

inotify and rsync on large number of files

I am using inotify to watch a directory and sync files between servers using rsync. Syncing works perfectly, and memory usage is mostly not an issue. However, recently a large number of files were added (350k) and this has impacted performance, specifically on CPU. Now when rsync runs, CPU usage spikes to 90%/100% and rsync takes long to complete, there are 650k files being watched/synced.
Is there any way to speed up rsync and only rsync the directory that has been changed? Or alternatively to set up multiple inotifywaits on separate directories. Script being used is below.
UPDATE: I have added the --update flag and usage seems mostly unchanged
#! /bin/bash
EVENTS="CREATE,DELETE,MODIFY,MOVED_FROM,MOVED_TO"
inotifywait -e "$EVENTS" -m -r --format '%:e %f' /var/www/ --exclude '/var/www/.*cache.*' | (
WAITING="";
while true; do
LINE="";
read -t 1 LINE;
if test -z "$LINE"; then
if test ! -z "$WAITING"; then
echo "CHANGE";
WAITING="";
rsync --update -alvzr --exclude '*cache*' --exclude '*.git*' /var/www/* root#secondwebserver:/var/www/
fi;
else
WAITING=1;
fi;
done)
I ended up removing the compression option (z) and upping the WAITING var to 10 (seconds). This seems to have helped, rsync still spikes CPU load but it is shorter lived. Credit goes to an answer on unix stackexchange
You're using rsync to synchronize the root directory of a large tree, so I'm not surprised at the performance loss.
One possible solution is to only synchronize the changed files/directories, instead of the whole root directory.
For instance, file1, file2 and file3 lay under from/dir. When changes are made to these 3 files, use
rsync --update -alvzr from/dir/file1 from/dir/file2 from/dir/file3 to/dir
rather than
rsync --update -alvzr from/dir/* to/dir
But this has a potential bug: rsync won't create directories automatically if target folders don't exist. However, you can use ssh to execute remote command and create directories by yourself.
You may need to set SSH public-key authentication as well, but according to the rsync command line you paste, I assume you've already done this.
reference:
rsync - create all missing parent directories?
rsync: how can I configure it to create target directory on server?
How to use SSH to run a shell script on a remote machine?
SSH error when executing a remote command: "stdin: is not a tty"

rsync: --delete-during --backup-dir=PATH --backup doesn't backup directories that are deleted to make way for files

When running rsync with the --backup --delete-during and --backup-dir=PATH options, only files that are deleted are backed up, but directories are not if those directories were empty at the time they were deleted. I can't see an option that specifies directories should not be pruned from backup when being deleted.
Example:
mkdir /tmp/test_rsync_delete
cd /tmp/test_rsync_delete
mkdir -p a/a/a/a/a
ln -s . a/b
mkdir -p b/a/a
ln -s a/a b/a
touch b/a/a/a
mkdir c
mkdir backup
rsync -avi --delete-during --backup --backup-dir=backup a/ c/
find backup/ -exec ls -ldi {} \;
# Should be empty
rsync -avi --delete-during --backup --backup-dir=backup b/ c/
find backup/ -exec ls -ldi {} \;
# Will be missing the directory that was deleted to make way for the file.
Update
As per the above example, when you run it, you will notice that the empty directories were pruned/removed by the --delete option. However, the same directories were not backed up in the directory specified by the --backup-dir option. It's not necessarily the directories that are important, but the permissions and ownership that are important. If rsync fails when running in batch mode (--read-batch) then you need to be able to roll back by restoring the system to its previous state. If directories are not being backed up, then it's not really creating a reliable point from which to restore to - it will potentially be missing some directories.
So why does the --backup family of options not backup empty directories when they are going to be pruned by the --delete family of options?
This is not an answer to the specific question, but probably the answer to, what others were searching for, ending up here:
Just for info: this is what I was searching for when I found this question:
rsync -av --delete-after src dest
-av The "-a" means archive. This will preserve symlinks, permissions, timestamps, group/owners, and will be recursive. The "v" makes the job verbose. This won't be necessary, but you can see what's happening with the rsync so you know if you've done something wrong.
--delete-after Will tell rsync to compare the destination against the source and delete any extraneous files after the rsync has completed. This is a dangerous option, so use with caution.

read input from a file and sync accordingly

I have a text file which contains the list of files and directories that I want to copy (one on a line). Now I want rsync to take this input from my text file and sync it to the destination that I provide.
I've tried playing around with "--include-from=FILE" and "--file-from=FILE" options of rsync but it is just not working
I also tried pre-fixing "+" on each line in my file but still it is not working.
I have tried coming with various filter PATTERNs as outlined in the rsync man page but it is still not working.
Could someone provide me correct syntax for this use case. I've tried above things on Fedora 15, RHEL 6.2 and Ubuntu 10.04 and none worked. So i am definitely missing something.
Many thanks.
There is more than one way to answer this question depending on how you want to copy these files. If your intent is to copy the file list with absolute paths, then it might look something like:
rsync -av --files-from=/path/to/files.txt / /destination/path/
...This would expect the paths to be relative to the source location of / and would retain the entire absolute structure under that destination.
If your goal is to copy all of those files in the list to the destination, without preserving any kind of path hierarchy (just a collection of files), then you could try one of the following:
# note this method might break if your file it too long and
# exceed the maximum arg limit
rsync -av `cat /path/to/file` /destination/
# or get fancy with xargs to batch 200 of your items at a time
# with multiple calls to rsync
cat /path/to/file | xargs -n 200 -J % rsync -av % /destination/
Or a for-loop and copy:
# bash shell
for f in `cat /path/to/files.txt`; do cp $f /dest/; done
Given a file listing $HOME/GET/bringemback containing
**need/A
alsoneed/B
shouldget/C**
cd $HOME/GET
run
rsync -av --files-from=./bringemback me#theremote:. $HOME/GET/collect
would get the files and drop them into $HOME/GET/collect
$HOME/GET/
collect/
need/A
alsoneed/B
shouldget/C
or so I believe.
helpful
rsync supports this natively:
rsync --recursive -av --files-from=/path/to/files.txt / /destination/path/

How do you get rsync to exclude any directory named cache?

I'm new to rsync and have read a bit about excluding files and directories but I don't fully understand and can't seem to get it working.
I'm simply trying to run a backup of all the websites in a server's webroot but don't want any of the CMS's cache files.
Is there away to exclude any directory named cache?
I've tried a lot of things over the weeks (that I don't remember), but more recently I've been trying these sorts of things:
sudo rsync -avzO -e --exclude *cache ssh username#11.22.33.44:/home/ /Users/username/webserver-backups/DEV/home/
and this:
sudo rsync -avzO -e --exclude cache/ ssh username#11.22.33.44:/home/ /Users/username/webserver-backups/DEV/home/
and this:
sudo rsync -avzO -e --exclude */cache/ ssh username#11.22.33.44:/home/ /Users/username/webserver-backups/DEV/home/
and this:
sudo rsync -avzO -e --exclude *cache/ ssh username#11.22.33.44:/home/ /Users/username/webserver-backups/DEV/home/
Sorry if this is easy, I just haven't been able to find info that I understand because they all talk about a path to exclude.
It's just that I don't have a specific path I want to exclude - just a directory name if that makes sense.
rsync --exclude cache/ ....
should work like peaches. I think you might be confusing some things since -e requires an option (like -e "ssh -l ssh-user"). Edit on looking at your command lines a little closer, it turns out this is exactly your problem. You should have said
--exclude cache/ -e ssh
although you could just drop -e ssh since ssh is the default.
I'd also recommend that you look at the filter rules:
rsync -FF ....
That way you can include .rsync-filter files throughout your directory tree, containing things like
-cache/
This makes things way more flexible, make command lines more readable and you can make exceptions inside specific subtrees.

Using rsync to delete a single file

File foo.txt exists on the remote machine at: /home/user/foo.txt
It doesn't exist on the local machine.
I want to delete foo.txt using rsync.
I do not know (and assume for the purposes of this question that I cannot find out) what other files are in /home/user on either the local or remote machines, so I can't just sync the whole directory.
What rsync command can I use to delete foo.txt on the remote machine?
Try this:
rsync -rv --delete --include=foo.txt '--exclude=*' /home/user/ user#remote:/home/user/
(highly recommend running with --dry-run first to test it) Although it seems like it would be easier to use ssh...
ssh user#remote "rm /home/user/foo.txt"
That's a bit trivial, but if, like me, you came to this page looking for a way to delete the content of a directory from remote server using rsync, this is how I did it:
Create an empty mock folder:
mkdir mock
Sync with it:
rsync -arv --delete --dry-run ~/mock/ remote_server:~/dir_to_clean/
Remove --dry-run from the line above to actually do the thing.
As suggested above, use --dry-run to test prior. --delete deletes files on the remote location per the rsync man page.
rsync -rv --delete user#hostname.local:full/path/to/foo.txt
Comment below stating this will list only is incorrect. To list only use --list-only and remove --delete.
Just came across the same problem, needed to use rsync to delete a remote file, as only rsync and no other SSH commands were allowed. The --remove-source-files option (formerly known as --remove-sender-files) did exactly that:
rsync -avPn --remove-source-files remote:/home/user/foo.txt .
rm foo.txt
As always, remove the -n option to really execute this.

Resources