Rsync: provide a list of unsent files - rsync

The Rsync -u flag prevents the overwriting of modified destination files. How can I get a list of files that were not sent due to this flag? The -v flag will let me know which files were sent, but I would like to know which ones weren't.

From the rsync man page:
-i, --itemize-changes
Requests a simple itemized list of the changes that are being
made to each file, including attribute changes. This is exactly
the same as specifying --out-format='%i %n%L'. If you repeat
the option, unchanged files will also be output, but only if the
receiving rsync is at least version 2.6.7 (you can use -vv with
older versions of rsync, but that also turns on the output of
other verbose messages).
In my testing, the -ii option isn't working with rsync 3.0.8, but -vv is. Your mileage may vary.
You could also get substantially the same information by invoking rsync with --dry-run and --existing in the opposite direction. So if your regular transfer looked like this:
rsync --update --recursive local:/directory/ remote:/directory/
You would use:
rsync --dry-run --existing --recursive remote:/directory/ local:/directory/
but -vv or -ii is safer and less prone to misinterpretation.

Related

Rsync takes longer time “receiving incremental file list”

I am using rysnc to copy files from remote host to local machine using a cron job. Every time I need the rsync to copy new files only from remote host. But its getting struck at this line "receiving incremental file list" for very long time. Below is the command I am using. Is there any other way I can fasten up this rsync process?
rsync -avz --inplace --progress --delete -ahe ssh remoteuser#remotehost:/home/bin/dir1/data /home/bin/dir1
Have you tried with --delete-before, --delete-after or --delay-updates?
Some options require rsync to know the full file list, so these options
disable the incremental recursion mode. These include: --delete-before, --delete-after, --prune-empty-dirs, and --delay-updates. Because of this, the default delete mode when you specify --delete is now --delete-during when both ends of the connection are at least 3.0.0 (use --del or --delete-during to request this improved deletion mode explicitly). See also the --delete-delay option that is a better choice than using --delete-after.
(from: http://linux.die.net/man/1/rsync)

inotify and rsync on large number of files

I am using inotify to watch a directory and sync files between servers using rsync. Syncing works perfectly, and memory usage is mostly not an issue. However, recently a large number of files were added (350k) and this has impacted performance, specifically on CPU. Now when rsync runs, CPU usage spikes to 90%/100% and rsync takes long to complete, there are 650k files being watched/synced.
Is there any way to speed up rsync and only rsync the directory that has been changed? Or alternatively to set up multiple inotifywaits on separate directories. Script being used is below.
UPDATE: I have added the --update flag and usage seems mostly unchanged
#! /bin/bash
EVENTS="CREATE,DELETE,MODIFY,MOVED_FROM,MOVED_TO"
inotifywait -e "$EVENTS" -m -r --format '%:e %f' /var/www/ --exclude '/var/www/.*cache.*' | (
WAITING="";
while true; do
LINE="";
read -t 1 LINE;
if test -z "$LINE"; then
if test ! -z "$WAITING"; then
echo "CHANGE";
WAITING="";
rsync --update -alvzr --exclude '*cache*' --exclude '*.git*' /var/www/* root#secondwebserver:/var/www/
fi;
else
WAITING=1;
fi;
done)
I ended up removing the compression option (z) and upping the WAITING var to 10 (seconds). This seems to have helped, rsync still spikes CPU load but it is shorter lived. Credit goes to an answer on unix stackexchange
You're using rsync to synchronize the root directory of a large tree, so I'm not surprised at the performance loss.
One possible solution is to only synchronize the changed files/directories, instead of the whole root directory.
For instance, file1, file2 and file3 lay under from/dir. When changes are made to these 3 files, use
rsync --update -alvzr from/dir/file1 from/dir/file2 from/dir/file3 to/dir
rather than
rsync --update -alvzr from/dir/* to/dir
But this has a potential bug: rsync won't create directories automatically if target folders don't exist. However, you can use ssh to execute remote command and create directories by yourself.
You may need to set SSH public-key authentication as well, but according to the rsync command line you paste, I assume you've already done this.
reference:
rsync - create all missing parent directories?
rsync: how can I configure it to create target directory on server?
How to use SSH to run a shell script on a remote machine?
SSH error when executing a remote command: "stdin: is not a tty"

How to restrict Rsync update timestamp

rsync -av --size-only --include="*/" --include="*.jpeg" --exclude="*" ~/alg/temperature/ ~/alg/tmp/
I use command as above to sync some files, and I don't want to update anything even timestamp if file size is the same
the option --size-only could only sync the file which changed in size
but those which no change in size will be "touched" and update the timestamp, this is what I don't want
how could I make it?
The -a option is equivalent to -rlptgoD. You need to remove the -t. -t tells rsync to transfer modification times along with the files and update them on the remote system.
You may want to try the -c skip based on checksum, not mod-time & size. This is slower, but should work for what you want.
So your line could be (by expanding a and replacing t with c):
rsync -rlpcgoDv --include="*/" --include="*.jpeg" --exclude="*" ~/alg/temperature/ ~/alg/tmp/

How do you get rsync to exclude any directory named cache?

I'm new to rsync and have read a bit about excluding files and directories but I don't fully understand and can't seem to get it working.
I'm simply trying to run a backup of all the websites in a server's webroot but don't want any of the CMS's cache files.
Is there away to exclude any directory named cache?
I've tried a lot of things over the weeks (that I don't remember), but more recently I've been trying these sorts of things:
sudo rsync -avzO -e --exclude *cache ssh username#11.22.33.44:/home/ /Users/username/webserver-backups/DEV/home/
and this:
sudo rsync -avzO -e --exclude cache/ ssh username#11.22.33.44:/home/ /Users/username/webserver-backups/DEV/home/
and this:
sudo rsync -avzO -e --exclude */cache/ ssh username#11.22.33.44:/home/ /Users/username/webserver-backups/DEV/home/
and this:
sudo rsync -avzO -e --exclude *cache/ ssh username#11.22.33.44:/home/ /Users/username/webserver-backups/DEV/home/
Sorry if this is easy, I just haven't been able to find info that I understand because they all talk about a path to exclude.
It's just that I don't have a specific path I want to exclude - just a directory name if that makes sense.
rsync --exclude cache/ ....
should work like peaches. I think you might be confusing some things since -e requires an option (like -e "ssh -l ssh-user"). Edit on looking at your command lines a little closer, it turns out this is exactly your problem. You should have said
--exclude cache/ -e ssh
although you could just drop -e ssh since ssh is the default.
I'd also recommend that you look at the filter rules:
rsync -FF ....
That way you can include .rsync-filter files throughout your directory tree, containing things like
-cache/
This makes things way more flexible, make command lines more readable and you can make exceptions inside specific subtrees.

Using rsync to delete a single file

File foo.txt exists on the remote machine at: /home/user/foo.txt
It doesn't exist on the local machine.
I want to delete foo.txt using rsync.
I do not know (and assume for the purposes of this question that I cannot find out) what other files are in /home/user on either the local or remote machines, so I can't just sync the whole directory.
What rsync command can I use to delete foo.txt on the remote machine?
Try this:
rsync -rv --delete --include=foo.txt '--exclude=*' /home/user/ user#remote:/home/user/
(highly recommend running with --dry-run first to test it) Although it seems like it would be easier to use ssh...
ssh user#remote "rm /home/user/foo.txt"
That's a bit trivial, but if, like me, you came to this page looking for a way to delete the content of a directory from remote server using rsync, this is how I did it:
Create an empty mock folder:
mkdir mock
Sync with it:
rsync -arv --delete --dry-run ~/mock/ remote_server:~/dir_to_clean/
Remove --dry-run from the line above to actually do the thing.
As suggested above, use --dry-run to test prior. --delete deletes files on the remote location per the rsync man page.
rsync -rv --delete user#hostname.local:full/path/to/foo.txt
Comment below stating this will list only is incorrect. To list only use --list-only and remove --delete.
Just came across the same problem, needed to use rsync to delete a remote file, as only rsync and no other SSH commands were allowed. The --remove-source-files option (formerly known as --remove-sender-files) did exactly that:
rsync -avPn --remove-source-files remote:/home/user/foo.txt .
rm foo.txt
As always, remove the -n option to really execute this.

Resources