How to restrict Rsync update timestamp - rsync

rsync -av --size-only --include="*/" --include="*.jpeg" --exclude="*" ~/alg/temperature/ ~/alg/tmp/
I use command as above to sync some files, and I don't want to update anything even timestamp if file size is the same
the option --size-only could only sync the file which changed in size
but those which no change in size will be "touched" and update the timestamp, this is what I don't want
how could I make it?

The -a option is equivalent to -rlptgoD. You need to remove the -t. -t tells rsync to transfer modification times along with the files and update them on the remote system.
You may want to try the -c skip based on checksum, not mod-time & size. This is slower, but should work for what you want.
So your line could be (by expanding a and replacing t with c):
rsync -rlpcgoDv --include="*/" --include="*.jpeg" --exclude="*" ~/alg/temperature/ ~/alg/tmp/

Related

Issues using rsync to migrate files to new server

I am trying to copy a directory full of directories and small files to a new server for an app migration. rsync is always my go to tool for this type of migration but this time it is not working as expected.
The directory has 174,412 files and is 136g in size. Based on this I created a 256G disk for them on the new server.
The issue is when I rsync'd the files over to the new server the new partition ran out of space before all files were copied.
I did some tests with a bigger destination disk on my test machine and when it finishes the total size on the new disk is 272G
time sudo rsync -avh /mnt/dotcms/* /data2/
sent 291.61G bytes received 2.85M bytes 51.75M bytes/sec
total size is 291.52G speedup is 1.00
df -h /data2
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/data2vg-data2lv 425G 272G 154G 64% /data2
The source is on a NAS and the new target is a XFS file system so first I thought it may be a block size issue. But then I used the cp command and it copied the exact same size.
time sudo cp -av /mnt/dotcms/* /data
df -h /data2
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/data2vg-data2lv 425G 136G 290G 32% /data2
Why is rsync increasing the space used?
According to the documentation, dotcms makes use of hard links. So, you need to give rsync the -H option to preserve them. Note that GNU's cp -av preserves hard links so doesn't have this problem.
Other rsync options you should consider using include:
-H, --hard-links : preserve hard links
-A, --acls : preserve ACLs (implies --perms)
-X, --xattrs : preserve extended attributes
-S, --sparse : turn sequences of nulls into sparse blocks
--delete : delete extraneous files from destination dirs
This assumes you are running as root and that the destination is supposed to have the same users/groups as the source. If the users and groups are not the same, then #Cyrus' alternative commandline using --numeric-id may be more appropriate.

Rsync takes longer time “receiving incremental file list”

I am using rysnc to copy files from remote host to local machine using a cron job. Every time I need the rsync to copy new files only from remote host. But its getting struck at this line "receiving incremental file list" for very long time. Below is the command I am using. Is there any other way I can fasten up this rsync process?
rsync -avz --inplace --progress --delete -ahe ssh remoteuser#remotehost:/home/bin/dir1/data /home/bin/dir1
Have you tried with --delete-before, --delete-after or --delay-updates?
Some options require rsync to know the full file list, so these options
disable the incremental recursion mode. These include: --delete-before, --delete-after, --prune-empty-dirs, and --delay-updates. Because of this, the default delete mode when you specify --delete is now --delete-during when both ends of the connection are at least 3.0.0 (use --del or --delete-during to request this improved deletion mode explicitly). See also the --delete-delay option that is a better choice than using --delete-after.
(from: http://linux.die.net/man/1/rsync)

inotify and rsync on large number of files

I am using inotify to watch a directory and sync files between servers using rsync. Syncing works perfectly, and memory usage is mostly not an issue. However, recently a large number of files were added (350k) and this has impacted performance, specifically on CPU. Now when rsync runs, CPU usage spikes to 90%/100% and rsync takes long to complete, there are 650k files being watched/synced.
Is there any way to speed up rsync and only rsync the directory that has been changed? Or alternatively to set up multiple inotifywaits on separate directories. Script being used is below.
UPDATE: I have added the --update flag and usage seems mostly unchanged
#! /bin/bash
EVENTS="CREATE,DELETE,MODIFY,MOVED_FROM,MOVED_TO"
inotifywait -e "$EVENTS" -m -r --format '%:e %f' /var/www/ --exclude '/var/www/.*cache.*' | (
WAITING="";
while true; do
LINE="";
read -t 1 LINE;
if test -z "$LINE"; then
if test ! -z "$WAITING"; then
echo "CHANGE";
WAITING="";
rsync --update -alvzr --exclude '*cache*' --exclude '*.git*' /var/www/* root#secondwebserver:/var/www/
fi;
else
WAITING=1;
fi;
done)
I ended up removing the compression option (z) and upping the WAITING var to 10 (seconds). This seems to have helped, rsync still spikes CPU load but it is shorter lived. Credit goes to an answer on unix stackexchange
You're using rsync to synchronize the root directory of a large tree, so I'm not surprised at the performance loss.
One possible solution is to only synchronize the changed files/directories, instead of the whole root directory.
For instance, file1, file2 and file3 lay under from/dir. When changes are made to these 3 files, use
rsync --update -alvzr from/dir/file1 from/dir/file2 from/dir/file3 to/dir
rather than
rsync --update -alvzr from/dir/* to/dir
But this has a potential bug: rsync won't create directories automatically if target folders don't exist. However, you can use ssh to execute remote command and create directories by yourself.
You may need to set SSH public-key authentication as well, but according to the rsync command line you paste, I assume you've already done this.
reference:
rsync - create all missing parent directories?
rsync: how can I configure it to create target directory on server?
How to use SSH to run a shell script on a remote machine?
SSH error when executing a remote command: "stdin: is not a tty"

read input from a file and sync accordingly

I have a text file which contains the list of files and directories that I want to copy (one on a line). Now I want rsync to take this input from my text file and sync it to the destination that I provide.
I've tried playing around with "--include-from=FILE" and "--file-from=FILE" options of rsync but it is just not working
I also tried pre-fixing "+" on each line in my file but still it is not working.
I have tried coming with various filter PATTERNs as outlined in the rsync man page but it is still not working.
Could someone provide me correct syntax for this use case. I've tried above things on Fedora 15, RHEL 6.2 and Ubuntu 10.04 and none worked. So i am definitely missing something.
Many thanks.
There is more than one way to answer this question depending on how you want to copy these files. If your intent is to copy the file list with absolute paths, then it might look something like:
rsync -av --files-from=/path/to/files.txt / /destination/path/
...This would expect the paths to be relative to the source location of / and would retain the entire absolute structure under that destination.
If your goal is to copy all of those files in the list to the destination, without preserving any kind of path hierarchy (just a collection of files), then you could try one of the following:
# note this method might break if your file it too long and
# exceed the maximum arg limit
rsync -av `cat /path/to/file` /destination/
# or get fancy with xargs to batch 200 of your items at a time
# with multiple calls to rsync
cat /path/to/file | xargs -n 200 -J % rsync -av % /destination/
Or a for-loop and copy:
# bash shell
for f in `cat /path/to/files.txt`; do cp $f /dest/; done
Given a file listing $HOME/GET/bringemback containing
**need/A
alsoneed/B
shouldget/C**
cd $HOME/GET
run
rsync -av --files-from=./bringemback me#theremote:. $HOME/GET/collect
would get the files and drop them into $HOME/GET/collect
$HOME/GET/
collect/
need/A
alsoneed/B
shouldget/C
or so I believe.
helpful
rsync supports this natively:
rsync --recursive -av --files-from=/path/to/files.txt / /destination/path/

Rsync: provide a list of unsent files

The Rsync -u flag prevents the overwriting of modified destination files. How can I get a list of files that were not sent due to this flag? The -v flag will let me know which files were sent, but I would like to know which ones weren't.
From the rsync man page:
-i, --itemize-changes
Requests a simple itemized list of the changes that are being
made to each file, including attribute changes. This is exactly
the same as specifying --out-format='%i %n%L'. If you repeat
the option, unchanged files will also be output, but only if the
receiving rsync is at least version 2.6.7 (you can use -vv with
older versions of rsync, but that also turns on the output of
other verbose messages).
In my testing, the -ii option isn't working with rsync 3.0.8, but -vv is. Your mileage may vary.
You could also get substantially the same information by invoking rsync with --dry-run and --existing in the opposite direction. So if your regular transfer looked like this:
rsync --update --recursive local:/directory/ remote:/directory/
You would use:
rsync --dry-run --existing --recursive remote:/directory/ local:/directory/
but -vv or -ii is safer and less prone to misinterpretation.

Resources