how to pick up new updated file from rsync log - rsync

I always use rsync command for backup below.
$ rsync -auvvz source_dir destination_dir
To increase verbosity, I use option v.
But, in this case, there is too many logs to pick up new updated file log at console.
ex)
hgoe/foo/foo/test1 is uptodate
hgoe/foo/foo/test2 <- I want to pick up logs like this.
hgoe/foo/foo/test3 is uptodate
How to pick up the logs ?

Ugly but working hack,
rsync -auvvz source_dir destination_dir| grep -vE '(sending incremental file list|delta-transmission disabled| is uptodate|total: matches=| bytes/sec| speedup is |^$)'
grep -v corresponds to the --invert-match option,
while -E, to the --extended-regexp one
^$ removes a blank line,
the other patterns are all included in the default output for rsync -auvvz

Related

SCP issue with multiple files - UNIX

Getting error in copying multiple files. Below command is copying only first file and giving error for rest of the files. Can someone please help me out.
Command:
scp $host:$(ssh -n $host "find /incoming -mmin -120 -name 2018*") /incoming/
Result:
user#host:~/scripts/OTA$ scp $host:$(ssh -n $host "find /incoming -mmin -120 -name 2018*") /incoming/
Password:
Password:
2018084session_event 100% |**********************************************************************************************************| 9765 KB 00:00
cp: cannot access /incoming/2018084session_event_log.195-10.45.40.9
cp: cannot access /incoming/2018084session_event_log.195-10.45.40.9_2_3
Your command uses Command Substitution to generate a list of files. Your assumption is that there is some magic in the "source" notation for scp that would cause multiple members of the list generated by your find command to be assumed to live on $host, when in fact your command might expand into something like:
scp remotehost:/incoming/someoldfile anotheroldfile /incoming
Only the first file is being copied from $host, because none of the rest include $host: at the beginning of the path. They're not found in your local /incoming directory, hence the error.
Oh, and in addition, you haven't escape the asterisk in the find command, so 2018* may expand to multiple files that are in the login directory for the user in question. I can't tell from here, it depends on your OS and shell configuration.
I should point out that you are providing yet another example of the classic Parsing LS problem. Special characters WILL break your command. The "better" solution usually offered for this problem tends to be to use a for loop, but that's not really what you're looking for. Instead, I'd recommend making a tar of the files you're looking for. Something like this might do:
ssh "$host" "find /incoming -mmin -120 -name 2018\* -exec tar -cf - {} \+" |
tar -xvf - -C /incoming
What does this do?
ssh runs a remote find command with your criteria.
find feeds the list of filenames (regardless of special characters) to a tar command as options.
The tar command sends its result to stdout (-f -).
That output is then piped into another tar running on your local machine, which extracts the stream.
If your tar doesn't support -C, you can either remove it and run a cd /incoming before the ssh, or you might be able to replace that pipe segment with a curly-braced command: { cd /incoming && tar -xvf -; }
The curly brace notation assumes a POSIX-like shell (bash, zsh, etc). The rest of this should probably work equally well in csh if that's what you're stuck with.
Limited warranty: Best Effort Only. Untested on animals or computers. Your milage may vary. May contain nuts.
If this doesn't work for you, poke at it until it does.

inotify and rsync on large number of files

I am using inotify to watch a directory and sync files between servers using rsync. Syncing works perfectly, and memory usage is mostly not an issue. However, recently a large number of files were added (350k) and this has impacted performance, specifically on CPU. Now when rsync runs, CPU usage spikes to 90%/100% and rsync takes long to complete, there are 650k files being watched/synced.
Is there any way to speed up rsync and only rsync the directory that has been changed? Or alternatively to set up multiple inotifywaits on separate directories. Script being used is below.
UPDATE: I have added the --update flag and usage seems mostly unchanged
#! /bin/bash
EVENTS="CREATE,DELETE,MODIFY,MOVED_FROM,MOVED_TO"
inotifywait -e "$EVENTS" -m -r --format '%:e %f' /var/www/ --exclude '/var/www/.*cache.*' | (
WAITING="";
while true; do
LINE="";
read -t 1 LINE;
if test -z "$LINE"; then
if test ! -z "$WAITING"; then
echo "CHANGE";
WAITING="";
rsync --update -alvzr --exclude '*cache*' --exclude '*.git*' /var/www/* root#secondwebserver:/var/www/
fi;
else
WAITING=1;
fi;
done)
I ended up removing the compression option (z) and upping the WAITING var to 10 (seconds). This seems to have helped, rsync still spikes CPU load but it is shorter lived. Credit goes to an answer on unix stackexchange
You're using rsync to synchronize the root directory of a large tree, so I'm not surprised at the performance loss.
One possible solution is to only synchronize the changed files/directories, instead of the whole root directory.
For instance, file1, file2 and file3 lay under from/dir. When changes are made to these 3 files, use
rsync --update -alvzr from/dir/file1 from/dir/file2 from/dir/file3 to/dir
rather than
rsync --update -alvzr from/dir/* to/dir
But this has a potential bug: rsync won't create directories automatically if target folders don't exist. However, you can use ssh to execute remote command and create directories by yourself.
You may need to set SSH public-key authentication as well, but according to the rsync command line you paste, I assume you've already done this.
reference:
rsync - create all missing parent directories?
rsync: how can I configure it to create target directory on server?
How to use SSH to run a shell script on a remote machine?
SSH error when executing a remote command: "stdin: is not a tty"

How to rsync only a specific list of files?

I've about 50 or so files in various sub-directories that I'd like to push to a remote server. I figured rsync would be able to do this for me using the --include-from option. Without the --exclude="*" option, all the files in the directory are being synced, with the option, no files are.
rsync -avP -e ssh --include-from=deploy/rsync_include.txt --exclude=* ./ root#0.0.0.0:/var/www/ --dry-run
I'm running it as dry initially and 0.0.0.0 is obviously replaced by the IP of the remote server. The contents of rsync_include.txt is a new line separated list of relative paths to the files I want to upload.
Is there a better way of doing this that is escaping me on a Monday morning?
There is a flag --files-from that does exactly what you want. From man rsync:
--files-from=FILE
Using this option allows you to specify the exact list of files to transfer (as read from the specified FILE or - for standard input). It also tweaks the default behavior of rsync to make transferring just the specified files and directories easier:
The --relative (-R) option is implied, which preserves the path information that is specified for each item in the file (use --no-relative or --no-R if you want to turn that off).
The --dirs (-d) option is implied, which will create directories specified in the list on the destination rather than noisily skipping them (use --no-dirs or --no-d if you want to turn that off).
The --archive (-a) option’s behavior does not imply --recursive (-r), so specify it explicitly, if you want it.
These side-effects change the default state of rsync, so the position of the --files-from option on the command-line has no bearing on how other options are parsed (e.g. -a works the same before or after --files-from, as does --no-R and all other options).
The filenames that are read from the FILE are all relative to the source dir -- any leading slashes are removed and no ".." references are allowed to go higher than the source dir. For example, take this command:
rsync -a --files-from=/tmp/foo /usr remote:/backup
If /tmp/foo contains the string "bin" (or even "/bin"), the /usr/bin directory will be created as /backup/bin on the remote host. If it contains "bin/" (note the trailing slash), the immediate contents of the directory would also be sent (without needing to be explicitly mentioned in the file -- this began in version 2.6.4). In both
cases, if the -r option was enabled, that dir’s entire hierarchy would also be transferred (keep in mind that -r needs to be specified explicitly with --files-from, since it is not implied by -a). Also note that the effect of the (enabled by default) --relative option is to duplicate only the path info that is read from the file -- it
does not force the duplication of the source-spec path (/usr in this case).
In addition, the --files-from file can be read from the remote host instead of the local host if you specify a "host:" in front of the file (the host must match one end of the transfer). As a short-cut, you can specify just a prefix of ":" to mean "use the remote end of the transfer". For example:
rsync -a --files-from=:/path/file-list src:/ /tmp/copy
This would copy all the files specified in the /path/file-list file that was located on the remote "src" host.
If the --iconv and --protect-args options are specified and the --files-from filenames are being sent from one host to another, the filenames will be translated from the sending host’s charset to the receiving host’s charset.
NOTE: sorting the list of files in the --files-from input helps rsync to be more efficient, as it will avoid re-visiting the path elements that are shared between adjacent entries. If the input is not sorted, some path elements (implied directories) may end up being scanned multiple times, and rsync will eventually unduplicate them after
they get turned into file-list elements.
For the record, none of the answers above helped except for one. To summarize, you can do the backup operation using --files-from= by using either:
rsync -aSvuc `cat rsync-src-files` /mnt/d/rsync_test/
OR
rsync -aSvuc --recursive --files-from=rsync-src-files . /mnt/d/rsync_test/
The former command is self explanatory, beside the content of the file rsync-src-files which I will elaborate down below. Now, if you want to use the latter version, you need to keep in mind the following four remarks:
Notice one needs to specify both --files-from and the source directory
One needs to explicitely specify --recursive.
The file rsync-src-files is a user created file and it was placed within the src directory for this test
The rsyn-src-files contain the files and folders to copy and they are taken relative to the source directory. IMPORTANT: Make sure there is not trailing spaces or blank lines in the file. In the example below, there are only two lines, not three (Figure it out by chance). Content of rsynch-src-files is:
folderName1
folderName2
--files-from= parameter needs trailing slash if you want to keep the absolute path intact. So your command would become something like below:
rsync -av --files-from=/path/to/file / /tmp/
This could be done like there are a large number of files and you want to copy all files to x path. So you would find the files and throw output to a file like below:
find /var/* -name *.log > file
$ date
Wed 24 Apr 2019 09:54:53 AM PDT
$ rsync --version
rsync version 3.1.3 protocol version 31
...
Syntax: rsync <args> <file_and_or_folder_list> <source_dir> <destination_dir/>
Folder names - WITH a trailing /; e.g. Cancer - Evolution/ - are provided in a file (e.g. my_folder_list):
# comment: /mnt/Vancouver/my_folder_list
# comment: 2019-04-24
some_file
another_file
Cancer/
Cancer - Evolution/
Cancer - Genomic Variants/
Cancer - Metastasis (EMT Transition ...)/
Cancer Pathways, Networks/
Catabolism - Autophagy; Phagosomes; Mitophagy/
so those are the "source" (files and/or) folders, to be rsync'd.
Note that if you don't include the trailing / shown above, rsync creates the target folders, but they are empty.
Those folder names provided in the <file_and_or_folder_list> are appended to the rest of their path: <src_dir> = /home/victoria/RESEARCH - NEWS (here, on a different partition), thus providing the complete folder path to rsync; e.g.: ... /home/victoria/RESEARCH - NEWS/Cancer - Evolution/ ...
[ I'm editing this answer some time later (2022-07), and I can't recall if the path provided to <src_dir> is /home/victoria/RESEARCH - NEWS or /home/victoria/RESEARCH - NEWS/ - providing the correct concatenated path. I believe it's the former; if it doesn't work, use the latter. ]
Note that you also need to use --files-from= ..., NOT --include-from= ...
Again the rsync syntax is:
rsync <args> <file_and_or_folder_list> <source_dir> <destination_dir/>
so,
rsync -aqP --delete --files-from=/mnt/Vancouver/my_folder_list "/home/victoria/RESEARCH - NEWS" $DEST_DIR/
where
<args> is -aqP --delete
<file_and_or_folder_list> is --files-from=/mnt/Vancouver/my_folder_list
<source_dir> is "/home/victoria/RESEARCH - NEWS"
<destination_dir/> is $DEST_DIR/ (note the trailing / added to the variable name)
In my BASH script, for coding flexibility I defined variable $DEST_DIR in two parts as follows.
BASEDIR="/mnt/Vancouver"
DEST_DIR=$BASEDIR/data
echo $DEST_DIR ## /mnt/Vancouver/data
## To clarify, here is $DEST_DIR with / appended to the variable name:
echo $DEST_DIR/ ## /mnt/Vancouver/data/
echo $DEST_DIR/apple/banana ## /mnt/Vancouver/data/apple/banana
However, you can more simply specify the destination path:
via a BASH variable: $DEST_DIR=/mnt/Vancouver/data
note that in the rsync expression above, / is appended to $DEST_DIR (i.e. $DEST_DIR/ is actually $DEST_DIR + /), giving the destination directory path /mnt/Vancouver/data/
explicitly state the destination path: /mnt/Vancouver/data/
rsync options used: ## man rsync or rsync -h
-a : archive: equals -rlptgoD (no -H,-A,-X)
-r : recursive
-l : copy symlinks as symlinks
-p : preserve permissions
-t : preserve modification times
-g : preserve group
-o : preserve owner (super-user only)
-D : same as --devices --specials
-P : same as --partial --progress
-q : quiet (https://serverfault.com/questions/547106/run-totally-silent-rsync)
--delete
This tells rsync to delete extraneous files from the RECEIVING SIDE (ones
that AREN’T ON THE SENDING SIDE), but only for the directories that are
being synchronized. You must have asked rsync to send the whole directory
(e.g. "dir" or "dir/") without using a wildcard for the directory’s contents
(e.g. "dir/*") since the wildcard is expanded by the shell and rsync thus
gets a request to transfer individual files, not the files’ parent directory.
Files that are excluded from the transfer are also excluded from being
deleted unless you use the --delete-excluded option or mark the rules as
only matching on the sending side (see the include/exclude modifiers in the
FILTER RULES section). ...
Edit: atp's answer below is better. Please use that one!
You might have an easier time, if you're looking for a specific list of files, putting them directly on the command line instead:
# rsync -avP -e ssh `cat deploy/rsync_include.txt` root#0.0.0.0:/var/www/
This is assuming, however, that your list isn't so long that the command line length will be a problem and that the rsync_include.txt file contains just real paths (i.e. no comments, and no regexps).
None of these answers worked for me, when all I had was a list of directories. Then I stumbled upon the solution! You have to add -r to --files-from because -a will not be recursive in this scenario (who knew?!).
rsync -aruRP --files-from=directory.list . ../new/location
I got similar task: to rsync all files modified after given date, but excluding some directories. It was difficult to build one liner all-in-one style, so I dived problem into smaller pieces.
Final solution:
find ~/sourceDIR -type f -newermt "DD MMM YYYY HH:MM:SS" | egrep -v "/\..|Downloads|FOO" > FileList.txt
rsync -v --files-from=FileList.txt ~/sourceDIR /Destination
First I use find -L ~/sourceDIR -type f -newermt "DD MMM YYYY HH:MM:SS". I tried to add regex to find line to exclude name patterns, however my flavor of Linux (Mint) seams not to understand negate regex in find. Tried number of regex flavors - non work as desired.
So I end up with egrep -v - option that excludes pattern easy way. My rsync is not copying directories like /.cache or /.config plus some other I explicitly named.
This answer is not the direct answer for the question.
But it should help you figure out which solution fits best for your problem.
When analysing the problem you should activate the debug option -vv
Then rsync will output which files are included or excluded by which pattern:
building file list ...
[sender] hiding file FILE1 because of pattern FILE1*
[sender] showing file FILE2 because of pattern *

read input from a file and sync accordingly

I have a text file which contains the list of files and directories that I want to copy (one on a line). Now I want rsync to take this input from my text file and sync it to the destination that I provide.
I've tried playing around with "--include-from=FILE" and "--file-from=FILE" options of rsync but it is just not working
I also tried pre-fixing "+" on each line in my file but still it is not working.
I have tried coming with various filter PATTERNs as outlined in the rsync man page but it is still not working.
Could someone provide me correct syntax for this use case. I've tried above things on Fedora 15, RHEL 6.2 and Ubuntu 10.04 and none worked. So i am definitely missing something.
Many thanks.
There is more than one way to answer this question depending on how you want to copy these files. If your intent is to copy the file list with absolute paths, then it might look something like:
rsync -av --files-from=/path/to/files.txt / /destination/path/
...This would expect the paths to be relative to the source location of / and would retain the entire absolute structure under that destination.
If your goal is to copy all of those files in the list to the destination, without preserving any kind of path hierarchy (just a collection of files), then you could try one of the following:
# note this method might break if your file it too long and
# exceed the maximum arg limit
rsync -av `cat /path/to/file` /destination/
# or get fancy with xargs to batch 200 of your items at a time
# with multiple calls to rsync
cat /path/to/file | xargs -n 200 -J % rsync -av % /destination/
Or a for-loop and copy:
# bash shell
for f in `cat /path/to/files.txt`; do cp $f /dest/; done
Given a file listing $HOME/GET/bringemback containing
**need/A
alsoneed/B
shouldget/C**
cd $HOME/GET
run
rsync -av --files-from=./bringemback me#theremote:. $HOME/GET/collect
would get the files and drop them into $HOME/GET/collect
$HOME/GET/
collect/
need/A
alsoneed/B
shouldget/C
or so I believe.
helpful
rsync supports this natively:
rsync --recursive -av --files-from=/path/to/files.txt / /destination/path/

Rsync: provide a list of unsent files

The Rsync -u flag prevents the overwriting of modified destination files. How can I get a list of files that were not sent due to this flag? The -v flag will let me know which files were sent, but I would like to know which ones weren't.
From the rsync man page:
-i, --itemize-changes
Requests a simple itemized list of the changes that are being
made to each file, including attribute changes. This is exactly
the same as specifying --out-format='%i %n%L'. If you repeat
the option, unchanged files will also be output, but only if the
receiving rsync is at least version 2.6.7 (you can use -vv with
older versions of rsync, but that also turns on the output of
other verbose messages).
In my testing, the -ii option isn't working with rsync 3.0.8, but -vv is. Your mileage may vary.
You could also get substantially the same information by invoking rsync with --dry-run and --existing in the opposite direction. So if your regular transfer looked like this:
rsync --update --recursive local:/directory/ remote:/directory/
You would use:
rsync --dry-run --existing --recursive remote:/directory/ local:/directory/
but -vv or -ii is safer and less prone to misinterpretation.

Resources