rsync - Why is there an --include-from and an --exclude-from option? - rsync

The rsync man page says that...
--include-from=FILE: "specifies a FILE that contains include patterns.."
Likewise that...
--exclude-from=FILE: "specifies a FILE that contains exclude patterns.."
But I have also read examples on the internet where the same FILE can contain both patterns to include (if preceded by a +) and patterns to exclude (if preceded by a -)
Example:
+ testfile.txt
- testfile*
I know that order is important so here, if testfile.txt is matched then it will be copied to a destination dir but testfile[anything else] will not be copied.
If FILE can contain patterns to include and to exclude then why does rsync have both options --include-from=FILE and --exclude-from=FILE? It seems to me they both do the same thing. It's the order of matches (preceded by a + or a -) that determines if the file is copied or not.
Can someone give me an example of why you would use the --include-from option instead of the --exclude-from option? Can they both be used together? How would you?
Cheers,
Flex

Related

Zsh glob: Get everything except stuff in a certin folder

Trying to find all files except those inside vendor/ folders, but why is this failing?
setopt extendedglob
for file in **/*~vendor/; do
done
See if this does what you're looking for:
setopt extendedglob
print -l ^vendor/**/*(.)
The *~ negation syntax usually needs parentheses in order to determine where the expression after the tilde ends. Your pattern is requesting all files and folders except those where the glob result name ends with vendor/. The glob result never includes the trailing slash, so you end up with all of the files and folders.
Adding parens will change the behavior of that pattern, but probably not in a useful way. This will result in a list of all of the directories where the last component is not vendor:
print -l **/(*~vendor)/
so x/y, x/y/vendor/z, and vendor/a will be included, but x/y/vendor will not.
The parentheses limit the 'not' pattern to just one piece of the path. In order to exclude matches at the top-level, the tested component needs to be at the front of the pattern:
print -l (*~vendor)/**/*
The very first pattern above uses the ^ syntax to produce the same results. The (.) glob qualifier in that pattern limits the globbing to plain files, so directories are not included.
Another variation that may be useful - this will exclude directories that have any component named vendor. It is similar to find -prune:
print -l (^vendor/)#*(.)
This will produce a list of all files except those in subdirectories with names like vendor/x, a/vendor and a/vendor/b.

Make rsync exclude all directories that contain a file with a specific name

I would like rsync to exclude all directories that contain a file with a specific name, say ".rsync-exclude", independent of the contents of the ".rsync-exclude" file.
If the file ".rsync-exclude" contained just "*", I could use rsync -r SRC DEST --filter='dir-merge,- .rsync-exclude'.
However, the directory should be excluded independent of the contents of the ".rsync-exclude" file (it should at least be possible to leave the ".rsync-exclude" file empty).
Any ideas?
rsync does not support this (at least the manpage does not mention anything), but you can do it in two steps:
run find to find the .rsync-exclude files
pipe this list to --exclude-from (or use a temporary file)
--exclude-from=FILE
This option is related to the --exclude option, but it specifies a FILE that contains exclude patterns
(one per line). Blank lines in the file and lines starting with ';' or '#' are ignored. If FILE is -,
the list will be read from standard input.
alternatively, if you do not mind to put something in the files, you can use:
-F The -F option is a shorthand for adding two --filter rules to your command. The first time it is used
is a shorthand for this rule:
--filter='dir-merge /.rsync-filter'
This tells rsync to look for per-directory .rsync-filter files that have been sprinkled through the
hierarchy and use their rules to filter the files in the transfer. If -F is repeated, it is a short-
hand for this rule:
--filter='exclude .rsync-filter'
This filters out the .rsync-filter files themselves from the transfer.
See the FILTER RULES section for detailed information on how these options work.
Old question, but I had the same one..
You can add the following filter:
--filter="dir-merge,n- .rsync-exclude"
Now you can place a .rsync-exclude file in any folder and write the names of the files and folders you want to exclude line by line. for example:
#.rsync-exclude file
folderYouWantToExclude
allFilesThatStartWithXY*
someSpecialImage.png
So you can use patterns in there too.
What you can't do is:
#.rsync-exclude file
folder/someFileYouWantToExlude
Hope it helps! Cheers
rsync -avz --exclude 'dir' /source /destination

With RSYNC, how do includes and excludes combine?

I want to rsync everything in /Volumes/B/, except for Cache directories, which I want to exclude globally. Also, I don't want to rsync any other /Volume/
I have the following exclusion file:
+ /Volumes/B/***
- Cache/
- /Volumes/*
The first and 3rd line seem to work correctly, except that rsync also picks up all Cache dirs under /Volumes/B/... ( /Volumes/B/***/Cache/ )
What am I missing?
rsync reads the exclude file top down when traversing the directories.
When it visited the Caches dirs, rsync acted on the first matching pattern.
The first matching pattern was "+ /Volumes/B/*", so Cache was included.
The rule is:
When having particular subdirectories, put them first.
Here 's a simple step by step explanation.

Unix wildcard selectors? (Asterisks)

In Ryan Bates' Railscast about git, his .gitignore file contains the following line:
tmp/**/*
What is the purpose of using the double asterisks followed by an asterisk as such: **/*?
Would using simply tmp/* instead of tmp/**/* not achieve the exact same result?
Googling the issue, I found an unclear IBM article about it, and I was wondering if someone could clarify the issue.
It says to go into all the subdirectories below tmp, as well as just the content of tmp.
e.g. I have the following:
$ find tmp
tmp
tmp/a
tmp/a/b
tmp/a/b/file1
tmp/b
tmp/b/c
tmp/b/c/file2
matched output:
$ echo tmp/*
tmp/a tmp/b
matched output:
$ echo tmp/**/*
tmp/a tmp/a/b tmp/a/b/file1 tmp/b tmp/b/c tmp/b/c/file2
It is a default feature of zsh, to get it to work in bash 4, you perform:
shopt -s globstar
From http://blog.privateergroup.com/2010/03/gitignore-file-for-android-development/:
(kwoods)
"The double asterisk (**) is not a git thing per say, it’s really a linux / Mac shell thing.
It would match on everything including any sub folders that had been created.
You can see the effect in the shell like so:
# ls ./tmp/* = should show you the contents of ./tmp (files and folders)
# ls ./tmp/** = same as above, but it would also go into each sub-folder and show the contents there as well."
According to the documentation of gitignore, this syntax is supported since git version 1.8.2.
Here is the relevant section:
Two consecutive asterisks (**) in patterns matched against full pathname may have special meaning:
A leading ** followed by a slash means match in all directories. For example, **/foo matches file or directory foo anywhere, the
same as pattern foo. **/foo/bar matches file or directory bar
anywhere that is directly under directory foo.
A trailing /** matches everything inside. For example, abc/** matches all files inside directory abc, relative to the location of
the .gitignore file, with infinite depth.
A slash followed by two consecutive asterisks then a slash matches zero or more directories. For example, a/**/b matches a/b,
a/x/b, a/x/y/b and so on.
Other consecutive asterisks are considered invalid.

To restrict a node from `tree` by Tree or Git

How can you restrict a node from the command tree?
#1
I need to give a tree of my project files reqularly for my supervisor.
These files contain some third-party components which I do not want to show in the tree.
I have solved this problem this far by coping the project file to tmp, removing 3rd party-files and then running tree.
However, this procedure is becoming cumbersome.
I would like to get a better way to give tree of my files to my supervisor.
#2
I have the files which I want to show in Git so Git may solve this problem.
I run unsuccessfully
git ls-files --with-tree
You can specify the files you want to match and avoid using general patterns. From the tree manpage:
-P pattern
List only those files that match the wild-card pattern. Note: you must use the -a option to also consider those files beginning with a dot '.' for matching. Valid wildcard operators are '*' (any zero or more characters), '?' (any single character), '[...]' (any single character listed between brackets (optional - (dash) for character range may be used: ex: [A-Z]), and '[^...]' (any single character not listed in brackets) and '|' separates alternate patterns.
-I pattern
Do not list those files that match the wild-card pattern.
In your specific case, running
tree -I '3rd*'
should hide a directory called '3rd_party', including subdirs and files, while still allowing matches like 'party_3rd'. Obviously, other files and directories not containing '3rd' in the name will also display as normal. I've verified this behaviour with tree v1.5.2.1 on Linux.
You can put the third party tools is a separate subdirectory.
Then you only have to eliminate one node.
Instead of changing the tree command it might be better to place the 3rd-party files in a sibling folder of, not in a child folder of, your own source.

Resources