Rsync all files (recursively) from one dir to another, maintaining only a portion of the original dir structure

Rsync all files (recursively) from one dir to another, maintaining only a portion of the original dir structure - gruntjs

I have two directories:
Directory #1, 'C'
C's absolute path:
/A/B/C
Directory #2, 'T'
T's absolute path:
/Q/R/T
I want to use rsync, to copy all files, recursively, from C, and copy them in to T, while maintaining the original directory structure - but only from B onwards.
Example to make it clearer: suppose 'B' has only 3 files nested within it:
/A/B/f1.txt
/A/B/C/f2.txt
/A/B/C/D/f3.txt
Then I want to end up with only f2.txt and f3.txt being copied over, with the final filepaths as follows (notice how I keep the directory structure, only from B onwards):
/Q/R/T/B/C/f2.txt
/Q/R/T/B/C/D/f3.txt
Here is the catch: I must execute the rsync cmd from within /Q/R/. So when I execute this command, my pwd must be /Q/R/.
Can anyone help me figure out how to do this?
[If I did not have this constraint of where my cwd must be, I could cd to /A/B, and then execute: rsync . /Q/R/T/ --recursive --relative . Unfortunately, I can not do that for reasons that would take a lot of pointless explaining here. And when I try to execute rsync /A/. /Q/R/T/ --recursive --relative, I end up with not only everything within A, but maintaining that first part of the dir structure (/A/) that I don't want. (Note - in the real life scenario the dir structure is much more complex then this, this is just the general problem.]

The rsync command includes a couple of options which are suitable for this scenario. They are:
--include=PATTERN - Don't exclude files matching PATTERN
--exclude=PATTERN - Exclude files matching PATTERN
An excellent description and examples of the --exclude flag can be found here.
Solution
Given the directory structures provided in your question and your pwd being set to /Q/R/. Running the following command will meet your requirement:
rsync ../../A/ T/ --recursive --include A/B/** --exclude B/*.*
Edit:
If you do want /A/B/f1.txt to copy to /Q/R/T/B/f1.txt (as it's unclear in your question because you don't show it in the "I want to end up with" example"). Then omit the --exclude B/*.* part, so the complete command is reduced to:
rsync ../../A/ T/ --recursive --include A/B/**
or reduced even further in complexity to just:
rsync ../../A/** T/ --recursive
Explanation of the command
../../A/
The first argument provides the path to the source directory. I.e. The relative position within the hierarchical tree of names (Based on your pwd being /Q/R).
T/
The second argument provides the path to the destination directory. Again this is a relative position within the hierarchical tree of names (and is also based on the pwd being /Q/R).
--recursive
The first option is to recurse into the directories.
--include A/B/**
This says that you want to include all the assets (files/folders), however many levels deep, from within the folder named B which resides inside folder A.
--exclude B/*.*
This says that you want to exclude any assets (files/folders), whose name includes a dot [.] plus extension, which reside inside folder B (at the top level). This will prevent the file named f1.txt from being copied. You could be even more specific here and use --exclude B/f1.txt instead, however I'm assuming in real life you perhaps have additional files you want to exclude here too.
Additional notes
Both the --include and --exclude options can be utilized multiple times. This can be very useful for some scenarios too as it enables you to be specific about what to include and/or exclude during the copy process.
For example, lets assume that your source directory /A/B/, (as described in your question), also contains a folder named X. So its path is A/B/X.
Lets say that we also do not want to copy this folder named X (in the same way as you currently do not want to copy /A/B/f1.txt).
For this scenario we add another --exclude option as follows:
rsync ../../A/ T/ --recursive --include A/B/** --exclude B/*.* --exclude X/
Note the additional --exclude X/ at the end.
You mention...
(Note - in the real life scenario the dir structure is much more complex then this, this is just the general problem.
... in your question, so you may find it necessary to add additional --exclude=PATTERN to truly meet your requirements.
Grunt
As you have included the gruntjs flag with your question, then you may want to consider utilizing plug-ins which can run shell commands like rsync such as:
grunt-shell
grunt-exec

Related

How to make a single makefile that applies the same command to sub-directories?

For clarity, I am running this on windows with GnuWin32 make.
I have a set of directories with markdown files in at several different levels - theoretically they could be in the branch nodes, but I think currently they are only in the leaf nodes. I have a set of pandoc/LaTeX commands to run to turn the markdown files into PDFs - and obviously only want to recreate the PDFs if the markdown file has been updated, so a makefile seems appropriate.
What I would like is a single makefile in the root, which iterates over any and all sub-directories (to any depth) and applies the make rule I'll specify for running pandoc.
From what I've been able to find, recursive makefiles require you to have a makefile in each sub-directory (which seems like an administrative overhead that I would like to avoid) and/or require you to list out all the sub-directories at the start of the makefile (again, would prefer to avoid this).
Theoretical folder structure:
root
|-make
|-Folder AB
| |-File1.md
| \-File2.md
|-Folder C
| \-File3.md
\-Folder D
|-Folder E
| \-File4.md
|-Folder F
\-File5.md
How do I write a makefile to deal with this situation?

Here is a small set of Makefile rules that hopefuly would get you going
%.pdf : %.md
pandoc -o $# --pdf-engine=xelatex $^
PDF_FILES=FolderA/File1.pdf FolderA/File2.pdf \
FolderC/File3.pdf FolderD/FolderE/File4.pdf FolderD/FolderF/File5.pdf
all: ${PDF_FILES}
Let me explain what is going on here. First we have a pattern rule that tells make how to convert a Markdown file to a PDF file. The --pdf-engine=xelatex option is here just for the purpose of illustration.
Then we need to tell Make which files to consider. We put the names together in a single variable PDF_FILES. This value for this variable can be build via a separate scripts that scans all subdirectories for .md files.
Note that one has to be extra careful if filenames or directory names contain spaces.
Then we ask Make to check if any of the PDF_FILES should be updated.
If you have other targets in your makefile, make sure that all is the first non-pattern target, or call make as make all
Updating the Makefile
If shell functions works for you and basic utilities such as sed and find are available, you could make your makefile dynamic with a single line.
%.pdf : %.md
pandoc -o $# --pdf-engine=xelatex $^
PDF_FILES:=$(shell find -name "*.md" | xargs echo | sed 's/\.md/\.pdf/g' )
all: ${PDF_FILES}
MadScientist suggested just that in the comments
Otherwise you could implement a script using the tools available on your operating system and add an additional target update: that would compute the list of files and replace the line starting with PDF_FILES with an updated list of files.

Final version of the code that worked for Windows, based on #DmitiChubarov and #MadScientist's suggestions is as follows:
%.pdf: %.md
pandoc $^ -o $#
PDF_FILES:=$(shell dir /s /b *.md | sed "s/\.md/\.pdf/g")
all: ${PDF_FILES}

rsync only folders and files in special sub directory

I have the following structure
I want to copy only folders, subfolders and files, which are located in "_bearbeitet".
I am trying it with the following options
--archive --hard-links --ignore-errors --force --exclude=* --include=/_bearbeitet

You have your rules in the wrong order, and your glob is too general.
Try this:
--include=/_bearbeitet --exclude='/*'
So altogether:
rsync -aH --ignore-errors --force --include=/_bearbeitet --exclude='/*' $src $dest
The rule is that, for each file, rsync will use the first include/exclude rule that matches, and will include anything that matches no rule.
So, first list what you want to include: /_barbeitet matches the named directory at the top level only.
Then list what you want to exclude after: /* matches all files and directories at the top level only. Note that * on it's own would exclude all files and directories anywhere, including files and directories inside an explicitly included directory.
You should also take care to put quotes around * in patterns or else the shell will expand them before calling rsync, which is not what you want.

Makefile rule depend on directory content changes

Using Make is there a nice way to depend on a directories contents.
Essentially I have some generated code which the application code depends on. The generated code only needs to change if the contents of a directory changes, not necessarily if the files within change their content. So if a file is removed or added or renamed I need the rule to run.
My first thought is generate a text file listing of the directory and diff that with the last listing. A change means rerun the build. I think I will have to pass off the generate and diff part to a bash script.
I am hoping somehow in their infinite intelligence might have an easier solution.

Kudos to gjulianm who got me on the right track. His solution works perfect for a single directory.
To get it working recursively I did the following.
ASSET_DIRS = $(shell find ../../assets/ -type d)
ASSET_FILES = $(shell find ../../assets/ -type f -name '*')
codegen: ../../assets/ $(ASSET_DIRS) $(ASSET_FILES)
generate-my-code
It appears now any changes to the directory or files (add, delete, rename, modify) will cause this rule to run. There is likely some issue with file names here (spaces might cause issues).

Let's say your directory is called dir, then this makefile will do what you want:
FILES = $(wildcard dir/*)
codegen: dir # Add $(FILES) here if you want the rule to run on file changes too.
generate-my-code
As the comment says, you can also add the FILES variable if you want the code to depend on file contents too.

A disadvantage of having the rule depend on a directory is that any change to that directory will cause the rule to be out-of-date — including creating generated files in that directory. So unless you segregate source and target files into different directories, the rule will trigger on every make.
Here is an alternative approach that allows you to specify a subset of files for which additions, deletions, and changes are relevant. Suppose for example that only *.foo files are relevant.
# replace indentation with tabs if copy-pasting
.PHONY: codegen
codegen:
find . -name '*.foo' |sort >.filelist.new
diff .filelist.current .filelist.new || cp -f .filelist.new .filelist.current
rm -f .filelist.new
$(MAKE) generate
generate: .filelist.current $(shell cat .filelist.current)
generate-my-code
.PHONY: clean
clean:
rm -f .filelist.*
The second line in the codegen rule ensures that .filelist.current is only modified when the list of relevant files changes, avoiding false-positive triggering of the generate rule.

Make rsync exclude all directories that contain a file with a specific name

I would like rsync to exclude all directories that contain a file with a specific name, say ".rsync-exclude", independent of the contents of the ".rsync-exclude" file.
If the file ".rsync-exclude" contained just "*", I could use rsync -r SRC DEST --filter='dir-merge,- .rsync-exclude'.
However, the directory should be excluded independent of the contents of the ".rsync-exclude" file (it should at least be possible to leave the ".rsync-exclude" file empty).
Any ideas?

rsync does not support this (at least the manpage does not mention anything), but you can do it in two steps:
run find to find the .rsync-exclude files
pipe this list to --exclude-from (or use a temporary file)
--exclude-from=FILE
This option is related to the --exclude option, but it specifies a FILE that contains exclude patterns
(one per line). Blank lines in the file and lines starting with ';' or '#' are ignored. If FILE is -,
the list will be read from standard input.
alternatively, if you do not mind to put something in the files, you can use:
-F The -F option is a shorthand for adding two --filter rules to your command. The first time it is used
is a shorthand for this rule:
--filter='dir-merge /.rsync-filter'
This tells rsync to look for per-directory .rsync-filter files that have been sprinkled through the
hierarchy and use their rules to filter the files in the transfer. If -F is repeated, it is a short-
hand for this rule:
--filter='exclude .rsync-filter'
This filters out the .rsync-filter files themselves from the transfer.
See the FILTER RULES section for detailed information on how these options work.

Old question, but I had the same one..
You can add the following filter:
--filter="dir-merge,n- .rsync-exclude"
Now you can place a .rsync-exclude file in any folder and write the names of the files and folders you want to exclude line by line. for example:
#.rsync-exclude file
folderYouWantToExclude
allFilesThatStartWithXY*
someSpecialImage.png
So you can use patterns in there too.
What you can't do is:
#.rsync-exclude file
folder/someFileYouWantToExlude
Hope it helps! Cheers

rsync -avz --exclude 'dir' /source /destination

Can I symlink multiple directories into one?

I have a feeling that I already know the answer to this one, but I thought I'd check.
I have a number of different folders:
images_a/
images_b/
images_c/
Can I create some sort of symlink such that this new directory has the contents of all those directories? That is this new "images_all" would contain all the files in images_a, images_b and images_c?

No. You would have to symbolically link all the individual files.
What you could do is to create a job to run periodically which basically removed all of the existing symbolic links in images_all, then re-create the links for all files from the three other directories, but it's a bit of a kludge, something like this:
rm -f images_all/*
for i in images_[abc]/* ; do; ln -s $i images_all/$(basename $i) ; done
Note that, while this job is running, it may appear to other processes that the files have temporarily disappeared.
You will also need to watch out for the case where a single file name exists in two or more of the directories.
Having come back to this question after a while, it also occurs to me that you can minimise the time during which the files are not available.
If you link them to a different directory then do relatively fast mv operations that would minimise the time. Something like:
mkdir images_new
for i in images_[abc]/* ; do
ln -s $i images_new/$(basename $i)
done
# These next two commands are the minimal-time switchover.
mv images_all images_old
mv images_new images_all
rm -rf images_old
I haven't tested that so anyone implementing it will have to confirm the suitability or otherwise.

You could try a unioning file system like unionfs!
http://www.filesystems.org/project-unionfs.html
http://aufs.sourceforge.net/

to add on to paxdiablo 's great answer, i think you could use cp -s
(-s or --symbolic-link)
which makes symbolic links instead of literal copying
to maybe speed up or simplify the the bulk adding of symlinks to the "merge" folder A , of the files from folder B and C.
(i have not tested this though)
I cant recall of the top of my head, but im sure there is some option for CP to NOT overwrite existing, thus only symlinks of new files will be "cp -s" ed

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex