How to exclude hidden file ".htaccess" in rsync? - rsync

I want to exclude one special hidden file in just one special folder.
The command I used is:
rsync -a --delete \
--exclude='/absolute/path/to/webpage/folder1' \
--exclude='/absolute/path/to/webpage/backups' \
--exclude='/absolute/path/to/webpage/.htaccess' \
/absolute/path/to/webpage/ \
/absolute/path/to/copy_of_webpage &>/dev/null
rsync always overwrites my .htaccess.
Also I want to keep my .htpasswd and I thought about using wildcards like:
rsync -a --delete \
--exclude='/absolute/path/to/webpage/folder1' \
--exclude='/absolute/path/to/webpage/backups' \
--exclude='/absolute/path/to/webpage/.ht*' \
/absolute/path/to/webpage/ \
/absolute/path/to/copy_of_webpage &>/dev/null
But that doesn't work either.

You could exclude all .htaccess with --exclude '.htaccess'

Exclude the path relative to the source folder, not the absolute path.
If your root (as above) is:
/absolute/path/to/webpage/
and you wish to exclude:
/absolute/path/to/webpage/.htaccess
/absolute/path/to/webpage/backups
then you'll need to say:
--exclude='/.htaccess' --exclude='/backups'
Per the docs:
"/foo" would match a file called "foo" at... the "root of the transfer"

Related

How to download static website with WGET including its CSS, JS, Images in separate folders

The website loads its assets from some other domain & I am not able to download those assets at all.(JS, CSS, Images, etc)
Say the website is example.com & it includes assets from, say, assets.orange.com.
How do I tell WGET to download those assets, save it into different folders(js, css, images) and convert the links in the downloaded HTML files?
I don't know what I am doing wrong & where to specify assets.orange.com in this command.
wget \
--mirror \
--recursive \
--no-clobber \
--page-requisites \
--html-extension \
--convert-links \
--restrict-file-names=windows \
--domains example.com \
--no-parent \
example.com
where to specify assets.orange.com in this command
wget manual says that --domains usage is
-D domain-list
--domains=domain-list
where domain-list is a comma-separated list of domains, so if you wish to specify more than one you should do
--domains=example.com,assets.orange.com
According to wget manual if you aim to to download all the files that are necessary to properly display a given HTML page you might use
-p
--page-requisites
Beware that This includes such things as inlined images, sounds, and referenced stylesheets.

Rsync exclude correct syntax

I want to copy a directory from remote machine to local using rsync, but without some inner folders.
I'm using this command:
rsync -rave --exclude 'js' --exclude 'css' --exclude 'fonts' root#{IP}:/rem_dir1/rem_dir2/public /local_dir1/local_dir2/public
But result of it is:
Unexpected remote arg: root#{IP}:/rem_dir1/rem_dir2/public
rsync error: syntax or usage error (code 1) at main.c(1361) [sender=3.1.2]
I'm sure remote root is correct. So the problem is in rsync command syntax.
What is the correct way to exclude several folders using rsync?
For example we have /public folder which contains dir1, dir2, dir3, dir4 and dir5. How to copy only dir1 and dir2 from /public?
As with your other question, the -rave makes no sense. You want just -av.
You can get fancy with include and exclude commands, but the easiest way to copy just two directories is just to list them:
rsync -av \
root#{IP}:/rem_dir1/rem_dir2/public/dir1 \
root#{IP}:/rem_dir1/rem_dir2/public/dir2 \
/local_dir1/local_dir2/public/
where \ is just line-continuation (so I can wrap the long line), and I deliberately only added / to the end of the destination path, not the source paths.

rsync - create all missing parent directories?

I'm looking for an rsync-like program which will create any missing parent directories on the remote side.
For example, if I have /top/a/b/c/d on one server and only /top/a exists on the remote server, I want to copy d to the remote server and have the b and c directories created as well.
The command:
rsync /top/a/b/c/d remote:/top/a/b/c
won't work because /tmp/a/b doesn't exist on the remote server. And if it did exist then the file d would get copied to the path /top/a/b/c.
This is possible to do with rsync using --include and --exclude switches, but it is very involved, e.g.:
rsync -v -r a dest:dir \
--include 'a/b' \
--include 'a/b/c' \
--include 'a/b/c/d' \
--include 'a/b/c/d/e' \
--exclude 'a/*' \
--exclude 'a/b/*' \
--exclude 'a/b/c/*' \
--exclude 'a/b/c/d/*'
will only copy a/b/c/d/e to dest:dir/a/b/c/d/e even if the intermediate directories have files. (Note - the includes must precede the excludes.)
Are there any other options?
You may be looking for
rsync -aR
for example:
rsync -a --relative /top/a/b/c/d remote:/
See also this trick in other question.
rsync -aq --rsync-path='mkdir -p /tmp/imaginary/ && rsync' file user#remote:/tmp/imaginary/
From http://www.schwertly.com/2013/07/forcing-rsync-to-create-a-remote-path-using-rsync-path/, but don't copy and paste from there, his syntax is butchered.
it lets you execute arbitrary command to setup the path for rsync executables.
As of version 3.2.3 (6 Aug 2020), rynsc has a flag for this purpose.
From the rsync manual page (man rsync):
--mkpath create the destination's path component
i suggest that you enforce the existence manually:
ssh user#remote mkdir -p /top/a/b/c
rsync /top/a/b/c/d remote:/top/a/b/c
this creates the target folder if it does not exists already.
According to https://unix.stackexchange.com/a/496181/5783, since rsync 2.6.7, --relative works if you use . to anchor the starting parent directory to create at the destination:
derek#DESKTOP-2F2F59O:~/projects/rsync$ mkdir --parents top1/a/b/c/d
derek#DESKTOP-2F2F59O:~/projects/rsync$ mkdir --parents top2/a
derek#DESKTOP-2F2F59O:~/projects/rsync$ rsync --recursive --relative --verbose top1/a/./b/c/d top2/a/
sending incremental file list
b/
b/c/
b/c/d/
sent 99 bytes received 28 bytes 254.00 bytes/sec
total size is 0 speedup is 0.00
--relative does not work for me since I had different setup.
Maybe I just didn't understood how --relative works, but I found that the
ssh remote mkdir -p /top/a/b/c
rsync /top/a/b/c/d remote:/top/a/b/c
is easy to understand and does the job.
I was looking for a better solution, but mine seems to be better suited when you have too many sub-directories to create them manually.
Simply use cp as an intermediate step with the --parents option
cp --parents /your/path/sub/dir/ /tmp/localcopy
rsync [options] /tmp/localcopy/* remote:/destination/path/
cp --parents will create the structure for you.
You can call it from any subfolder if you want only one subset of the parent folders to be copied.
A shorter way in Linux to create rsync destination paths is to use the '$_' Special Variable. (I think, but cannot confirm, that it is also the same in OSX).
'$_' holds the value of the last argument of the previous command executed. So the question could be answered with:
ssh remote mkdir -p /top/a/b/c/ && rsync -avz /top/a/b/c/d remote:$_

rsync multiple remote directories to local machine preserving directory paths

Would I be able to use rsync as such:
rsync -e ssh root#remote.com:/path/to/file:/path/to/second/file/ /local/directory/
or would i have to do something else?
Directly from the rsync man page:
The syntax for requesting multiple files from a remote host is done
by specifying additional remote-host args in the same style as the
first, or with the hostname omitted. For instance, all these work:
rsync -av host:file1 :file2 host:file{3,4} /dest/
rsync -av host::modname/file{1,2} host::modname/file3 /dest/
rsync -av host::modname/file1 ::modname/file{3,4}
This means your example should have a space added before the second path:
rsync -e ssh root#remote.com:/path/to/file :/path/to/second/file/ /local/directory/
I'd suggest you first try it with the -n or --dry-run option, so you see what will be done, before the copy (and possible deletions) are actually performed.
just an actual example of #tonin. Download specific directories from live server
rsync -av root#123.124.137.147:/var/www/html/cls \
:/var/www/html/index.php \
:/var/www/html/header.inc \
:/var/www/html/version.inc.php \
:/var/www/html/style.css \
:/var/www/html/accounts \
:/var/www/html/admin \
:/var/www/html/api \
:/var/www/html/config \
:/var/www/html/main \
:/var/www/html/reports .

rsync only *.php files

How can I rsync mirror only *.php files? This gives me a bunch of empty dirs too and I don't want those.
rsync -v -aze 'ssh ' \
--numeric-ids \
--delete \
--include '*/' \
--exclude '*' \
--include '*.php' \
user#site.com:/home/www/domain.com \
/Volumes/Servers/
The culprit here is the
--include '*/'
When including a wildcard followed by the trailing forward-slash you're telling rsync to transfer all files ending with a '/' (that is, all directories).
Thus,
rsync -v -aze 'ssh ' \
--numeric-ids \
--delete \
--exclude '*' \
--include '*.php' \
user#site.com:/home/www/domain.com \
/Volumes/Servers/
If you were using that because you intend to recursively find all .php files, you’d have to use the ** wildcard.
That is,
--include '**/*.php'
Another way ( http://www.commandlinefu.com/commands/view/1481/rsync-find ) is pre-finding the target files and then using rsync,
find source -name "*.php" -print0 | rsync -av --files-from=- --from0 ./ ./destination/

Resources