rsync will not read from include file - rsync

I am trying to use rsync to do backups. I have an include file called /etc/daily.rsync and it contains the following:
+ /home/demo
- *
Then I run the command below:
$ sudo rsync -acvv --delete --include-from=/etc/daily.rsync /mnt/offsite_backup/home/
sending incremental file list
delta-transmission disabled for local transfer or --whole-file
drwxrwxr-x 6 2021/02/22 14:09:13 .
total: matches=0 hash_hits=0 false_alarms=0 data=0
sent 52 bytes received 131 bytes 366.00 bytes/sec
total size is 0 speedup is 0.00
When I go look in the directory I see nothing. What I think is that it is trying to rsync from the current directory which btw is empty. So this leaves me to believe that it is not getting the data form the include file.
This command runs as expected:
sudo rsync -acvv --delete /home/demo /mnt/offsite_backup/home/
The different posts made many suggestions, and I have tried them. I am just stuck. Any thoughts would be very welcome.

I think you're misunderstanding what a filter file (like the one you specified with --include-from) does. It does not specify where to sync files from; it specifies which files within the source directory to sync.
You need to specify both the source and destination as part of the command line. In the command:
sudo rsync -acvv --delete --include-from=/etc/daily.rsync /mnt/offsite_backup/home/
You only specified one directory, /mnt/offsite_backup/home/, so rsync has assumed it's the source, and there is no destination. According to the rsync man page:
As a special case, if a single source arg is specified without a
destination, the files are listed in an output format similar to "ls -l".
So, basically, it's listing the contents of /mnt/offsite_backup/home/, and apparently that's empty.
The second command you gave specifies both the source and destination, which is why it works correctly. If you want to add a filter file to, be aware that the paths in the filter will be relative to the source. So if you used
sudo rsync -acvv --delete --include-from=/etc/daily.rsync /home/demo /mnt/offsite_backup/home/
...it's going to try to include the file/directory /home/demo/home/demo, which probably doesn't exist. Except it actually won't do that, because the - * line will exclude /home/demo/home, so if it did exist, it and its contents will be excluded. You need to include the parent directories of anything you want to include in the sync operation. Again, from the man page:
The concept path exclusion is particularly important when using a
trailing '*' rule. For instance, this won't work:
+ /some/path/this-file-will-not-be-found
+ /file-is-included
- *
This fails because the parent directory "some" is excluded by the '*' rule, so rsync never visits any of the files in the "some" or
"some/path" directories. One solution is to ask for all directories in
the hierarchy to be included by using a single rule: "+ */" (put it
somewhere before the "- *" rule), and perhaps use the
--prune-empty-dirs option. Another solution is to add specific include rules for all the parent dirs that need to be visited. For instance,
this set of rules works fine:
+ /some/
+ /some/path/
+ /some/path/this-file-is-found
+ /file-also-included
- *

ok, so after walking away from the problem I realized that, I never specified what actual directory I wanted to sync. The include can't work from thin air. so the command is:
sudo rsync -acv --delete /home/ --include-from=/etc/weekly.rsync /mnt/offline_backup/home/
The include file had to change as well.
+ demo/***
+ truenorth/***
- *
To have it decend into the directory structure, I needed the ***. I hope this can help someone else out.

Related

Folderstructure with rsync in bash

I looked up the forum but didn't find an article which matches my problem. Maybe there is some, and you can help me out with it.
My problem is I want to sync an folder with the command rsync -a -v. The point is I got 5 different Maschinen. On every maschine is a scratch folder I want to sync into the folder: ~/work_dir/scratch_maschines and inside the /scratch_maschines folder should be a folder for maschine_a, maschine_b and so on.
On the maschines it is always the same path: /scratch/my_name. So when I use now this command for the first two maschines:
rsync -a -v --exclude='*.chk' --exclude='*.rwf' --exclude='*.fchk' --delete sp02:/scratch/my_name ~/work_dir/scratch_maschine01; rsync -a -v --exclude='*.chk' --exclude='*.rwf' --exclude='*.fchk' --delete maschine02:/scratch/my_name ~/work_dir/scratch_maschine02
I got a folders for scratch_maschine01 and scratch_maschine02 in my working directory but inside these folders are not direct my data there is first a folder inside with my_name and this folder contains the data. So my question is how can I use the rsync command and get the files from the scratch directorys straight to the folders for each machine?
You might want to consider reformulating your commands similar to the following:
START=`pwd`
EXCLUDES="--exclude='*.chk' --exclude='*.rwf' --exclude='*.fchk'"
{ SOURCE="sp02:/scratch/my_name"
REMOTE="${HOME}/work_dir/scratch_maschine01"
cd "${SOURCE}"
rsync --recursive -v --delete ${EXCLUDES} "./" "${REMOTE}/"
}>${START}/job.log 2>${START}/job.err
The key elements there are
the --recursive which will rsync will expand to include all content and subdirs of the SOURCE directory.
the / behind the ${SOURCE} notifies rsync to limit itself to content of the SOURCE directory, but not the directory itself.
the / behind the ${REMOTE} notifies rsync to limit itself to depositing content into that directory and expect it to already exist, to specifically fail if that does not already exist at REMOTE; this ensures that the remote site doesn't attempt a failsafe PWD and deposit files elsewhere than expected.
The above approach lends itself to a function form that could be placed into a loop with pre-attempt condition checks, along with having a complementary case for all variable assignments grouped under a destination heading (i.e. case statements).
Using such an approach with meaningful labels for variables lends itself to a type of implicit documentation, making the code more meaningful to someone not familiar with the code, as well as a refresher for yourself after a long period of not working or using the code.
I try to avoid the "~" because I prefer to always enclose definitions for variables in double quotes, to avoid issues that might arise from paths that may include unexpected characters or spaces. That way, you are sure to have your defined paths correctly interpreted by commands in scripts.
Lastly, I prefer to use the long form for the rsync options (and almost every other command) so that I don't have to refer to the manual every time to translate the single-character options when trying to understand what is coded, if the need arises for troubleshooting unexpected errors (I have always had poor memory).
My own backup command is as follows. The only reason why the
${PathMirror}${dirC}/
is not encapsulated in single quotes within the double quotes for COM is because I know those variables all evaluate to non-complex strings which cannot be misinterpreted.

Backup using rsync apparently does not remove files and folders that does not exist anymore on source

I'm using very handy rsync command which allows me to have a backup of particular folders on a specific volumes.
I call rsync with following parameters:
rsync -avzP
To be explicit, when I want to do a backup of all pictures and Lightroom catalogs I call:
rsync -avzP /Volumes/SLICK-2TB/Pictures /Volumes/SLICK-PICTURES-BACKUP
So SLICK-2TB is my source drive and SLICK-PICTURES-BACKUP is my destination drive.
My problem is, whenever I delete / remove a file on source, the change is not reflected on destination. In other words, all new stuff will be always archived on backup volume, but things that don't exist on the source, will be left intact on destination.
Is there a particular attribute that I could add to -avzP that will help me achieve / solve the problem?
Thanks.
You need to use the --delete argument to delete the files on the destination.
Also consider using --dry-run first to make sure you aren't deleting the wrong files. I guess the reason --delete is not a default argument (or part of -a) is that it can wipe out the destination if it's not set correctly.

Grep command stopped working

Suddenly grep command stopped working. When I did the ls -l ~/grep showing the one file in my home directory.But this file has been present for ages. If I give command which grep --> pointing to /bin/grep and with /bin/grep it is working fine. Can anyone please suggest.
Thanks,
Regards,
Shiv
You can delete the zero-byte file in your home directory. It's not doing anything. (I don't know how it got there.) The problem is that the first entry in PATH, ".", points to whatever directory you're in. So when you're in your home directory, the shell (bash, I assume) looks for grep in the current directory, and finds the file that's there, which can't do anything.
I consider it a bad idea to have "." in your path. It's convenient, and natural if you're coming from the Windows world, but it means that what gets executed can change depending on what directory you're in (as you have now seen). It also means that if you're on a multiuser system, someone can put an executable in one of their directories, and then when you cd into their directory, all of a sudden you're executing their code, which might not be what you want, and could be dangerous.
Instead, remove ".:" (dot colon) from your PATH. When you need to run a script in the current directory, add "./" to its name to execute it. "/bin" and "/usr/bin" should usually be at the front of the list. Some people prefer to put "/usr/local/bin" at the front of the list, or something else.
You can change your PATH by editing .profile or .bash_profile or .bashrc. It depends on how you have your shell set up. Be careful to separate each directory path in PATH with one ":" character.

mkdir's "-p" option

So this doesn't seem like a terribly complicated question I have, but it's one I can't find the answer to. I'm confused about what the -p option does in Unix. I used it for a lab assignment while creating a subdirectory and then another subdirectory within that one. It looked like this:
mkdir -p cmps012m/lab1
This is in a private directory with normal rights (rlidwka). Oh, and would someone mind giving a little explanation of what rlidwka means? I'm not a total noob to Unix, but I'm not really familiar with what this means. Hopefully that's not too vague of a question.
The man pages is the best source of information you can find... and is at your fingertips: man mkdir yields this about -p switch:
-p, --parents
no error if existing, make parent directories as needed
Use case example: Assume I want to create directories hello/goodbye but none exist:
$mkdir hello/goodbye
mkdir:cannot create directory 'hello/goodbye': No such file or directory
$mkdir -p hello/goodbye
$
-p created both, hello and goodbye
This means that the command will create all the directories necessaries to fulfill your request, not returning any error in case that directory exists.
About rlidwka, Google has a very good memory for acronyms :). My search returned this for example: http://www.cs.cmu.edu/~help/afs/afs_acls.html
Directory permissions
l (lookup)
Allows one to list the contents of a directory. It does not allow the reading of files.
i (insert)
Allows one to create new files in a directory or copy new files to a directory.
d (delete)
Allows one to remove files and sub-directories from a directory.
a (administer)
Allows one to change a directory's ACL. The owner of a directory can always change the ACL of a directory that s/he owns, along with the ACLs of any subdirectories in that directory.
File permissions
r (read)
Allows one to read the contents of file in the directory.
w (write)
Allows one to modify the contents of files in a directory and use chmod on them.
k (lock)
Allows programs to lock files in a directory.
Hence rlidwka means: All permissions on.
It's worth mentioning, as #KeithThompson pointed out in the comments, that not all Unix systems support ACL. So probably the rlidwka concept doesn't apply here.
-p|--parent will be used if you are trying to create a directory with top-down approach. That will create the parent directory then child and so on iff none exists.
-p, --parents
no error if existing, make parent directories as needed
About rlidwka it means giving full or administrative access. Found it here https://itservices.stanford.edu/service/afs/intro/permissions/unix.
mkdir [-switch] foldername
-p is a switch, which is optional. It will create a subfolder and a parent folder as well, even if parent folder doesn't exist.
From the man page:
-p, --parents no error if existing, make parent directories as needed
Example:
mkdir -p storage/framework/{sessions,views,cache}
This will create subfolder sessions,views,cache inside framework folder irrespective of whether 'framework' was available earlier or not.
PATH: Answered long ago, however, it maybe more helpful to think of -p as "Path" (easier to remember), as in this causes mkdir to create every part of the path that isn't already there.
mkdir -p /usr/bin/comm/diff/er/fence
if /usr/bin/comm already exists, it acts like:
mkdir /usr/bin/comm/diff
mkdir /usr/bin/comm/diff/er
mkdir /usr/bin/comm/diff/er/fence
As you can see, it saves you a bit of typing, and thinking, since you don't have to figure out what's already there and what isn't.
Note that -p is an argument to the mkdir command specifically, not the whole of Unix. Every command can have whatever arguments it needs.
In this case it means "parents", meaning mkdir will create a directory and any parents that don't already exist.

Add last n lines of files to tar/zip

I need to regularly send a collection of log files that can grow quite large, so I would like to only send the last n lines of the each of the files.
for example:
/usr/local/data_store1/file.txt (500 lines)
/usr/local/data_store2/file.txt (800 lines)
Given a file with a list of needed files named files.txt, I would like to create an archive (tar or zip) with the last 100 lines of each of those files.
I can do this by creating a separate directory structure with the tail-ed files, but that seems like a waste of resources when there's probably some piping magic that can happen to accomplish it. Full directory structure also must be preserved since files can have the same names in different directories.
I would like the solution to be a shell script if possible, but perl (without added modules) is also acceptable (this is for Solaris machines that don't have ruby/python/etc.. installed on them.)
You could try
tail -n 10 your_file.txt | while read line; do zip /tmp/a.zip $line; done
where a.zip is the zip file and 10 is n or
tail -n 10 your_file.txt | xargs tar -czvf test.tar.gz --
for tar.gz
You are focusing in an specific implementation instead of looking at the bigger picture.
If the final goal is to have an exact copy of the files on the target machine while minimizing the amount of data transfered, what you should use is rsync, which automatically sends only the parts of the files that have changed and also can automatically compress while sending and decompress while receiving.
Running rsync doesn't need any more daemons on the target machine that the standard sshd one, and to setup automatic transfers without passwords you just need to use public key authentication.
There is no piping magic for that, you will have to create the folder structure you want and zip that.
mkdir tmp
for i in /usr/local/*/file.txt; do
mkdir -p "`dirname tmp/${i:1}`"
tail -n 100 "$i" > "tmp/${i:1}"
done
zip -r zipfile tmp/*
Use logrotate.
Have a look inside /etc/logrotate.d for examples.
Why not put your log files in SCM?
Your receiver creates a repository on his machine from where he retrieves the files by checking them out.
You send the files just by commiting them. Only the diff will be transmitted.

Resources