Excluding directory when using tar - unix

I have quite a simple bash script that's running every night via crontab.
The issue I am having is ignoring one of the directories when archiving up my site using tar. It still seems to include it.
Any thoughts?
#!/bin/bash
NOW=$(date +"%Y-%m-%d-%H%M")
DB_USER=""
DB_PASS=""
DB_NAME=""
DB_HOST=""
TREE_FILE="$NOW.tar.gz"
DB_FILE="$DB_NAME.$NOW.sql"
BACKUP_DIR="/var/www/html/site/backups/"
WWW_DIR="/var/www/html/site/"
mkdir -p $BACKUP_DIR
tar -czvf $BACKUP_DIR/$TREE_FILE --exclude=/var/www/html/site/backups/ $WWW_DIR
mysqldump -h$DB_HOST -u$DB_USER -p$DB_PASS $DB_NAME > $BACKUP_DIR/$DB_FILE
find $BACKUP_DIR -type f -mtime +7 -delete

I believe tar strips any trailing slashes from directory paths, so I think you simply want to leave the trailing slash off your pattern:
tar -czvf $BACKUP_DIR/$TREE_FILE --exclude=/var/www/html/site/backups $WWW_DIR
This will exclude the directory backups and everything below it, but not (for example) a file named backupsthing.
You could also do something like this:
tar -czvf $BACKUP_DIR/$TREE_FILE --exclude="/var/www/html/site/backups/*" $WWW_DIR
This would include the backups dir itself, but nothing under it. (I.e., you'd have an empty dir in the tar.)

Related

ignore subdirectories timestamps when syncing from shell

i want to write a shell command to sync current directory to backup directory with some requirments. the command i'm using is:
rsync -ptvHS --progress --delete-after --exclude /backup $pwd ~/backup
i want the directory timestamps to be ignored, eventhough i use -t to preserve the file timestamps.
Any idea?
thank you in advance
From the man page:
-t, --times preserve modification times
-O, --omit-dir-times omit directories from --times
-J, --omit-link-times omit symlinks from --times
Seems like you need to add -O to your command.
This is from rsync 3.1.2; you might find your version is too old.

Find and tar files on Solaris

I've got a little problem with my bash script. I'm newbie in unix world, so I find it difficult to deal with an exercise. What I have to do is find files on Solaris server with specific name, modified in specific time and archive them in one .tar file. First two points are easy, but I'm having a nightmare with trying to archive it. The thing is, I constantly archive whole tree of file (with file at the end) to .tar file, but I need just a file. My code looks like this:
find ~ -name "$maska" -mtime -$dni | xargs -t -L 1 tar -cvf $3 -C
where $maska is the name of the file, $dni refers to modification time and $3 is just a archive name. I found out about -C switch, that let's me jump into the folder where desired file is, but when I use it with xargs, it seems just to jump there and do nothing else.
So my question is:
1) is there any possibility of achieving my goal this way?
Please remember, I don't work on gnu tar. And I HAVE TO use commands: tar, find.
Edit: I'd like to specify more my problem. When I use the script for, for example, file a, it should look for it since the point shown in script (it's ~ ) and everything it will find should be in one tar file.
What I got right now is (I'm in /home/me/Scripts):
-bash-3.2$ ./Script.sh a 1000 backup
a /home/me/Program/Test/a/ 0K
a /home/me/Program/Test/a/a.c 1K
a /home/me/Program/Test/a/a.out 8K
So script has done some packing. Next I want to see my packed file, so:
-bash-3.2$ tar -tf backup
/home/me/Program/Test/a/
/home/me/Program/Test/a/a.c
/home/me/Program/Test/a/a.out
And that's the problem. Tar file have all the paths in it, so if I will untar it, instead of getting just the file I wanted to archive, I will replace them in their old places. For visualisation:
-bash-3.2$ ls
Script.sh* Script.sh~* backup
-bash-3.2$ tar -xvf backup
x /home/me/Program/Test/a, 0 bytes, 0 tape blocks
x /home/me/Program/Test/a/a.c, 39 bytes, 1 tape blocks
x /home/me/Program/Test/a/a.out, 7928 bytes, 16 tape blocks
-bash-3.2$ ls
Script.sh* Script.sh~* backup
That's the problem.
So all I want is to pack all those desired file (a in example above) in one tar file without those paths, so it will simply untar in the directory I run the Script.sh.
I'm not sure to understand what you want but this might be it :
find ~ -name "$maska" -mtime -$dni -exec tar cvf $3 {} +
Edit: second attempt after your wrote the main issue is the absolute path:
( cd ~; find . -name "$maska" -type f -mtime -$dni -exec tar cvf $3 {} + )
Edit: third attempt, after you wrote you want no path at all in the archive, maska is a directory name and $3 need to be in the current directory:
mkdir ~/foo && \
find ~ -name "$maska" -type d -mtime -$dni -exec sh -c 'ln -s $1/* ~/foo/' sh {} \; && \
( cd ~/foo ; tar chf - * ) > $3 && \
rm -rf ~/foo
Replace ~/foo by ~/somethingElse if ~/foo already exists for some reason.
Maybe you can do something like this:
#!/bin/bash
find ~ -name "$maska" -mtime -$dni -print0 | while read -d $'\0' file; do
d=$(dirname "$file")
f=$(basename "$file")
echo $d: $f # Show directory and file for debug purposes
tar -rvf tarball.tar -C"$d" "$f"
done
I don't have a Solaris box at hand for testing :-)
First of all, my assumptions:
1. "one tar file", like you said, and
2. no absolute paths, ie if you backup ~/dir/file, you should be able to test extracting it in /tmp obtaining /tmp/dir/file.
If the problem is the full paths, you should replace
find ~ # etc
with
cd ~ || exit
find . # etc
If the tar archive isn't an absolute name, instead, it should be something like
(
cd ~ || exit
find . etc etc | xargs tar cf - etc etc
) > $3
Explanation
"(...)" runs a subshell, meaning some of the tings you change in there have no effects outside of the parens; the current directory is one of them, so "(cd whatever; foo)" means you run another shell, change its current directory, run foo from there, and then you're back in your script which never changed directory.
"cd ~ || exit" is paranoia, it means "cd ~; if that fails, exit".
"." is an alias meaning "the current directory, whatever that is"; play with "find ." vs "find ~" if you don't know what it means, you'll understand it better than if I explained it here.
"tar cf -" means that you create the tar archive on standard output; I think the syntax is portable enough, you may have to replace "-" with "/dev/stdout" or whatever works on solaris (the simplest solution is simply "tar", without the "c" command, but it's ugly to read).
The final "> $3", outside of the parens, is output redirection: rather than writing the output to the terminal, you save it into a file.
So the whole script reads like this:
- open a subshell
- change the subshell's current directory to ~
- in the subshell, find the files newer than requested, archive them, and write the contents of the resulting tar archive to standard output
- the subshell's stdout is saved to $3; because the redirection is outside the parens, relative paths are resolved relatively to your script's $PWD, meaning that eg if you run the script from the /tmp directory you'll get a tar archive in the /tmp directory (it would be in ~ if the redirection happened in the subshell).
If I misunderstood your question, the solution doesn't work or the explanation isn't clear let me know (the answer is too long, but I already know that :).
The pax command will output tar-compatible archives and has the flexibility you need to rewrite pathnames.
find ~ -name "$maska" -mtime -$dni | pax -w -x ustar -f "$3" -s '!.*/!!'
Here are what the options mean, paraphrasing from the man page:
-w write the contents of the file operands to the standard output (or to the pathname specified by the -f option) in an archive format.
-x ustar the output archive format is the extended tar interchange format specified in the IEEE POSIX standard.
-s '!.*/!!' Modifies file operands according to the substitution expression, using regular expression syntax. Here, it deletes all characters in each file name from the beginning to the final /.

rsync: --delete-during --backup-dir=PATH --backup doesn't backup directories that are deleted to make way for files

When running rsync with the --backup --delete-during and --backup-dir=PATH options, only files that are deleted are backed up, but directories are not if those directories were empty at the time they were deleted. I can't see an option that specifies directories should not be pruned from backup when being deleted.
Example:
mkdir /tmp/test_rsync_delete
cd /tmp/test_rsync_delete
mkdir -p a/a/a/a/a
ln -s . a/b
mkdir -p b/a/a
ln -s a/a b/a
touch b/a/a/a
mkdir c
mkdir backup
rsync -avi --delete-during --backup --backup-dir=backup a/ c/
find backup/ -exec ls -ldi {} \;
# Should be empty
rsync -avi --delete-during --backup --backup-dir=backup b/ c/
find backup/ -exec ls -ldi {} \;
# Will be missing the directory that was deleted to make way for the file.
Update
As per the above example, when you run it, you will notice that the empty directories were pruned/removed by the --delete option. However, the same directories were not backed up in the directory specified by the --backup-dir option. It's not necessarily the directories that are important, but the permissions and ownership that are important. If rsync fails when running in batch mode (--read-batch) then you need to be able to roll back by restoring the system to its previous state. If directories are not being backed up, then it's not really creating a reliable point from which to restore to - it will potentially be missing some directories.
So why does the --backup family of options not backup empty directories when they are going to be pruned by the --delete family of options?
This is not an answer to the specific question, but probably the answer to, what others were searching for, ending up here:
Just for info: this is what I was searching for when I found this question:
rsync -av --delete-after src dest
-av The "-a" means archive. This will preserve symlinks, permissions, timestamps, group/owners, and will be recursive. The "v" makes the job verbose. This won't be necessary, but you can see what's happening with the rsync so you know if you've done something wrong.
--delete-after Will tell rsync to compare the destination against the source and delete any extraneous files after the rsync has completed. This is a dangerous option, so use with caution.

Using grep to find a file that contains a string

My .htaccess file in my htdocs folder does not work. I tried to redirect to Google when accessing a filename. I want to find out where the settings for my httpd.conf are, so I can enable mod_rewrite. I did the following UNIX command to find out if a httpd.conf file existed on my hard drive:
find * -name "httpd.conf"
The file does not exist. I am thinking that maybe there is another file that controls mod_rewrite. I want to see if "AllowOverride" exists in any directory. I entered the following UNIX command:
grep -r "AllowOverride" *
But it's hard to read because it prints out so many folders. The message that accompanies the folders are "Permission denied" or "No such file or directory". How do I only get the file paths of files that contain AllowOverride?
Many Unix and similar systems provide a locate(1) command that uses a database to speed finding individual files. Try this:
locate httpd.conf
Note, of course, that Apache configurations are stored in files of all sorts of names; I've seen apache.conf, httpd.conf, httpd2.conf, and then there's the giant pile of /etc/apache2/conf.d/ -- entire directory structures set aside for configuring Apache. Your distribution may vary.
Perhaps apachectl configtest will show the paths? (currently not installed on my machine, so I can't easily test.)
Try this command:
find / -name "httpd.conf" 2>1 | grep -v "Permission denied"
the 2>1 funnels stderr to stdout so that both can be piped into the grep utility. grep in turn will print anyline that doesn't have the string "Permission denied" in it (the -v negates/inverts the matching of the search string)
If you don't redirect stderr to stdout, the output of stderr to the console would bypass the rest of the command line.
You could extend the above command line by appending this:
| grep -v "No such file or directory"
if that string was coming up and you wanted to suppress it too.
This tells you all about io redirection. And here's a nice quick summary.
Use the following:
find / -type f -exec grep -n "AllowOverride" {} \; -print 2>/dev/null
To scan files containing the "AllowOverride" string from the root, if you want to run the search in a particular directory, use the following instead:
find /path/to/directory -type f -exec grep -n "AllowOverride" {} \; -print 2>/dev/null
The output should only print the files containing the specified string along with the number of the matching line

Unix shell file copy flattening folder structure

On the UNIX bash shell (specifically Mac OS X Leopard) what would be the simplest way to copy every file having a specific extension from a folder hierarchy (including subdirectories) to the same destination folder (without subfolders)?
Obviously there is the problem of having duplicates in the source hierarchy. I wouldn't mind if they are overwritten.
Example: I need to copy every .txt file in the following hierarchy
/foo/a.txt
/foo/x.jpg
/foo/bar/a.txt
/foo/bar/c.jpg
/foo/bar/b.txt
To a folder named 'dest' and get:
/dest/a.txt
/dest/b.txt
In bash:
find /foo -iname '*.txt' -exec cp \{\} /dest/ \;
find will find all the files under the path /foo matching the wildcard *.txt, case insensitively (That's what -iname means). For each file, find will execute cp {} /dest/, with the found file in place of {}.
The only problem with Magnus' solution is that it forks off a new "cp" process for every file, which is not terribly efficient especially if there is a large number of files.
On Linux (or other systems with GNU coreutils) you can do:
find . -name "*.xml" -print0 | xargs -0 echo cp -t a
(The -0 allows it to work when your filenames have weird characters -- like spaces -- in them.)
Unfortunately I think Macs come with BSD-style tools. Anyone know a "standard" equivalent to the "-t" switch?
The answers above don't allow for name collisions as the asker didn't mind files being over-written.
I do mind files being over-written so came up with a different approach. Replacing each / in the path with - keep the hierarchy in the names, and puts all the files in one flat folder.
We use find to get the list of all files, then awk to create a mv command with the original filename and the modified filename then pass those to bash to be executed.
find ./from -type f | awk '{ str=$0; sub(/\.\//, "", str); gsub(/\//, "-", str); print "mv " $0 " ./to/" str }' | bash
where ./from and ./to are directories to mv from and to.
If you really want to run just one command, why not cons one up and run it? Like so:
$ find /foo -name '*.txt' | xargs echo | sed -e 's/^/cp /' -e 's|$| /dest|' | bash -sx
But that won't matter too much performance-wise unless you do this a lot or have a ton of files. Be careful of name collusions, however. I noticed in testing that GNU cp at least warns of collisions:
cp: will not overwrite just-created `/dest/tubguide.tex' with `./texmf/tex/plain/tugboat/tubguide.tex'
I think the cleanest is:
$ find /foo -name '*.txt' | xargs -i cp {} /dest
Less syntax to remember than the -exec option.
As far as the man page for cp on a FreeBSD box goes, there's no need for a -t switch. cp will assume the last argument on the command line to be the target directory if more than two names are passed.

Resources