Exclude files from tar gzipping a directory in unix - unix

I have a directory (dir) (with files and subdirectories):
ls -1 dir
plot.pdf
subdir.1
subdir.2
obj.RDS
And then ls -1 for either subdir.1 or subdir.2:
plot.pdf
PC.pdf
results.csv
de.pdf
de.csv
de.RDS
I would like to tar and gzip dir (in unix) and I'd like to exclude all RDS files (the the level right below dir and the ones in its subdirectories).
What's the easiest way to achieve that? Perhaps in a one liner

Something like:
find dir -type f -not -name '*.RDS' -print0 |
tar --null -T- -czf TARGET.tgz
should do it.
First, find finds the files, and then tar accepts the list via -T- (= --files-from /dev/stdin).
-print0 on find combined wth --null on tar protect from weird filenames.
-czf == Create gZipped File
You can add v to get verbose output.
To later inspect the contents, you can do:
tar tf TARGET.tgz

tar --exclude=*.RDS -Jcf outputball.tar dir_to_compress
this will ignore *.RDS across any dir or subdirs
decompress using
tar -xvf outputball.tar

Related

Linux one line command to gzip and move

I have some .txt file in a particular /path/doc.txt and i wish to gzip all the files and move the new file that zipped all txt file into another path. How will i achieve that in one line of code.
maybe something like:
find /path/doc/ -type f -name \*.txt | xargs tar -z -c -f save.tar.gz && mv save.tar.gz other/path
use
tar -vtf save.tar.gz
to check archive content

Excluding directory when using tar

I have quite a simple bash script that's running every night via crontab.
The issue I am having is ignoring one of the directories when archiving up my site using tar. It still seems to include it.
Any thoughts?
#!/bin/bash
NOW=$(date +"%Y-%m-%d-%H%M")
DB_USER=""
DB_PASS=""
DB_NAME=""
DB_HOST=""
TREE_FILE="$NOW.tar.gz"
DB_FILE="$DB_NAME.$NOW.sql"
BACKUP_DIR="/var/www/html/site/backups/"
WWW_DIR="/var/www/html/site/"
mkdir -p $BACKUP_DIR
tar -czvf $BACKUP_DIR/$TREE_FILE --exclude=/var/www/html/site/backups/ $WWW_DIR
mysqldump -h$DB_HOST -u$DB_USER -p$DB_PASS $DB_NAME > $BACKUP_DIR/$DB_FILE
find $BACKUP_DIR -type f -mtime +7 -delete
I believe tar strips any trailing slashes from directory paths, so I think you simply want to leave the trailing slash off your pattern:
tar -czvf $BACKUP_DIR/$TREE_FILE --exclude=/var/www/html/site/backups $WWW_DIR
This will exclude the directory backups and everything below it, but not (for example) a file named backupsthing.
You could also do something like this:
tar -czvf $BACKUP_DIR/$TREE_FILE --exclude="/var/www/html/site/backups/*" $WWW_DIR
This would include the backups dir itself, but nothing under it. (I.e., you'd have an empty dir in the tar.)

Unix : how to tar only N first files of each folder?

I have a folder containing 2Gb of images, with sub-folders several levels deep.
I'd like to archive only N files of each (sub) folder in a tar file. I tried to use find then tail then tar but couldn't manage to get it to work. Here is what I tried (assuming N = 10):
find . | tail -n 10 | tar -czvf backup.tar.gz
… which outputs this error:
Cannot stat: File name too long
What's wrong here? thinking of it - even if it works I think it will tar only the first 10 files of all folders, not the first 10 files of each folder.
How can I get the first N files of each folder?
A proposal with some quirks: order is only determined by the order out of find, so "first" isn't well-defined.
find . -type f |
awk -v N=10 -F / 'match($0, /.*\//, m) && a[m[0]]++ < N' |
xargs -r -d '\n' tar -rvf /tmp/backup.tar
gzip /tmp/backup.tar
Comments:
use find . -type f to ensure that files have a leading directory-name prefix, so the next step can work
the awk command tracks such leading directory names, and emits full path names until N (10, here) files with the same leading directory have been emitted
use xargs to invoke tar - we're gathering regular file names, and they need to be arguments to that archiving command
xargs may invoke tar more than once, so we'll append (-r option) to a plain archive, then compress it after it's all written
Also, you may not want to write a backup file into the current directory, since you're scanning that - that's why this suggestion writes into /tmp.

How to create zip/gz/tar files for if the files are older than particular days in UNIX or Linux

I need a script file for backup (zip or tar or gz) of old log files in our unix server (causing the space problem). Could you please help me to create the zip or gz files for each log files in current directory and sub-directories also?
I found one command which is to create gz file for the older files, but it creates only one gz file for all older file. But I need individual gz file for each log file.
find /tmp/log/ -mtime +180 | xargs tar -czvPf /tmp/older_log_$(date +%F).tar.gz
Thanking you in advance.
Best way is
find . -mtime +3 -print -exec gzip {} \;
Where +3 means zip all files which is older than 3 days.
Thanks a lot for your reply.
I got it.
files=($(find /tmp/mallik3/ -mtime +"$days"))
for files in ${files[*]}
do
echo $files
zip $files-$(date --date="- "$days"days" +%F)_.zip $files
# tar cvfz $(files)_$(date --date='-6months' +%F).tar.gz $files
# rm $files
done
First, the -mtime argument does not get you files that are "older" than a certain amount. Rather, it checks the last time the file was modified. The creation date of files is not kept in most file systems. Often, the last modified time is sufficient, but it is not the same as the age of the file.
If you just want to create a single tar file for each archive, use -exec instead of passing the data to xargs:
find /tmp/log/ -mtime +180 -type f -exec sh -c \
'tar -czvPf /tmp/older_log_$(basename $0)_$(date +%F).tar.gz $0' {} \;

Tar only the Directory structure

I want to copy my directory structure excluding the files. Is there any option in the tar to ignore all files and copy only the Directories recursively.
You can use find to get the directories and then tar them:
find .. -type d -print0 | xargs -0 tar cf dirstructure.tar --no-recursion
If you have more than about 10000 directories use the following to work around xargs limits:
find . -type d -print0 | tar cf dirstructure.tar --no-recursion --null --files-from -
Directory names that contain spaces or other special characters may require extra attention. For example:
$ mkdir -p "backup/My Documents/stuff"
$ find backup/ -type d | xargs tar cf directory-structure.tar --no-recursion
tar: backup/My: Cannot stat: No such file or directory
tar: Documents: Cannot stat: No such file or directory
tar: backup/My: Cannot stat: No such file or directory
tar: Documents/stuff: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors
Here are some variations to handle these cases of "unusual" directory names:
$ find backup/ -type d -print0 | xargs -0 tar cf directory-structure.tar --no-recursion
Using -print0 with find will emit filenames as null-terminated strings; with -0 xargs will interpret arguments that same way. Using null as a terminator helps ensure that even filenames with spaces and newlines will be interpreted correctly.
It's also possible to pipe results straight from find to tar:
$ find backup/ -type d | tar cf directory-structure.tar -T - --no-recursion
Invoking tar with -T - (or --files-from -) will cause it to read filenames from stdin, expecting each filename to be separated by a line break.
For maximum effect this can be combined with options for null-terminated strings:
$ find . -type d -print0 | tar cf directory-structure.tar --null --files-from - --no-recursion
Of these I consider this last version to be the most robust, because it supports both unusual filenames and (unlike xargs) is not inherently limited by system command-line sizes. (see xargs --show-limits)
for i in `find . -type d`; do mkdir -p /tmp/tar_root/`echo $i|sed 's/\.\///'`; done
pushd /tmp/tar_root
tar cf tarfile.tar *
popd
# rm -fr /tmp/tar_root
go into the folder you want to start at (that's why we use find dot)
save tar file somewhere else. I think I got an error leaving it right there.
tar with r not c. I think with cf you keep creating new files and you only
get the last set of file subdirectories. tar r appends to the tar file.
--no-recursion because the find is giving you your whole list of files already
so you don't want to recurse.
find . -type d |xargs tar rf /somewhereelse/whatever-dirsonly.tar --no-recursion
tar tvf /somewhereelse/whatever-dirsonly.tar |more to check what you got.
For AIX:
tar cvfD some-tarball.tar `find /dir_to_start_from -type d -print`

Resources