Script Issues with find -> tar/gzip - unix

I am currently working on a script, to store/backup our old files, so that we have more space on our server. This script will be used as a cronjob to backup the stuff every week. My script currently looks like this:
#!/bin/bash
currentDate=$(date '+%Y%m%d%T' | sed -e 's/://g')
find /Directory1/ -type f -mtime +90 | xargs tar cvf - | gzip > /Directory2/Backup$currentDate.tar.gz
find /Directory1/ -type f -mtime +90 -exec rm {} \;
The script is at first saving the current Date + Timestamp(without ":") as a variable. Afterwards it searches for files older than 90 days, tars them and finally makes a gzip out of them, which has the name "Backup$currentDate.tar.gz".
Then it's supposed to find the files again and remove them.
I do however have some issues here:
Directory1 consists of multiple Directories. It does find the files and creates the gz file, but while some files are zipped properly(for instance /DirName1/DirName2/DirName3/File), others appear directly in the "root" Dir. What could be the issue here?
Is there a way to tell the Script, to only create the gz file, if files are found? Because currently, we get gz files, even if there was nothing found, leading to empty directories.
Can I somehow use the find output later on(store variable?), so that the remove at the end really only targets those files found in the step before? Because if the third step would take, let's say a hour and the last step gets executed after it's finished, it could potentially remove files, that weren't older than 90 days before, but are now, so they are never backed up, but then deleted(highly unlikly, but not impossible).
If there's anything else you need to know, feel free to ask ^^
Best regards

I've "rephrased" your original code a bit. I don't have an AIX machine to test anything, so DO NOT cut and paste this. Using this code, you should be able to address your issues. To wit:
It make a record of what files it intends to operate on ($BFILES).
This record can be used to check for empty tar files.
This record can be used to see why your find is producing "funny" output. It wouldn't surprise me to find that xargs hit a space character.
This record can be used to delete exactly the files archived.
As a child, I had a serious accident with xargs and have avoided it ever since. Maybe there is a safe version out there.
#!/bin/bash
# I don't have an AIX machine to test this, so exit immediately until
# someone can proof this code.
exit 1
currentDate=$(date '+%Y%m%d%T' | sed -e 's/://g')
BFILES=/tmp/Backup$currentDate.files
find /Directory1 -type f -mtime +90 -print > $BFILES
# Here is the time to proofread the file list, $BFILES
# The AIX page I read lists the '-L' option to take filenames from an
# input file. I've found xargs to be sketchy unless you are very
# careful about quoting.
#tar -c -v -L $BFILES -f - | gzip -9 > /Directory2/Backup$currentDate.tar.gz
# I've found xargs to be sketchy unless you are very careful about
# quoting. I would rather loop over the input file one well quoted
# line at a time rather than use the faster, less safe xargs. But
# here it is.
#xargs rm < $BFILES

Related

unix command to change directory name

Hi this is a simple question but the solution eludes me at the moment..
I can find out the folder name that I want to change the name of, and I know the command to change the name of a folder is mv
so from the current directory if i go
ls ~/relevant.directory.containing.directory.name.i.want.to.change
to which i get the name of the directory is called say lorem-ipsum-v1-3
but the directory name may change in the future but it is the only directory in the directory:
~/relevant.directory.containing.directory.name.i.want.to.change
how to i programmatically change it to a specific name like correct-files
i can do it normally by just doing something like
mv lorem-ipsum-v1-3 correct-files
but I want to start automating this so that I don't need to keep copying and pasting the directory name....
any help would be appreciated...
Something like:
find . -depth -maxdepth 1 -type d | head -n 1 | xargs -I '{}' mv '{}' correct-files
should work fine as long as only one directory should be moved.
If you are absolutely certain that relevant.directory.containing.directory.name.i.want.to.change only contains the directory you want to rename, then you can simply use a wildcard:
mv ~/relevant.directory.containing.directory.name.i.want.to.change/*/ ~/relevant.directory.containing.directory.name.i.want.to.change/correct-files
This can can also be simplified further, using bash brace expansion, to:
mv ~/relevant.directory.containing.directory.name.i.want.to.change/{*/,correct-files}
cd ~/relevant.directory.containing.directory.name.i.want.to.change
find . -type d -print | while read a ;
do
mv $a correct-files ;
done
Caveats:
No error handling
There may be a way of reversing the parameters to mv so you can use xargs instead of a while loop, but that's not standard (as far as I'm aware)
Not parameterised
If there any any subdirectories it won't work. The depth parameters on the find command are (again, AFAIK) not standard. They do exist on GNU versions but seem to be missing on Solaris
Probably others...

Program fails to move file

I'm trying to move file from one place to another directory...So my program will read Log_Deleter, use parameters given in each line to delete the file.
When I execute the file, it seems like it runs fine (no errors) but non of the files are moved... I'm not sure why it's not moving the file nor display any error...
Can someone please identify the error?
my attempt:
#!/bin/ksh
while read -r line ; do
v=$line
set -- $v
cd /
$(find "$1" -type f -name "$2" -mtime +"$3" -exec mv {} "$4" \;)
done < Log_Deleter.txt
Log_Deleter.txt
/usr/IBM/WebSphere/AppServer/profiles/AppSrvSIT1/logs/Server1 'SystemOut_*' 5 /backup/Abackuptest1
/usr/IBM/WebSphere/AppServer/profiles/AppSrvSIT1/logs/Server2 'SystemOut_*' 5 /backup/Abackuptest2
Thanks for your help!
Find is looking for files that have a literal ' in the name. You need to remove the single quotes from $2 before invoking find. Try:
#!/bin/ksh
while read -r path name mtime dest ; do
name=$( echo $name | tr -d "'" )
find "$path" -type f -name "$name" -mtime +"$mtime" -exec mv {} "$dest" \;
done < Log_Deleter.txt
The problem is that you are trying to match a file whose name actually has the single quotes in it.
Barring other problems, I think your script will probably work once you take the quotes out of Log_Deleter.txt.
The quotes are only meaningful when the shell is parsing command input. This is not what the read builtin does. And even when reading command input, once the quotes get into a variable they stay there forever unless reread at the shells CLI layer via eval.
The shell is not exactly a macro processor. It's a complicated hybrid that a little bit CLI, a little bit programming language, and a little bit macro processor.
And, speaking of eval, it's not necessary to wrap the find in an eval-like construct. Simplify your script to run find directly and you will find it easier to debug and understand.

Efficient way of getting listing of files in large filesystem

What is the most efficient way to get a "ls"-like output of the most recently created files in a very large unix file system (100 thousand files +)?
Have tried ls -a and some other varients.
You can also use less to search and scroll it easily.
ls -la | less
If I'm understanding your question correctly try
ls -a | tail
More information here
If the files are in a single directory, then you can use:
ls -lt | less
the -t option to ls will sort the files by modification time and less will let you scroll through them
If the want recent files across an entire file system --- i.e., in different directories, then you can use the find command:
find dir -mtime 1 -print | xargs ls -ld
Substitute the directory where you want to start the search for "dir". The find command will print the names of all of the files that have been modified in the last day (-mtime 1 means modified in the last one day) and the xargs command will take that list of files and feed it to ls, giving you the ls-like output you want

Locating most recently updated file recursively in UNIX

For a website I'm working on I want to be able to automatically update the "This page was last modified:" section in the footer as I'm doing my nightly git commit. Essentially I plan on writing a shell script to run at midnight each night which will do all of my general server maintenance. Most of these tasks I already know how to automate, but I have a file (footer.php) which is included in every page and displays the date the site was last updated. I want to be able to recursively look through my website and check the timestamp on every file, then if any of these were edited after the date in footer.php I want to update this date.
All I need is a UNIX command that will recursively iterate through my files and return ONLY the date of the last modification. I don't need file names or what changes were made, I just need to know a single day (and hopefully time) that the most recently updated file was changed.
I know using "ls -l" and "cut" I could iterate through every folder to do this, but I was hoping for a quicker-running and easier command. Preferably a single-line shell command (possibly with a -R parameter)
The find outputs all the access times in Unix format, then sort and take the biggest.
Converting into whatever date format is wanted is left as an exercise for the reader:
find /path -type f -iname "*.php" -printf "%T#" | sort -n | tail -1
GNU find
find /path -type -f -iname "*.php" -printf "%T+"
check the find man page to play with other -printf specifiers.
You might want to look at a inotify script that updates the footer every time any other file is modified, instead of looking all through the file system for new updates.

Why did my use of the read command not do what I expected?

I did some havoc on my computer, when I played with the commands suggested by vezult [1]. I expected the one-liner to ask file-names to be removed. However, it immediately removed my files in a folder:
> find ./ -type f | while read x; do rm "$x"; done
I expected it to wait for my typing of stdin:s [2]. I cannot understand its action. How does the read command work, and where do you use it?
What happened there is that read reads from stdin. When you put it at the end of a pipe, it read from that pipe.
So your find becomes
file1
file2
and so on; read reads that and replaces x successively with file1 then file2, and so your loop becomes
rm "file1"
rm "file2"
and sure enough, that rm's every file starting at the current directory ".".
A couple hints.
You didn't need the "/".
It's better and safer to say
find . -type f
because should you happen to type ". /" (ie, dot SPACE slash) find will start at the current directory and then go look starting at the root directory. That trick, given the right privileges, would delete every file in the computer. "." is already the name of a directory; you don't need to add the slash.
The find or rm commands will do this
It sounds like what you wanted to do was go through all the files in all the directories starting at the current directory ".", and have it ASK if you want to delete it. You could do that with
find . -type f -exec rm -i {} \;
or
find . -type f -ok rm {} \;
and not need a loop at all. You can also do
rm -r -i *
and get nearly the same effect, except that it will try to delete directories too. If the directory is empty, that'll even work.
Another thought
Come to think of it, unless you have a LOT of files, you could also do
rm -i `find . -type f`
Now the find in backquotes will become a bunch of file names on the command line, and the '-i' interactive flag on rm will ask the yes or no question.
Charlie Martin gives you a good dissection and explanation of what went wrong with your specific example, but doesn't address the general question of:
When should you use the read command?
The answer to that is - when you want to read successive lines from some file (quite possibly the standard output of some previous sequence of commands in a pipeline), possibly splitting the lines into several separate variables. The splitting is done using the current value of '$IFS', which normally means on blanks and tabs (newlines don't count in this context; they separate lines). If there are multiple variables in the read command, then the first word goes into the first variable, the second into the second, ..., and the residue of the line into the last variable. If there's only one variable, the whole line goes into that variable.
There are many uses. This is one of the simpler scripts I have that uses the split option:
#!/bin/ksh
#
# #(#)$Id: mkdbs.sh,v 1.4 2008/10/12 02:41:42 jleffler Exp $
#
# Create basic set of databases
MKDUAL=$HOME/bin/mkdual.sql
ELEMENTS=$HOME/src/sqltools/SQL/elements.sql
cat <<! |
mode_ansi with log mode ansi
logged with buffered log
unlogged
stores with buffered log
!
while read dbs logging
do
if [ "$dbs" = "unlogged" ]
then bw=""; cw=""
else bw="-ebegin"; cw="-ecommit"
fi
sqlcmd -xe "create database $dbs $logging" \
$bw -e "grant resource to public" -f $MKDUAL -f $ELEMENTS $cw
done
The cat command with a here-document has its output sent to a pipe, so the output goes into the while read dbs logging loop. The first word goes into $dbs and is the name of the (Informix) database I want to create. The remainder of the line is placed into $logging. The body of the loop deals with unlogged databases (where begin and commit do not work), then run a program sqlcmd (completely separate from the Microsoft new-comer of the same name; it's been around since about 1990) to create a database and populate it with some standard tables and data - a simulation of the Oracle 'dual' table, and a set of tables related to the 'table of elements'.
Other scripts that use the read command are bigger (by far), but generally read lines containing one or more file names and some other attributes of relevance, and then apply an appropriate transform to the files using the attributes.
Osiris JL: file * | grep 'sh.*script' | sed 's/:.*//' | xargs wgrep read
esqlcver:read version letter
jlss: while read directory
jlss: read x || exit
jlss: read x || exit
jlss: while read file type link owner group perms
jlss: read x || exit
jlss: while read file type link owner group perms
kb: while read size name
mkbod: while read directory
mkbod:while read dist comp
mkdbs:while read dbs logging
mkmsd:while read msdfile master
mknmd:while read gfile sfile version notes
publictimestamp:while read name type title
publictimestamp:while read name type title
Osiris JL:
'Osiris JL: ' is my command line prompt; I ran this in my 'bin' directory. 'wgrep' is a variant of grep that only matches entire words (to avoid words like 'already'). This gives some indication of how I've used it.
The 'read x || exit' lines are for an interactive script that reads a response from standard input, but exits if the command gets EOF (for example, if standard input comes from /dev/null).

Resources