Delete files with a specific pattern using script in UNIX - unix

I have some folders in unix, lets say aa, ab, ac and so on. I have subfolders inside these folders. They are numbered like 100, 200 and so on. I want to delete some sub folders in each of these main folders. The sub folders to be deleted must be greater than a specific number(say anything above 700) How can I do this using a script? Please help.

I would use the find command. You can do something like this:
find . -name '[7-9][0-9][0-9]' -execdir echo 'rm -vr' {} +
Of course, you may need to tweak the pattern to hit the right names, but I would need more information to help with that.

#!/bin/bash
if [ $# -ne 2 ]
then
echo Usage: $0 searchdir limit
exit 1
fi
searchdir="$1"
limit="$2"
find $searchdir -type d |
egrep "/[0-9]+$" |
while read dirname
do
let num=`basename "$dirname"`
if [ $num -ge $limit ]
then
echo rm -rf "$dirname"
fi
done
Run with:
./script.sh dirtosearch thresholdfordelete
When you're sure it's ok, remove the echo before rm -rf

You can do it all using find.
In the following command, find passes the files to sh which checks if they are >700 and if so echoes out a delete. (You can obviously remove the echo if you really want to delete.)
find . -type d -regex "^.*/[0-9]+$" -exec sh -c 'f="{}";[ $(basename "$f") -gt 700 ] && echo "rm -rf $f"' \;

Related

Sorted List of folders (by size) of an specific user

I need a command to get a list of all folders owned by a specific user.
Is there a way to get them sorted by size and in addition with the total sum of all sizes?
My recent try was:
#!/bin/bash
read -p "Enter user: " username
for folder in $(find /path/ -maxdepth 4 -user $username) ; do
du -sh $folder | sort -hr
done
Thanks in advance!
Try this (version 1, incomplete, see version 2 below!)
#!/bin/bash
tmp_du=$(mktemp)
path="/path"
find "$path" -maxdepth 4 -type d -print0 | while IFS= read -r -d '' directory
do
du -sh "$directory" >>"$tmp_du"
done
sort -hr "$tmp_du"
echo ""
echo "TOTAL"
du -sh "$path"
rm -f "$tmp_du"
The find with -print0 is explained here: https://mywiki.wooledge.org/BashFAQ/001
Since you want the size of each directory, you have to store all the results in a file, then sort it.
Version 2 with the -user which I forgot in my first answer, plus a total that takes into account only the directories for that user:
#!/bin/bash
read -rp "Enter user: " username
tmp_du=$(mktemp)
path="/path"
# Get a list of directories owned by "$username", and their size, store it in a file
find "$path" -maxdepth 4 -type d -user "$username" -exec du -s {} \; 2>/dev/null | sort -n >"$tmp_du"
# Add the first column of that file
sum=$( awk '{ sum+=$1 } END { print sum; }' "$tmp_du" )
# Output, must output first column in human readable format
numfmt --field=1 --to=iec <"$tmp_du"
# Output total
echo "Total: $(echo "$sum" | numfmt --to=iec)"
rm -f "$tmp_du"
here all the directory sizes are stored in a file
the total is the sum of the first column
to output the numbers in a format similar to -h of du, use numfmt.

How to cleanup the graphite whisper's data?

I want to delete the graphite's storage whisper's data but there ain't anything in the graphite docs.
One way I did is deleting the the files at /opt/graphite...../whispers/stats... manually.
But this is tedious, so how do I do it?
Currently, deleting files from /opt/graphite/storage/whisper/ is the correct way to clean up whisper data.
As for the tedious side of the process, you could use the find command if there is a certain pattern that your trying to remove.
find /opt/graphite/storage/whisper -name loadavg.wsp -delete
Similar Question on answers.launchpad.net/graphite
I suppose that this is going into Server Fault territory, but I added
the following cron job to delete old metrics of ours that haven't been
written to for over 30 days (e.g. of cloud instances that have been
disposed):
find /mnt/graphite/storage -mtime +30 | grep -E \
"/mnt/graphite/storage/whisper/collectd/app_name/[^/]*" -o \
| uniq | xargs rm -rf
This will delete directories which have valid data.
First:
find whisperDir -mtime +30 -type f | xargs rm
And then delete empty dirs
find . -type d -empty | xargs rmdir
This last step should be repeated, because may be new empty directories will be left.
As people have pointed out, removing the files is the way to go. Expanding on previous answers, I made this script that removes any file that has exceeded its max retention age. Run it as a cronjob fairly regularly.
#!/bin/bash
d=$1
now=$(date +%s)
MINRET=86400
if [ -z "$d" ]; then
echo "Must specify a directory to clean" >&2
exit 1
fi
find $d -name '*.wsp' | while read w; do
age=$((now - $(stat -c '%Y' "$w")))
if [ $age -gt $MINRET ]; then
retention=$(whisper-info.py $w maxRetention)
if [ $age -gt $retention ]; then
echo "Removing $w ($age > $retention)"
rm $w
fi
fi
done
find $d -empty -type d -delete
A couple of bits to be aware of - the whisper-info call is quite heavyweight. To reduce the number of calls to it I've put the MINRET constant in, so that no file will be considered for deletion until it is 1 day old (24*60*60 seconds) - adjust to fit your needs. There are probably other things that can be done to shard the job or generally improve its efficiency, but I haven't had need to as yet.

Shell script to process files

I need to write a Shell Script to process a huge folder of nearly 20 levels.I have to process each and every file and check which files contain lines like
select
insert
update
When I mean line it should take the line till I find a semicolon in that file.
I should get a result like this
C:/test.java select * from dual
C:/test.java select * from test
C:/test1.java select * from tester
C:/test1.java select * from dual
and so on.Right now I have a script to read all the files
#!bin/ksh
FILE=<FILEPATH to be traversed>
TEMPFILE=<Location of Temp file>
cd $FILE
for f in `find . ! -type d`;
do
cat $FILE/addedText.txt>>$TEMPFILE/newFile.txt
cat $f>>$TEMPFILE/newFile.txt
rm $f
cat $TEMPFILE/newFile.txt>>$f
rm $TEMPFILE/newFile.txt
done
I have very little knowledge of awk and sed to proceed further in reading each file and achieve what I want to.Can anyone help me in this
if you have GNU find/gawk
find /path -type f -name "*.java" | while read -r FILE
do
awk -vfile="$FILE" 'BEGIN{RS=";"}
/select|update|insert/{
b=gensub(/(.*)(select|update|insert)(.*)/,"\\2\\3","g",$0)
gsub(/\n+/,"",b)
print file,b
}
' "$FILE"
done
if you are on Solaris, use nawk
find /path -type f -name "test*file" | while read -r FILE
do
nawk -v file="$FILE" 'BEGIN{RS=";"}
/select/{ gsub(/.*select/,"select");gsub(/\n+/,"");print file,$0; }
/update/{ gsub(/.*update/,"update");gsub(/\n+/,"");print file,$0; }
/insert/{ gsub(/.*insert/,"insert");gsub(/\n+/,"");print file,$0; }
' "$FILE"
done
Note this is simplistic case. your SQL statement might be more complicated.

Unix Find Replace Special Characters in Multiple Files

I've got a set of files in a web root that all contain special characters that I'd like to remove (Â,€,â,etc).
My command
find . -type f -name '*.*' -exec grep -il "Â" {} \;
finds & lists out the files just fine, but my command
find . -type f -name '*.*' -exec tr -d 'Â' '' \;
doesn't produce the results I'm looking for.
Any thoughts?
to replace all non-ascii characters in all files inside the current directory you could use:
find . -type f | xargs perl -pi.bak -e 's,[^[:ascii:]],,g'
afterwards you will have to find and remove all the '.bak' files:
find . -type f -a -name \*.bak | xargs rm
I would recommend looking into sed. It can be used to replace the contents of the file.
So you could use the command:
find . -type f -name '*.*' -exec sed -i "s/Â//" {} \;
I have tested this with a simple example and it seems to work. The -exec should handle files with whitespace in their name, but there may be other vulnerabilities I'm not aware of.
Use
tr -d 'Â'
What does the ' ' stands for? On my system using your command produces this error:
tr: extra operand `'
Only one string may be given when deleting without squeezing repeats.
Try `tr --help' for more information.
sed 's/ø//' file.txt
That should do the trick for replacing a special char with an empty string.
find . -name "*.*" -exec sed 's/ø//' {} \
It would be helpful to know what "doesn't produce the results I'm looking for" means. However, in your command tr is not provided with the filenames to process. You could change it to this:
find . -type f -name '*.*' -exec tr -d 'Â' {} \;
Which is going to output everything to stdout. You probably want to modify the files instead. You can use Grundlefleck's answer, but one of the issues alluded to in that answer is if there are large numbers of files. You can do this:
find . -type f -name '*.*' -print0 | xargs -0 -I{} sed -i "s/Â//" \{\}
which should handle files with spaces in their names as well as large numbers of files.
with bash shell
for file in *.*
do
case "$file" in
*[^[:ascii:]]* )
mv "$file" "${file//[^[:ascii:]]/}"
;;
esac
done
I would use something like this.
for file in `find . -type f`
do
# Search for char end remove it. Save file as file.new
sed -e 's/[ۉ]//g' $file > $file.new
# mv file.new to file DON'T RUN IF YOU WILL NOT OVERITE ORIGINAL FILE
mv $file.new $file
done
The above script will fail as levislevis85 has mentioned it with spaces in filenames. This would not be the case if you use the following code.
find . -type f | while read file
do
# Search for char end remove it. Save file as file.new
sed -e 's/[ۉ]//g' "$file" > "$file".new
# mv file.new to file DON'T RUN IF YOU WILL NOT OVERITE ORIGINAL FILE
mv "$file".new "$file"
done

With Unix find(1), how do I find files in one tree newer than counterparts in another tree?

Let's say I have two directory structures:
/var/www/site1.prod
/var/www/site1.test
I want to use find(1) to see the files that are newer in /var/www/site1.test than their counterparts in /var/www/site1.prod.
Can I do this with find(1), and if so, how?
You also could use rsync -n
rsync -av -n /var/www/site1.test /var/www/site1.prod
should do it.
Using find,
cd /var/www/site1.test
find . -type f -print | while read file ; do
[ "$file" -nt /var/www/site1.prod/"$file" ] && echo "File '$file' changed"
done
This will work with filenames containing blanks, as well as work for a large volume of files.
To distinguish between modified and missing files, as per Eddie's comments,
cd /var/www/site1.test
find . -type f -print | while read file ; do
[ "$file" -nt /var/www/site1.prod/"$file" ] && reason=changed
[ \! -e /var/www/site1.prod/"$file" ] && reason=created
[ -n "$reason" ] && echo "$file $reason"
done
I think you can't do it with find alone, but you can do something like
$ cd /var/www/site1.test
$ files=`find .`
$ for $f in $files; do
if [ $f -nt /var/www/site1.prod/$f ]; then echo "$f changed!"; fi;
done
If you look at the -fprintf option you should be able to create two finds that output into two files that list the files name and modification time. You should then be able to just diff the two files to see what has changed.
I understand that you specifically asked for newer, but I think it's always good to know all your options.
I would use diff -ur dir1 dir2.
Maybe the timestamp changed, maybe it didn't, but that doesn't necessarily tell you if the contents are the same. Diff will tell you if the contents changed. If you really don't want to see the contents use diff -rq to just see the files that changed.

Resources