Shell script to process files - unix

I need to write a Shell Script to process a huge folder of nearly 20 levels.I have to process each and every file and check which files contain lines like
select
insert
update
When I mean line it should take the line till I find a semicolon in that file.
I should get a result like this
C:/test.java select * from dual
C:/test.java select * from test
C:/test1.java select * from tester
C:/test1.java select * from dual
and so on.Right now I have a script to read all the files
#!bin/ksh
FILE=<FILEPATH to be traversed>
TEMPFILE=<Location of Temp file>
cd $FILE
for f in `find . ! -type d`;
do
cat $FILE/addedText.txt>>$TEMPFILE/newFile.txt
cat $f>>$TEMPFILE/newFile.txt
rm $f
cat $TEMPFILE/newFile.txt>>$f
rm $TEMPFILE/newFile.txt
done
I have very little knowledge of awk and sed to proceed further in reading each file and achieve what I want to.Can anyone help me in this

if you have GNU find/gawk
find /path -type f -name "*.java" | while read -r FILE
do
awk -vfile="$FILE" 'BEGIN{RS=";"}
/select|update|insert/{
b=gensub(/(.*)(select|update|insert)(.*)/,"\\2\\3","g",$0)
gsub(/\n+/,"",b)
print file,b
}
' "$FILE"
done
if you are on Solaris, use nawk
find /path -type f -name "test*file" | while read -r FILE
do
nawk -v file="$FILE" 'BEGIN{RS=";"}
/select/{ gsub(/.*select/,"select");gsub(/\n+/,"");print file,$0; }
/update/{ gsub(/.*update/,"update");gsub(/\n+/,"");print file,$0; }
/insert/{ gsub(/.*insert/,"insert");gsub(/\n+/,"");print file,$0; }
' "$FILE"
done
Note this is simplistic case. your SQL statement might be more complicated.

Related

trouble listing directories that contain files with specific file extensions

How to I list only directories that contain certain files. I am running on a Solaris box. Example, I want to list sub-directories of directory ABC that contain files that end with .out, .dat and .log .
Thanks
Something along these lines might work out for you:
find ABC/ \( -name "*.out" -o -name "*.log" \) -print | while read f
do
echo "${f%/*}"
done | sort -u
The sort -u bit could be just uniq instead, but either should work.
Should work on bash or ksh. Probably not so much on /bin/sh - you'd have to replace the variable expansion with something like echo "${f}" | sed -e 's;/[^/]*$;;' or something else that would strip off the last component of the path. dirname "${f}" would be good for that, but I don't recall if Solaris includes that utility...

I want to recursively insert two lines in all files of my directory where it was not present?

I have a directory customer. I have many customers in customer directory.
Now I want to add two lines in some process_config file within customer directory where it was not available.
For example:
/home/sam/customer/a1/na/process_config.txt
/home/sam/customer/p1/emea/process_config.txt
and so so.
Is this possible by single command like find & sed?
With a simple for loop :
for file in /home/sam/customer/*/*/process_config.txt; do
printf "one line\nanother line\n" >> "$file"
done
find /home/sam/customer -name 'process_config.txt' -exec DoYourAddWithSedAwkEchoOrWhatever {} \;
find give you the possibility to select each wanted (selected) file
option -exec start a subshell with your command on this file.
{} is the file name (full name) in this case.
Use \; as end of command for the iteration (other command couldbe used with the standard behaviour of ; ex -exec echo 'line1' >> {} ; echo "line2" >> {} \;
sed, awk or echo like in sample can modify the file

Search files and run a script on every result - Cont:

I would like to know how to search certain pattern of files (GunZip Files) in all Sub Directories ( Month wise / Date wise - Sub Directories created).
And then, execute a script on the found files. Also need to populate FILENAME along with output for tracking purpose and further analysis on that particular files.
Step1: For example: currently searching files on this pattern TT_DETAIL*.gz.
find /cygdrive/c/Test/ -name TT_DETAIL*.gz
output#1:
/cygdrive/c/Test/Feb2014/TT_DETAIL_20141115.csv.gz
/cygdrive/c/Test/Jan2014/TT_DETAIL_20141110.csv.gz
/cygdrive/c//Test/Mar2014/TT_DETAIL_20141120.csv.gz
Step2:
zcat TT_DETAIL*.gz | awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' >Op_TT_Detail.txt
cat Op_TT_Detail.txt
ZZZ,AAA,ECH,1,10,XXX
ZZZ,BBB,ECH,1,10,XXX
ZZZ,CCC,ECH,1,10,XXX
ZZZ,DDD,ECH,1,10,XXX
Thanks fedorqui for below script is working fine without FILENAME.
while IFS= read -r file
do
awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' <(zcat "$file") >>Op_TT_Detail.txt
done < <(find /cygdrive/c/Test/ -name TT_DETAIL*.gz)
Have tried below command to populate FILENAME along with output for tracking purpose :
while IFS= read -r file
do
awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17,FILENAME}' <(zcat "$file") >>Op_TT_Detail.txt
done < <(find /cygdrive/c/Test/ -name TT_DETAIL*.gz)
Desired Output:
ZZZ,AAA,ECH,1,10,XXX,/cygdrive/c/Test/Feb2014/TT_DETAIL_20141115.csv.gz
ZZZ,BBB,ECH,1,10,XXX,/cygdrive/c/Test/Feb2014/TT_DETAIL_20141115.csv.gz
ZZZ,CCC,ECH,1,10,XXX,/cygdrive/c//Test/Mar2014/TT_DETAIL_20141120.csv.gz
ZZZ,DDD,ECH,1,10,XXX,/cygdrive/c//Test/Mar2014/TT_DETAIL_20141120.csv.gz
Since FILENAME is not working for *.gz files , should I write" find /cygdrive/c/Test/ -name TT_DETAIL*.gz " into another output file
then call that output file into script , I don't have a write access for source files located server.
Looking for your suggestions !!!
Nice to see you are using the snippet I wrote in the previous question!
I would use this:
while IFS= read -r file
do
awk -v file="$file" 'BEGIN { FS=OFS=","} \
{ if ($11=="10") print $2,$3,$6,$10,$11,$17, file}' \
<(zcat "$file") >>Op_TT_Detail.txt
done < <(find /cygdrive/c/Test/ -name TT_DETAIL*.gz)
That is, with -v file="$file" you give the file name as a variable to awk. And then you use it in your print command.

Delete files with a specific pattern using script in UNIX

I have some folders in unix, lets say aa, ab, ac and so on. I have subfolders inside these folders. They are numbered like 100, 200 and so on. I want to delete some sub folders in each of these main folders. The sub folders to be deleted must be greater than a specific number(say anything above 700) How can I do this using a script? Please help.
I would use the find command. You can do something like this:
find . -name '[7-9][0-9][0-9]' -execdir echo 'rm -vr' {} +
Of course, you may need to tweak the pattern to hit the right names, but I would need more information to help with that.
#!/bin/bash
if [ $# -ne 2 ]
then
echo Usage: $0 searchdir limit
exit 1
fi
searchdir="$1"
limit="$2"
find $searchdir -type d |
egrep "/[0-9]+$" |
while read dirname
do
let num=`basename "$dirname"`
if [ $num -ge $limit ]
then
echo rm -rf "$dirname"
fi
done
Run with:
./script.sh dirtosearch thresholdfordelete
When you're sure it's ok, remove the echo before rm -rf
You can do it all using find.
In the following command, find passes the files to sh which checks if they are >700 and if so echoes out a delete. (You can obviously remove the echo if you really want to delete.)
find . -type d -regex "^.*/[0-9]+$" -exec sh -c 'f="{}";[ $(basename "$f") -gt 700 ] && echo "rm -rf $f"' \;

Unix Find Replace Special Characters in Multiple Files

I've got a set of files in a web root that all contain special characters that I'd like to remove (Â,€,â,etc).
My command
find . -type f -name '*.*' -exec grep -il "Â" {} \;
finds & lists out the files just fine, but my command
find . -type f -name '*.*' -exec tr -d 'Â' '' \;
doesn't produce the results I'm looking for.
Any thoughts?
to replace all non-ascii characters in all files inside the current directory you could use:
find . -type f | xargs perl -pi.bak -e 's,[^[:ascii:]],,g'
afterwards you will have to find and remove all the '.bak' files:
find . -type f -a -name \*.bak | xargs rm
I would recommend looking into sed. It can be used to replace the contents of the file.
So you could use the command:
find . -type f -name '*.*' -exec sed -i "s/Â//" {} \;
I have tested this with a simple example and it seems to work. The -exec should handle files with whitespace in their name, but there may be other vulnerabilities I'm not aware of.
Use
tr -d 'Â'
What does the ' ' stands for? On my system using your command produces this error:
tr: extra operand `'
Only one string may be given when deleting without squeezing repeats.
Try `tr --help' for more information.
sed 's/ø//' file.txt
That should do the trick for replacing a special char with an empty string.
find . -name "*.*" -exec sed 's/ø//' {} \
It would be helpful to know what "doesn't produce the results I'm looking for" means. However, in your command tr is not provided with the filenames to process. You could change it to this:
find . -type f -name '*.*' -exec tr -d 'Â' {} \;
Which is going to output everything to stdout. You probably want to modify the files instead. You can use Grundlefleck's answer, but one of the issues alluded to in that answer is if there are large numbers of files. You can do this:
find . -type f -name '*.*' -print0 | xargs -0 -I{} sed -i "s/Â//" \{\}
which should handle files with spaces in their names as well as large numbers of files.
with bash shell
for file in *.*
do
case "$file" in
*[^[:ascii:]]* )
mv "$file" "${file//[^[:ascii:]]/}"
;;
esac
done
I would use something like this.
for file in `find . -type f`
do
# Search for char end remove it. Save file as file.new
sed -e 's/[ۉ]//g' $file > $file.new
# mv file.new to file DON'T RUN IF YOU WILL NOT OVERITE ORIGINAL FILE
mv $file.new $file
done
The above script will fail as levislevis85 has mentioned it with spaces in filenames. This would not be the case if you use the following code.
find . -type f | while read file
do
# Search for char end remove it. Save file as file.new
sed -e 's/[ۉ]//g' "$file" > "$file".new
# mv file.new to file DON'T RUN IF YOU WILL NOT OVERITE ORIGINAL FILE
mv "$file".new "$file"
done

Resources