find and then grep and then iterate through list of files - unix

I have following script to replace text.
grep -l -r "originaltext" . |
while read fname
do
sed 's/originaltext/replacementText/g' $fname > tmp.tmp
mv tmp.tmp $fname
done
Now in the first statement of this script , I want to do something like this.
find . -name '*.properties' -exec grep "originaltext" {} \;
How do I do that?
I work on AIX, So --include-file wouldn't work .

In general, I prefer to use find to FIND files rather than grep. It looks obvious : )
Using process substitution you can feed the while loop with the result of find:
while IFS= read -r fname
do
sed 's/originaltext/replacementText/g' $fname > tmp.tmp
mv tmp.tmp $fname
done < <(find . -name '*.properties' -exec grep -l "originaltext" {} \;)
Note I use grep -l (big L) so that grep just returns the name of the file matching the pattern.

You could go the other way round and give the list of '*.properties' files to grep. For example
grep -l "originaltext" `find -name '*.properties'`
Oh, and if you're on a recent linux distribution, there is an option in grep to achieve that without having to create that long list of files as argument
grep -l "originaltext" --include='*.properties' -r .

Related

Combine find, grep and xargs with printf

I have a find command combined with exec grep and a printf option :
find -L /home/blast/dirtest -maxdepth 3 **-exec grep -q "pattern" {} \;** -printf '%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' 2> /dev/null
Result :
f/#/2018-01-01 10:00:00/#/191/#/filee.xml/#//#//home/blast/dirtest/01/05
I need the printf to get all the desired file informations at once (date, type size etc)
The above command works fine. But the exec option is too slow comparing to xargs.
I tryed to do the same with xarg but I did not succeed.
Any Idea on how to acheive that ? using the xargs command keeping the desired printf or similar .
Thanks
Your code is:
find -L /home/blast/dirtest -maxdepth 3 \
-exec grep -q "pattern" {} \; \
-printf '%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' 2> /dev/null
This invokes a new grep process for each file.
If you are using GNU utilities, you can reduce the number of grep processes by something like:
(
format=\''%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n'\'
find -L /home/blast/dirtest -maxdepth 3 -print0 |\
xargs -0 grep -l -Z "pattern" |\
xargs -0 sh -c 'find "$#" -printf '"$format" --
) 2>/dev/null
for clarity, store the formatstring in a variable
use -print0 / -0 / -Z options to enable null-delimited data
generate initial filelist with find
filter on "pattern" with grep (use of xargs minimises the number of times grep gets called)
feed the filtered filelist into another xargs to run a minimal number of find -printf
in second xargs, call a subshell so that extra arguments can be appended (find requires the paths to precede the operators)
dummy second argument (--) to the sh -c invocation prevents the first filename being lost due to assignment to $0
To do it exactly how you want:
find -L /home/blast/dirtest/ -maxdepth 3 \
-printf '%p#%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' \
> tmp.out
cut -d# -f1 tmp.out \
| xargs grep -l "pattern" 2>/dev/null \
| sed 's/^/^/; s/$/#/' \
| grep -f /dev/stdin tmp.out \
| sed 's/^.*#//'
This operates under the assumption that you have no character # in your file names.
What it does is avoid the grep at first and just dump all the files with the requested metadata to a temporary file.
But it also prefixes each line with the full path (%p#).
Then we extract (cut) the full paths out of this list and list the files which contains the pattern (xargs grep).
We then use sed to prefix each such file name with ^ and suffix it with #, which makes it a greppable pattern in our tmp.out file.
Then we use this pattern (grep -f /dev/stdin) to extract only those paths from the big list in tmp.out.
Now all that's left is to remove the artificial full path we prefixed using the last sed command.
Seeing how you used /home, there's a good chance you're on Linux, which, if you're willing to accept some output format changes, allows you to do it somewhat more elegantly:
find -L /home/blast/dirtest/ -maxdepth 3 \
| xargs grep -l "pattern" 2>/dev/null \
| xargs stat --printf '%F/#/%y/#/%s/#/%n\n'
The output of stat --printf is different from that of find -printf (and from that of MacOS' stat -f), but it's the same information.
Do note, however, that because you passed -L to find, and you're grepping the result:
The results are limited to file types which can be grepped, so they will never be directories, links, etc..
If you stumble upon a broken link, it will not be in the output because it cannot be grepped.
I'v found an intresting thing about the -exec option.
We could run the grep once using the exec with the plus-sign (+)
-exec command {} +
This variant of the -exec option runs the specified command on the selected files, but the command line is built by appending each selected file name at the end; the total
number of invocations of the command will be much less than the number of matched files. The command line is built in much the same way that xargs builds its command
lines. Only one instance of ’{}’ is allowed within the command. The command is executed in the starting directory.
That means if I change this :
-exec grep -l 'pattern' {} \;
By this ( replace the semicolon with the plus signe ):
-exec grep -l 'pattern' {} \+
Will improve the performance significantly.
Then I can pipe only one xargs for the format printing needs only.

How to grep for files containing a specific word and pass the list of files as argument to second command?

grep rli "stringName" * | xargs <second_command> <list_of files>
will the above code work for the functionality mentioned?
I am a beginner to not sure how to use it.
You are just missing hyphen for options to grep. Following should work
grep -rli "stringName" * | xargs <second_command>
Considering above command cannot handle whitespace or weird characters in file names, more robust solution would be to use find
find . -type f -exec grep -qi "stringName" {} + -print0 | xargs -0 <second_command>
Or use -Z option with xargs -0
grep -rli "stringName" * -Z | xargs -0 <second_command>
Extending on jkshah's answer, which is already quite good.
find . -type f -exec grep -qi "regex" {} \; -exec "second_command" {} \;
This has the advantage of being more portable (-print0 and -0 are gnu extensions).
It executes the second command for each matching file in turn. If you want to execute with a list of all matching files at the end instead, change the last \; to +

Remove underscores from all filenames within a directory

I have a folder "model" with files named like:
a_EmployeeData
a_TableData
b_TestData
b_TestModel
I basically need to drop the underscore and make them:
aEmployeeData
aTableData
bTestData
bTestModel
Is there away in the Unix Command Line to do so?
This will correctly process files containing odd characters like spaces or even newlines and should work on any Unix / Linux distribution being only based on POSIX syntax.
find model -type f -name "*_*" -exec sh -c 'd=$(dirname "$1"); mv "$1" "$d/$(basename "$1" | tr -d _)"' sh {} \;
Here is what it does:
For each file (not directory) containing an underscore in its name under the model directory and its subdirectories, rename the file in place with all the underscores stripped out.
You can do this simply with bash.
for file in /path/to/model/*; do
mv "$file" "${file/_/}"
done
If you have rename command available then simply do
rename 's/_//' /path/to/model/*
for f in model/* ; do mv "$f" `echo "$f" | sed 's/_//g'` ; done
Edit: modified a few things thanks to suggestions by others, but I'm afraid my code is still bad for strange filenames.
maybe this:
find model -name "*_*" -type f -maxdepth 1 -print | sed -e 'p;s/_//g' | xargs -n2 echo mv
Decomposition:
find all plain files in the directory model what contains at least one underscore, and don't search subdirectories
with the sed make filename adjustments - replace the _ with nothing
also print the old name
fed the two filenames to xargs what will rename the files with mv
The above is for a dry-run. When satisfied, remove the echo before mv for actual rename.
Warning: Will not work if filename contains spaces. If you have GNU sed you can
find . -name "*_*" -maxdepth 1 -print0 | sed -z 'p;s/_//g' | xargs -0 -n2 echo mv
and will works with a filenames with spaces too...
In zsh:
autoload zmv # in ~/.zshrc
cd model && zmv '(**/)(*)' '$1${2//_}'
marc#panic:~$ echo 'a_EmployeeData' | tr -d '_'
aEmployeeData
I had the same problem on my machine, but the filenames had more than one underscore. I used rename with the g option so that all underscores get removed:
find model/ -maxdepth 1 -type f | rename 's/_//g'
Or if there are no subdirectories, just
rename 's/_//g'
If you don't have rename, see Jaypal Singh's answer.
Use the global flag /g with your replace pattern to replace all occurrences within the filename.
find . -type f -print0 | xargs -0 rename 's/_//g'
Or if you want underscores replaced with spaces then use this:
find . -type f -print0 | xargs -0 rename 's/_/ /g'
If you like to live dangerously add the force flag -f in front of your replace pattern rename -f 's/_//g'

search and replace a string

is there a way to search and replace a string using single unix command grep recusrsively in multiple directories?
i know it can be done by using the combination of find with other utilities like sed perl etc.but is there a way where we can use only grep for doing this on unix command line?
I don't think that only grep would work here; involving sed and other utilities will be much more easier, than just grep
one way, if you have GNU find and bash shell
find /path -type f -iname "*.txt" | while read -r FILE
do
while read -r LINE
do
case "$LINE" in
*WORD_TO_SEARCH* ) LINE=${LINE//WORD_TO_SEARCH/REPLACE};;
esac
echo "$LINE" >> temp
done < "$FILE"
mv temp "$FILE"
done

Unix Find Replace Special Characters in Multiple Files

I've got a set of files in a web root that all contain special characters that I'd like to remove (Â,€,â,etc).
My command
find . -type f -name '*.*' -exec grep -il "Â" {} \;
finds & lists out the files just fine, but my command
find . -type f -name '*.*' -exec tr -d 'Â' '' \;
doesn't produce the results I'm looking for.
Any thoughts?
to replace all non-ascii characters in all files inside the current directory you could use:
find . -type f | xargs perl -pi.bak -e 's,[^[:ascii:]],,g'
afterwards you will have to find and remove all the '.bak' files:
find . -type f -a -name \*.bak | xargs rm
I would recommend looking into sed. It can be used to replace the contents of the file.
So you could use the command:
find . -type f -name '*.*' -exec sed -i "s/Â//" {} \;
I have tested this with a simple example and it seems to work. The -exec should handle files with whitespace in their name, but there may be other vulnerabilities I'm not aware of.
Use
tr -d 'Â'
What does the ' ' stands for? On my system using your command produces this error:
tr: extra operand `'
Only one string may be given when deleting without squeezing repeats.
Try `tr --help' for more information.
sed 's/ø//' file.txt
That should do the trick for replacing a special char with an empty string.
find . -name "*.*" -exec sed 's/ø//' {} \
It would be helpful to know what "doesn't produce the results I'm looking for" means. However, in your command tr is not provided with the filenames to process. You could change it to this:
find . -type f -name '*.*' -exec tr -d 'Â' {} \;
Which is going to output everything to stdout. You probably want to modify the files instead. You can use Grundlefleck's answer, but one of the issues alluded to in that answer is if there are large numbers of files. You can do this:
find . -type f -name '*.*' -print0 | xargs -0 -I{} sed -i "s/Â//" \{\}
which should handle files with spaces in their names as well as large numbers of files.
with bash shell
for file in *.*
do
case "$file" in
*[^[:ascii:]]* )
mv "$file" "${file//[^[:ascii:]]/}"
;;
esac
done
I would use something like this.
for file in `find . -type f`
do
# Search for char end remove it. Save file as file.new
sed -e 's/[ۉ]//g' $file > $file.new
# mv file.new to file DON'T RUN IF YOU WILL NOT OVERITE ORIGINAL FILE
mv $file.new $file
done
The above script will fail as levislevis85 has mentioned it with spaces in filenames. This would not be the case if you use the following code.
find . -type f | while read file
do
# Search for char end remove it. Save file as file.new
sed -e 's/[ۉ]//g' "$file" > "$file".new
# mv file.new to file DON'T RUN IF YOU WILL NOT OVERITE ORIGINAL FILE
mv "$file".new "$file"
done

Resources