search and replace a string - unix

is there a way to search and replace a string using single unix command grep recusrsively in multiple directories?
i know it can be done by using the combination of find with other utilities like sed perl etc.but is there a way where we can use only grep for doing this on unix command line?

I don't think that only grep would work here; involving sed and other utilities will be much more easier, than just grep

one way, if you have GNU find and bash shell
find /path -type f -iname "*.txt" | while read -r FILE
do
while read -r LINE
do
case "$LINE" in
*WORD_TO_SEARCH* ) LINE=${LINE//WORD_TO_SEARCH/REPLACE};;
esac
echo "$LINE" >> temp
done < "$FILE"
mv temp "$FILE"
done

Related

Combine find, grep and xargs with printf

I have a find command combined with exec grep and a printf option :
find -L /home/blast/dirtest -maxdepth 3 **-exec grep -q "pattern" {} \;** -printf '%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' 2> /dev/null
Result :
f/#/2018-01-01 10:00:00/#/191/#/filee.xml/#//#//home/blast/dirtest/01/05
I need the printf to get all the desired file informations at once (date, type size etc)
The above command works fine. But the exec option is too slow comparing to xargs.
I tryed to do the same with xarg but I did not succeed.
Any Idea on how to acheive that ? using the xargs command keeping the desired printf or similar .
Thanks
Your code is:
find -L /home/blast/dirtest -maxdepth 3 \
-exec grep -q "pattern" {} \; \
-printf '%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' 2> /dev/null
This invokes a new grep process for each file.
If you are using GNU utilities, you can reduce the number of grep processes by something like:
(
format=\''%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n'\'
find -L /home/blast/dirtest -maxdepth 3 -print0 |\
xargs -0 grep -l -Z "pattern" |\
xargs -0 sh -c 'find "$#" -printf '"$format" --
) 2>/dev/null
for clarity, store the formatstring in a variable
use -print0 / -0 / -Z options to enable null-delimited data
generate initial filelist with find
filter on "pattern" with grep (use of xargs minimises the number of times grep gets called)
feed the filtered filelist into another xargs to run a minimal number of find -printf
in second xargs, call a subshell so that extra arguments can be appended (find requires the paths to precede the operators)
dummy second argument (--) to the sh -c invocation prevents the first filename being lost due to assignment to $0
To do it exactly how you want:
find -L /home/blast/dirtest/ -maxdepth 3 \
-printf '%p#%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' \
> tmp.out
cut -d# -f1 tmp.out \
| xargs grep -l "pattern" 2>/dev/null \
| sed 's/^/^/; s/$/#/' \
| grep -f /dev/stdin tmp.out \
| sed 's/^.*#//'
This operates under the assumption that you have no character # in your file names.
What it does is avoid the grep at first and just dump all the files with the requested metadata to a temporary file.
But it also prefixes each line with the full path (%p#).
Then we extract (cut) the full paths out of this list and list the files which contains the pattern (xargs grep).
We then use sed to prefix each such file name with ^ and suffix it with #, which makes it a greppable pattern in our tmp.out file.
Then we use this pattern (grep -f /dev/stdin) to extract only those paths from the big list in tmp.out.
Now all that's left is to remove the artificial full path we prefixed using the last sed command.
Seeing how you used /home, there's a good chance you're on Linux, which, if you're willing to accept some output format changes, allows you to do it somewhat more elegantly:
find -L /home/blast/dirtest/ -maxdepth 3 \
| xargs grep -l "pattern" 2>/dev/null \
| xargs stat --printf '%F/#/%y/#/%s/#/%n\n'
The output of stat --printf is different from that of find -printf (and from that of MacOS' stat -f), but it's the same information.
Do note, however, that because you passed -L to find, and you're grepping the result:
The results are limited to file types which can be grepped, so they will never be directories, links, etc..
If you stumble upon a broken link, it will not be in the output because it cannot be grepped.
I'v found an intresting thing about the -exec option.
We could run the grep once using the exec with the plus-sign (+)
-exec command {} +
This variant of the -exec option runs the specified command on the selected files, but the command line is built by appending each selected file name at the end; the total
number of invocations of the command will be much less than the number of matched files. The command line is built in much the same way that xargs builds its command
lines. Only one instance of ’{}’ is allowed within the command. The command is executed in the starting directory.
That means if I change this :
-exec grep -l 'pattern' {} \;
By this ( replace the semicolon with the plus signe ):
-exec grep -l 'pattern' {} \+
Will improve the performance significantly.
Then I can pipe only one xargs for the format printing needs only.

Append "/" to end of directory

Completely noob question but, using ls piped to grep, I need to find files or directories that have all capitals in their name, and directories need to have "/" appended to indicate that it is a directory. Trying to append the "/" is the only part I am stuck on. Again, I apologize for the amateur question. I currently have ls | grep [A-Z] and the example out should be: BIRD, DOG, DOGDIR/
It's an interesting question because it's a somewhat difficult thing to accomplish with a bash one-liner.
Here's what I came up with. It doesn't seem very elegant, but I'm not sure how to improve.
find /animals -type d -or -type f \
| grep '/[A-Z]*$' \
| xargs -I + bash -c 'echo -n $(basename +)$( test -d + && echo -n /),\\ ' \
| sed -e 's/, *$//'; echo
I'll break that down for you
find /animals -type d -or -type f writes out, once per line, the directories and files it found in /animals (see below for my test environment dockerfile - I created /animals to match your desired output). Find can't do a regex match as far as I know on the name, so...
grep '/[A-Z]*$' filter's find's output so that only paths are shown where the last part of the file or directory name, after the final /, is all uppercase
xargs -I + bash -c '...' when you're in a shell and you want to use a "for" loop, chances are what you should be using is xargs. Learn it, know it, love it. xargs takes its input, separated by default by $IFS, and runs the command you give it for each piece of input . So this is going to run a bash shell for each path. that passed the grep filter. In my case, -I + will make xargs replace the literal '+' character with its current input filename. -I also makes it pass one at a time through xargs. For more information, see the xargs manual page.
'echo -n $(basename +)$( test -d + && echo -n /),\\ ' this is the inner bash script that will be run by xargs for each path that got through grep.
basename + cuts the directory component off the path; from your example output you don't want eg /animals/DOGDIR/, you want DOGDIR/. basename is the program that trims the directories for us.
test -d + && echo -n / checks to see whether + (remember xargs will replace it with filename) is a directory ,and if so, runs echo -n /. the -n argument to echo suppresses the newline, important to get the output in the CSV format you specified.
now we can put it all together to see that we're echo -n the output of basename + , with / appended, if it's a directory, and then , appended to that. All the echos run with -n to suppress newlines to keep output CSV looking.
| sed -e 's/, *$//'; echo is purely for formatting. Adding , to each individual output was an easy way to get the CSV, but it leaves us with a final , at the end of the list. The sed invocation removes , followed by any number of spaces at the end of the output so far - eg the entire output from all the xargs invocations. And since we never did output a newline at the end of that output, the final echo is adding that.
Usually in unix shells, you probably wouldn't want a CSV style output. You'd probably instead want a newline-separated output in most cases, one matching file per line, and that would be somewhat simpler to do because you wouldn't need all that faffing with -n and , to make it CSV style. But, valid requirement if the need is there.
FROM debian
RUN mkdir -p /animals
WORKDIR /animals
RUN mkdir -p DOGDIR lowerdir && touch DOGDIR/DOG DOGDIR/lowerDOG2 lowerdir/BIRD
ENTRYPOINT [ "/bin/bash" ]
CMD [ "-c" , "find /animals -type d -or -type f | grep '/[A-Z]*$'| xargs -I + bash -c 'echo -n $(basename +)$( test -d + && echo -n /),\\ ' | sed -e 's/, *$//'; echo"]
$ docker run --rm test
BIRD, DOGDIR/, DOG
You can start looking at
ls -F | grep -v "[[:lower:]]"
I did not add something for a comma-seperated line, because this is the wrong method: Parsing ls should be avoided ! It will go wrong for filenames like
I am a terribble filename,
with newlines inside me,
and the ls command combined with grep
will only show the last line
BECAUSE THIS LINE HAS NO LOWERCASE CHARACTERS
To get the files without a pipe, you can use
shopt -s extglob
ls -dp +([[:upper:]])
shopt -u extglob
An explanation of the extglob and uppercase can be found at https://unix.stackexchange.com/a/389071/57293
When you want the output in one line, you can get troubles with filenames that have newlines or commas in its name. You might want something like
# parsing ls, yes wrong and failing for some files
ls -dp +([[:upper:]]) | tr "\n" "," | sed 's/,$/\n/'

find and then grep and then iterate through list of files

I have following script to replace text.
grep -l -r "originaltext" . |
while read fname
do
sed 's/originaltext/replacementText/g' $fname > tmp.tmp
mv tmp.tmp $fname
done
Now in the first statement of this script , I want to do something like this.
find . -name '*.properties' -exec grep "originaltext" {} \;
How do I do that?
I work on AIX, So --include-file wouldn't work .
In general, I prefer to use find to FIND files rather than grep. It looks obvious : )
Using process substitution you can feed the while loop with the result of find:
while IFS= read -r fname
do
sed 's/originaltext/replacementText/g' $fname > tmp.tmp
mv tmp.tmp $fname
done < <(find . -name '*.properties' -exec grep -l "originaltext" {} \;)
Note I use grep -l (big L) so that grep just returns the name of the file matching the pattern.
You could go the other way round and give the list of '*.properties' files to grep. For example
grep -l "originaltext" `find -name '*.properties'`
Oh, and if you're on a recent linux distribution, there is an option in grep to achieve that without having to create that long list of files as argument
grep -l "originaltext" --include='*.properties' -r .

for i in `ls |grep` question

This is the code i'm using to untar a file grep on the contents of the files within the tar and then delete the untared files. I dont have enough space to untar all files at once.
the issue i'm having is with the for f in `ls | grep -v *.gz line this is supposed to find the files that have come out of the tar and can be identified by not having a .tar.gz extension but it doesnt seem to pick them up?
Any help would be much appreciated
M
for i in *.tar.gz;
do echo $i >>outtput1;
tar -xvvzf $i; mv $i ./processed/;
for f in `ls | grep -v *.gz`; ----- this is the line that isn't working
do echo $f >> outtput1;
grep 93149249194 $f >>
outtput1; grep 788 $f >> outtput1;
rm -f $f;
done;
done
Try ls -1 | grep -v "\\.gz$". The -1 will make ls output one result per line. I've also fixed your regex for you in a few ways.
Although a better way to solve this whole thing is with find -exec.
Change it to ls | grep -v "*.gz", you have to quote *.gz because otherwise it will just glob the files in the working directory and grep them.
Never use ls in scripts, and don't use grep to match file patterns. Use globbing and tests instead:
for f in *
do
if [[ $f != *.gz ]]
then
...
fi
done

Run a script over multiple files in unix

Started to learn some shell scripting. I have a perl script that runs on one file. How would I write a shell script to run the perl script a bunch of times on all files with keyword "filename" in it?
So in English,
for /filename/ in filenames
perl myscript.pl completefilename
Thanks.
find . -name "filename*" -exec perl myscript.pl '{}' \;
for i in $(\ls -d filenames)
do
perl myscript.pl $i
done
The backslash in front of the 'ls' command is to temporarily disable any aliases.
HTH
find . -path "\*filename\*" -exec perl myscript.pl {} \;
edit: escaped stars, didn't want the markup here
And if you have spaces in your filenames, use the old standby
find . -maxdepth 1 -print0 | xargs -0 perl myscript.pl
In bash:
files=`ls -1 *`
for $file in $files;
do
perl myscript.pl $file;
done
One liner:
$ for file in filename1 filename2 filename3; do perl myscript $file; done
Instead of the space separated list of filenames you can also use wildcards, for instance:
$ for file in *.txt *.csv; do perl myscript $file; done
FILES="keyword"
for f in "$FILES"
do
perl myscript.pl $f
done
From http://www.cyberciti.biz/faq/bash-loop-over-file/
I personally use the zsh shell, which gives you a very nice way to run a command recursively on a set of subdirectories. It also allows you to change the suffix of the file, which is handly when using lame to create MP3 files of .wav files:
for i in **/*.wav; lame $i $i:r.mp3
You can also pipe the output of one command to another, which I something I use frequently when I'm downloading a number of bittorrent files and want to see the percentage that each download has completed:
for i in **/*.txt; grep -H percent $i | tail -1

Resources