How do I perform a recursive directory search for strings within files in a UNIX TRU64 environment? - unix

Unfortunately, due to the limitations of our Unix Tru64 environment, I am unable to use the GREP -r switch to perform my search for strings within files across multiple directories and sub directories.
Ideally, I would like to pass two parameters. The first will be the directory I want my search is to start on. The second is a file containing a list of all the strings to be searched. This list will consist of various directory path names and will include special characters:
ie:
/aaa/bbb/ccc
/eee/dddd/ggggggg/
etc..
The purpose of this exercise is to identify all shell scripts that may have specific hard coded path names identified in my list.
There was one example I found during my investigations that perhaps comes close, but I am not sure how to customize this to accept a file of string arguments:
eg: find etb -exec grep test {} \;
where 'etb' is the directory and 'test', a hard coded string to be searched.

This should do it:
find dir -type f -exec grep -F -f strings.txt {} \;
dir is the directory from which searching will commence
strings.txt is the file of strings to match, one per line
-F means treat search strings as literal rather than regular expressions
-f strings.txt means use the strings in strings.txt for matching
You can add -l to the grep switches if you just want filenames that match.
Footnote:
Some people prefer a solution involving xargs, e.g.
find dir -type f -print0 | xargs -0 grep -F -f strings.txt
which is perhaps a little more robust/efficient in some cases.

By reading, I assume we can not use the gnu coreutil, and egrep is not available.
I assume (for some reason) the system is broken, and escapes do not work as expected.
Under normal situations, grep -rf patternfile.txt /some/dir/ is the way to go.
a file containing a list of all the strings to be searched
Assumptions : gnu coreutil not available. grep -r does not work. handling of special character is broken.
Now, you have working awk ? no ?. It makes life so much easier. But lets be on the safe side.
Assume : working sed ,one of od OR hexdump OR xxd (from vim package) is available.
Lets call this patternfile.txt
1. Convert list into a regexp that grep likes
Example patternfile.txt contains
/foo/
/bar/doe/
/root/
(example does not print special char, but it's there.) we must turn it into something like
(/foo/|/bar/doe/|/root/)
Assuming echo -en command is not broken, and xxd , or od, or hexdump is available,
Using hexdump
cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n'
Using od
cat patternfile.txt |od -A none -t x1|tr -d '\n'
and pipe it into (common for both hexdump and od)
|sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'
then pipe result into
|sed 's:^:\\(:g' |sed 's:$:\\):g'
and you have a regexp pattern that is escaped.
2. Feed the escaped pattern into broken regexp
Assuming the bare minimum shell escape is available,
we use grep "$(echo -en "ESCAPED_PATTERN" )" to do our job.
3. To sum it up
Building a escaped regexp pattern (using hexdump as example )
grep "$(echo -en "$( cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n' |sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'|sed 's:^:\\(:g' |sed 's:$:\\):g')")"
will escape all characters and enclose it with (|) brackets so a regexp OR match will be performed.
4. Recrusive directory lookup
Under normal situations, even when grep -r is broken, find /dir/ -exec grep {} \; should work.
Some may prefer xargs instaed (unless you happen to have buggy xargs).
We prefer find /somedir/ -type f -print0 |xargs -0 grep -f 'patternfile.txt' approach, but since
this is not available (for whatever valid reason),
we need to exec grep for each file,and this is normaly the wrong way.
But lets do it.
Assume : find -type f works.
Assume : xargs is broken OR not available.
First, if you have a buggy pipe, it might not handle large number of files.
So we avoid xargs in such systems (i know, i know, just lets pretend it is broken ).
find /whatever/dir/to/start/looking/ -type f > list-of-all-file-to-search-for.txt
IF your shell handles large size lists nicely,
for file in cat list-of-all-file-to-search-for.txt ; do grep REGEXP_PATTERN "$file" ;
done ; is a nice way to get by. Unfortunetly, some systems do not like that,
and in that case, you may require
cat list-of-all-file-to-search-for.txt | split --help -a 4 -d -l 2000 file-smaller-chunk.part.
to turn it into smaller chunks. Now this is for a seriously broken system.
then a for file in file-smaller-chunk.part.* ; do for single_line in cat "$file" ; do grep REGEXP_PATTERN "$single_line" ; done ; done ;
should work.
A
cat filelist.txt |while read file ; do grep REGEXP_PATTERN $file ; done ;
may be used as workaround on some systems.
What if my shell doe not handle quotes ?
You may have to escape the file list beforehand.
It can be done much nicer in awk, perl, whatever, but since we restrict our selves to
sed, lets do it.
We assume 0x27, the ' code will actually work.
cat list-of-all-file-to-search-for.txt |sed 's#['\'']#'\''\\'\'\''#g'|sed 's:^:'\'':g'|sed 's:$:'\'':g'
The only time I had to use this was when feeding output into bash again.
What if my shell does not handle that ?
xargs fails , grep -r fails , shell's for loop fails.
Do we have other things ? YES.
Escape all input suitable for your shell, and make a script.
But you know what, I got board, and writing automated scripts for csh just seems
wrong. So I am going to stop here.
Take home note
Use the tool for the right job. Writing a interpreter on bc is perfectly
capable, but it is just plain wrong. Install coreutils, perl, a better grep
what ever. makes life a better thing.

Related

Unix Pipes for Command Argument [duplicate]

This question already has answers here:
How to pass command output as multiple arguments to another command
(5 answers)
Read expression for grep from standard input
(1 answer)
Closed last month.
I am looking for insight as to how pipes can be used to pass standard output as the arguments for other commands.
For example, consider this case:
ls | grep Hello
The structure of grep follows the pattern: grep SearchTerm PathOfFileToBeSearched. In the case I have illustrated, the word Hello is taken as the SearchTerm and the result of ls is used as the file to be searched. But what if I want to switch it around? What if I want the standard output of ls to be the SearchTerm, with the argument following grep being PathOfFileToBeSearched? In a general sense, I want to have control over which argument the pipe fills with the standard output of the previous command. Is this possible, or does it depend on how the script for the command (e.g., grep) was written?
Thank you so much for your help!
grep itself will be built such that if you've not specified a file name, it will open stdin (and thus get the output of ls). There's no real generic mechanism here - merely convention.
If you want the output of ls to be the search term, you can do this via the shell. Make use of a subshell and substitution thus:
$ grep $(ls) filename.txt
In this scenario ls is run in a subshell, and its stdout is captured and inserted in the command line as an argument for grep. Note that if the ls output contains spaces, this will cause confusion for grep.
There are basically two options for this: shell command substitution and xargs. Brian Agnew has just written about the former. xargs is a utility which takes its stdin and turns it into arguments of a command to execute. So you could run
ls | xargs -n1 -J % grep -- % PathOfFileToBeSearched
and it would, for each file output by ls, run grep -e filename PathOfFileToBeSearched to grep for the filename output by ls within the other file you specify. This is an unusual xargs invocation; usually it's used to add one or more arguments at the end of a command, while here it should add exactly one argument in a specific place, so I've used -n and -J arguments to arrange that. The more common usage would be something like
ls | xargs grep -- term
to search all of the files output by ls for term. Although of course if you just want files in the current directory, you can this more simply without a pipeline:
grep -- term *
and likewise in your reversed arrangement,
for filename in *; do
grep -- "$#" PathOfFileToBeSearched
done
There's one important xargs caveat: whitespace characters in the filenames generated by ls won't be handled too well. To do that, provided you have GNU utilities, you can use find instead.
find . -mindepth 1 -maxdepth 1 -print0 | xargs -0 -n1 -J % grep -- % PathOfFileToBeSearched
to use NUL characters to separate filenames instead of whitespace

Program fails to move file

I'm trying to move file from one place to another directory...So my program will read Log_Deleter, use parameters given in each line to delete the file.
When I execute the file, it seems like it runs fine (no errors) but non of the files are moved... I'm not sure why it's not moving the file nor display any error...
Can someone please identify the error?
my attempt:
#!/bin/ksh
while read -r line ; do
v=$line
set -- $v
cd /
$(find "$1" -type f -name "$2" -mtime +"$3" -exec mv {} "$4" \;)
done < Log_Deleter.txt
Log_Deleter.txt
/usr/IBM/WebSphere/AppServer/profiles/AppSrvSIT1/logs/Server1 'SystemOut_*' 5 /backup/Abackuptest1
/usr/IBM/WebSphere/AppServer/profiles/AppSrvSIT1/logs/Server2 'SystemOut_*' 5 /backup/Abackuptest2
Thanks for your help!
Find is looking for files that have a literal ' in the name. You need to remove the single quotes from $2 before invoking find. Try:
#!/bin/ksh
while read -r path name mtime dest ; do
name=$( echo $name | tr -d "'" )
find "$path" -type f -name "$name" -mtime +"$mtime" -exec mv {} "$dest" \;
done < Log_Deleter.txt
The problem is that you are trying to match a file whose name actually has the single quotes in it.
Barring other problems, I think your script will probably work once you take the quotes out of Log_Deleter.txt.
The quotes are only meaningful when the shell is parsing command input. This is not what the read builtin does. And even when reading command input, once the quotes get into a variable they stay there forever unless reread at the shells CLI layer via eval.
The shell is not exactly a macro processor. It's a complicated hybrid that a little bit CLI, a little bit programming language, and a little bit macro processor.
And, speaking of eval, it's not necessary to wrap the find in an eval-like construct. Simplify your script to run find directly and you will find it easier to debug and understand.

How to copy files in shell that do not end with a certain file extension

For example copy all files that do not end with .txt
Bash will accept a not pattern.
cp !(*.txt)
You can use ls with grep -v option:
for i in `ls | grep -v ".txt"`
do
cp $i $dest_dir
done
Depending on how many assumptions you can afford to make about the characters in the file names, it might be as simple as:
cp $(ls | grep -v '\.txt$') /some/other/place
If that won't work for you, then maybe find ... -print0 | xargs -0 cp ... can be used instead (though that has issues - because the destination goes at the end of the argument list).
On MacOS X, xargs has an option -J that supports what is needed:
-J replstr
If this option is specified, xargs will use the data read from standard input to replace the first occurrence of replstr instead of append-
ing that data after all other arguments. This option will not affect how many arguments will be read from input (-n), or the size of the
command(s) xargs will generate (-s). The option just moves where those arguments will be placed in the command(s) that are executed. The
replstr must show up as a distinct argument to xargs. It will not be recognized if, for instance, it is in the middle of a quoted string.
Furthermore, only the first occurrence of the replstr will be replaced. For example, the following command will copy the list of files and
directories which start with an uppercase letter in the current directory to destdir:
/bin/ls -1d [A-Z]* | xargs -J % cp -rp % destdir
It appears the GNU xargs does not have -J but does have the related but slightly restrictive -I option (which is also present in MacOS X):
-I replace-str
Replace occurrences of replace-str in the initial-arguments with
names read from standard input. Also, unquoted blanks do not
terminate input items; instead the separator is the newline
character. Implies -x and -L 1.
You can rely on:
find . -not -name "*.txt"
By using:
find -x . -not -name "*.txt" -d 1 -exec cp '{}' toto/ \;`
Which copies all file that are not .txt of the current directory to a subdirectory toto/. the -d 1 is used to prevent recursion here.
Either do:
for f in $(ls | grep -v "\.txt$")
do
cp -- "$f" ⟨destination-directory⟩
done
or if you have a huge amount of files:
find -prune \! -name "*.txt" -exec cp -- "{}" ⟨destination-directory⟩ .. \;
Two things here to comment on. One is the use of the double hyphen in the invocation of cp, and the quoting of $f. The first guards against "wacky" filenames that begin with a hyphen and might be interpreted as options. The second guards agains filenames with spaces (or what's in IFS) in them.
In zsh:
setopt extendedglob
cp *^.txt /some/folder
(if you just want files)...
cp *.^txt(.) /some/folder
More information on zsh globbing here and here.
I would do it like this, where destination is the destination directory:
ls | grep -v "\.txt$" | xargs cp -t destination
Edit: added "-t" thanks to the comments

What's the best way to convert Windows/DOS files to Unix in batch?

Basically we need to change the end of line characters for a group of files.
Is there a way to accomplish this with a batch file? Is there a freeware utility?
dos2unix
It could be done with somewhat shorter command.
find ./ -type f | xargs -I {} dos2unix {}
You should be able to use tr in combination with xargs to do this.
On the Unix side at least, this should be the simplest way. However, I tried doing it that way once on a Windows box over a decade ago, but discovered that the Windows version of tr was translating my terminators right back to Windows format for me. :-( However, I think in the interveneing decade the tools have gotten smarter.
Combine find with dos2unix/fromdos to convert a directory of files (excluding binary files).
Just add this to your .bashrc:
DOS2UNIX=$(which fromdos || which dos2unix) \
|| echo "*** Please install fromdos or dos2unix"
function finddos2unix {
# Usage: finddos2unix Directory
find $1 -type f -exec file {} \; | grep " text" | cut -d ':' -f1 | xargs $DOS2UNIX
}
First, DOS2UNIX finds whether you have the utility installed, and picks one to use
Find makes a list of all files, then file appends the ": ASCII text" after each text file.
Finally, grep picks the text files, Cut removes all text after ':', and xargs makes this one big command line for DOS2UNIX.

Shell script - search and replace text in multiple files using a list of strings

I have a file "changesDictionary.txt" containing (a variable number of) pairs of key-value strings.
e.g.
"textToSearchFor" = "theReplacementText"
(The format of the dictionary is unimportant, and be changed as required.)
I need to iterate through the contents of a given directory, including sub-directories. For each file encountered with the extension ".txt", we search for each of the keys in changesDictionary.txt, replacing each found instance with the replacement string value.
i.e. a search and replace over multiple files, but using a list of search/replace terms rather than a single search/replace term.
How could I do this? (I have studied single search/replace examples, but do not understand how to do multiple searches within a file.)
The implementation (bash, perl, whatever) is not important as long as I can run it from the command line in Mac OS X. Thanks for any help.
I'd convert your changesDictionary.txt file to a sed script, with... sed:
$ sed -e 's/^"\(.*\)" = "\(.*\)"$/s\/\1\/\2\/g/' \
changesDictionary.txt > changesDictionary.sed
Note, any special characters for either regular expressions or sed expressions in your dictionary will be falsely interpreted by sed, so your dictionary can either only have only the most primitive search-and-replacements, or you'll need to maintain the sed file with valid expressions. Unfortunately, there's no easy way in sed to either shut off regular expression and use only string matching or quote your searches and replacements as "literals".
With the resulting sed script, use find and xargs -- rather than find -exec -- to convert your files with the sed script as quickly as possible, by processing them more than one at a time.
$ find somedir -type f -print0 \
| xargs -0 sed -i -f changesDictionary.sed
Note, the -i option of sed edits files "in-place", so be sure to make backups for safety, or use -i~ to create tilde-backups.
Final note, using search and replaces can have unintended consequences. Will you have searches that are substrings of other searches? Here's an example.
$ cat changesDictionary.txt
"fix" = "broken"
"fixThat" = "Fixed"
$ sed -e 's/^"\(.*\)" = "\(.*\)"$/s\/\1\/\2\/g/' changesDictionary.txt \
| tee changesDictionary.sed
s/fix/broken/g
s/fixThat/Fixed/g
$ mkdir subdir
$ echo fixThat > subdir/target.txt
$ find subdir -type f -name '*.txt' -print0 \
| xargs -0 sed -i -f changesDictionary.sed
$ cat subdir/target.txt
brokenThat
Should "fixThat" have become "Fixed" or "brokenThat"? Order matters for sed script. Similarly, a search and replace can be search and replaced more than once -- changing "a" to "b", may be changed by another search-and-replace later from "b" to "c".
Perhaps you've already considered both of these, but I mention because I've tried what you were doing before and didn't think of it. I don't know of anything that simply does the right thing for doing multiple search and replacements at once. So, you need to program it to do the right thing yourself.
Here are the basic steps I would do
Copy the changesDictionary.txt file
In it replace "a"="b" to the equivalent sed line: e.g. (use $1 for the file name)
sed -e 's/a/b/g' $1
(you could write a script to do this or just do it by hand, if you just need to do this once and it's not too big).
If the files are all in one directory, then you can do something like:
ls *.txt | xargs scriptFromStep2.sh
If they are in subdirs, use a find to call that script on all of the files, something like
find . -name '*.txt' -exec scriptFromStep2.sh {} \;
These aren't exact, do some experiments to make sure you get it right -- it's just the approach I would use.
(but, if you can, just use perl, it would be a lot simpler)
Use this tool, which is written in Perl - with quite a lot of bells and whistles - oldie, but goodie:
http://unixgods.org/~tilo/replace_string/
Features:
do multiple search-replace or query-search-replace operations
search-replace expressions can be given on the command line or read from a file
processes multiple input files
recursively descend into directory and do multiple search/replace operations on all files
user defined perl expressions are applied to each line of each input file
optionally run in paragraph mode (for multi-line search/replace)
interactive mode
batch mode
optionally backup files and backup numbering
preserve modes/owner when run as root
ignore symbolic links, empty files, write protected files, sockets, named pipes, and directory names
optionally replace lines only matching / not matching a given regular expression
This script has been used quite extensively over the years with large data sets.
#!/bin/bash
f="changesDictionary.tx"
find /path -type f -name "*.txt" | while read FILE
do
awk 'BEGIN{ FS="=" }
FNR==NR{ s[$1]=$2; next }
{
for(i in s){
if( $0 ~ i ){ gsub(i,s[i]) }
}
print $0
}' $f $FILE > temp
mv temp $FILE
done
for i in ls -1 /script/arq*.sh
do
echo -e "ARQUIVO ${i}"
sed -i 's|/$file_path1|/file_path2|g' ${i}
done

Resources