trouble listing directories that contain files with specific file extensions - unix

How to I list only directories that contain certain files. I am running on a Solaris box. Example, I want to list sub-directories of directory ABC that contain files that end with .out, .dat and .log .
Thanks

Something along these lines might work out for you:
find ABC/ \( -name "*.out" -o -name "*.log" \) -print | while read f
do
echo "${f%/*}"
done | sort -u
The sort -u bit could be just uniq instead, but either should work.
Should work on bash or ksh. Probably not so much on /bin/sh - you'd have to replace the variable expansion with something like echo "${f}" | sed -e 's;/[^/]*$;;' or something else that would strip off the last component of the path. dirname "${f}" would be good for that, but I don't recall if Solaris includes that utility...

Related

Remove underscores from all filenames within a directory

I have a folder "model" with files named like:
a_EmployeeData
a_TableData
b_TestData
b_TestModel
I basically need to drop the underscore and make them:
aEmployeeData
aTableData
bTestData
bTestModel
Is there away in the Unix Command Line to do so?
This will correctly process files containing odd characters like spaces or even newlines and should work on any Unix / Linux distribution being only based on POSIX syntax.
find model -type f -name "*_*" -exec sh -c 'd=$(dirname "$1"); mv "$1" "$d/$(basename "$1" | tr -d _)"' sh {} \;
Here is what it does:
For each file (not directory) containing an underscore in its name under the model directory and its subdirectories, rename the file in place with all the underscores stripped out.
You can do this simply with bash.
for file in /path/to/model/*; do
mv "$file" "${file/_/}"
done
If you have rename command available then simply do
rename 's/_//' /path/to/model/*
for f in model/* ; do mv "$f" `echo "$f" | sed 's/_//g'` ; done
Edit: modified a few things thanks to suggestions by others, but I'm afraid my code is still bad for strange filenames.
maybe this:
find model -name "*_*" -type f -maxdepth 1 -print | sed -e 'p;s/_//g' | xargs -n2 echo mv
Decomposition:
find all plain files in the directory model what contains at least one underscore, and don't search subdirectories
with the sed make filename adjustments - replace the _ with nothing
also print the old name
fed the two filenames to xargs what will rename the files with mv
The above is for a dry-run. When satisfied, remove the echo before mv for actual rename.
Warning: Will not work if filename contains spaces. If you have GNU sed you can
find . -name "*_*" -maxdepth 1 -print0 | sed -z 'p;s/_//g' | xargs -0 -n2 echo mv
and will works with a filenames with spaces too...
In zsh:
autoload zmv # in ~/.zshrc
cd model && zmv '(**/)(*)' '$1${2//_}'
marc#panic:~$ echo 'a_EmployeeData' | tr -d '_'
aEmployeeData
I had the same problem on my machine, but the filenames had more than one underscore. I used rename with the g option so that all underscores get removed:
find model/ -maxdepth 1 -type f | rename 's/_//g'
Or if there are no subdirectories, just
rename 's/_//g'
If you don't have rename, see Jaypal Singh's answer.
Use the global flag /g with your replace pattern to replace all occurrences within the filename.
find . -type f -print0 | xargs -0 rename 's/_//g'
Or if you want underscores replaced with spaces then use this:
find . -type f -print0 | xargs -0 rename 's/_/ /g'
If you like to live dangerously add the force flag -f in front of your replace pattern rename -f 's/_//g'

Recursively search files named string.xml for certain text

This command will search all directories and subdirectories for files containing "text"
grep -r "text" *
How do i specify to search only in files that are named 'strings.xml'?
You'll want to use find for this, since grep won't work that way recursively (as far as I know). Something like this should work:
find . -name "strings.xml" -exec grep "text" "{}" \;
The find command searches starting in the current directory (.) for a file with the name strings.xml (-name "strings.xml"), and then for each found file, execute the grep command specified. The curly braces ("{}") are a placeholder that find uses to specify the name of the file it found. More detail can be found in man find.
Also note that the -r option to grep is no longer necessary, since find works recursively.
You can use the grep command:
grep -r "text" /path/to/dir/strings.xml
grep supports an --include option whose use is to recurse in directories only searching file matching PATTERN. So, try something like below:
grep -R --include 'strings.xml' text .
I also tried using find which seems to be quite faster than grep:
find ./ -name "strings.xml" -exec grep "text" '{}' \; -print
These links speak about the same issue, might help you:
'grep -R string *.txt' even when top dir doesn't have a .txt file
http://www.linuxquestions.org/questions/linux-newbie-8/run-grep-only-on-certain-files-using-wildcard-919822/
Try below command
find . -type f | xargs grep "strings\.xml"
This will run grep "strings\.xml" on every file returned by find

How to copy files in shell that do not end with a certain file extension

For example copy all files that do not end with .txt
Bash will accept a not pattern.
cp !(*.txt)
You can use ls with grep -v option:
for i in `ls | grep -v ".txt"`
do
cp $i $dest_dir
done
Depending on how many assumptions you can afford to make about the characters in the file names, it might be as simple as:
cp $(ls | grep -v '\.txt$') /some/other/place
If that won't work for you, then maybe find ... -print0 | xargs -0 cp ... can be used instead (though that has issues - because the destination goes at the end of the argument list).
On MacOS X, xargs has an option -J that supports what is needed:
-J replstr
If this option is specified, xargs will use the data read from standard input to replace the first occurrence of replstr instead of append-
ing that data after all other arguments. This option will not affect how many arguments will be read from input (-n), or the size of the
command(s) xargs will generate (-s). The option just moves where those arguments will be placed in the command(s) that are executed. The
replstr must show up as a distinct argument to xargs. It will not be recognized if, for instance, it is in the middle of a quoted string.
Furthermore, only the first occurrence of the replstr will be replaced. For example, the following command will copy the list of files and
directories which start with an uppercase letter in the current directory to destdir:
/bin/ls -1d [A-Z]* | xargs -J % cp -rp % destdir
It appears the GNU xargs does not have -J but does have the related but slightly restrictive -I option (which is also present in MacOS X):
-I replace-str
Replace occurrences of replace-str in the initial-arguments with
names read from standard input. Also, unquoted blanks do not
terminate input items; instead the separator is the newline
character. Implies -x and -L 1.
You can rely on:
find . -not -name "*.txt"
By using:
find -x . -not -name "*.txt" -d 1 -exec cp '{}' toto/ \;`
Which copies all file that are not .txt of the current directory to a subdirectory toto/. the -d 1 is used to prevent recursion here.
Either do:
for f in $(ls | grep -v "\.txt$")
do
cp -- "$f" ⟨destination-directory⟩
done
or if you have a huge amount of files:
find -prune \! -name "*.txt" -exec cp -- "{}" ⟨destination-directory⟩ .. \;
Two things here to comment on. One is the use of the double hyphen in the invocation of cp, and the quoting of $f. The first guards against "wacky" filenames that begin with a hyphen and might be interpreted as options. The second guards agains filenames with spaces (or what's in IFS) in them.
In zsh:
setopt extendedglob
cp *^.txt /some/folder
(if you just want files)...
cp *.^txt(.) /some/folder
More information on zsh globbing here and here.
I would do it like this, where destination is the destination directory:
ls | grep -v "\.txt$" | xargs cp -t destination
Edit: added "-t" thanks to the comments

How do I perform a recursive directory search for strings within files in a UNIX TRU64 environment?

Unfortunately, due to the limitations of our Unix Tru64 environment, I am unable to use the GREP -r switch to perform my search for strings within files across multiple directories and sub directories.
Ideally, I would like to pass two parameters. The first will be the directory I want my search is to start on. The second is a file containing a list of all the strings to be searched. This list will consist of various directory path names and will include special characters:
ie:
/aaa/bbb/ccc
/eee/dddd/ggggggg/
etc..
The purpose of this exercise is to identify all shell scripts that may have specific hard coded path names identified in my list.
There was one example I found during my investigations that perhaps comes close, but I am not sure how to customize this to accept a file of string arguments:
eg: find etb -exec grep test {} \;
where 'etb' is the directory and 'test', a hard coded string to be searched.
This should do it:
find dir -type f -exec grep -F -f strings.txt {} \;
dir is the directory from which searching will commence
strings.txt is the file of strings to match, one per line
-F means treat search strings as literal rather than regular expressions
-f strings.txt means use the strings in strings.txt for matching
You can add -l to the grep switches if you just want filenames that match.
Footnote:
Some people prefer a solution involving xargs, e.g.
find dir -type f -print0 | xargs -0 grep -F -f strings.txt
which is perhaps a little more robust/efficient in some cases.
By reading, I assume we can not use the gnu coreutil, and egrep is not available.
I assume (for some reason) the system is broken, and escapes do not work as expected.
Under normal situations, grep -rf patternfile.txt /some/dir/ is the way to go.
a file containing a list of all the strings to be searched
Assumptions : gnu coreutil not available. grep -r does not work. handling of special character is broken.
Now, you have working awk ? no ?. It makes life so much easier. But lets be on the safe side.
Assume : working sed ,one of od OR hexdump OR xxd (from vim package) is available.
Lets call this patternfile.txt
1. Convert list into a regexp that grep likes
Example patternfile.txt contains
/foo/
/bar/doe/
/root/
(example does not print special char, but it's there.) we must turn it into something like
(/foo/|/bar/doe/|/root/)
Assuming echo -en command is not broken, and xxd , or od, or hexdump is available,
Using hexdump
cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n'
Using od
cat patternfile.txt |od -A none -t x1|tr -d '\n'
and pipe it into (common for both hexdump and od)
|sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'
then pipe result into
|sed 's:^:\\(:g' |sed 's:$:\\):g'
and you have a regexp pattern that is escaped.
2. Feed the escaped pattern into broken regexp
Assuming the bare minimum shell escape is available,
we use grep "$(echo -en "ESCAPED_PATTERN" )" to do our job.
3. To sum it up
Building a escaped regexp pattern (using hexdump as example )
grep "$(echo -en "$( cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n' |sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'|sed 's:^:\\(:g' |sed 's:$:\\):g')")"
will escape all characters and enclose it with (|) brackets so a regexp OR match will be performed.
4. Recrusive directory lookup
Under normal situations, even when grep -r is broken, find /dir/ -exec grep {} \; should work.
Some may prefer xargs instaed (unless you happen to have buggy xargs).
We prefer find /somedir/ -type f -print0 |xargs -0 grep -f 'patternfile.txt' approach, but since
this is not available (for whatever valid reason),
we need to exec grep for each file,and this is normaly the wrong way.
But lets do it.
Assume : find -type f works.
Assume : xargs is broken OR not available.
First, if you have a buggy pipe, it might not handle large number of files.
So we avoid xargs in such systems (i know, i know, just lets pretend it is broken ).
find /whatever/dir/to/start/looking/ -type f > list-of-all-file-to-search-for.txt
IF your shell handles large size lists nicely,
for file in cat list-of-all-file-to-search-for.txt ; do grep REGEXP_PATTERN "$file" ;
done ; is a nice way to get by. Unfortunetly, some systems do not like that,
and in that case, you may require
cat list-of-all-file-to-search-for.txt | split --help -a 4 -d -l 2000 file-smaller-chunk.part.
to turn it into smaller chunks. Now this is for a seriously broken system.
then a for file in file-smaller-chunk.part.* ; do for single_line in cat "$file" ; do grep REGEXP_PATTERN "$single_line" ; done ; done ;
should work.
A
cat filelist.txt |while read file ; do grep REGEXP_PATTERN $file ; done ;
may be used as workaround on some systems.
What if my shell doe not handle quotes ?
You may have to escape the file list beforehand.
It can be done much nicer in awk, perl, whatever, but since we restrict our selves to
sed, lets do it.
We assume 0x27, the ' code will actually work.
cat list-of-all-file-to-search-for.txt |sed 's#['\'']#'\''\\'\'\''#g'|sed 's:^:'\'':g'|sed 's:$:'\'':g'
The only time I had to use this was when feeding output into bash again.
What if my shell does not handle that ?
xargs fails , grep -r fails , shell's for loop fails.
Do we have other things ? YES.
Escape all input suitable for your shell, and make a script.
But you know what, I got board, and writing automated scripts for csh just seems
wrong. So I am going to stop here.
Take home note
Use the tool for the right job. Writing a interpreter on bc is perfectly
capable, but it is just plain wrong. Install coreutils, perl, a better grep
what ever. makes life a better thing.

batch rename to change only single character

How to rename all the files in one directory to new name using the command mv. Directory have 1000s of files and requirement is to change the last character of each file name to some specific char. Example: files are
abc.txt
asdf.txt
zxc.txt
...
ab_.txt
asd.txt
it should change to
ab_.txt
asd_.txt
zx_.txt
...
ab_.txt
as_.txt
You have to watch out for name collisions but this should work okay:
for i in *.txt ; do
j=$(echo "$i" | sed 's/..txt$/_.txt/')
echo mv \"$i\" \"$j\"
#mv "$i" "$j"
done
after you uncomment the mv (I left it commented so you could see what it does safely). The quotes are for handling files with spaces (evil, vile things in my opinion :-).
If all files end in ".txt", you can use mmv (Multiple Move) for that:
mmv "*[a-z].txt" "#1_.txt"
Plus: mmv will tell you when this generates a collision (in your example: abc.txt becomes ab_.txt which already exists) before any file is renamed.
Note that you must quote the file names, else the shell will expand the list before mmv sees it (but mmv will usually catch this mistake, too).
If your files all have a .txt suffix, I suggest the following script:
for i in *.txt
do
r=`basename $i .txt | sed 's/.$//'`
mv $i ${r}_.txt
done
Is it a definite requirement that you use the mv command?
The perl rename utility was written for this sort of thing. It's standard for debian-based linux distributions, but according to this page it can be added really easily to any other.
If it's already there (or if you install it) you can do:
rename -v 's/.\.txt$/_\.txt/' *.txt
The page included above has some basic info on regex and things if it's needed.
Find should be more efficient than for file in *.txt, which expands all of your 1000 files into a long list of command line parameters. Example (updated to use bash replacement approach):
find . \( -type d ! -name . -prune \) -o \( -name "*.txt" \) | while read file
do
mv $file ${file%%?.txt}_.txt
done
I'm not sure if this will work with thousands of files, but in bash:
for i in *.txt; do
j=`echo $i |sed 's/.\.txt/_.txt/'`
mv $i $j
done
You can use bash's ${parameter%%word} operator thusly:
for FILE in *.txt; do
mv $FILE ${FILE%%?.txt}_.txt
done

Resources