Regex & ls or find

Regex & ls or find - unix

I need to select only directories between the periods of 20140729 - 20140921.
The directories look like this.
20140729_154208 20140814_221350 20140829_215623
What is the best method to do this?
Thanks

Using find
In order to find files modified within a range, if the creation time of the last file in the directory matches the directory name, the easiest way is to create files at the boundaries of the range and use the -newer predicate.
touch -t 201407290000 start
touch -t 201409210000 stop
find . -newer start \! -newer stop -type d
(I know not how to work with dates within regex, but I hope I have time to learn)
Using awk
Yeah, why not using awk instead of building a static regex to match the case?
Pass the find or ls result to awk with a little program checking the result is between stop and start (NB: for find I had to substr(3,10) for comparison):
find . |awk -v start=20140729 -v stop=20140921 \
'{ curr=substr($0, 3, 10); if (curr <= stop && curr >= start) { print $0 } }'
(It worked for me on AIX and Linux)

Related

How to print the longest line number for each file in a directory?

I'm trying to list the max line length for files in the current directory, but I'm having trouble with my command working. I believe it's an issue with escaping the curly brackets {} in my exec command. After googling through a ton of find exec escape answers I wasn't able to locate anything about how to escape brackets {} in the exec command. What am I missing?
find . -iname *.page -exec awk '{if(length($0) > L) { LINE=$0;L = length($0)}}
END {print LINE"|"L}' {}\; | sort

Their are multiple issues with the original command none of which are escaping {}. The first issue is there needs to be a space between {} and \;. The second issue is related to how the shell expands the wildcard in the find iname paramater *.page.
From the Free BSD Forums
"*" is expanded by the shell before the command-line is passed to find(1). If there's only 1 item in the directory, then it works. If
there's more than one item in the directory, then it fails as the
command-line options are no longer correct.
Wrapping the *.page in quotes solves the issue. The final version is
find . -iname '*.page' -exec awk '{if(length($0) > L)
{ LINE=NR;L = length($0)}} END {print L"|"FILENAME":"LINE}' {} \; | sort -n
Which outputs the a sorted list of the longest line for each file with line number
220|./Example1.page:157
206|./Example2.page:203

You want to run awk on each file, right?
create a script: t.sh in your home directory:
awk '{if(length($0) > L) { LINE=$0;L = length($0)}}
END {print LINE"|"L}' "$1"
command line:
find . -iname *.page -exec ~/t.sh {} | sort
I'm not too sure about your awk script but since you think it is what you need let's pass on that for now.

Program fails to move file

I'm trying to move file from one place to another directory...So my program will read Log_Deleter, use parameters given in each line to delete the file.
When I execute the file, it seems like it runs fine (no errors) but non of the files are moved... I'm not sure why it's not moving the file nor display any error...
Can someone please identify the error?
my attempt:
#!/bin/ksh
while read -r line ; do
v=$line
set -- $v
cd /
$(find "$1" -type f -name "$2" -mtime +"$3" -exec mv {} "$4" \;)
done < Log_Deleter.txt
Log_Deleter.txt
/usr/IBM/WebSphere/AppServer/profiles/AppSrvSIT1/logs/Server1 'SystemOut_*' 5 /backup/Abackuptest1
/usr/IBM/WebSphere/AppServer/profiles/AppSrvSIT1/logs/Server2 'SystemOut_*' 5 /backup/Abackuptest2
Thanks for your help!

Find is looking for files that have a literal ' in the name. You need to remove the single quotes from $2 before invoking find. Try:
#!/bin/ksh
while read -r path name mtime dest ; do
name=$( echo $name | tr -d "'" )
find "$path" -type f -name "$name" -mtime +"$mtime" -exec mv {} "$dest" \;
done < Log_Deleter.txt

The problem is that you are trying to match a file whose name actually has the single quotes in it.
Barring other problems, I think your script will probably work once you take the quotes out of Log_Deleter.txt.
The quotes are only meaningful when the shell is parsing command input. This is not what the read builtin does. And even when reading command input, once the quotes get into a variable they stay there forever unless reread at the shells CLI layer via eval.
The shell is not exactly a macro processor. It's a complicated hybrid that a little bit CLI, a little bit programming language, and a little bit macro processor.
And, speaking of eval, it's not necessary to wrap the find in an eval-like construct. Simplify your script to run find directly and you will find it easier to debug and understand.

UNIX find: opposite of -newer option exists?

I know there is this option for unix's find command:
find -version
GNU find version 4.1
-newer file Compares the modification date of the found file with that of
the file given. This matches if someone has modified the found
file more recently than file.
Is there an option that will let me find files that are older than a certain file. I would like to delete all files from a directory for cleanup. So, an alternative where I would find all files older than N days would do the job too.

You can use a ! to negate the -newer operation like this:
find . \! -newer filename
If you want to find files that were last modified more then 7 days ago use:
find . -mtime +7
UPDATE:
To avoid matching on the file you are comparing against use the following:
find . \! -newer filename \! -samefile filename
UPDATE2 (several years later):
The following is more complicated, but does do a strictly older than match. It uses -exec and test -ot to test each file against the comparison file. The second -exec is only executed if the first one (the test) succeeds. Remove the echo to actually remove the files.
find . -type f -exec test '{}' -ot filename \; -a -exec echo rm -f '{}' +

You can just use negation:
find ... \! -newer <reference>
You might also try the -mtime/-atime/-ctime/-Btime family of options. I don't immediately remember how they work, but they might be useful in this situation.
Beware of deleting files from a find operation, especially one running as root; there are a whole bunch of ways an unprivileged, malicious process on the same system can trick it into deleting things you didn't want deleted. I strongly recommend you read the entire "Deleting Files" section of the GNU find manual.

If you only need files that are older than file "foo" and not foo itself, exclude the file by name using negation:
find . ! -newer foo ! -name foo

Please note, that the negation of newer means "older or same timestamp":
As you see in this example, the same file is also returned:
thomas#vm1112:/home/thomas/tmp/ touch test
thomas#vm1112:/home/thomas/tmp/ find ./ ! -newer test
./test

Unfortunately, find doesnt support this
! -newer doesnt mean older. It only means not newer, but it also matches files that have equal modification time. So I rather use
for f in path/files/etc/*; do
[ $f -ot reference_file ] && {
echo "$f is older"
# do something
}
done

find dir \! -newer fencefile -exec \
sh -c '
for f in "$#"; do
[ "$f" -ot fencefile ] && printf "%s\n" "$f"
done
' sh {} + \
;

How do I perform a recursive directory search for strings within files in a UNIX TRU64 environment?

Unfortunately, due to the limitations of our Unix Tru64 environment, I am unable to use the GREP -r switch to perform my search for strings within files across multiple directories and sub directories.
Ideally, I would like to pass two parameters. The first will be the directory I want my search is to start on. The second is a file containing a list of all the strings to be searched. This list will consist of various directory path names and will include special characters:
ie:
/aaa/bbb/ccc
/eee/dddd/ggggggg/
etc..
The purpose of this exercise is to identify all shell scripts that may have specific hard coded path names identified in my list.
There was one example I found during my investigations that perhaps comes close, but I am not sure how to customize this to accept a file of string arguments:
eg: find etb -exec grep test {} \;
where 'etb' is the directory and 'test', a hard coded string to be searched.

This should do it:
find dir -type f -exec grep -F -f strings.txt {} \;
dir is the directory from which searching will commence
strings.txt is the file of strings to match, one per line
-F means treat search strings as literal rather than regular expressions
-f strings.txt means use the strings in strings.txt for matching
You can add -l to the grep switches if you just want filenames that match.
Footnote:
Some people prefer a solution involving xargs, e.g.
find dir -type f -print0 | xargs -0 grep -F -f strings.txt
which is perhaps a little more robust/efficient in some cases.

By reading, I assume we can not use the gnu coreutil, and egrep is not available.
I assume (for some reason) the system is broken, and escapes do not work as expected.
Under normal situations, grep -rf patternfile.txt /some/dir/ is the way to go.
a file containing a list of all the strings to be searched
Assumptions : gnu coreutil not available. grep -r does not work. handling of special character is broken.
Now, you have working awk ? no ?. It makes life so much easier. But lets be on the safe side.
Assume : working sed ,one of od OR hexdump OR xxd (from vim package) is available.
Lets call this patternfile.txt
1. Convert list into a regexp that grep likes
Example patternfile.txt contains
/foo/
/bar/doe/
/root/
(example does not print special char, but it's there.) we must turn it into something like
(/foo/|/bar/doe/|/root/)
Assuming echo -en command is not broken, and xxd , or od, or hexdump is available,
Using hexdump
cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n'
Using od
cat patternfile.txt |od -A none -t x1|tr -d '\n'
and pipe it into (common for both hexdump and od)
|sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'
then pipe result into
|sed 's:^:\\(:g' |sed 's:$:\\):g'
and you have a regexp pattern that is escaped.
2. Feed the escaped pattern into broken regexp
Assuming the bare minimum shell escape is available,
we use grep "$(echo -en "ESCAPED_PATTERN" )" to do our job.
3. To sum it up
Building a escaped regexp pattern (using hexdump as example )
grep "$(echo -en "$( cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n' |sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'|sed 's:^:\\(:g' |sed 's:$:\\):g')")"
will escape all characters and enclose it with (|) brackets so a regexp OR match will be performed.
4. Recrusive directory lookup
Under normal situations, even when grep -r is broken, find /dir/ -exec grep {} \; should work.
Some may prefer xargs instaed (unless you happen to have buggy xargs).
We prefer find /somedir/ -type f -print0 |xargs -0 grep -f 'patternfile.txt' approach, but since
this is not available (for whatever valid reason),
we need to exec grep for each file,and this is normaly the wrong way.
But lets do it.
Assume : find -type f works.
Assume : xargs is broken OR not available.
First, if you have a buggy pipe, it might not handle large number of files.
So we avoid xargs in such systems (i know, i know, just lets pretend it is broken ).
find /whatever/dir/to/start/looking/ -type f > list-of-all-file-to-search-for.txt
IF your shell handles large size lists nicely,
for file in cat list-of-all-file-to-search-for.txt ; do grep REGEXP_PATTERN "$file" ;
done ; is a nice way to get by. Unfortunetly, some systems do not like that,
and in that case, you may require
cat list-of-all-file-to-search-for.txt | split --help -a 4 -d -l 2000 file-smaller-chunk.part.
to turn it into smaller chunks. Now this is for a seriously broken system.
then a for file in file-smaller-chunk.part.* ; do for single_line in cat "$file" ; do grep REGEXP_PATTERN "$single_line" ; done ; done ;
should work.
A
cat filelist.txt |while read file ; do grep REGEXP_PATTERN $file ; done ;
may be used as workaround on some systems.
What if my shell doe not handle quotes ?
You may have to escape the file list beforehand.
It can be done much nicer in awk, perl, whatever, but since we restrict our selves to
sed, lets do it.
We assume 0x27, the ' code will actually work.
cat list-of-all-file-to-search-for.txt |sed 's#['\'']#'\''\\'\'\''#g'|sed 's:^:'\'':g'|sed 's:$:'\'':g'
The only time I had to use this was when feeding output into bash again.
What if my shell does not handle that ?
xargs fails , grep -r fails , shell's for loop fails.
Do we have other things ? YES.
Escape all input suitable for your shell, and make a script.
But you know what, I got board, and writing automated scripts for csh just seems
wrong. So I am going to stop here.
Take home note
Use the tool for the right job. Writing a interpreter on bc is perfectly
capable, but it is just plain wrong. Install coreutils, perl, a better grep
what ever. makes life a better thing.

Diff files present in two different directories

I have two directories with the same list of files. I need to compare all the files present in both the directories using the diff command. Is there a simple command line option to do it, or do I have to write a shell script to get the file listing and then iterate through them?

You can use the diff command for that:
diff -bur folder1/ folder2/
This will output a recursive diff that ignore spaces, with a unified context:
b flag means ignoring whitespace
u flag means a unified context (3 lines before and after)
r flag means recursive

If you are only interested to see the files that differ, you may use:
diff -qr dir_one dir_two | sort
Option "q" will only show the files that differ but not the content that differ, and "sort" will arrange the output alphabetically.

Diff has an option -r which is meant to do just that.
diff -r dir1 dir2

diff can not only compare two files, it can, by using the -r option, walk entire directory trees, recursively checking differences between subdirectories and files that occur at comparable points in each tree.
$ man diff
...
-r --recursive
Recursively compare any subdirectories found.
...
Another nice option is the über-diff-tool diffoscope:
$ diffoscope a b
It can also emit diffs as JSON, html, markdown, ...

If you specifically don't want to compare contents of files and only check which one are not present in both of the directories, you can compare lists of files, generated by another command.
diff <(find DIR1 -printf '%P\n' | sort) <(find DIR2 -printf '%P\n' | sort) | grep '^[<>]'
-printf '%P\n' tells find to not prefix output paths with the root directory.
I've also added sort to make sure the order of files will be the same in both calls of find.
The grep at the end removes information about identical input lines.

If it's GNU diff then you should just be able to point it at the two directories and use the -r option.
Otherwise, try using
for i in $(\ls -d ./dir1/*); do diff ${i} dir2; done
N.B. As pointed out by Dennis in the comments section, you don't actually need to do the command substitution on the ls. I've been doing this for so long that I'm pretty much doing this on autopilot and substituting the command I need to get my list of files for comparison.
Also I forgot to add that I do '\ls' to temporarily disable my alias of ls to GNU ls so that I lose the colour formatting info from the listing returned by GNU ls.

When working with git/svn or multiple git/svn instances on disk this has been one of the most useful things for me over the past 5-10 years, that somebody might find useful:
diff -burN /path/to/directory1 /path/to/directory2 | grep +++
or:
git diff /path/to/directory1 | grep +++
It gives you a snapshot of the different files that were touched without having to "less" or "more" the output. Then you just diff on the individual files.

In practice the question often arises together with some constraints. In that case following solution template may come in handy.
cd dir1
find . \( -name '*.txt' -o -iname '*.md' \) | xargs -i diff -u '{}' 'dir2/{}'

Here is a script to show differences between files in two folders. It works recursively. Change dir1 and dir2.
(search() { for i in $1/*; do [ -f "$i" ] && (diff "$1/${i##*/}" "$2/${i##*/}" || echo "files: $1/${i##*/} $2/${i##*/}"); [ -d "$i" ] && search "$1/${i##*/}" "$2/${i##*/}"; done }; search "dir1" "dir2" )

Try this:
diff -rq /path/to/folder1 /path/to/folder2

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Regex & ls or find - unix

I need to select only directories between the periods of 20140729 - 20140921. The directories look like this. 20140729_154208 20140814_221350 20140829_215623 What is the best method to do this? Thanks

Related

How to print the longest line number for each file in a directory?

Program fails to move file

UNIX find: opposite of -newer option exists?

How do I perform a recursive directory search for strings within files in a UNIX TRU64 environment?

Diff files present in two different directories

Categories

Resources