find command characteristics on unix - unix

How can I delete from directory2 and directory3 some files in the same time using the find command?
Some files of directory1 valides the same characteristics of the files in the 2 others directory and directories 1,2,3 figures in the directory0.

I suppose you meant:
(cd directory1 | find -iname 'somename' -print0) |
tee >(cd directory2 && xargs -0 rm -fvi) |
tee >(cd directory3 && xargs -0 rm -fvi)
ad libitum? But I'll make the answer more specific once you do your question

To find (From the man examples)
find / -newerct '1 minute ago' -print
Print out a list of all the files whose inode change
time is more recent than the current time minus one minute
To remove
find / -newerct '1 minute ago' -print | xargs rm

Related

Finding and sorting files by size in Unix

I want to create a function in shell programming that gets 2 parameters, directory-name and file-name and that does the following: it searches starting in the given directory-name for the file-name and then goes in all subdirectories of the directory-name to continue the search. I want the output to be every parent-directory where the file-name has been found, sorted using the file-name size.
Help would be much appreciated, thanks.
not sure about which Unix you asked for, but for Linux and maybe common Unix systems:
find <directory> -name "<filename>" -ls | sort -k 7 -n -r | awk '{print $NF}' | xargs -n 1 dirname
sort => sort by file size (the 7th column of find output is filesize)
awk => print the filename full path
dirname => get parent directory of the matched file
Example:
# Find parent directory of all types.h under /usr/include, sorted by file size in desc order
$ find /usr/include/ -name "types.h" -ls | sort -k 7 -n -r | awk '{print $NF}' | xargs -n 1 dirname
/usr/include/x86_64-linux-gnu/bits
/usr/include/x86_64-linux-gnu/sys
/usr/include/c++/7/parallel
/usr/include/rpc
/usr/include/linux/sched
/usr/include/linux/iio
/usr/include/linux
/usr/include/asm-generic
/usr/include/x86_64-linux-gnu/asm

Combine find, grep and xargs with printf

I have a find command combined with exec grep and a printf option :
find -L /home/blast/dirtest -maxdepth 3 **-exec grep -q "pattern" {} \;** -printf '%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' 2> /dev/null
Result :
f/#/2018-01-01 10:00:00/#/191/#/filee.xml/#//#//home/blast/dirtest/01/05
I need the printf to get all the desired file informations at once (date, type size etc)
The above command works fine. But the exec option is too slow comparing to xargs.
I tryed to do the same with xarg but I did not succeed.
Any Idea on how to acheive that ? using the xargs command keeping the desired printf or similar .
Thanks
Your code is:
find -L /home/blast/dirtest -maxdepth 3 \
-exec grep -q "pattern" {} \; \
-printf '%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' 2> /dev/null
This invokes a new grep process for each file.
If you are using GNU utilities, you can reduce the number of grep processes by something like:
(
format=\''%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n'\'
find -L /home/blast/dirtest -maxdepth 3 -print0 |\
xargs -0 grep -l -Z "pattern" |\
xargs -0 sh -c 'find "$#" -printf '"$format" --
) 2>/dev/null
for clarity, store the formatstring in a variable
use -print0 / -0 / -Z options to enable null-delimited data
generate initial filelist with find
filter on "pattern" with grep (use of xargs minimises the number of times grep gets called)
feed the filtered filelist into another xargs to run a minimal number of find -printf
in second xargs, call a subshell so that extra arguments can be appended (find requires the paths to precede the operators)
dummy second argument (--) to the sh -c invocation prevents the first filename being lost due to assignment to $0
To do it exactly how you want:
find -L /home/blast/dirtest/ -maxdepth 3 \
-printf '%p#%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' \
> tmp.out
cut -d# -f1 tmp.out \
| xargs grep -l "pattern" 2>/dev/null \
| sed 's/^/^/; s/$/#/' \
| grep -f /dev/stdin tmp.out \
| sed 's/^.*#//'
This operates under the assumption that you have no character # in your file names.
What it does is avoid the grep at first and just dump all the files with the requested metadata to a temporary file.
But it also prefixes each line with the full path (%p#).
Then we extract (cut) the full paths out of this list and list the files which contains the pattern (xargs grep).
We then use sed to prefix each such file name with ^ and suffix it with #, which makes it a greppable pattern in our tmp.out file.
Then we use this pattern (grep -f /dev/stdin) to extract only those paths from the big list in tmp.out.
Now all that's left is to remove the artificial full path we prefixed using the last sed command.
Seeing how you used /home, there's a good chance you're on Linux, which, if you're willing to accept some output format changes, allows you to do it somewhat more elegantly:
find -L /home/blast/dirtest/ -maxdepth 3 \
| xargs grep -l "pattern" 2>/dev/null \
| xargs stat --printf '%F/#/%y/#/%s/#/%n\n'
The output of stat --printf is different from that of find -printf (and from that of MacOS' stat -f), but it's the same information.
Do note, however, that because you passed -L to find, and you're grepping the result:
The results are limited to file types which can be grepped, so they will never be directories, links, etc..
If you stumble upon a broken link, it will not be in the output because it cannot be grepped.
I'v found an intresting thing about the -exec option.
We could run the grep once using the exec with the plus-sign (+)
-exec command {} +
This variant of the -exec option runs the specified command on the selected files, but the command line is built by appending each selected file name at the end; the total
number of invocations of the command will be much less than the number of matched files. The command line is built in much the same way that xargs builds its command
lines. Only one instance of ’{}’ is allowed within the command. The command is executed in the starting directory.
That means if I change this :
-exec grep -l 'pattern' {} \;
By this ( replace the semicolon with the plus signe ):
-exec grep -l 'pattern' {} \+
Will improve the performance significantly.
Then I can pipe only one xargs for the format printing needs only.

Sort Matched Files by Last Modified and Timestamp

I need to look for files that match a certain pattern of characters, then find the most recent file and display it. The below code isn't quite getting me there but, I think I'm close.
Code:
find /home/weather/data/blend/ -type f -name "*.ctl" -printf '%Ts\t%p\n' | sort -nr | cut -f2
Here's a working solution:
find . -mmin -720 -type f -name "*.ctl" -exec ls -t {} \; | cut -c 3-

UNIX find for finding file names not paired by

Is there a simple way to recursively find all files in a directory hierarchy, that do not have a matching file with a different extension?
For example the directory has a bunch of files ending in .dat
I want to find the .dat files that do not have an accompanying .out file.
I have a while loop that checks each entry, but that is slow for long lists...
I am using GNU find.
Perhaps something like this?
find . -name "*.dat" -print | sort > column1.txt
find . -name "*.out" -print | sort > column2.txt
diff column1.txt column2.txt
I haven't tested it, but I think it's probably close to what you're asking for.
find . -name '*.dat' -printf "[ -f %p ] || echo %p\n" | sed 's/\.dat/.out/' | sh
I had to add a bunch of bells and whistles to the 1st solution, but that was a good start, thanks...
find . -print | grep -Fi '.dat' | grep -vFi '.dat.' | sort | sed -e 's/.dat//g' > column1.txt
find . -print | grep -Fi '.out' | grep -vFi '.out.' | sort | sed -e 's/.out//g' > column2.txt
sdiff -s column1.txt column2.txt | grep -F '<' | cut -f1 -d"<" > c12diff.txt

Shell Script — Get all files modified after <date>

I'd rather not do this in PHP so I'm hoping a someone decent at shell scripting can help.
I need a script that runs through directory recursively and finds all files with last modified date is greater than some date. Then, it will tar and zip the file(s) keeping the path information.
as simple as:
find . -mtime -1 | xargs tar --no-recursion -czf myfile.tgz
where find . -mtime -1 will select all the files in (recursively) current directory modified day before. you can use fractions, for example:
find . -mtime -1.5 | xargs tar --no-recursion -czf myfile.tgz
If you have GNU find, then there are a legion of relevant options. The only snag is that the interface to them is less than stellar:
-mmin n (modification time in minutes)
-mtime n (modification time in days)
-newer file (modification time newer than modification time of file)
-daystart (adjust start time from current time to start of day)
Plus alternatives for access time and 'change' or 'create' time.
The hard part is determining the number of minutes since a time.
One option worth considering: use touch to create a file with the required modification time stamp; then use find with -newer.
touch -t 200901031231.43 /tmp/wotsit
find . -newer /tmp/wotsit -print
rm -f /tmp/wotsit
This looks for files newer than 2009-01-03T12:31:43. Clearly, in a script, /tmp/wotsit would be a name with the PID or other value to make it unique; and there'd be a trap to ensure it gets removed even if the user interrupts, and so on and so forth.
You can do this directly with tar and even better:
tar -N '2014-02-01 18:00:00' -jcvf archive.tar.bz2 files
This instructs tar to compress files newer than 1st of January 2014, 18:00:00.
This will work for some number of files. You want to include "-print0" and "xargs -0" in case any of the paths have spaces in them. This example looks for files modified in the last 7 days. To find those modified before the last 7 days, use "+7".
find . -mtime -7 -print0 | xargs -0 tar -cjf /foo/archive.tar.bz2
As this page warns, xargs can cause the tar command to be executed multiple times if there are a lot of arguments, and the "-c" flag could cause problems. In that case, you would want this:
find . -mtime -7 -print0 | xargs -0 tar -rf /foo/archive.tar
You can't update a zipped tar archive with tar, so you would have to bzip2 or gzip it in a second step.
This should show all files modified within the last 7 days.
find . -type f -mtime -7 -print
Pipe that into tar/zip, and you should be good.
I would simply do the following to backup all new files from 7 days ago
tar --newer $(date -d'7 days ago' +"%d-%b") -zcf thisweek.tgz .
note you can also replace '7 days ago' with anything that suits your need
Can be : date -d'yesterday' +"%d-%b"
Or even : date -d'first Sunday last month' +"%d-%b"
well under linux try reading man page of the find command
man find
something like this should
find . -type f -mtime -7 -print -exec cat {} \; | tar cf - | gzip -9
and you have it
You can get a list of files last modified later than x days ago with:
find . -mtime -x
Then you just have to tar and zip files in the resulting list, e.g.:
tar czvf mytarfile.tgz `find . -mtime -30`
for all files modified during last month.
This script will find files having a modification date of two minutes before and after the given date (and you can change the values in the conditions as per your requirement)
PATH_SRC="/home/celvas/Documents/Imp_Task/"
PATH_DST="/home/celvas/Downloads/zeeshan/"
cd $PATH_SRC
TODAY=$(date -d "$(date +%F)" +%s)
TODAY_TIME=$(date -d "$(date +%T)" +%s)
for f in `ls`;
do
# echo "File -> $f"
MOD_DATE=$(stat -c %y "$f")
MOD_DATE=${MOD_DATE% *}
# echo MOD_DATE: $MOD_DATE
MOD_DATE1=$(date -d "$MOD_DATE" +%s)
# echo MOD_DATE: $MOD_DATE
DIFF_IN_DATE=$[ $MOD_DATE1 - $TODAY ]
DIFF_IN_DATE1=$[ $MOD_DATE1 - $TODAY_TIME ]
#echo DIFF: $DIFF_IN_DATE
#echo DIFF1: $DIFF_IN_DATE1
if [[ ($DIFF_IN_DATE -ge -120) && ($DIFF_IN_DATE1 -le 120) && (DIFF_IN_DATE1 -ge -120) ]]
then
echo File lies in Next Hour = $f
echo MOD_DATE: $MOD_DATE
#mv $PATH_SRC/$f $PATH_DST/$f
fi
done
For example you want files having modification date before the given date only, you may change 120 to 0 in $DIFF_IN_DATE parameter discarding the conditions of $DIFF_IN_DATE1 parameter.
Similarly if you want files having modification date 1 hour before and after given date,
just replace 120 by 3600 in if CONDITION.

Resources