Can't seem to search just first line of text - unix

I'm using the following to only search the first line of a file for the report name. It's searching the whole file instead. I thought NR==1 would only search the first line. I think I just have a bad syntax.
find /SYM/SYM000/REPORT/ -type f -mmin -480 \
-name '[0-9][0-9][0-9][0-9][0-9][0-9]' \
-exec awk '/My Report Title/,NR==1 {print FILENAME; exit}' {} \;
Any help is appreciated.
I just want to return the filename. It looks for the past eight hours with a 6 digit number as the filename mask.

hek2gml's answer contains the crucial pointer - you must use && for logical AND rather than a range - but the command can be made more efficient in two respects:
Short-circuit processing of a given input file so that processing stops after the first line.
Passing (typically) all files to a single awk call, by terminating the -exec primary with + rather than \;
find /SYM/SYM000/REPORT/ -type f -mmin -480 \
-name '[0-9][0-9][0-9][0-9][0-9][0-9]' \
-exec awk '/My Report Title/ { print FILENAME } { nextfile }' {} +
This command only ever looks at the 1st line of each input file.
nextfile is not strictly POSIX-compliant, so if your awk doesn't have it (GNU Awk, Mawk, and BSD/OSX Awk do - not sure about AIX), use (less efficient, because it must read all lines of each file):
find /SYM/SYM000/REPORT/ -type f -mmin -480 \
-name '[0-9][0-9][0-9][0-9][0-9][0-9]' \
-exec awk 'FNR == 1 && /My Report Title/ { print FILENAME }' {} +
If, in the absence of nextfile, you'd rather call awk once for each file (-exec terminator \;), as in the original solution attempt (reads only the first line of each file, but calls awk once for each file):
find /SYM/SYM000/REPORT/ -type f -mmin -480 \
-name '[0-9][0-9][0-9][0-9][0-9][0-9]' \
-exec awk '/My Report Title/ { print FILENAME } { exit }' \;

Looks like you assume that /My Report Title/,NR==1 will be a kind of a list of conditions separated by a ,. That assumption is wrong.
Right in this case would be to use the logical AND operator && to concatenate the conditions:
find /SYM/SYM000/REPORT/ -type f -mmin -480 \
-name '[0-9][0-9][0-9][0-9][0-9][0-9]' \
-exec awk '/My Report Title/ && NR==1 {print FILENAME; exit}' {} \;

Related

Combine find, grep and xargs with printf

I have a find command combined with exec grep and a printf option :
find -L /home/blast/dirtest -maxdepth 3 **-exec grep -q "pattern" {} \;** -printf '%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' 2> /dev/null
Result :
f/#/2018-01-01 10:00:00/#/191/#/filee.xml/#//#//home/blast/dirtest/01/05
I need the printf to get all the desired file informations at once (date, type size etc)
The above command works fine. But the exec option is too slow comparing to xargs.
I tryed to do the same with xarg but I did not succeed.
Any Idea on how to acheive that ? using the xargs command keeping the desired printf or similar .
Thanks
Your code is:
find -L /home/blast/dirtest -maxdepth 3 \
-exec grep -q "pattern" {} \; \
-printf '%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' 2> /dev/null
This invokes a new grep process for each file.
If you are using GNU utilities, you can reduce the number of grep processes by something like:
(
format=\''%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n'\'
find -L /home/blast/dirtest -maxdepth 3 -print0 |\
xargs -0 grep -l -Z "pattern" |\
xargs -0 sh -c 'find "$#" -printf '"$format" --
) 2>/dev/null
for clarity, store the formatstring in a variable
use -print0 / -0 / -Z options to enable null-delimited data
generate initial filelist with find
filter on "pattern" with grep (use of xargs minimises the number of times grep gets called)
feed the filtered filelist into another xargs to run a minimal number of find -printf
in second xargs, call a subshell so that extra arguments can be appended (find requires the paths to precede the operators)
dummy second argument (--) to the sh -c invocation prevents the first filename being lost due to assignment to $0
To do it exactly how you want:
find -L /home/blast/dirtest/ -maxdepth 3 \
-printf '%p#%y/#/%TY-%Tm-%Td %TX/#/%s/#/%f/#/%l/#/%h\n' \
> tmp.out
cut -d# -f1 tmp.out \
| xargs grep -l "pattern" 2>/dev/null \
| sed 's/^/^/; s/$/#/' \
| grep -f /dev/stdin tmp.out \
| sed 's/^.*#//'
This operates under the assumption that you have no character # in your file names.
What it does is avoid the grep at first and just dump all the files with the requested metadata to a temporary file.
But it also prefixes each line with the full path (%p#).
Then we extract (cut) the full paths out of this list and list the files which contains the pattern (xargs grep).
We then use sed to prefix each such file name with ^ and suffix it with #, which makes it a greppable pattern in our tmp.out file.
Then we use this pattern (grep -f /dev/stdin) to extract only those paths from the big list in tmp.out.
Now all that's left is to remove the artificial full path we prefixed using the last sed command.
Seeing how you used /home, there's a good chance you're on Linux, which, if you're willing to accept some output format changes, allows you to do it somewhat more elegantly:
find -L /home/blast/dirtest/ -maxdepth 3 \
| xargs grep -l "pattern" 2>/dev/null \
| xargs stat --printf '%F/#/%y/#/%s/#/%n\n'
The output of stat --printf is different from that of find -printf (and from that of MacOS' stat -f), but it's the same information.
Do note, however, that because you passed -L to find, and you're grepping the result:
The results are limited to file types which can be grepped, so they will never be directories, links, etc..
If you stumble upon a broken link, it will not be in the output because it cannot be grepped.
I'v found an intresting thing about the -exec option.
We could run the grep once using the exec with the plus-sign (+)
-exec command {} +
This variant of the -exec option runs the specified command on the selected files, but the command line is built by appending each selected file name at the end; the total
number of invocations of the command will be much less than the number of matched files. The command line is built in much the same way that xargs builds its command
lines. Only one instance of ’{}’ is allowed within the command. The command is executed in the starting directory.
That means if I change this :
-exec grep -l 'pattern' {} \;
By this ( replace the semicolon with the plus signe ):
-exec grep -l 'pattern' {} \+
Will improve the performance significantly.
Then I can pipe only one xargs for the format printing needs only.

Unix to find exact match of string

I'm looking to find a way to match an exact string
for example:
I have these cmd that I run on unix server
1.)find ./ -name "*.jsp" -type f -exec grep -m1 -l '50.000' {} + >> 50dotcol.txt
2.)find ./ -name ".jsp" -type f -exec grep -m1 -l '\<50.000>' {} + >> 50dotcol.txt
Edited after Georges response:
find ./ -name "*.jsp" -type f -exec grep -m1 -l '(50.000)' {} + >> 50dotcol.txt
Still didn't pull in any results
The first one will find any string containing "50" the second will omit double digit strings but will pull in $50,000 $50.000. But I'm just looking to pull in "50.000" and that's it, no other variations of this
Am I missing something in my find cmd?
Use
grep -m1 -l '(50\.000)'
instead. That backslash before the < interprets it literally, which you don't want to do. And you need to use parenthesis to have it considered as an exact group of characters.

Unix script for loop find error --- find: paths must precede expression

I am formulating the script so that for loop automates with the find command.
I am getting the error "find: paths must precede expression"
++ find alogic/batch/Instrument b/Instrument/Bank b/Instrument/container \
b/Instrument/Authorize b/Instrument/Common b/Instrument/Confirm \
-type d -type d '\(' -path alogic/batch/Instrument/BuyerCredit \
-o -path alogic/batch/Instrument/DebitCard '\)' -prune -o -name \
'*.cpp' -print
find: paths must precede expression
eg:
inclusive_directories = alogic/batch/Instrument b/Instrument/Bank b/Instrument/container b/Instrument/Authorize b/Instrument/Common b/Instrument/Confirm
exclusive_unix_notation=
-type d \( -path alogic/batch/Instrument/BuyerCredit -o -path alogic/batch/Instrument/DebitCard \)
Scripts
for directory in `echo "$APPLOGIC_EXCLUSIVE" "$BIZ_EXCLUSIVE" "$PACKAGE_EXCLUSIVE" "$PIMP_EXCLUSIVE" "$OTHER_EXCLUSIVE"`
do
if [[ -d "$directory" ]]; then
#intially
if [[ "$exclusive_unix_notation" == "" ]]; then
exclusive_unix_notation=" -type d \( -path $directory"
else
exclusive_unix_notation="`echo $exclusive_unix_notation` -o -path $directory"
fi
fi
done
#if processed succesfully added the close brace
if [[ "$exclusive_unix_notation" != "" ]]; then
exclusive_unix_notation="`echo $exclusive_unix_notation` \) "
fi
# generate cpp files with files to be excluded
for files in `find $inclusive_directories $exclusive_unix_notation -prune -o -name "*.cpp" -print`
do
if [[ -f "$files" ]]; then
echo "$files"
fi
done | sed 's#^\./##' | sed 's/.cpp/.o/' | sort > $OBJ_LIST
exit;
You are breaking your script likely with use of
"`echo $something` another thing"
For what you seem to be doing, you can simply:
myvar="$somevar another thing"
That way you avoid all problems associated with multiple expansion of your commands. And your for loop is suboptimal. Why don't you just have:
find some options -print | sed some other options | ...
To see only regular files you can add -type f to the find command.
Keep simple things simple and try to understand what you are doing. Doing more than needed usually leads to troubles.
UPDATE: besides what I said as a general recommendation above, you need to use eval find $.... Otherwise your command line options are not passed to find separately as it expects but as a single option with spaces within. The quotes you see are inserted by bash so you see that the \( is passed literary and it is not just a ( as it should actually be.
Using eval has its own challanges though because it removes a layer of escaping and quoting. So in your case you may need to additionally escape the * symbol.

Adding data line by line in file in Unix

I am extracting file names from one command it returns many file names and i am putting them into one file
code :
echo `find ${FILE_SYSTEM}/${dir_name}/${sub_dir_name} -type f -size +${BADFILES_SIZE} -exec ls -1lutr {} \; | sort -rn | awk '{print $9}'` >> Somefile.txt
The problem here is that i am not getting file names on each line.
Its giving two filenames on 1 line.
But i want to have each filename on 1 line.
Eg :
/informatica/ETD/PC9/scripts/kamil/temp/temp1.txt /informatica/ETD/PC9/scripts/kamil/temp/temp2.txt
I am getting filenames as shown above and i want as shown below.
/informatica/ETD/PC9/scripts/kamil/temp/temp1.txt
/informatica/ETD/PC9/scripts/kamil/temp/temp2.txt
Please give ur suggestions,
The problem is that you're using echo and backticks. Don't! The echo flattens all its arguments (a list of two files, it seems) into a single line of output.
Wrong:
echo `find ${FILE_SYSTEM}/${dir_name}/${sub_dir_name} -type f -size +${BADFILES_SIZE} -exec ls -1lutr {} \; | sort -rn | awk '{print $9}'` >> Somefile.txt
Right:
find ${FILE_SYSTEM}/${dir_name}/${sub_dir_name} -type f \
-size +${BADFILES_SIZE} -exec ls -1lutr {} + |
sort -rn |
awk '{print $9}' >> Somefile.txt

Unix Find Replace Special Characters in Multiple Files

I've got a set of files in a web root that all contain special characters that I'd like to remove (Â,€,â,etc).
My command
find . -type f -name '*.*' -exec grep -il "Â" {} \;
finds & lists out the files just fine, but my command
find . -type f -name '*.*' -exec tr -d 'Â' '' \;
doesn't produce the results I'm looking for.
Any thoughts?
to replace all non-ascii characters in all files inside the current directory you could use:
find . -type f | xargs perl -pi.bak -e 's,[^[:ascii:]],,g'
afterwards you will have to find and remove all the '.bak' files:
find . -type f -a -name \*.bak | xargs rm
I would recommend looking into sed. It can be used to replace the contents of the file.
So you could use the command:
find . -type f -name '*.*' -exec sed -i "s/Â//" {} \;
I have tested this with a simple example and it seems to work. The -exec should handle files with whitespace in their name, but there may be other vulnerabilities I'm not aware of.
Use
tr -d 'Â'
What does the ' ' stands for? On my system using your command produces this error:
tr: extra operand `'
Only one string may be given when deleting without squeezing repeats.
Try `tr --help' for more information.
sed 's/ø//' file.txt
That should do the trick for replacing a special char with an empty string.
find . -name "*.*" -exec sed 's/ø//' {} \
It would be helpful to know what "doesn't produce the results I'm looking for" means. However, in your command tr is not provided with the filenames to process. You could change it to this:
find . -type f -name '*.*' -exec tr -d 'Â' {} \;
Which is going to output everything to stdout. You probably want to modify the files instead. You can use Grundlefleck's answer, but one of the issues alluded to in that answer is if there are large numbers of files. You can do this:
find . -type f -name '*.*' -print0 | xargs -0 -I{} sed -i "s/Â//" \{\}
which should handle files with spaces in their names as well as large numbers of files.
with bash shell
for file in *.*
do
case "$file" in
*[^[:ascii:]]* )
mv "$file" "${file//[^[:ascii:]]/}"
;;
esac
done
I would use something like this.
for file in `find . -type f`
do
# Search for char end remove it. Save file as file.new
sed -e 's/[ۉ]//g' $file > $file.new
# mv file.new to file DON'T RUN IF YOU WILL NOT OVERITE ORIGINAL FILE
mv $file.new $file
done
The above script will fail as levislevis85 has mentioned it with spaces in filenames. This would not be the case if you use the following code.
find . -type f | while read file
do
# Search for char end remove it. Save file as file.new
sed -e 's/[ۉ]//g' "$file" > "$file".new
# mv file.new to file DON'T RUN IF YOU WILL NOT OVERITE ORIGINAL FILE
mv "$file".new "$file"
done

Resources