How to grep for the whole word - unix

I am using the following command to grep stuff in subdirs
find . | xargs grep -s 's:text'
However, this also finds stuff like <s:textfield name="sdfsf"...../>
What can I do to avoid that so it just finds stuff like <s:text name="sdfsdf"/>
OR for that matter....also finds <s:text somethingElse="lkjkj" name="lkkj"
basically s:text and name should be on same line....

You want the -w option to specify that it's the end of a word.
find . | xargs grep -sw 's:text'

Use \b to match on "word boundaries", which will make your search match on whole words only.
So your grep would look something like
grep -r "\bSTRING\b"
adding color and line numbers might help too
grep --color -rn "\bSTRING\b"
From http://www.regular-expressions.info/wordboundaries.html:
There are three different positions that qualify as word boundaries:
Before the first character in the string, if the first character is a
word character.
After the last character in the string, if the last
character is a word character.
Between two characters in the string,
where one is a word character and the other is not a word character.

You can drop the xargs command by making grep search recursively. And you normally don't need the 's' flag. Hence:
grep -wr 's:text'

you could try rg, https://github.com/BurntSushi/ripgrep :
rg -w 's:text' .
should do it

Use -w option for whole word match. Sample given below:
[binita#ubuntu ~]# a="abcd efg"
[binita#ubuntu ~]# echo $a
abcd efg
[binita#ubuntu ~]# echo $a | grep ab
abcd efg
[binita#ubuntu ~]# echo $a | grep -w ab
[binita#ubuntu ~]# echo $a | grep -w abcd
abcd efg

This is another way of setting the boundaries of the word, note that it doesn't work without the quotes around it:
grep -r '\<s:text\>' .

If you just want to filter out the remainder text part, you can do this.
xargs grep -s 's:text '
This should find only s:text instances with a space after the last t. If you need to find s:text instances that only have a name element, either pipe your results to another grep expression, or use regex to filter only the elements you need.

Related

how to list a specific string or number in a file in Unix

for Example if your file has following lines
1=10200|2=2343i|3=otit|5=89898|54=9546i96i|10=2459
1=10200|54=9546i96i|10=2459|2=2343i|3=otit|5=8
1=10200|5=IGY|14=897|459=122|132=1|54=9546i96i|10=2459
1=10200|2=2343i|5=0|54=9546i96i
The output should be
5=89898
5=8
5=IGY
5=0
You could use grep with the -o flag to return only the regexp matches.
Assuming you have a file.txt that you want to parse:
cat file.txt | grep -o -E "(\||^)5=[^|]*" | grep -o "5=[^|]*"
This will match anything that starts with 5= up until the first |.
By running this command on the input you provided I get:
5=89898
5=8
5=IGY
5=0
Cheers
Edit: as Walter A suggested, my previous solution did not cover all cases.
I have added an extra parsing step: first, you get all strings that match 5=... at the start of a line, or |5=..., and then you remove the |.
Use (^|[|]) for matching start of field (start of line or |) and remember/match string until next | or end-of-line.
sed -nr 's/.*(^|[|])(5=[^|]*).*/\2/p' file

How to grep the output from first grep of the date?

I would like to search from the first output of grep. First grep I search all the files from yesterday that has the word "abc". I would like to also find if it has "def" from first search.
This is what I have
find /tttt/aaaa/bbb -type f -mtime -1 -mtime 0 -print0| xargs -0 grep -l "abc"
I would like to find the output that also contains "def".
It sounds like you are wanting to search for multiple strings in a list. This can be accomplished with the -E switch on grep. This allows you to provide search strings separated with a pipe.
grep -E 'abc|def'
Another option is to use a backslash on the pipe:
grep 'abc\|def'
EDIT: Oops. I thought you wanted an OR, but you want an AND.
You just need to add another pipe to grep at the end of your one-liner:
| xargs grep -l 'def'

Divide the result of two grep and word count

I have a log file and I would like to divide the result of one grep and count by another grep and count.
$ echo $((cat log2.txt | grep timed\|error\|Error | wc -l)/(cat log2.txt | grep Duration | wc -l))
zsh: bad math expression: operator expected at `log2.txt |...'
It's ugly, doesn't work and I can probably do it in a better way but I don't know how.
Also I would like to know if it possible to id incrementaly on a log stream read by tail for example.
First of all, you should know that, both grep|wc -l will count number of matched lines instead of occurrences, I hope this is what you really want.
Regarding your requirement, indeed, your approach is ugly (7 processes), apart from the mistakes. The job can be done by a single awk line:
awk '/timed|[Ee]rror/{a++}/Duration/{b++}END{printf "%.2f\n",a/b}' log2.txt
The above line calculates the result based on matched number of lines, same as your grep|wc -l.
You have several problems:
You are trying to run shell commands directly inside an arithmetic expression.
You aren't passing the correct regular expression to grep.
You need to make sure at least one of the operands is a floating-point value to trigger zsh's floating-point division.
Each pipeline can also be reduced to a single command; use input redirection instead of cat, and use the -c option to get the number of lines that match the regular expression.
echo $(( 1.0 * $(grep -c 'timed\|error\|Error' log2.txt) / $(grep -c Duration log2.txt))
Basic regular expressions treat unescaped | as a literal character, not an alteration operator.
$ echo foo | grep foo\|bar
$ echo foo | grep foo\\\|bar # Pass a literal backslash as part of the regex
foo
$ echo foo | grep 'foo\|bar' # Use '...' instead of explicitly escaping \ and |
foo
$ echo foo | grep -E 'foo|bar' # Use extended regular expressions instead

grep not giving expected result

I am having trouble understanding following grep operation
a=jQuery.Uno
echo $a | grep -i "jquerya*"
why is above query returning jQuery.Uno?
The * quantifier matches 0 (zero) or more.
In the string, jQuery.Uno there is 0 a after y. As such, the regex jquerya* matches the string.
If you wanted one or more of a, then instead say:
grep -i "jquerya\{1,\}"
or, if your version of grep supports extended regular expressions:
grep -iE "jquerya+"
Moreover, instead of echo "$var" | grep ..., it is better to make use of herestrings if your shell supports those:
grep -iE "jquerya+" <<< "$a"

grepping lines from a document using xargs

Let's say I have queries.txt.
queries.txt:
cat
dog
123
now I want to use them are queries to find lines in myDocument.txt using grep.
cat queries.txt | xargs grep -f myDocument.txt
myDocument has lines like
cat
i have a dog
123
mouse
it should return the first 3 lines. but it's not. instead, grep tries to find them as file names. what am i doing wrong?
Here, you just need:
grep -f queries.txt myDocument.txt
This causes grep to read the regular expressions from the file queries.txt and then apply them to myDocument.txt.
In the xargs version, you were effectively writing:
grep -f myDocument.txt cat dog 123
If you absolutely must use xargs, then you'll need to write:
xargs -I % grep -e % myDocument.txt < queries.txt
This avoids a UUOC — Useless Use of cat – award by redirecting standard input from queries.txt. It uses the -I % option to specify where the replacement text should go in the command line. Using the -e option means that if the pattern is, say --help, you won't run into problems with (GNU) grep treating that as an argument (and therefore printing its help message).
The grep -e option will take a pattern string as an argument. -f treats the argument as a file name of a file with patterns in it.

Resources