grep -wF "example1#domain1.org" filename.txt and special characters - unix

Inside filename.txt I've:
example1#domain1.org example1#domain1.org
example2#domain1.org example2#domain1.org
example3#domain1.org newexample3#domain1.org
example4#domain1.org oldexample4#domain1.org
example5#domain1.org otherexample5#domain1.org
I need search inside it the exact match, the result of:
grep -wF "example1#domain1.org" filename.txt
it's correct.
My problem is that grep show me correct result also if I do:
grep -wF "example1" filename.txt
maybe, is the # (special characters) problem.
-w, --word-regexp
-F, --fixed-strings

Word characters in grep consist of letters, digits, and underscores. So # will end a word.
http://www.gnu.org/software/grep/manual/html_node/Matching-Control.html#Matching-Control

Related

how to grep nth string

How to use "grep" shell command to show specific word from a line starting with a specific word.
Ex:
I want to print a string "myFTPpath/folderName/" from the line starting with searchStr in the below mentioned line.
searchStr:somestring:myFTPpath/folderName/:somestring
Something like this with awk:
awk -F: '/^searchStr/{print $3}' File
From all the lines starting with searchStr, print the 3rd field (field seperator set as :)
Sample:
AMD$ cat File
someStr:somestring:myFTPpath/folderName/:somestring
someStr:somestring:myFTPpath/folderName/:somestring
searchStr:somestring:myFTPpath/folderName/:somestring
someStr:somestring:myFTPpath/folderName/:somestring
AMD$ awk -F: '/^searchStr/{print $3}' File
myFTPpath/folderName/
Remember that grep isn't the only tool that can usefully do searches.
In this particular case, where the lines are naturally broken into fields, awk is probably the best solution, as #A.M.D's answer suggests.
For more general case edits, however, remember sed's -n option, which suppresses printing out a line after edits:
sed -n 's/searchStr:[^:]*:\([^:]*\):.*/\1/p' input-file
The -n suppresses automatic printing of the line, and the trailing /p flag explicitly prints out lines on which there is a substitution.
This matching pattern is fiddly – use awk in this fielded case – but don't forget sed -n.
You could get the desired output with grep itself but you need to enable -P and -o parameters.
$ echo 'searchStr:somestring:myFTPpath/folderName/:somestring' | grep -oP '^searchStr:[^:]*:\K[^:]*'
myFTPpath/folderName/
\K discards the characters which are matched previously from printing at the final leaving only the characters which are matched by the pattern exists next to \K. Here we used \K instead of a variable length positive lookbehind assertion.

How to use grep to get special character

I want to grep \" as following text in file(abc) like:
$egrep -n "^\"$" abc
"CO_FA_SC_600212","2","\"HSE 48\" 48 CHIVALRY AVE"
But its not appearing how could i use egrep or grep to get the line.
grep -F 'special char' filename will search the lines which has special characters.
grep -Fn 'special char' filename gets the line number too.
man grep says,
-F, --fixed-strings: Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched
Here are two alternatives:
grep -n "\\\\\"" filename
grep -n '\\"' filename
For the first one, two consecutive \\ act as a single \, and " was escaped by \, so \\" is passed to grep.
For the second one, \ is taken literally, so \\" is passed to grep
3 and 4 can be verified by echo "\\\\\"" and echo '\\"'

grep for special characters in Unix

I have a log file (application.log) which might contain the following string of normal & special characters on multiple lines:
*^%Q&$*&^#$&*!^#$*&^&^*&^&
I want to search for the line number(s) which contains this special character string.
grep '*^%Q&$*&^#$&*!^#$*&^&^*&^&' application.log
The above command doesn't return any results.
What would be the correct syntax to get the line numbers?
Tell grep to treat your input as fixed string using -F option.
grep -F '*^%Q&$*&^#$&*!^#$*&^&^*&^&' application.log
Option -n is required to get the line number,
grep -Fn '*^%Q&$*&^#$&*!^#$*&^&^*&^&' application.log
The one that worked for me is:
grep -e '->'
The -e means that the next argument is the pattern, and won't be interpreted as an argument.
From: http://www.linuxquestions.org/questions/programming-9/how-to-grep-for-string-769460/
A related note
To grep for carriage return, namely the \r character, or 0x0d, we can do this:
grep -F $'\r' application.log
Alternatively, use printf, or echo, for POSIX compatibility
grep -F "$(printf '\r')" application.log
And we can use hexdump, or less to see the result:
$ printf "a\rb" | grep -F $'\r' | hexdump -c
0000000 a \r b \n
Regarding the use of $'\r' and other supported characters, see Bash Manual > ANSI-C Quoting:
Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard
grep -n "\*\^\%\Q\&\$\&\^\#\$\&\!\^\#\$\&\^\&\^\&\^\&" test.log
1:*^%Q&$&^#$&!^#$&^&^&^&
8:*^%Q&$&^#$&!^#$&^&^&^&
14:*^%Q&$&^#$&!^#$&^&^&^&
You could try removing any alphanumeric characters and space. And then use -n will give you the line number. Try following:
grep -vn "^[a-zA-Z0-9 ]*$" application.log
Try vi with the -b option, this will show special end of line characters
(I typically use it to see windows line endings in a txt file on a unix OS)
But if you want a scripted solution obviously vi wont work so you can try the -f or -e options with grep and pipe the result into sed or awk.
From grep man page:
Matcher Selection
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched. (-F is specified
by POSIX.)

grep -w with only space as delimiter

grep -w uses punctuations and whitespaces as delimiters.
How can I set grep to only use whitespaces as a delimiter for a word?
If you want to match just spaces: grep -w foo is the same as grep " foo ". If you also want to match line endings or tabs you can start doing things like: grep '\(^\| \)foo\($\| \)', but you're probably better off with perl -ne 'print if /\sfoo\s/'
You cannot change the way grep -w works. However, you can replace punctuations with, say, X character using tr or sed and then use grep -w, that will do the trick.
The --word-regexp flag is useful, but limited. The grep man page says:
-w, --word-regexp
Select only those lines containing matches that form whole
words. The test is that the matching substring must either be
at the beginning of the line, or preceded by a non-word
constituent character. Similarly, it must be either at the end
of the line or followed by a non-word constituent character.
Word-constituent characters are letters, digits, and the
underscore.
If you want to use custom field separators, awk may be a better fit for you. Or you could just write an extended regular expression with egrep or grep --extended-regexp that gives you more control over your search pattern.
Use tr to replace spaces with new lines. Then grep your string. The contiguous string I needed was being split up with grep -w because it has colons in it. Furthermore, I only knew the first part, and the second part was the unknown data I needed to pull. Therefore, the following helped me.
echo "$your_content" | tr ' ' '\n' | grep 'string'

How to grep for the whole word

I am using the following command to grep stuff in subdirs
find . | xargs grep -s 's:text'
However, this also finds stuff like <s:textfield name="sdfsf"...../>
What can I do to avoid that so it just finds stuff like <s:text name="sdfsdf"/>
OR for that matter....also finds <s:text somethingElse="lkjkj" name="lkkj"
basically s:text and name should be on same line....
You want the -w option to specify that it's the end of a word.
find . | xargs grep -sw 's:text'
Use \b to match on "word boundaries", which will make your search match on whole words only.
So your grep would look something like
grep -r "\bSTRING\b"
adding color and line numbers might help too
grep --color -rn "\bSTRING\b"
From http://www.regular-expressions.info/wordboundaries.html:
There are three different positions that qualify as word boundaries:
Before the first character in the string, if the first character is a
word character.
After the last character in the string, if the last
character is a word character.
Between two characters in the string,
where one is a word character and the other is not a word character.
You can drop the xargs command by making grep search recursively. And you normally don't need the 's' flag. Hence:
grep -wr 's:text'
you could try rg, https://github.com/BurntSushi/ripgrep :
rg -w 's:text' .
should do it
Use -w option for whole word match. Sample given below:
[binita#ubuntu ~]# a="abcd efg"
[binita#ubuntu ~]# echo $a
abcd efg
[binita#ubuntu ~]# echo $a | grep ab
abcd efg
[binita#ubuntu ~]# echo $a | grep -w ab
[binita#ubuntu ~]# echo $a | grep -w abcd
abcd efg
This is another way of setting the boundaries of the word, note that it doesn't work without the quotes around it:
grep -r '\<s:text\>' .
If you just want to filter out the remainder text part, you can do this.
xargs grep -s 's:text '
This should find only s:text instances with a space after the last t. If you need to find s:text instances that only have a name element, either pipe your results to another grep expression, or use regex to filter only the elements you need.

Resources