I'm trying to extract an address from a file.
grep keyword /path/to/file
is how I'm finding the line of code I want. The output is something like
var=http://address
Is there a way I can get only the part directly after the = i.e. http://address , considering the keyword I'm greping for is both in the var and http://address parts
grep keyword /path/to/file | cut -d= -f2-
Just pipe to cut:
grep keyword /path/to/file | cut -d '=' -f 2
You can avoid the needless pipes:
awk -F= '/keyword/{print $2}' /path/to/file
Related
I'm trying to figure out a command to parse the following file content:
Operation=GET
Type=HOME
Counters=CacheHit=0,Exception=1,Validated=0
I need to extract Exception=1 into its own line. I'm fiddling with awk, sed and grep but not making much progress. Does anyone have any tips on using any unix command to perform this?
Thanks
Since your file is close to bash syntax, there is a fun little trick you can do to make bash itself parse the file. First, use some program like tr to transform the input into a something bash can parse, and then "source" that, which will create shell variables you can expand later to get the values.
source <(tr , $'\n' < file_name_goes_here)
echo $Exception
Many ways to do this. Here is one assuming the file is called "file.txt". Grab the line you want, replace everything from the start of the line up to Except with just Except, then pull out the first field using comma as the delimiter.
$ grep Exception file.txt | sed 's/.*Except/Except/g' | cut -d, -f 1
Exception=1
If you wanted to use gawk:
$ grep Exception file.txt | sed 's/.*Except/Except/g' | gawk -F, '{print $1}'
Exception=1
or just using grep and sed:
$ grep Exception file.txt | sed 's/.*\(Exception=[0-9]*\).*/\1/g'
Exception=1
or as #sheltter reminded me:
$ egrep -o "Exception=[0-9]+" file.txt
Exception=1
No need to use a mix of commands.
awk -F, 'NR==2 {print RS$1}' RS="Exception" file
Exception=1
Here we split the line by the keyword we look for RS="Exception"
If the line has two record (only when keyword is found), then
print first field, separated using command, with Record selector.
PS This only works if you have one Exception field
pdftk file.pdf dump_data output | grep NumberOfPages:
gives me:
NumberOfPages: 5
I don't want it to output NumberOfPages. I want to get in this case just 5. Is there a flag I can say in grep to get just that? I did a man grep and nothing seemed to do the trick.
I think grep doesn't know about how to parse strings in different formats. But other utilities like awk will help you:
pdftk file.pdf dump_data output | grep NumberOfPages: | awk '{print $2}'
pdftk file.pdf dump_data output | grep NumberOfPages: | sed 's\NumberOfPages:\\'
Yes, in GNU Grep you can use the -o operator to get "only" the matching portion of your expression. So something like;
pdftk file.pdf dump_data output | grep -o ' .*'
Could work for you. As other answers have pointed out, if you want only the number you'd be better off using something in addition to grep.
For example:
$ echo 'NumberOfPages: 5' | grep -o ' .*'
5
Notice the space before the 5 being included.
Need a grep one liner [ without pipe ] which would check multiple expressions in a single command
cat FILE | egrep OH|OI is not working
What you're looking for is a simple:
egrep 'OH|OI' FILE
The command you have (without the quotes):
cat FILE | egrep OH|OI
will attempt to cat the file through egrep looking for OH then pipe the results of that through an executable OI (which probably doesn't exist).
The quotes will fix that for you so that OH|HI is a single argument to the egrep rather than something the shell processes.
Just try:
cat FILE | egrep 'OH|OI'
Eliminate the pipe by supplying the filename.
egrep 'OH|OI' filename
I use Terminal in Mac with the following command:
df -lak | grep File||disk02
what I want to use this script to get the header of df command (disk space) and the line with disk02 only. I think '|' is a char in grep as or logic. However, since I am using grep in Terminal, the char '|' also means pipe. Therefore I tried to use '||' to avoid piping, but it does not get what I want. Only the header with "File" is back.
Not sure how I can use this script command in Terminal?
df -lak | grep "File\|disk02"
use awk
df -lak | awk 'NR==1 || /disk02/'
Or
df -lak | grep -E "File|disk02"
df -lak | grep -E '(^File|disk02)'
You can shorten grep -E to egrep.
df -lak | grep -e File -e disk02
Just display the mount points you're interested in!
df -lak /
In order to use the uniq command, you have to sort your file first.
But in the file I have, the order of the information is important, thus how can I keep the original format of the file but still get rid of duplicate content?
Another awk version:
awk '!_[$0]++' infile
This awk keeps the first occurrence. Same algorithm as other answers use:
awk '!($0 in lines) { print $0; lines[$0]; }'
Here's one that only needs to store duplicated lines (as opposed to all lines) using awk:
sort file | uniq -d | awk '
FNR == NR { dups[$0] }
FNR != NR && (!($0 in dups) || !lines[$0]++)
' - file
There's also the "line-number, double-sort" method.
nl -n ln | sort -u -k 2| sort -k 1n | cut -f 2-
You can run uniq -d on the sorted version of the file to find the duplicate lines, then run some script that says:
if this_line is in duplicate_lines {
if not i_have_seen[this_line] {
output this_line
i_have_seen[this_line] = true
}
} else {
output this_line
}
Using only uniq and grep:
Create d.sh:
#!/bin/sh
sort $1 | uniq > $1_uniq
for line in $(cat $1); do
cat $1_uniq | grep -m1 $line >> $1_out
cat $1_uniq | grep -v $line > $1_uniq2
mv $1_uniq2 $1_uniq
done;
rm $1_uniq
Example:
./d.sh infile
You could use some horrible O(n^2) thing, like this (Pseudo-code):
file2 = EMPTY_FILE
for each line in file1:
if not line in file2:
file2.append(line)
This is potentially rather slow, especially if implemented at the Bash level. But if your files are reasonably short, it will probably work just fine, and would be quick to implement (not line in file2 is then just grep -v, and so on).
Otherwise you could of course code up a dedicated program, using some more advanced data structure in memory to speed it up.
for line in $(sort file1 | uniq ); do
grep -n -m1 line file >>out
done;
sort -n out
first do the sort,
for each uniqe value grep for the first match (-m1)
and preserve the line numbers
sort the output numerically (-n) by line number.
you could then remove the line #'s with sed or awk