unexpected EOF while looking for matching `"' - unix

I am writing a script which has command to execute as below:
cat /abc | grep -v ^# | grep -i root | sed -e '\''s/"//g'\'' | awk '\''{print $2}'\''
When running the script on SunOS, i am getting below error:
test: line 1: unexpected EOF while looking for matching `"'
test: line 3: syntax error: unexpected end of file
Tried with different option.. but no luck.
Need somebody help me identify what is missing in the above command.

what are those escapes ?!
cat /abc | grep -v '^#' | grep -i root | sed -e '\''s/"//g'\'' | awk '\''{print $2}'\''
^ ^ ^ ^
Your problem is there:
sed -e '\''s/"//g'\''
^ unmatched

The quoting is all wrong. Why do you use single quote, backslash, single quote, single quote,and always in that order? Regardless, you have an unquoted double quote, so the shell expects you to add a closing quote for the quoted string which starts with that opening double quote.
As a matter of style, you should also lose the Useless Use of Cat, and think about how to simplify your script. At least:
grep -v ^# /abc | grep -i root | sed -e 's/"//g' | awk '{print $2}'
... but in practice
awk '/^#/ { next } /[Rr][Oo][Oo][Tt]/ { gsub ("\"",""); print $2 }' /abc
Because some of the characters in the awk and sed scripts have a special meaning to the shell, we put them in single quotes. If you need to have single quotes in a script, you need to double quote them; a frequent pattern is to have a string in single quotes adjacent to a string in double quotes, like this: echo '"'"'". This echos " (quoted in single quotes) immediately followed by ' (quoted in double quotes).
Edit Updated analysis of quoting problem; added code example; corrected code example. Final edit corrects quoting of gsub in awk script, and adds a small discussion of quoting.

Related

Unix command to parse string

I'm trying to figure out a command to parse the following file content:
Operation=GET
Type=HOME
Counters=CacheHit=0,Exception=1,Validated=0
I need to extract Exception=1 into its own line. I'm fiddling with awk, sed and grep but not making much progress. Does anyone have any tips on using any unix command to perform this?
Thanks
Since your file is close to bash syntax, there is a fun little trick you can do to make bash itself parse the file. First, use some program like tr to transform the input into a something bash can parse, and then "source" that, which will create shell variables you can expand later to get the values.
source <(tr , $'\n' < file_name_goes_here)
echo $Exception
Many ways to do this. Here is one assuming the file is called "file.txt". Grab the line you want, replace everything from the start of the line up to Except with just Except, then pull out the first field using comma as the delimiter.
$ grep Exception file.txt | sed 's/.*Except/Except/g' | cut -d, -f 1
Exception=1
If you wanted to use gawk:
$ grep Exception file.txt | sed 's/.*Except/Except/g' | gawk -F, '{print $1}'
Exception=1
or just using grep and sed:
$ grep Exception file.txt | sed 's/.*\(Exception=[0-9]*\).*/\1/g'
Exception=1
or as #sheltter reminded me:
$ egrep -o "Exception=[0-9]+" file.txt
Exception=1
No need to use a mix of commands.
awk -F, 'NR==2 {print RS$1}' RS="Exception" file
Exception=1
Here we split the line by the keyword we look for RS="Exception"
If the line has two record (only when keyword is found), then
print first field, separated using command, with Record selector.
PS This only works if you have one Exception field

how to grep a log and print multiple string or text from a log

ERROR|2017-04-04 06:27:20|ID=15098|ST=2018-04-0406:27:21|TYPE=Log|LOG=6|OBJECT=NoticeBService|T_TIME=10|REQUEST_MSG=<Envelope><Header><ns11:messageId>184745460</ns11:messageId><ns11:messageDateTime>2018-04-13T11:27:21Z</ns11:messageDateTime></Header></Envelope>|RESPONSE_MSG=<Envelope><Header><m:messageId>184460</m:messageId><m:messageDateTimeStamp>2018-04-04T06:27:21-05:00</m:messageDateTimeStamp></m:trackingMessageHeader></m:wsMessageHeader></Header><Body><Fault><faultcode>Server.704</faultcode><detail><ns1:providerError><ns1:providerErrorCode>704</ns1:providerErrorCode><ns1:providerErrorText>business_rule_exception-Server.704: 'Active'status.</ns1:providerErrorText></ns1:providerError></detail></Fault></Body></Envelope>
I want to print from a test.log:
OBJECT=NoticeBService
business_rule_exception-Server.704: 'Active'status.
I used : sed -n '/providerErrorText/,/providerErrorText/p' | cut -d '|' -f 7 test.log
I getting output :
OBJECT=NoticeBService
sed: -e expression #1, char 31: extra characters after command
with grep
$ grep -oP 'OBJECT[^|]+|(?<=providerErrorText>)[^<]+' file
OBJECT=NoticeBService
business_rule_exception-Server.704: 'Active'status.
Explanation
OBJECT[^|]+ looking for the literal "OBJECT" and match until a pipe symbol
(?<=providerErrorText>) look-behind, find the pattern but not capture
[^<]+ capture everything until the < sign
-oP o is for outputting only the matched pattern, P for perl compatibility (for look-behind matching here).
pattern1|pattern2 is for either pattern1 or pattern2 match (it can be both).

Unix - Need to cut a file which has multiple blanks as delimiter - awk or cut?

I need to get the records from a text file in Unix. The delimiter is multiple blanks. For example:
2U2133 1239
1290fsdsf 3234
From this, I need to extract
1239
3234
The delimiter for all records will be always 3 blanks.
I need to do this in an unix script(.scr) and write the output to another file or use it as an input to a do-while loop. I tried the below:
while read readline
do
read_int=`echo "$readline"`
cnt_exc=`grep "$read_int" ${Directory path}/file1.txt| wc -l`
if [ $cnt_exc -gt 0 ]
then
int_1=0
else
int_2=0
fi
done < awk -F' ' '{ print $2 }' ${Directoty path}/test_file.txt
test_file.txt is the input file and file1.txt is a lookup file. But the above way is not working and giving me syntax errors near awk -F
I tried writing the output to a file. The following worked in command line:
more test_file.txt | awk -F' ' '{ print $2 }' > output.txt
This is working and writing the records to output.txt in command line. But the same command does not work in the unix script (It is a .scr file)
Please let me know where I am going wrong and how I can resolve this.
Thanks,
Visakh
The job of replacing multiple delimiters with just one is left to tr:
cat <file_name> | tr -s ' ' | cut -d ' ' -f 2
tr translates or deletes characters, and is perfectly suited to prepare your data for cut to work properly.
The manual states:
-s, --squeeze-repeats
replace each sequence of a repeated character that is
listed in the last specified SET, with a single occurrence
of that character
It depends on the version or implementation of cut on your machine. Some versions support an option, usually -i, that means 'ignore blank fields' or, equivalently, allow multiple separators between fields. If that's supported, use:
cut -i -d' ' -f 2 data.file
If not (and it is not universal — and maybe not even widespread, since neither GNU nor MacOS X have the option), then using awk is better and more portable.
You need to pipe the output of awk into your loop, though:
awk -F' ' '{print $2}' ${Directory_path}/test_file.txt |
while read readline
do
read_int=`echo "$readline"`
cnt_exc=`grep "$read_int" ${Directory_path}/file1.txt| wc -l`
if [ $cnt_exc -gt 0 ]
then int_1=0
else int_2=0
fi
done
The only residual issue is whether the while loop is in a sub-shell and and therefore not modifying your main shell scripts variables, just its own copy of those variables.
With bash, you can use process substitution:
while read readline
do
read_int=`echo "$readline"`
cnt_exc=`grep "$read_int" ${Directory_path}/file1.txt| wc -l`
if [ $cnt_exc -gt 0 ]
then int_1=0
else int_2=0
fi
done < <(awk -F' ' '{print $2}' ${Directory_path}/test_file.txt)
This leaves the while loop in the current shell, but arranges for the output of the command to appear as if from a file.
The blank in ${Directory path} is not normally legal — unless it is another Bash feature I've missed out on; you also had a typo (Directoty) in one place.
Other ways of doing the same thing aside, the error in your program is this: You cannot redirect from (<) the output of another program. Turn your script around and use a pipe like this:
awk -F' ' '{ print $2 }' ${Directory path}/test_file.txt | while read readline
etc.
Besides, the use of "readline" as a variable name may or may not get you into problems.
In this particular case, you can use the following line
sed 's/ /\t/g' <file_name> | cut -f 2
to get your second columns.
In bash you can start from something like this:
for n in `${Directoty path}/test_file.txt | cut -d " " -f 4`
{
grep -c $n ${Directory path}/file*.txt
}
This should have been a comment, but since I cannot comment yet, I am adding this here.
This is from an excellent answer here: https://stackoverflow.com/a/4483833/3138875
tr -s ' ' <text.txt | cut -d ' ' -f4
tr -s '<character>' squeezes multiple repeated instances of <character> into one.
It's not working in the script because of the typo in "Directo*t*y path" (last line of your script).
Cut isn't flexible enough. I usually use Perl for that:
cat file.txt | perl -F' ' -e 'print $F[1]."\n"'
Instead of a triple space after -F you can put any Perl regular expression. You access fields as $F[n], where n is the field number (counting starts at zero). This way there is no need to sed or tr.

output of one command is argument of another

Is there any way to fit in 1 line using the pipes the following:
output of
sha1sum $(xpi) | grep -Eow '^[^ ]+'
goes instead of 456
sed 's/#version#/456/' input.txt > output.txt
Um, I think you can nest $(command arg arg) occurances, so if you really need just one line, try
sed "s/#version#/$(sha1sum $(xpi) | grep -Eow '^[^ ]+')/" input.txt \
> output.txt
But I like Trey's solution putting it one two lines; it's less confusing.
This is not possible using pipes. Command nesting works though:
sed 's/#version#/'$(sha1sum $(xpi) | grep -Eow '^[^ ]+')'/' input.txt > output.txt
Also note that if the results of the nested command contain the / character you will need to use a different character as delimiter (#, |, $, and _ are popular ones) or somehow escape the forward slashes in your string. This StackOverflow question also has a solution to the escaping problem. The problem can be solved by piping the command to sed and replacing all forward slashes (for escape characters) and backslashes (to avoid conflicts with using / as the outer sed delimiter).
The following regular expression will escape all \ characters and all / characters in the command:
sha1sum $(xpi) | grep -Eow '^[^ ]+' | sed -e 's/\(\/\|\\\|&\)/\\&/g'
Nesting this as we did above we get this solution which should properly escape slashes where needed:
sed 's/#version#/'$(sha1sum $(xpi) | grep -Eow '^[^ ]+' | sed -e 's/\(\/\|\\\|&\)/\\&/g')'/' input.txt > output.txt
Personally I think that looks like a mess as one line, but it works.

Can you grep a file using a regular expression and only output the matching part of a line?

I have a log file which contains a number of error lines, such as:
Failed to add email#test.com to database
I can filter these lines with a single grep call:
grep -E 'Failed to add (.*) to database'
This works fine, but what I'd really like to do is have grep (or another Unix command I pass the output into) only output the email address part of the matched line.
Is this possible?
sed is fine without grep:
sed -n 's/Failed to add \(.*\) to database/\1/p' filename
You can also just pipe grep to itself :)
grep -E 'Failed to add (.*) to database' | grep -Eo "[^ ]+#[^ ]+"
Or, if "lines in interest" are the only ones with emails, just use the last grep command without the first one.
You can use sed:
grep -E 'Failed to add (.*) to database'| sed 's/'Failed to add \(.*\) to database'/\1'
Recent versions of GNU grep have a -o option which does exactly what you want. (-o is for --only-matching).
This should do the job:
grep -x -e '(?<=Failed to add ).+?(?= to database)'
It uses a positive look-ahead assertion, followed by the match for the email address, followed by a postivie look-behind assertion. This insures that it matches the entire line, but only actually consumes (and thus returns) the email address part.
The -x option specifies that grep should match lines rather than the whole text.
or python:
cat file | python -c "import re, sys; print '\r\n'.join(re.findall('add (.*?) to', sys.stdin.read()))"
-r option for sed allows regexps without backslashes
sed -n -r 's/Failed to add (.*) to database/\1/p' filename
If you just want to use grep and output only matching part of line
grep -E -o 'Failed to add (.*) to database'
Then maybe if you want to write it to a file
cat yourlogfile | grep -E -o 'Failed to add (.*) to database' >> outputfile
So as of grep utility -o is going to -o, --only-matching show only nonempty parts of lines that match'.
If you want to use grep, it would be more appropiate to use egrep;
About egrep
Search a file for a pattern using full regular expressions.
grep will not always have as complete of functionality for regex.

Resources