I'm trying to extract headers from emails and create a JSON fragment from them. I'm using sed to pull out the keys and values, but it's failing to put the trailing quote on each of the lines:
$ cat email1 | grep -i -e "^subject:" -e "^from:" -e "^to:" | \
sed -n 's/\^([^:]*\):[ ]*\(.*\)$/"\1":"\2"/gp'
"From":"Blah Blech <blah.blech#blahblech.com>
"To":"foo#bar.com
"Subject":"Yeah
I don't understand why the replacement pattern isn't working.
awk to the rescue!
$ awk -F": *" -vOFS=":" -vq="\"" 'tolower($0)~/^from|to|subject/
{print q$1q,q$2q}' email1
which combines cat or grep steps as well.
Stripping the carriage returns as #tripleee suggested fixed the issue with sed (using ctrl-v ctrl-m to capture the literal carriage return):
$ cat email1 | tr -d '^M' | grep -i -e "^subject:" -e "^from:" -e "^to:" | \
sed -n 's/^\([^:]*\):[ ]*\(.*\)$/"\1":"\2"/gp'
"From":"Blah Blech <blah.blech#blahblech.com>"
"To":"foo#bar.com"
"Subject":"Yeah"
Related
I'm trying to grep the word which starts with group keyword & ends with -wx in the given line. Also I need to ignore the below words.
Starts with default:group and ends with -wx
group::-wx
My Findings
echo "# file: /test/test123 # owner: own # group: acct user::r-- group::r-x mask::rwx other::r-x default:user::r-- default:user:an:--x default:group::r-x default:group:fin:-wx default:mask::rwx default:other::r-x" | grep -o "group:[^ ]*-wx" | sed '/group::-wx/d';'/default:[^ ]*:[^ ]*-wx/d'
Actual result
fin:-wx
Expected result
<null>
You already have a grep to select what you want, simply add grep statements to remove those you do not want.
Like so:
LINE="# file: /test/test123 # owner: own # group: acct user::r-- group::r-x mask::rwx other::r-x default:user::r-- default:user:an:--x default:group::r-x default:group:fin:-wx default:mask::rwx default:other::r-x"
echo $LINE | grep -o "group:[^ ]*-wx" \
| grep -vo "default:group:[^ ]*-wx" \
| grep -vo "group::-wx"
On my linux it returns nothing, which is what you expected. I do not have other test samples, but I think this is ok.
As you are first extracting the substring group:fin:-wx out of
default:group:fin:-wx with grep, the following sed filter
/default:[^ ]*:[^ ]*-wx/d no longer works.
A workaround is to change the order of filtering:
str="# file: /test/test123 # owner: own # group: acct user::r-- group::r-x mask::rwx other::r-x default:user::r-- default:user:an:--x default:group::r-x default:group:fin:-wx default:mask::rwx default:other::r-x"
echo "$str" | sed -e 's/default:group:[^ ]*-wx//' -e 's/group::-wx//' | grep -o 'group:[^ ]*-wx'
As an alternative, if your grep supports -P option, you can make use of positive lookbehind as:
echo "$str" | grep -Po '(?<= )group:[^ ]*-wx' | sed -e '/group::-wx/d' -e '/default:[^ ]*:[^ ]*-wx/d'
The pattern (?<= ) forces the pattern match preceded by a whitespace without
including it in the output.
I have the following list in a text file:
10.1.2.200
10.1.2.201
10.1.2.202
10.1.2.203
I want to encase in "double quotes", comma separate and join the values as one string.
Can this be done in sed or awk?
Expected output:
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203","10.1.2.204"
The easiest is something like this (in pseudo code):
Read a line;
Put the line in quotes;
Keep that quoted line in a stack or string;
At the end (or while constructing the string), join the lines together with a comma.
Depending on the language, that is fairly straightforward to do:
With awk:
$ awk 'BEGIN{OFS=","}{s=s ? s OFS "\"" $1 "\"" : "\"" $1 "\""} END{print s}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
Or, less 'wall of quotes' to define a quote character:
$ awk 'BEGIN{OFS=",";q="\""}{s=s ? s OFS q$1q : q$1q} END{print s}' file
With sed:
$ sed -E 's/^(.*)$/"\1"/' file | sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g'
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
(With Perl and Ruby, with a join function, it is easiest to push the elements onto a stack and then join that.)
Perl:
$ perl -lne 'push #a, "\"$_\""; END{print join(",", #a)}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
Ruby:
$ ruby -ne 'BEGIN{#arr=[]}; #arr.push "\"#{$_.chomp}\""; END{puts #arr.join(",")}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
here is another alternative
sed 's/.*/"&"/' file | paste -sd,
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
awk -F'\n' -v RS="\0" -v OFS='","' -v q='"' '{NF--}$0=q$0q' file
should work for given example.
Tested with gawk:
kent$ cat f
10.1.2.200
10.1.2.201
10.1.2.202
10.1.2.203
kent$ awk -F'\n' -v RS="\0" -v OFS='","' -v q='"' '{NF--}$0=q$0q' f
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
$ awk '{o=o (NR>1?",":"") "\""$0"\""} END{print o}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
i wanted to keep only the 10.100.52.11 and delete everything else, 10.100.52.11 keeps changing so i don't want to hard code it
The original output was as below
"PrivateIpAddress": "10.100.52.111",
I tried the below command and removed "PrivateIpAddress": "
sudo aws ec2 describe-instances --filter Name=tag:Name,Values=bip-spark-es-worker1 |grep PrivateIpAddress |head -1|sed 's/^[ ^t]*\"PrivateIpAddress\"[:]* \"//g'
so the output for the above command now is
10.100.52.111",
I want to delete even the ending quotes and comma.
I tried with ["].$ and also \{2\}.$ did not work.
Please help.
Let sed do all the work. You don't need grep or head:
sed -n '/"PrivateIpAddress": /{s///; s/[",]//g; p; q}'
If content within " do not have " themselves,
grep PrivateIpAddress |head -1|sed 's/^[ ^t]*\"PrivateIpAddress\"[:]* \"//g'
can be replaced with
awk -F\" '/PrivateIpAddress/{print $4; exit}'
-F\" use " as field separator
/PrivateIpAddress/ if line matches this string
print $4 print 4th field which is 10.100.52.111 for given sample
exit will quit as only first match is required
some awk proposals
echo '"PrivateIpAddress": "10.100.52.111",'| awk -F: '{print substr($2,3,13)}'
10.100.52.111
echo '"PrivateIpAddress": "10.100.52.111",'| awk -F\" '{print $4}'
10.100.52.111
Alternative :
$ echo "\"PrivateIpAddress\": \"10.100.52.111\", "
"PrivateIpAddress": "10.100.52.111",
$ echo "\"PrivateIpAddress\": \"10.100.52.111\", " |grep -Po '(\d+[.]){3}\d+'
10.100.52.111
$ echo "\"PrivateIpAddress\": \"10.100.52.111\", " |grep -Eo '([[:digit:]]+[.]){3}[[:digit:]]+'
10.100.52.111
I have one requirement that i have to read the file and manipulate. I have to replace the single double quote into double double quote if it is found in any fields. fields are separated by |.
Please find below for better understanding.
Input:
1234567|9393874|"Hi"|"How are "you""
98647489|20370483|"i am "good""|"what about "you""
output :
1234567|9393874|"Hi"|"How are ""you"""
98647489|20370483|"i am ""good"""|"what about ""you"""
I would replace all the "edge" quotes with another character and then replace the "inner" ones:
sed -e 's/|"/|_/g' -e 's/"|/_|/g' -e 's/"$/_/' file | sed 's/"/""/g' | sed 's/_/"/g'
It returns:
1234567|9393874|"Hi"|"How are ""you"""
98647489|20370483|"i am ""good"""|"what about ""you"""
Step by step:
$ sed -e 's/|"/|_/g' -e 's/"|/_|/g' -e 's/"$/_/' a
1234567|9393874|_Hi_|_How are "you"_
98647489|20370483|_i am "good"_|_what about "you"_
$ sed -e 's/|"/|_/g' -e 's/"|/_|/g' -e 's/"$/_/' a | sed 's/"/""/g'
1234567|9393874|_Hi_|_How are ""you""_
98647489|20370483|_i am ""good""_|_what about ""you""_
$ sed -e 's/|"/|_/g' -e 's/"|/_|/g' -e 's/"$/_/' a | sed 's/"/""/g' | sed 's/_/"/g'
1234567|9393874|"Hi"|"How are ""you"""
98647489|20370483|"i am ""good"""|"what about ""you"""
I have a file which contains about 30000 Records delimited by '|'. I need to get a distinct list of special characters only from the file.
For Eg:
123|fasdf|%df&|pap,came|!
234|%^&asdf|34|'":|
My output should be:
|%&,!^'":
Any help would be greatly appreciated.
Thanks,
Velraj.
grep -o '[|%&,!^":]' input | sort -u
You have to list all your special characters inside brackets.
This will return each unique special character on its own line. If you really need a string with these characters you have to remove newlines afterwards, e.g.:
grep -o '[|%&,!^":]' input | sort -u | tr -d '\n'
UPDATE:
If you need to remove all characters which are not from 'a-zA-Z0-9' set then you can use this one:
grep -o '[^a-zA-Z0-9]' input | sort -u | tr -d '\n'
echo "123|fasdf|%df&|pap,came|! 234|%^&asdf|34|'\":|" \
| { tr -d '[[:alnum:]]'; printf "\n"; } \
| sed 's/\(.\)/\1_/g' \
| awk -v 'RS=_' '{print $0}' \
| sort -u \
| awk '{printf $0}END{printf "\n"}'
output
!"%&',:^||
You can replace the first line echo .... with cat fileName