I want to replace unwanted strings using unix sed reqular expression
Input string
echo ',"wanted1":"value1","unwanted";"unwanted";"wanted2":"value2",'
Required string
"wanted1":"value1","wanted2":"value2"
Try this one
Script
echo ',"wanted1":"value1","unwanted";"unwanted";"wanted2":"value2",' | sed 's/,//g; s/"unwanted"//g; s/;//g; s/""/","/g'
Output
"wanted1":"value1","wanted2":"value2"
Related
I am new to the shell script, I have a text file with multiple records, and the 1st record end and second record start in the same line as below
"-}{"
So I want to break the chain as
"-} #line1
{ #line2"
I tried like below:
Method 1
sed 's/\-\}\{//\-\} \n \{' file.txt
Method 2
tr '-}{' '\n'
Can anyone please help me with this?
With your shown samples, please try following awk code. Simply substituting -}{ with -} new line { and printing the value.
echo '"-}{"' | awk '{sub(/-}{/,"-}\n{")} 1'
Too much escaping.
Also it's s/<pattern>/<replacement>/. There are 3 /, the last one on the end.
$ echo '"-}{"' | sed 's/-}{/-} \n {/'
"-}
{"
It's not possible to with tr, tr is for single character translate. If you would like tr -- '-}{' '\n' then tr would replace any of -, } and { by a newline.
This might work for you (GNU sed):
sed 'G;:a;s/-}\({.*\(.\)\)/-}\2\1/;ta;s/.$//' file
Append a newline to the current line.
Use pattern matching to insert the newline between -} and { repeatedly.
When all is done, remove the introduced newline.
You can use () to capture the delimiters and do this:
echo '"-}{" -} -}{' | sed -E 's/(-})({)/\1\n\2/g'
"-}
{" -} -}
{
Each record coming with column names. It is pipe delimited. I have to replace them in each record as shown below:
Input:
COMPILES=1|PROPS=inet.timeoutDownload=5000;inet.timeoutIO=5000;inet.timeoutOpen=5000;inet.urlBase=vxml3-elr:7000/CVP/;swirec_language=en-US|SCPU=30828
Output:
1|inet.timeoutDownload=5000;inet.timeoutIO=5000;inet.timeoutOpen=5000;inet.urlBase=vxml3-elr:7000/CVP/;swirec_language=en-US|30828
I was trying the command sed 's/[^|]*=//g' to replace all sequences of non-| characters followed by = with nothing but in the 2nd column it is printing only last value. Is there a way to replace only 1st instance in each field?
1|en-US|30828
Using sed:
$ sed 's/\(^\||\)[^=]\+=/\1/g' file
1|inet.timeoutDownload=5000;inet.timeoutIO=5000;inet.timeoutOpen=5000;inet.urlBase=vxml3-elr:7000/CVP/;swirec_language=en-US|30828
Explained:
s/ replace
\(^\||\)[^=]\+= beginning (^) or (\|) separator (|) and all non-=s and a =
/\1/g with beginning or separator (\1) globally (g)
ie. replace ^THIS= with ^ and |THIS= with |.
Try with this:
awk -v RS='|' -v ORS='|' '{sub("[^.]*=","")}1' input | sed "s|\|$||g"
RS, record separator, usually is newline, in this case it changes to |, so a record would be COMPILES=1 or PROPS=inet.timeoutDownload=5000;inet.timeoutIO=5000;inet.timeoutOpen=5000;inet.urlBase=vxml3-elr:7000/CVP/;swirec_language=en-US
ORS, output record separator, is also newline, changes to |, so when print, the output would be separated by |
sub("[^.]*=","") its a lazy regex to replace the first value before =, more about it in https://unix.stackexchange.com/questions/49601/how-to-reduce-the-greediness-of-a-regular-expression-in-awk
sed "s|\|$||g" to delete the last |
another awk
$ awk 'BEGIN{FS=OFS="|"} {for(i=1;i<=NF;i++) sub(/[^=]+=/,"",$i)}1' file
results with
1|inet.timeoutDownload=5000;inet.timeoutIO=5000;inet.timeoutOpen=5000;inet.urlBase=vxml3-elr:7000/CVP/;swirec_language=en-US|30828
Using Perl
$ cat mullapudi.log
COMPILES=1|PROPS=inet.timeoutDownload=5000;inet.timeoutIO=5000;inet.timeoutOpen=5000;inet.urlBase=vxml3-elr:7000/CVP/;swirec_language=en-US|SCPU=30828
$ perl -F"\|" -ane ' s/^.+?=//g for #F; print join("|",#F) ' mullapudi.log
1|inet.timeoutDownload=5000;inet.timeoutIO=5000;inet.timeoutOpen=5000;inet.urlBase=vxml3-elr:7000/CVP/;swirec_language=en-US|30828
Hi I have got a file that looks like below and I need to check if the 3rd position is W then replace it with A else if it is M replace it with X in unix . Can anyone help?
Input file:
CRM~ABC~M~124
CRM~CDF~W~875
Output expected :
CRM~ABC~X~124
CRM~CDF~A~875
Thanks in advance..
With awk this is done easily, taking care of having the tilde both as input field separator (FS) and as output field separator (OFS). You then just replace the third field according to your needs:
awk 'BEGIN {FS=OFS="~"} $3=="M" {$3="X"} $3=="W" {$3="A"} {print}' yourfile
The sed answer is longer:
sed 's/\(^[^~]*~[^~]*~\)M~/\1X~/;s/\(^[^~]*~[^~]*~\)W~/\1A~/' yourfile
* Each line consists of two fields, separated by a pipe '|', where
* the first field is a comma-separated list of items, and
* the second field is a tag.
This is my INPUT:
100,210,354,462|acct
331,746,50|mis
90,263,47,14|sales
and required OUTPUT:
100acct
210acct
354acct
462acct
331mis
746mis
50mis
90sales
263sales
47sales
14sales
sed '{s/^\([^a-z].*\),\([^a-z].*\),\([^a-z].*\),\([^a-z].*\)|\([^0-9].*\)$/\1\5\n\2\5\n\3\5\n\4\5/;s/^\([^a-z].*\),\([^a-z].*\),\([^a-z].*\)|\([^0-9].*\)$/\1\4\n\2\4\n\3\4/}' filename
One way using GNU awk:
awk -F "[,|]" '{ for (i=1; i<NF; i++) print $i$NF }' file.txt
Results:
100acct
210acct
354acct
462acct
331mis
746mis
50mis
90sales
263sales
47sales
14sales
use the following
sed 's/^\([^a-z].*\),\([^a-z].*\),\([^a-z].*\),\([^a-z].*\)|\([^0-9].*\)$/\1\5\n\2\5\n\3\5\n\4\5/g;s/^\([^a-z].*\),\([^a-z].*\),\([^a-z].*\)|\([^0-9].*\)$/\1\4\n\2\4\n\3\4/g'
This might work for you (GNU sed):
sed 's/\s*//;:a;s/,\(.*|\(.*\)\)/\2\n\1/;ta;s/|//' file
Explanation:
s/\s*// remove whitespace at the front of the record.
:a;s/,\(.*|\(.*\)\)/\2\n\1/;ta replace each , by the last field and a newline
s/|// remove the |
To preserve whitespace use:
sed -r 's/(\s*)(.*\|)/\2\1/;:a;s/,(.*\|(.*))/\2\n\1/;ta;s/\|//;s/(\S+)(\s+)(\S+)/\2\1\3/g' file
sed 's/\([0-9]\),\([0-9]*\),\([0-9]*\),*\([0-9]*\)\([,|]\)\(.*\)/\1\6\n\2\6\n\3\6\n\4\6/' input | sed '/^[a-z]*$/d'
this expression is give the correct output for you.
I have a blob of text like this:
abcd,def,geff,hij,klmn,nop,qrs,tuv,wxyz,....
Can you guys help me in replacing the 4th comma (,) with a newline using awk or any unix (mac) magic!
To replace 4th , occurance you can use:
echo "abcd,def,geff,hij,klmn,nop,qrs,tuv,wxyz,...." | sed 's/,/\n/4'
To replace every 4th occurance use:
echo "abcd,def,geff,hij,klmn,nop,qrs,tuv,wxyz,...." | sed 's/\(\([^,]*,\)\{3\}[^,]*\),/\1\n/g'
To change only the 4th comma:
sed 's/\(\([^,]*,\)\{3\}[^,]*\),/\1\n/'
(note: rush shows a much cooler way to do this): s/,/\n/4
To change every 4th comma, add the g flag:
$ echo 'abcd,def,geff,hij,klmn,nop,qrs,tuv,wxyz,....' |\
> sed 's/\(\([^,]*,\)\{3\}[^,]*\),/\1\n/g'
abcd,def,geff,hij
klmn,nop,qrs,tuv
wxyz,....
Here's a sed reference.
In a nutshell, the command finds the pattern
(( non-commas - comma ) (3 times) - (non-commas)) comma
and changes it to
"whatever is in outer brackets" + newline.
It works because default action of xargs is /bin/echo
http://unixhelp.ed.ac.uk/CGI/man-cgi?xargs
echo 'abcd,def,geff,hij,klmn,nop,qrs,tuv,wxyz,....' | xargs -d, -n4 | tr ' ' ','