sed: remove digits after word - unix

I have a simple sed question.
I have data like this:
2600,Sale,"Approved 911973",244.72
2601,Sale,"Approved 04735C",490.51
2602,Sale,"Approved 581068",52.82
2603,Sale,"Approved 009275",88.10
How do I make it like this:
2600,Sale,Approved,244.72
2601,Sale,Approved,490.51
2602,Sale,Approved,52.82
2603,Sale,Approved,88.10
Notice the numbers after approved are gone as well as the quotes. I can remove quotes with:
sed 's/,$//gn' file
but I don't know how to remove the spaces and digits.
Thanks!

sed "s/\"Approved[^,]*/Approved/g"
It finds the quoted "Approved" followed by any non-comma character, up until the first comma encountered, and replaces it with Approved (no quotes)
2600,Sale,Approved,244.72
2601,Sale,Approved,490.51
2602,Sale,Approved,52.82
2603,Sale,Approved,88.10

Using extended regex with sed:
sed -r 's/"([^[:space:]]*)[^"]*"/\1/g' file
The above regex targets for any quoted string. If you want to target the string Approved, then:
sed -r 's/"(Approved)[^"]*"/\1/g' file
With basic regex:
sed 's/"\(Approved\)[^"]*"/\1/g' file
To target any quoted string, change Approved to [^[:space:]]*

One way using awk(only if the other columns does not contain multiple words as in your sample):
awk -F"[ ,]" '{gsub("\"","");$1=$1}1' OFS=, file

awk -F'[," ]' '{OFS=","; print $1,$2,$4,$7}' file
Output:
2600,Sale,Approved,244.72
2601,Sale,Approved,490.51
2602,Sale,Approved,52.82
2603,Sale,Approved,88.10
I suppose there is no other whitespace.

Related

Replace double consonant letters with one using sed command

How to replace double consonants with only one letter using sed Linux command. Example: WILLIAM -> WILIAM. grep -E '(.)\1+' commands finds the words that follow two same consonants in a row pattern, but how do I replace them with only one occurrence of the letter?
I tried
cat test.txt | head | tr -s '[^AEUIO\n]' '?'
tr is all or nothing; it will replace all occurrences of the selected characters, regardless of context. For regex replacement, look at sed - you even included this in your question's tags, but you don't seem to have explored how it might be useful?
sed 's/\(.\)\1/\1/g' test.txt
The dot matches any character; to restrict to only consonants, change it to [b-df-hj-np-tv-xz] or whatever makes sense (maybe extend to include upper case; perhaps include accented characters?)
The regex dialect understood by sed is more like the one understood by grep without -E (hence all the backslashes); though some sed implementations also support this option to select the POSIX extended regular expression dialect.
Neither sed not tr need cat to read standard input for them (though tr obscurely does not accept a file name argument). See tangentially also Useless use of cat?
Match one consonant, remember it in \( \), then match is again with \1 and substitute it for itself.
sed 's/\([bcdfghjklmnpqrstvxzBCDFGHJKLMNPQRSTVXZ]\)\1/\1/'

Replace two characters with sed dynamically

I have the following strings
C:/data
D:/backups
C:/Users/Guest/old_data
F:/files/new
How can I replace the first two characters with /cygdrive/LOWERCASE_DRIVE_LETTER?
RESULT
/cygdrive/c/data
/cygdrive/d/backups
/cygdrive/c/Users/Guest/old_data
/cygdrive/f/files/new
awk -F':' 'sub(/../,"/cygdrive/"tolower($1))' file
Brief explanation,
-F':': set ':' as the field separator.
tolower($1): return thee lower case of $1
sub(/../,"/cygdrive/"tolower($1)): substitute the first 2 character to "/cygdrive/"tolower($1)
This might work for you (GNU sed):
sed 's/\(.\):/\/cygdrive\/\l\1/' file
Remember by grouping the first character followed by a :. Then insert /cygdrive/ and lowercase the group i.e. the first character.
Could you please try following.
awk 'BEGIN{FS=OFS="/"}{sub(/:/,"",$1);$1=tolower($1);print "/cygdrive/" $0}' Input_file

Replace characters in a delimited part of a file

I have the file teste.txt with the following content:
02183101399205000 GBTD9VBYMBQ 04455927964
02183101409310000 XBQMPL1C93B 27699484827
54183101003651000 1WFG3SNVDG9 71530894204
I execute the command
sed -e 's/^\(.\{18\}\)[0-9]/\1#/g' teste.txt
The result is:
02183101399205000 GBTD9VBYMBQ 04455927964
02183101409310000 XBQMPL1C93B 27699484827
54183101003651000 #WFG3SNVDG9 71530894204
Only the 19th position in line 3 is changed from 1 to #.
I would like to know how can I change all numeric characters from the 19th to the 30th position.
The expected result is:
02183101399205000 GBTD#VBYMBQ 04455927964
02183101409310000 XBQMPL#C##B 27699484827
54183101003651000 #WFG#SNVDG# 71530894204
An awk command to accomplish your goal:
awk '{ gsub(/[0-9]/,"#",$2); print }' teste.txt
This might work for you (GNU sed):
sed -r 's/./&\n/30;s//\n&/19;h;s/[0-9]/#/g;H;x;s/\n.*\n(.*)\n.*\n(.*)\n.*/\2\1/' file
Surround the string, which is from the 19th to the 30th character, by newlines and make a copy. Replace all digits by #'s. Append this string to the original and use pattern matching to rearrange the strings to make a new string with the unchanged parts either side of the changed part, at the same time discarding the introduced newlines.
An alternative method, utilising the fact the the fields are space separated:
sed -r ':a;s/( \S*)[0-9](\S* )/\1#\2/;ta' file
In fact the two methods can be combined:
sed -r 's/./&\n/30;s//\n&/19;:a;s/(\n.*)[0-9](.*\n)/\1#\2/;ta;s/\n//g' file

Using sed to replace text with curly braces

I am trying to find the following text
get_pins {
and replace it with
get_pins -hierarchical {proc_top_*/
I've tried using sed but I'm not sure what I'm doing wrong. I know that you need # in front of curly braces but I still can't get the command to work properly.
The closest I've come is to this:
sed 's/get_pins #{#/get_pins -hierarchical #{#proc_top_*\//g' filename.txt > output
but it doesn't do the replacement I wanted above.
#merlin2011's answer shows you how to do it with alternative delimiters, but as for why your command didn't work:
It's actually perfectly fine, if you just remove all # chars. from your statement:
sed 's/get_pins {/get_pins -hierarchical {proc_top_*\//'g filename.txt > output
There are two distinct escaping requirements involved here:
Escaping literal use of the regex delimiter: this is what you did correctly, by escaping the / as \/.
Escaping characters with special meaning inside a regex in general: this escaping is always done with \-prefixing, but in your case there is NO need for such escaping: since you're NOT using -E or -r to indicate use of extended regexes - and are therefore using a basic regex - { is actually NOT a special character, so you need NOT escape it. If, by contrast, you had used -E (-r), then you should have escaped { as \{.
The problem is not in the curly braces, it's in the /.
This is exactly why sed lets you do alternate delimiters.
The line below uses ! as a delimiter instead, and works correctly for a simple file with get_pins { in it.
sed 's!get_pins {!get_pins -hierarchical {proc_top_*/!g' Input.txt
Output:
get_pins -hierarchical {proc_top_*/
Update: Based mklement0's comment, and testing with the csh shell, the following should work in csh.
sed 's#get_pins {#get_pins -hierarchical {proc_top_*/#g' Input.txt
This awk should do the replace:
awk '{sub(/get_pins {/,"get_pins -hierarchical {proc_top_*/")}1'

Sed replace only exact match

I wan't to replace a string like Europe12 with Europe12_yesturday in a file. Without changing the Europe12-36 strings that also exists in the file.
I tried:
$basename=Europe12
sed -i 's/\b$basename\b/${basename}_yesterday/g' file.txt
but this also changed the Europe12-36 strings.
Require a space or end of line character:
sed 's/Europe12\([ ]|$\)/Europe12_yesturday\1/g' input
Manually construct the delimiter list you want instead of using \b, \W or \<. - is not part of the word characters (alphanumericals), so that's why this also matches your other string. So try something like this, expanding the list as needed: [-a-zA-Z0-9].
You can do it in 2 times:
sed -e 's/Europe12/Europe12_yesturday/g' -e 's/Europe12_yesturday-36/Europe12-36/g' file.txt
sed 's/\(Europe12[[:blank:]]\)/\1_yesturday/g;s/Europe12$/&_yesturday/' YourFile
[[:blank:]] could be completeted with any boundary you accept also like .,;:/]) etc (be carrefull of regex meaning of this char in this case)
It is little late to reply..
It can be achieved easily by "word boundary" notation (\<..\>)
sed -i 's/\<$basename\>/${basename}_yesterday/g' file.txt

Resources