Replace consecutive delimeters but not single delimeter occurence in unix

Replace consecutive delimeters but not single delimeter occurence in unix - unix

I have an unix file whose delimeter is #!
I need to replace the delimeter '#!' to '~'
But I have '#' as data in some columns. I don't want to replace them.
I want to replace only # and ! together. I don't want to replace when either of them occurs single(only # or only !).
Please help me with a unix command

You can use the sed command to do replacements.
For example, if your file was data.txt:
sed 's/#!/~/g' data.txt > data_replaced.txt
IF you want to edit the file inplace, you can use this:
sed -i 's/#!/~/g' data.txt
Hope this helps!

Related

Unix: multi and single character delimiter in cut or awk commands

This is the string I have:
my_file1.txt-myfile2.txt_my_file3.txt
I want to remove all the characters after the first "_" that follows the first ".txt".
From the above example, I want the output to be my_file1.txt-myfile2.txt. I have to search for first occurrence of ".txt" and continue parsing until I find the underscore character, and remove everything from there on.
Is it possible to do it in sed/awk/cut etc commands?

You can't do this job with cut but you can with sed and awk:
$ sed 's/\.txt/\n/g; s/\([^\n]*\n[^_]*\)_.*/\1/; s/\n/.txt/g' file
my_file1.txt-myfile2.txt
$ awk 'match($0,/\.txt[^_]*_/){print substr($0,1,RSTART+RLENGTH-2)}' file
my_file1.txt-myfile2.txt

Could you please try following, written based on your shown samples.
awk '{sub(/\.txt_.*/,".txt")} 1' Input_file
Simply substituting everything from .txt_ to till last of line with .txt and printing the line here

Change multiple filenames unix

I had to download 15GB of data and for some reason during the downloading process the filenames were messed up in a way so that instead of
test_file.txt
the filenames are doubled, so it's
test_file.txttest_file.txt
instead. My only idea was whether there is any way to count the letters and then rename each file with deleting the first/ or second half of the filename? The filenames are not consistent, so for example in the same folder there might also be files named
files_are_great.txtfiles_are_great.txt
so I'm struggling to find a way to loop over them.
Thanks a lot!

The command sed 's/\(.*\)\1/\1/' will replace all duplicated strings with the single string without requiring a certain part of the file name like .txt. It allows spaces in the string.
Example:
echo 'abc defabc def' | sed 's/^\(.*\)\1$/\1/'
prints
abc def
Explanation of the sed command:
^ anchors the pattern to the beginning of the line
.* is 0 or more occurrences of any character
\(...\) captures what matches the pattern in between
\1 is a reference to the first capture group, i.e. the text that was found before
$ anchors the search pattern to the end of the line
This results in a search pattern that matches a whole line that consists of any text followed by the same text.
\1 in the replacement is the same reference to the matched text, i.e. a single occurrence of the duplicated text.
Any input that does not match the pattern will remain unchanged.
Assuming you want to rename all files in the current directory you can use it like this
for file in *
do
new=$(echo $file|sed 's/\(.*\)\1/\1/')
[ "$file" = "$new" ] || mv "$file" "$new"
done
As the sed command does not change non-matching input, $new will be the same as $file for file names that don't consist of a duplicated string. This would result in an error message from mv. That's why the renaming will be skipped in this case.

Using sed
sed 's#\(\.txt\)#& #g'
Explanation: using \( \) we group the expression which can be accessed using &
Demo:
echo "files_are_great.txtfiles_are_great.txt" | sed 's#\(\.txt\)#& #g'
files_are_great.txt files_are_great.txt
For renaming:
for file_name in $(ls -1 *txt*txt)
do
new_file_name=$(echo $i |sed 's#\(\.txt\)#& #g' | cut -d' ' -f1)
mv $file_name $new_file_name
done

replace all commas except last one between two equal sign

I have a requirement that, I have a file and for each line I need to replace all commas except the last one between two equal sign. Can anyone help on this.
(Prefer sed command and no looping condition)
File's data-->>
STREET:1:1=Zwaneweg 23, Box 0001, PIN002,TOWN.COUNTRY:1:1=BE/Schilde
Should be-->>
STREET:1:1=Zwaneweg 23? Box 0001? PIN002,TOWN.COUNTRY:1:1=BE/Schilde

Try something like this:
mayankp#mayank:~/Documents$ cat tt.txt
STREET:1:1=Zwaneweg 23, Box 0001, PIN002,TOWN.COUNTRY:1:1=BE/Schilde
mayankp#mayank:~/Documents$ cat tt.txt| grep -o -P '(?<==).*(?==)'| rev |sed 's/,/?/2g' |rev > out.txt
mayankp#mayank:~/Documents/$ cat out.txt
Zwaneweg 23? Box 0001? PIN002,TOWN.COUNTRY:1:1
Now merge out.txt with tt.txt to retain missed data.
mayankp#mayank:~/Documents/$ perl -0777 -i -pe "s/(=).*(=)/\$1`cat out.txt`\$2/s" tt.txt
mayankp#mayank:~/Documents$ cat t3.txt
STREET:1:1=Zwaneweg 23? Box 0001? PIN002,TOWN.COUNTRY:1:1=BE/Schilde

With sed you can remember matches and restore them.
When you only want to replace the second-last comma, you can use
sed -r 's/(=.*),(.*,.*=)/\1?\2/' inputfile
The wildcard is greedy, when you have 8 commas between the equal signs, the seventh will be replaced.
You can tell sed to repeat his instruction until it doesn't find a match witch a label.
The label :a is inserted in front of the replace, and the "turnback" is instructed with ta. The command becomes
sed -r ':a;s/(=.*),(.*,.*=)/\1?\2/;ta' inputfile
When you have more than 2 equal sign, you must know where to look. This command will replace take the first ant last equal sign:
echo '1,a=2,b,b,b,=3,c=Only, this part, should have, the commas, except this one, replaced=5,e,e'|
sed -r ':a;s/(=.*),(.*,.*=)/\1?\2/;ta'
1,a=2?b?b?b?=3?c=Only? this part? should have? the commas? except this one, replaced=5,e,e
When you only want the replacements done between the last 2 equal signs, you need to replace the wildcard . with everything except the equal sign [^=], what will give an even harder to read command
echo '1,a=2,b,b,b,=3,c=Only, this part, should have, the commas, except this one, replaced=5,e,e'|
sed -r ':a;s/(=[^=]*),([^=]*,[^=]*=)([^=]*)$/\1?\2\3/;ta'
1,a=2,b,b,b,=3,c=Only? this part? should have? the commas? except this one, replaced=5,e,e

sed: remove digits after word

I have a simple sed question.
I have data like this:
2600,Sale,"Approved 911973",244.72
2601,Sale,"Approved 04735C",490.51
2602,Sale,"Approved 581068",52.82
2603,Sale,"Approved 009275",88.10
How do I make it like this:
2600,Sale,Approved,244.72
2601,Sale,Approved,490.51
2602,Sale,Approved,52.82
2603,Sale,Approved,88.10
Notice the numbers after approved are gone as well as the quotes. I can remove quotes with:
sed 's/,$//gn' file
but I don't know how to remove the spaces and digits.
Thanks!

sed "s/\"Approved[^,]*/Approved/g"
It finds the quoted "Approved" followed by any non-comma character, up until the first comma encountered, and replaces it with Approved (no quotes)
2600,Sale,Approved,244.72
2601,Sale,Approved,490.51
2602,Sale,Approved,52.82
2603,Sale,Approved,88.10

Using extended regex with sed:
sed -r 's/"([^[:space:]]*)[^"]*"/\1/g' file
The above regex targets for any quoted string. If you want to target the string Approved, then:
sed -r 's/"(Approved)[^"]*"/\1/g' file
With basic regex:
sed 's/"\(Approved\)[^"]*"/\1/g' file
To target any quoted string, change Approved to [^[:space:]]*

One way using awk(only if the other columns does not contain multiple words as in your sample):
awk -F"[ ,]" '{gsub("\"","");$1=$1}1' OFS=, file

awk -F'[," ]' '{OFS=","; print $1,$2,$4,$7}' file
Output:
2600,Sale,Approved,244.72
2601,Sale,Approved,490.51
2602,Sale,Approved,52.82
2603,Sale,Approved,88.10
I suppose there is no other whitespace.

Unix replace a particular type of string in file through unix

I have a situation that I need to replace a particular type of string in a file.
Scenario is:
user input like this:
abc = 21
xyz=32;34;35
The user can input many numbers in xyz but format should be ";" separated values
Now I need to replace these values in a particular file suppose test.txt
This file has a format like this:
test.txt
cond0=abc
cond1=xyz
Cond2=abcxyz%
hence output should be like this
cond0=21
cond1=32;34;35
cond2=2132%;2134%;2135%
I am using below command to do this but from this I am not able to get right output in cond2
sed "s/abc/${abc}/g" "$TEST_DIR/$file" > "$TEST_DIR/$file.bak" && mv "$TEST_DIR/$file.bak" "$TEST_DIR/$file"
sed "s/xyz/${xyz}/g" "$TEST_DIR/$file" > "$TEST_DIR/$file.bak" && mv "$TEST_DIR/$file.bak" "$TEST_DIR/$file"
Can anyone have a look at this?

Why can't you pipe through all of your conditions?
sed "s/abc/$abc/g" <$TEST_DIR/$file | sed "s/xyz/$xyz/g" >$TEST_DIR/newfile
mv $TEST_DIR/newfile $TEST_DIR/$file
Note that you will have to make this two operations, i.e., writing to a temporary file and then renaming. Otherwise you'll end up wiping out the file.
The input "<" and output ">" redirections are handled by the shell, and the moment they see ">somefile", the "somefile" is truncated. So you can never do cat <file >file successfully.

Using bash,
abc=21
xyz='32;34;35'
abcxyz=$(sed -r "s/^|;/\0${abc}/g;s/;|$/%\0/g" <<< "${xyz}")
sed -i~ "s/abcxyz%/${abcxyz}/;s/abc/${abc}/;s/xyz/${xyz}/" inputFile

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Replace consecutive delimeters but not single delimeter occurence in unix - unix

You can use the sed command to do replacements. For example, if your file was data.txt: sed 's/#!/~/g' data.txt > data_replaced.txt IF you want to edit the file inplace, you can use this: sed -i 's/#!/~/g' data.txt Hope this helps!

Related

Unix: multi and single character delimiter in cut or awk commands

Change multiple filenames unix

replace all commas except last one between two equal sign

sed: remove digits after word

Unix replace a particular type of string in file through unix

Categories

Resources