Combine two sed commands in one line - unix

I'm looking for an in place command to change all file lines which end with :.:.
From
chr01 1453173 . C T 655.85 PASS . GT:AD:DP:PGT:PID 0/1:25,29:54:.:.
To
chr01 1453173 . C T 655.85 PASS . GT:AD:DP 0/1:25,29:54
In words, I'm basically deleting :PGT:PID and :.:. from any line ending with :.:.

With GNU sed and Solaris sed:
sed '/:\.:\.$/{s///;s/PGT:PID//;}' file
If you want to edit your file with GNU sed "in place" use option -i.

With awk that'd be:
awk 'sub(/:\.:\.$/,""){sub(/:PGT:PID/,"")} 1' file
chr01 1453173 . C T 655.85 PASS . GT:AD:DP 0/1:25,29:54
and for inplace editing with gawk you could add the -i inplace option while with any awk you can just add > tmp && mv tmp file.

You are looking for something like:
sed -i.bak "/^.*:\.:\.$/ {s/:PGT:PID//g; s/:\.:\.//g;}" file
it does inplace replace with file and creates a backup as file.bak
the /^.*:\.:\.$/ restricts the s command to lines ending in :.:. the . need quoting because they are special characters for regexes
the s sommand replaces the strings with the empty string

As a one-liner:
sed -i -e '/:\.:\.$/!n' -e 's///' -e 's/:PGT:PID//g' "$file"
Expanded:
/:\.:\.$/!n # leave lines untouched unless they end in ":.:."
s/// # replace the matched ":.:." with nothing
s/:PGT:PID//g # replace ":PGT:PID" with nothing, everywhere
We use the -i flag to perform in-place edits. We pass each line as a separate -e expression for portability: some sed implementations allow several commands to be concatenated with ;, but that is not required by the POSIX standard.

Related

Unix: Remove the filename, get only the extension and rename the extension using sed

I have lot filenames which have this kind format:
118-edorf.sum.fil
118-edorf.sum.fil_1
118-edorf.sum.fil_11
i want to remove 118-edorf.sum. from the filename and get only the extension , fil , fil_1 and fil_11 and rename it to asc, asc_1 and asc_11.
So far, i can only remove 118-edorf.sum. using
sed 's/.*\.//'
The result will be
fil
fil_1
fil_11
So, how to rename it to
asc
asc_1
asc_11
To your solution only add additional substitution
sed 's/^.*\.//; s/fil/asc/'
To rename all files in directory using that criteria
rename 's/^.*\.//; s/fil/asc/' *
note: this rename command is untested
You can try this zsh foreach loop. Foreach can be somewhat slow for many files. You can also remove the echo statements for less noisier output.
foreach C (`ls 118-edorf.sum.fil*`)
f2=`echo $C|cut -d "." -f 3|cut -s -d "_" -f 2`
if [ "${f2}" -eq "" ]; then
echo "no underscore"
mv $C asc
elif
echo "_${f2}"
mv $C "asc_${f2}"
end
This might work for you (GNU sed):
sed -r 's/.*\.([^_]*(.*))/mv \1 asc\2/e' file
Use an evaluated substitution command. The substitution removes everything upto the last . and retains everything thereafter as the first backreference. Everything following the first _ within that backreference is also retained as the second backreference. The RHS of the substitution command the forms a mv commmand using the parts from the LHS.
If your sed does not have the e command/flag, use:
sed 's/.*\.\([^_]*\(.*\)\)/mv \1 asc\2/' file | shell
It might be safer to use:
sed -r 's/.*\.(fil(.*))/mv \1 asc\2/e' file

Replacing all occurrences of a string within a file

I'm producing many files with numeric fields. All the fields end with . Like 1234.
I need to Replace all occurrences of '.,' with a ','in all files
Assuming all the files are in the same dir, AND you're using an new-ish gnu sed that supports the -i (inplace) optin , you can do
cd /path/to/dataDir
for file in * ; do
sed -i 's/\([0-9]\)\.,/\1./g' "$f"
done
If you're using Mac OSX, you can either supply a file extension to the -i option like
sed -i".bak" ....
or indicate "overwrite existing" with
sed -i""
If you're in an vendored-unix environment, you may need to manage the output yourself. Then you can replace inner-loop with
sed s/\([0-9]\)\.,/\1./g' "$f" > "$f".new && /bin/mv "$f".new "$f"
IHTH

How to remove blank lines from a Unix file

I need to remove all the blank lines from an input file and write into an output file. Here is my data as below.
11216,33,1032747,64310,1,0,0,1.878,0,0,0,1,1,1.087,5,1,1,18-JAN-13,000603221321
11216,33,1033196,31300,1,0,0,1.5391,0,0,0,1,1,1.054,5,1,1,18-JAN-13,059762153003
11216,33,1033246,31300,1,0,0,1.5391,0,0,0,1,1,1.054,5,1,1,18-JAN-13,000603211032
11216,33,1033280,31118,1,0,0,1.5513,0,0,0,1,1,1.115,5,1,1,18-JAN-13,055111034001
11216,33,1033287,31118,1,0,0,1.5513,0,0,0,1,1,1.115,5,1,1,18-JAN-13,000378689701
11216,33,1033358,31118,1,0,0,1.5513,0,0,0,1,1,1.115,5,1,1,18-JAN-13,000093737301
11216,33,1035476,37340,1,0,0,1.7046,0,0,0,1,1,1.123,5,1,1,18-JAN-13,045802041926
11216,33,1035476,37340,1,0,0,1.7046,0,0,0,1,1,1.123,5,1,1,18-JAN-13,045802041954
11216,33,1035476,37340,1,0,0,1.7046,0,0,0,1,1,1.123,5,1,1,18-JAN-13,045802049326
11216,33,1035476,37340,1,0,0,1.7046,0,0,0,1,1,1.123,5,1,1,18-JAN-13,045802049383
11216,33,1036985,15151,1,0,0,1.4436,0,0,0,1,1,1.065,5,1,1,18-JAN-13,000093415580
11216,33,1037003,15151,1,0,0,1.4436,0,0,0,1,1,1.065,5,1,1,18-JAN-13,000781202001
11216,33,1037003,15151,1,0,0,1.4436,0,0,0,1,1,1.065,5,1,1,18-JAN-13,000781261305
11216,33,1037003,15151,1,0,0,1.4436,0,0,0,1,1,1.065,5,1,1,18-JAN-13,000781603955
11216,33,1037003,15151,1,0,0,1.4436,0,0,0,1,1,1.065,5,1,1,18-JAN-13,000781615746
sed -i '/^$/d' foo
This tells sed to delete every line matching the regex ^$ i.e. every empty line. The -i flag edits the file in-place, if your sed doesn't support that you can write the output to a temporary file and replace the original:
sed '/^$/d' foo > foo.tmp
mv foo.tmp foo
If you also want to remove lines consisting only of whitespace (not just empty lines) then use:
sed -i '/^[[:space:]]*$/d' foo
Edit: also remove whitespace at the end of lines, because apparently you've decided you need that too:
sed -i '/^[[:space:]]*$/d;s/[[:space:]]*$//' foo
awk 'NF' filename
awk 'NF > 0' filename
sed -i '/^$/d' filename
awk '!/^$/' filename
awk '/./' filename
The NF also removes lines containing only blanks or tabs, the regex /^$/ does not.
Use grep to match any line that has nothing between the start anchor (^) and the end anchor ($):
grep -v '^$' infile.txt > outfile.txt
If you want to remove lines with only whitespace, you can still use grep. I am using Perl regular expressions in this example, but here are other ways:
grep -P -v '^\s*$' infile.txt > outfile.txt
or, without Perl regular expressions:
grep -v '^[[:space:]]*$' infile.txt > outfile.txt
sed -e '/^ *$/d' input > output
Deletes all lines which consist only of blanks (or is completely empty). You can change the blank to [ \t] where the \t is a representation for tab. Whether your shell or your sed will do the expansion varies, but you can probably type the tab character directly. And if you're using GNU or BSD sed, you can do the edit in-place, if that's what you want, with the -i option.
If I execute the above command still I have blank lines in my output file. What could be the reason?
There could be several reasons. It might be that you don't have blank lines but you have lots of spaces at the end of a line so it looks like you have blank lines when you cat the file to the screen. If that's the problem, then:
sed -e 's/ *$//' -e '/^ *$/d' input > output
The new regex removes repeated blanks at the end of the line; see previous discussion for blanks or tabs.
Another possibility is that your data file came from Windows and has CRLF line endings. Unix sees the carriage return at the end of the line; it isn't a blank, so the line is not removed. There are multiple ways to deal with that. A reliable one is tr to delete (-d) character code octal 15, aka control-M or \r or carriage return:
tr -d '\015' < input | sed -e 's/ *$//' -e '/^ *$/d' > output
If neither of those works, then you need to show a hex dump or octal dump (od -c) of the first two lines of the file, so we can see what we're up against:
head -n 2 input | od -c
Judging from the comments that sed -i does not work for you, you are not working on Linux or Mac OS X or BSD — which platform are you working on? (AIX, Solaris, HP-UX spring to mind as relatively plausible possibilities, but there are plenty of other less plausible ones too.)
You can try the POSIX named character classes such as sed -e '/^[[:space:]]*$/d'; it will probably work, but is not guaranteed. You can try it with:
echo "Hello World" | sed 's/[[:space:]][[:space:]]*/ /'
If it works, there'll be three spaces between the 'Hello' and the 'World'. If not, you'll probably get an error from sed. That might save you grief over getting tabs typed on the command line.
grep . file
grep looks at your file line-by-line; the dot . matches anything except a newline character. The output from grep is therefore all the lines that consist of something other than a single newline.
with awk
awk 'NF > 0' filename
To be thorough and remove lines even if they include spaces or tabs something like this in perl will do it:
cat file.txt | perl -lane "print if /\S/"
Of course there are the awk and sed equivalents. Best not to assume the lines are totally blank as ^$ would do.
Cheers
You can sed's -i option to edit in-place without using temporary file:
sed -i '/^$/d' file

find and replace from command line unix

I have a multi line text file where each line has the format
..... Game #29832: ......
I want to append the character '1' to each number on each line (which is different on every line), does anyone know of a way to do this from the command line?
Thanks
sed -i -e 's/Game #[0-9]*/&1/' file
-i is for in-place editing, and & means whatever matched from the pattern. If you don't want to overwrite the file, omit the -i flag.
Using sed:
cat file | sed -e 's/\(Game #[0-9]*\)/\11/'
sed 's/ Game #\([0-9]*\):/ Game #1\1:/' yourfile.txt
GNU awk
awk '{b=gensub(/(Game #[0-9]+)/ ,"\\11","g",$0); print b }' file

How do I replace a token with the result of `pwd` in sed?

I'm trying to do something like this:
sed 's/#REPLACE-WITH-PATH/'`pwd`'/'
Unfortunately, I that errors out:
sed: -e expression #1, char 23: unknown option to `s'
Why does this happen?
You need to use a different character instead of /, eg.:
sed 's?#REPLACE-WITH-PATH?'`pwd`'?'
because / appears in the pwd output.
in sed, you can't use / directly, you must use '/'.
#!/bin/bash
dir=$`pwd`/
ls -1 | sed "s/^/${dir//\//\\/}/g"
sed 's:#REPLACE-WITH-PATH:'`pwd`':' config.ini
The problem is one of escaping the output of pwd correctly. Fortunately, as in vim, sed supports using a different delimiter character. In this case, using the colon instead of slash as a delimiter avoids the escaping problem.
instead of fumbling around with quotes like that, you can do it like this
#!/bin/bash
p=`pwd`
# pass the variable p to awk
awk -v p="$p" '$0~p{ gsub("REPLACE-WITH-PATH",p) }1' file >temp
mv temp file
or just bash
p=`pwd`
while read line
do
line=${line/REPLACE-WITH-PATH/$p}
echo $line
done < file > temp
mv temp file

Resources