I have this line inside a file:
ULNET-PA,client_sgcib,broker_keplersecurities
,KEPLER
I try to get rid of that ^M (carriage return) character so I used:
sed 's/^M//g'
However this does remove everything after ^M:
[root#localhost tmp]# vi test
ULNET-PA,client_sgcib,broker_keplersecurities^M,KEPLER
[root#localhost tmp]# sed 's/^M//g' test
ULNET-PA,client_sgcib,broker_keplersecurities
What I want to obtain is:
[root#localhost tmp]# vi test
ULNET-PA,client_sgcib,broker_keplersecurities,KEPLER
Use tr:
tr -d '^M' < inputfile
(Note that the ^M character can be input using Ctrl+VCtrl+M)
EDIT: As suggested by Glenn Jackman, if you're using bash, you could also say:
tr -d $'\r' < inputfile
still the same line:
sed -i 's/^M//g' file
when you type the command, for ^M you type Ctrl+VCtrl+M
actually if you have already opened the file in vim, you can just in vim do:
:%s/^M//g
same, ^M you type Ctrl-V Ctrl-M
You can simply use dos2unix which is available in most Unix/Linux systems. However I found the following sed command to be better as it removed ^M where dos2unix couldn't:
sed 's/\r//g' < input.txt > output.txt
Hope that helps.
Note: ^M is actually carriage return character which is represented in code as \r
What dos2unix does is most likely equivalent to:
sed 's/\r\n/\n/g' < input.txt > output.txt
It doesn't remove \r when it is not immediately followed by \n and replaces both with just \n. This fails with certain types of files like one I just tested with.
alias dos2unix="sed -i -e 's/'\"\$(printf '\015')\"'//g' "
Usage:
dos2unix file
If Perl is an option:
perl -i -pe 's/\r\n$/\n/g' file
-i makes a .bak version of the input file
\r = carriage return
\n = linefeed
$ = end of line
s/foo/bar/g = globally substitute "foo" with "bar"
In awk:
sub(/\r/,"")
If it is in the end of record, sub(/\r/,"",$NF) should suffice. No need to scan the whole record.
This is the better way to achieve
tr -d '\015' < inputfile_name > outputfile_name
Later rename the file to original file name.
I agree with #twalberg (see accepted answer comments, above), dos2unix on Mac OSX covers this, quoting man dos2unix:
To run in Mac mode use the command-line option "-c mac" or use the
commands "mac2unix" or "unix2mac"
I settled on 'mac2unix', which got rid of my less-cmd-visible '^M' entries, introduced by an Apple 'Messages' transfer of a bash script between 2 Yosemite (OSX 10.10) Macs!
I installed 'dos2unix', trivially, on Mac OSX using the popular Homebrew package installer, I highly recommend it and it's companion command, Cask.
This is clean and simple and it works:
sed -i 's/\r//g' file
where \r of course is the equivalent for ^M.
Simply run the following command:
sed -i -e 's/\r$//' input.file
I verified this as valid in Mac OSX Monterey.
remove any \r :
nawk 'NF+=OFS=_' FS='\r'
gawk 3 ORS= RS='\r'
remove end of line \r :
mawk2 8 RS='\r?\n'
mawk -F'\r$' NF=1
I need to remove all the blank lines from an input file and write into an output file. Here is my data as below.
11216,33,1032747,64310,1,0,0,1.878,0,0,0,1,1,1.087,5,1,1,18-JAN-13,000603221321
11216,33,1033196,31300,1,0,0,1.5391,0,0,0,1,1,1.054,5,1,1,18-JAN-13,059762153003
11216,33,1033246,31300,1,0,0,1.5391,0,0,0,1,1,1.054,5,1,1,18-JAN-13,000603211032
11216,33,1033280,31118,1,0,0,1.5513,0,0,0,1,1,1.115,5,1,1,18-JAN-13,055111034001
11216,33,1033287,31118,1,0,0,1.5513,0,0,0,1,1,1.115,5,1,1,18-JAN-13,000378689701
11216,33,1033358,31118,1,0,0,1.5513,0,0,0,1,1,1.115,5,1,1,18-JAN-13,000093737301
11216,33,1035476,37340,1,0,0,1.7046,0,0,0,1,1,1.123,5,1,1,18-JAN-13,045802041926
11216,33,1035476,37340,1,0,0,1.7046,0,0,0,1,1,1.123,5,1,1,18-JAN-13,045802041954
11216,33,1035476,37340,1,0,0,1.7046,0,0,0,1,1,1.123,5,1,1,18-JAN-13,045802049326
11216,33,1035476,37340,1,0,0,1.7046,0,0,0,1,1,1.123,5,1,1,18-JAN-13,045802049383
11216,33,1036985,15151,1,0,0,1.4436,0,0,0,1,1,1.065,5,1,1,18-JAN-13,000093415580
11216,33,1037003,15151,1,0,0,1.4436,0,0,0,1,1,1.065,5,1,1,18-JAN-13,000781202001
11216,33,1037003,15151,1,0,0,1.4436,0,0,0,1,1,1.065,5,1,1,18-JAN-13,000781261305
11216,33,1037003,15151,1,0,0,1.4436,0,0,0,1,1,1.065,5,1,1,18-JAN-13,000781603955
11216,33,1037003,15151,1,0,0,1.4436,0,0,0,1,1,1.065,5,1,1,18-JAN-13,000781615746
sed -i '/^$/d' foo
This tells sed to delete every line matching the regex ^$ i.e. every empty line. The -i flag edits the file in-place, if your sed doesn't support that you can write the output to a temporary file and replace the original:
sed '/^$/d' foo > foo.tmp
mv foo.tmp foo
If you also want to remove lines consisting only of whitespace (not just empty lines) then use:
sed -i '/^[[:space:]]*$/d' foo
Edit: also remove whitespace at the end of lines, because apparently you've decided you need that too:
sed -i '/^[[:space:]]*$/d;s/[[:space:]]*$//' foo
awk 'NF' filename
awk 'NF > 0' filename
sed -i '/^$/d' filename
awk '!/^$/' filename
awk '/./' filename
The NF also removes lines containing only blanks or tabs, the regex /^$/ does not.
Use grep to match any line that has nothing between the start anchor (^) and the end anchor ($):
grep -v '^$' infile.txt > outfile.txt
If you want to remove lines with only whitespace, you can still use grep. I am using Perl regular expressions in this example, but here are other ways:
grep -P -v '^\s*$' infile.txt > outfile.txt
or, without Perl regular expressions:
grep -v '^[[:space:]]*$' infile.txt > outfile.txt
sed -e '/^ *$/d' input > output
Deletes all lines which consist only of blanks (or is completely empty). You can change the blank to [ \t] where the \t is a representation for tab. Whether your shell or your sed will do the expansion varies, but you can probably type the tab character directly. And if you're using GNU or BSD sed, you can do the edit in-place, if that's what you want, with the -i option.
If I execute the above command still I have blank lines in my output file. What could be the reason?
There could be several reasons. It might be that you don't have blank lines but you have lots of spaces at the end of a line so it looks like you have blank lines when you cat the file to the screen. If that's the problem, then:
sed -e 's/ *$//' -e '/^ *$/d' input > output
The new regex removes repeated blanks at the end of the line; see previous discussion for blanks or tabs.
Another possibility is that your data file came from Windows and has CRLF line endings. Unix sees the carriage return at the end of the line; it isn't a blank, so the line is not removed. There are multiple ways to deal with that. A reliable one is tr to delete (-d) character code octal 15, aka control-M or \r or carriage return:
tr -d '\015' < input | sed -e 's/ *$//' -e '/^ *$/d' > output
If neither of those works, then you need to show a hex dump or octal dump (od -c) of the first two lines of the file, so we can see what we're up against:
head -n 2 input | od -c
Judging from the comments that sed -i does not work for you, you are not working on Linux or Mac OS X or BSD — which platform are you working on? (AIX, Solaris, HP-UX spring to mind as relatively plausible possibilities, but there are plenty of other less plausible ones too.)
You can try the POSIX named character classes such as sed -e '/^[[:space:]]*$/d'; it will probably work, but is not guaranteed. You can try it with:
echo "Hello World" | sed 's/[[:space:]][[:space:]]*/ /'
If it works, there'll be three spaces between the 'Hello' and the 'World'. If not, you'll probably get an error from sed. That might save you grief over getting tabs typed on the command line.
grep . file
grep looks at your file line-by-line; the dot . matches anything except a newline character. The output from grep is therefore all the lines that consist of something other than a single newline.
with awk
awk 'NF > 0' filename
To be thorough and remove lines even if they include spaces or tabs something like this in perl will do it:
cat file.txt | perl -lane "print if /\S/"
Of course there are the awk and sed equivalents. Best not to assume the lines are totally blank as ^$ would do.
Cheers
You can sed's -i option to edit in-place without using temporary file:
sed -i '/^$/d' file
For grep there's a fixed string option, -F (fgrep) to turn off regex interpretation of the search string.
Is there a similar facility for sed? I couldn't find anything in the man. A recommendation of another gnu/linux tool would also be fine.
I'm using sed for the find and replace functionality: sed -i "s/abc/def/g"
Do you have to use sed? If you're writing a bash script, you can do
#!/bin/bash
pattern='abc'
replace='def'
file=/path/to/file
tmpfile="${TMPDIR:-/tmp}/$( basename "$file" ).$$"
while read -r line
do
echo "${line//$pattern/$replace}"
done < "$file" > "$tmpfile" && mv "$tmpfile" "$file"
With an older Bourne shell (such as ksh88 or POSIX sh), you may not have that cool ${var/pattern/replace} structure, but you do have ${var#pattern} and ${var%pattern}, which can be used to split the string up and then reassemble it. If you need to do that, you're in for a lot more code - but it's really not too bad.
If you're not in a shell script already, you could pretty easily make the pattern, replace, and filename parameters and just call this. :)
PS: The ${TMPDIR:-/tmp} structure uses $TMPDIR if that's set in your environment, or uses /tmp if the variable isn't set. I like to stick the PID of the current process on the end of the filename in the hopes that it'll be slightly more unique. You should probably use mktemp or similar in the "real world", but this is ok for a quick example, and the mktemp binary isn't always available.
Option 1) Escape regexp characters. E.g. sed 's/\$0\.0/0/g' will replace all occurrences of $0.0 with 0.
Option 2) Use perl -p -e in conjunction with quotemeta. E.g. perl -p -e 's/\\./,/gi' will replace all occurrences of . with ,.
You can use option 2 in scripts like this:
SEARCH="C++"
REPLACE="C#"
cat $FILELIST | perl -p -e "s/\\Q$SEARCH\\E/$REPLACE/g" > $NEWLIST
If you're not opposed to Ruby or long lines, you could use this:
alias replace='ruby -e "File.write(ARGV[0], File.read(ARGV[0]).gsub(ARGV[1]) { ARGV[2] })"'
replace test3.txt abc def
This loads the whole file into memory, performs the replacements and saves it back to disk. Should probably not be used for massive files.
If you don't want to escape your string, you can reach your goal in 2 steps:
fgrep the line (getting the line number) you want to replace, and
afterwards use sed for replacing this line.
E.g.
#/bin/sh
PATTERN='foo*[)*abc' # we need it literal
LINENUMBER="$( fgrep -n "$PATTERN" "$FILE" | cut -d':' -f1 )"
NEWSTRING='my new string'
sed -i "${LINENUMBER}s/.*/$NEWSTRING/" "$FILE"
You can do this in two lines of bash code if you're OK with reading the whole file into memory. This is quite flexible -- the pattern and replacement can contain newlines to match across lines if needed. It also preserves any trailing newline or lack thereof, which a simple loop with read does not.
mapfile -d '' < file
printf '%s' "${MAPFILE//"$pat"/"$rep"}" > file
For completeness, if the file can contain null bytes (\0), we need to extend the above, and it becomes
mapfile -d '' < <(cat file; printf '\0')
last=${MAPFILE[-1]}; unset "MAPFILE[-1]"
printf '%s\0' "${MAPFILE[#]//"$pat"/"$rep"}" > file
printf '%s' "${last//"$pat"/"$rep"}" >> file
perl -i.orig -pse 'while (($i = index($_,$s)) >= 0) { substr($_,$i,length($s), $r)}'--\
-s='$_REQUEST['\'old\'']' -r='$_REQUEST['\'new\'']' sample.txt
-i.orig in-place modification with backup.
-p print lines from the input file by default
-s enable rudimentary parsing of command line arguments
-e run this script
index($_,$s) search for the $s string
substr($_,$i,length($s), $r) replace the string
while (($i = index($_,$s)) >= 0) repeat until
-- end of perl parameters
-s='$_REQUEST['\'old\'']', -r='$_REQUEST['\'new\'']' - set $s,$r
You still need to "escape" ' chars but the rest should be straight forward.
Note: this started as an answer to How to pass special character string to sed hence the $_REQUEST['old'] strings, however this question is a bit more appropriately formulated.
You should be using replace instead of sed.
From the man page:
The replace utility program changes strings in place in files or on the
standard input.
Invoke replace in one of the following ways:
shell> replace from to [from to] ... -- file_name [file_name] ...
shell> replace from to [from to] ... < file_name
from represents a string to look for and to represents its replacement.
There can be one or more pairs of strings.
I'm trying to do the opposite of this question, replacing Unix line endings with Windows line endings, so that I can use SQL Server bcp over samba to import the file. I have sed installed but not dos2unix. I tried reversing the examples but to no avail.
Here's the command I'm using.
sed -e 's/\n/\r\n/g' myfile
I executed this and then ran od -c myfile, expecting to see \r\n where there used to be \n. But there all still \n. (Or at least they appear to be. The output of od overflows my screen buffer, so I don't get to see the beginning of the file).
I haven't been able to figure out what I'm doing wrong. Any suggestions?
When faced with this, I use a simple perl one-liner:
perl -pi -e 's/\n/\r\n/' filename
because sed behavior varies, and I know this works.
What is the problem with getting dos2unix onto the machine?
What is the platform you are working with?
Do you have GNU sed or regular non-GNU sed?
On Solaris, /usr/bin/sed requires:
sed 's/$/^M/'
where I entered the '^M' by typing controlV controlM. The '$' matches at the end of the line, and replaces the end of line with the control-M. You can script that, too.
Mechanisms expecting sed to expand '\r' or '\\r' to control-M are going to be platform-specific, at best.
You don't need the -e option.
$ matches the endline character. This sed command will insert a \r character before the end of line:
sed 's/$/\r/' myfile
Just adding a \r (aka ^M, see Jonathan Leffler's answer) in front of \n is not safe because the file might have mixed mode EOL, so then you risk ending up with some lines becomming \r\r\n. The safe thing to do is first remove all '\r' characters, and then insert (a single) \r before \n.
#!/bin/sh
sed 's/^M//g' ${1+"$#"} | sed 's/$/^M/'
Updated to use ^M.
sed 's/\([^^M]\)$/\0^M/' your_file
This makes sure you only insert a \r when there is no \r before \n. This worked for me.
Try using:
echo " this is output" > input
sed 's/$/\r/g' input |od -c
Maybe if you try it this way
cat myfile | sed 's/\n/\r\n/g' > myfile.win
will work, from my understanding your just making the replacements to the console output, you need to redirect output to a file, in this case myfile.win, then you could just rename it to whatever you want. The whole script would be (running inside a directory full of this kind of files):
#!/bin/bash
for file in $(find . -type f -name '*')
do
cat $file | sed 's/\n/\r\n/g' > $file.new
mv -f $file.new $file
done