How to colorize diff on the command line - unix

When I have a diff, how can I colorize it so that it looks good?
I want it for the command line, so please no GUI solutions.

Man pages for diff suggest no solution for colorization from within itself. Please consider using colordiff. It's a wrapper around diff that produces the same output as diff, except that it augments the output using colored syntax highlighting to increase readability:
diff old new | colordiff
or just:
colordiff old new
Installation:
Ubuntu/Debian: sudo apt-get install colordiff
OS X: brew install colordiff or port install colordiff

Use Vim:
diff /path/to/a /path/to/b | vim -R -
Or better still, VimDiff (or vim -d, which is shorter to type) will show differences between two, three or four files side-by-side.
Examples:
vim -d /path/to/[ab]
vimdiff file1 file2 file3 file4

Actually there seems to be yet another option (which I only noticed recently, when running into the problem described above):
git diff --no-index <file1> <file2>
# output to console instead of opening a pager
git --no-pager diff --no-index <file1> <file2>
If you have Git around (which you already might be using anyway), then you will be able to use it for comparison, even if the files themselves are not under version control. If not enabled for you by default, then enabling color support here seems to be considerably easier than some of the previously mentioned workarounds.

diff --color option (added to GNU diffutils 3.4 in 2016-08-08)
This is the default diff implementation on most distributions, which will soon be getting it.
Ubuntu 18.04 (Bionic Beaver) has diffutils 3.6 and therefore has it.
On 3.5 it looks like this:
Tested with:
diff --color -u \
<(seq 6 | sed 's/$/ a/') \
<(seq 8 | grep -Ev '^(2|3)$' | sed 's/$/ a/')
Apparently added in commit c0fa19fe92da71404f809aafb5f51cfd99b1bee2 (Mar 2015).
Word-level diff
Like diff-highlight. It is not possible it seems, but there is a feature request: https://lists.gnu.org/archive/html/diffutils-devel/2017-01/msg00001.html
Related questions:
Using 'diff' (or anything else) to get character-level diff between text files
https://unix.stackexchange.com/questions/11128/diff-within-a-line
https://superuser.com/questions/496415/using-diff-on-a-long-one-line-file
ydiff does it though. See below.
ydiff side-by-side word level diff
https://github.com/ymattw/ydiff
Is this nirvana?
python3 -m pip install --user ydiff
diff -u a b | ydiff -s
Outcome:
If the lines are too narrow (default 80 columns), fit to the screen with:
diff -u a b | ydiff -w 0 -s
Contents of the test files:
a
1
2
3
4
5 the original line the original line the original line the original line
6
7
8
9
10
11
12
13
14
15 the original line the original line the original line the original line
16
17
18
19
20
b
1
2
3
4
5 the original line the original line the original line the original line
6
7
8
9
10
11
12
13
14
15 the original line the original line the original line the original line
16
17
18
19
20
ydiff Git integration
ydiff integrates with Git without any configuration required.
From inside a Git repository, instead of git diff, you can do just:
ydiff -s
and instead of git log:
ydiff -ls
See also: How can I get a side-by-side diff when I do "git diff"?
Tested on Ubuntu 16.04 (Xenial Xerus), Git 2.18.0, and ydiff 1.1.

And for those occasions when a yum install colordiff or an apt-get install colordiff is not an option due to some insane constraint beyond your immediate control, or you're just feeling crazy, you can reinvent the wheel with a line of sed:
sed 's/^-/\x1b[41m-/;s/^+/\x1b[42m+/;s/^#/\x1b[34m#/;s/$/\x1b[0m/'
Throw that in a shell script and pipe unified diff output through it.
It makes hunk markers blue and highlights new/old filenames and added/removed lines in green and red background, respectively.1 And it will make trailing space2 changes more readily apparent than colordiff can.
1 Incidentally, the reason for highlighting the filenames the same as the modified lines is that to correctly differentiate between the filenames and the modified lines requires properly parsing the diff format, which is not something to tackle with a regex. Highlighting them the same works "well enough" visually and makes the problem trivial. That said, there are some interesting subtleties.
2 But not trailing tabs. Apparently tabs don't get their background set, at least in my xterm. It does make tab vs. space changes stand out a bit though.

Coloured, word-level diff ouput
Here's what you can do with the the below script and diff-highlight:
#!/bin/sh -eu
# Use diff-highlight to show word-level differences
diff -U3 --minimal "$#" |
sed 's/^-/\x1b[1;31m-/;s/^+/\x1b[1;32m+/;s/^#/\x1b[1;34m#/;s/$/\x1b[0m/' |
diff-highlight
(Credit to #retracile's answer for the sed highlighting)

You can change the Subversion configuration to use colordiff:
~/.subversion/config.diff
### Set diff-cmd to the absolute path of your 'diff' program.
### This will override the compile-time default, which is to use
### Subversion's internal diff implementation.
-# diff-cmd = diff_program (diff, gdiff, etc.)
+diff-cmd = colordiff
via: https://gist.github.com/westonruter/846524

I use grc (Generic Colouriser), which allows you to colour the output of a number of commands including diff.
It is a Python script which can be wrapped around any command. So instead of invoking diff file1 file2, you would invoke grc diff file1 file2 to see colourised output. I have aliased diff to grc diff to make it easier.

Here is another solution that invokes sed to insert the appropriate ANSI escape sequences for colors to show the +, -, and # lines in red, green, and cyan, respectively.
diff -u old new | sed "s/^-/$(tput setaf 1)&/; s/^+/$(tput setaf 2)&/; s/^#/$(tput setaf 6)&/; s/$/$(tput sgr0)/"
Unlike the other solutions to this question, this solution does not spell out the ANSI escape sequences explicitly. Instead, it invokes the tput setaf and tput sgr0 commands to generate the ANSI escape sequences to set an appropriate color and reset terminal attributes, respectively.
To see the available colors for each argument to tput setaf, use this command:
for i in {0..255}; do tput setaf $i; printf %4d $i; done; tput sgr0; echo
Here is how the output looks:
Here is the evidence that the tput setaf and tput sgr0 commands generate the appropriate ANSI escape sequences:
$ tput setaf 1 | xxd -g1
00000000: 1b 5b 33 31 6d .[31m
$ tput setaf 2 | xxd -g1
00000000: 1b 5b 33 32 6d .[32m
$ tput setaf 6 | xxd -g1
00000000: 1b 5b 33 36 6d .[36m
$ tput sgr0 | xxd -g1
00000000: 1b 28 42 1b 5b 6d .(B.[m

Since wdiff accepts arguments specifying the string at the beginning and end of both insertions and deletions, you can use ANSI color sequences as those strings:
wdiff -n -w $'\033[30;41m' -x $'\033[0m' -y $'\033[30;42m' -z $'\033[0m' file1 file2
For example, this is the output of comparing two CSV files:
Example from 2.2 Actual examples of wdiff usage.

I would suggest you to give diff-so-fancy a try. I use it during my work and it sure seems great as of now. It comes packed with many options and it's really easy to configure your diffs the way you want.
You can install it by:
sudo npm install -g diff-so-fancy
or on Mac:
brew install diff-so-fancy
Afterwards, you can highlight your diffs like this:
diff -u file1 file2 | diff-so-fancy

Character-level color diff:
Install ccdiff
ccdiff -r /usr/share/dict/words /tmp/new-dict

No one has mentioned delta so far. It supports syntax colored diff view with syntax highlighting.

With the bat command:
diff file1 file2 | bat -l diff

On recent versions of Git on Ubuntu, you can enable diff-highlighting with:
sudo ln -s /usr/share/doc/git/contrib/diff-highlight/diff-highlight /usr/local/bin
sudo chmod a+x /usr/share/doc/git/contrib/diff-highlight/diff-highlight
And then adding this to your .gitconfig file:
[pager]
log = diff-highlight | less
show = diff-highlight | less
diff = diff-highlight | less
It's possible the script is located somewhere else in other distributions. You can use locate diff-highlight to find out where.

My favorite choice is vdiff <file1> <file2> function (I forgot from where I got it).
It will open two windows in Vim side-by-side, to see clearly see the difference between the two files.
vdiff () {
if [ "${#}" -ne 2 ] ; then
echo "vdiff requires two arguments"
echo " comparing dirs: vdiff dir_a dir_b"
echo " comparing files: vdiff file_a file_b"
return 1
fi
local left="${1}"
local right="${2}"
if [ -d "${left}" ] && [ -d "${right}" ]; then
vim +"DirDiff ${left} ${right}"
else
vim -d "${left}" "${right}"
fi
}
Put this script in your (.alias) or (.zshrc), and then call it using
vdiff <file1> <file2>.
Example
The results are:

diff --color=always file_a file_b | less
works for me

For me I found some solutions: it is a working solution
#echo off
Title a game for YouTube
explorer "https://thepythoncoding.blogspot.com/2020/11/how-to-echo-with-different-colors-in.html"
SETLOCAL EnableDelayedExpansion
for /F "tokens=1,2 delims=#" %%a in ('"prompt #$H#$E# & echo on & for %%b in (1) do rem"') do (
set "DEL=%%a"
)
echo say the name of the colors, don't read
call :ColorText 0a "blue"
call :ColorText 0C "green"
call :ColorText 0b "red"
echo(
call :ColorText 19 "yellow"
call :ColorText 2F "black"
call :ColorText 4e "white"
goto :Beginoffile
:ColorText
echo off
<nul set /p ".=%DEL%" > "%~2"
findstr /v /a:%1 /R "^$" "%~2" nul
del "%~2" > nul 2>&1
goto :eof
:Beginoffile

Related

grep at multiple files, then cut at those files, then cut at another file, then diff between those 2 files

What I need to do is the following 3 operations at 1 single line, if possible:
1) grep -h A1234 BATCHFILE.201509* | cut -c 38-49 > file1.out
(Note: Here I use the ':' delimiter, because there are many BATCHFILE.201509xxxx files, for which the grep returns:
BATCHFILE.2015091915: ..............
BATCHFILE.2015091918: ..............
BATCHFILE.2015091922: ..............
So, I cut for characters 38 till 49 from results after the ':'.)
2) Then I need to cut for characters 38 till 49 at a file "WORKFILE.IN":
cut -c 38-49 WORKFILE.IN > file2.out
3) Then I need to find the differences between the 2 files extracted:
diff file1.out file2.out
Thank you.
grep -h A1234 BATCHFILE.201509* | cut -c 38-49 > file1.out && cut -c 38-49 WORKFILE.IN > file2.out && diff file1.out file2.out
Well technically there is a single line. the -h option to grep will suppress the filename from being output when multiple files are searched, and the && will chain the commands so they are run only if the previous completes.
I still question the need for 1 line, though. simple shell script with the steps/variables might be better / easier to understand
diff with anonymous pipelines
diff <(grep -h A1234 BATCHFILE.201509* | cut -c 38-49) <(cut -c 38-49 WORKFILE.IN)

use gpsd or cgps to return latitude and longitude then quit

I would like a dead-simple way to query my gps location from a usb dongle from the unix command line.
Right now, I know I've got a functioning software and hardware system, as evidenced by the success of the cgps command in showing me my position. I'd now like to be able to make short requests for my gps location (lat,long in decimals) from the command line. my usb serial's path is /dev/ttyUSB0 and I'm using a Global Sat dongle that outputs generic NMEA sentences
How might I accomplish this?
Thanks
telnet 127.0.0.1 2947
?WATCH={"enable":true}
?POLL;
gives you your answer, but you still need to separate the wheat from the chaff. It also assumes the gps is not coming in from a cold start.
A short script could be called, e.g.;
#!/bin/bash
exec 2>/dev/null
# get positions
gpstmp=/tmp/gps.data
gpspipe -w -n 40 >$gpstmp"1"&
ppid=$!
sleep 10
kill -9 $ppid
cat $gpstmp"1"|grep -om1 "[-]\?[[:digit:]]\{1,3\}\.[[:digit:]]\{9\}" >$gpstmp
size=$(stat -c%s $gpstmp)
if [ $size -gt 10 ]; then
cat $gpstmp|sed -n -e 1p >/tmp/gps.lat
cat $gpstmp|sed -n -e 2p >/tmp/gps.lon
fi
rm $gpstmp $gpstmp"1"
This will cause 40 sentences to be output and then grep lat/lon to temporary files and then clean up.
Or, from GPS3 github repository place the alpha gps3.py in the same directory as, and execute, the following Python2.7-3.4 script.
from time import sleep
import gps3
the_connection = gps3.GPSDSocket()
the_fix = gps3.DataStream()
try:
for new_data in the_connection:
if new_data:
the_fix.refresh(new_data)
if not isinstance(the_fix.TPV['lat'], str): # check for valid data
speed = the_fix.TPV['speed']
latitude = the_fix.TPV['lat']
longitude = the_fix.TPV['lon']
altitude = the_fix.TPV['alt']
print('Latitude:', latitude, 'Longitude:', longitude)
sleep(1)
except KeyboardInterrupt:
the_connection.close()
print("\nTerminated by user\nGood Bye.\n")
If you want it to close after one iteration also import sys and then replace sleep(1) with sys.exit()
much easier solution:
$ gpspipe -w -n 10 | grep -m 1 lon
{"class":"TPV","device":"tcp://localhost:4352","mode":2,"lat":11.1111110000,"lon":22.222222222}
source
You can use my script : gps.sh return "x,y"
#!/bin/bash
x=$(gpspipe -w -n 10 |grep lon|tail -n1|cut -d":" -f9|cut -d"," -f1)
y=$(gpspipe -w -n 10 |grep lon|tail -n1|cut -d":" -f10|cut -d"," -f1)
echo "$x,$y"
sh gps.sh
43.xx4092000,6.xx1269167
Putting a few of the bits of different answers together with a bit more jq work, I like this version:
$ gpspipe -w -n 10 | grep -m 1 TPV | jq -r '[.lat, .lon] | #csv'
40.xxxxxx054,-79.yyyyyy367
Explanation:
(1) use grep -m 1 after invoking gpspipe, as used by #eadmaster's answer, because the grep will exit as soon as the first match is found. This gets you results faster instead of having to wait for 10 lines (or using two invocations of gpspipe).
(2) use jq to extract both fields simultaneously; the #csv formatter is more readable. Note the use of jq -r (raw output), so that the output is not put in quotes. Otherwise the output would be "40.xxxx,-79.xxxx" - which might be fine or better for some applications.
(3) Search for the TPV field by name for clarity. This is the "time, position, velocity" record, which is the one we want for extracting the current lat & lon. Just searching for "lat" or "lon" risks getting confused by the GST object that some GPSes may supply, and in that object, 'lat' and 'lon' are the standard deviation of the position error, not the position itself.
Improving on eadmaster's answer here is a more elegant solution:
gpspipe -w -n 10 | jq -r '.lon' | grep "[[:digit:]]" | tail -1
Explanation:
Ask from gpsd 10 times the data
Parse the received JSONs using jq
We want only numeric values, so filter using grep
We want the last received value, so use tail for that
Example:
$ gpspipe -w -n 10 | jq -r '.lon' | grep "[[:digit:]]" | tail -1
28.853181286

Generate a random filename in unix shell

I would like to generate a random filename in unix shell (say tcshell). The filename should consist of random 32 hex letters, e.g.:
c7fdfc8f409c548a10a0a89a791417c5
(to which I will add whatever is neccesary). The point is being able to do it only in shell without resorting to a program.
Assuming you are on a linux, the following should work:
cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32
This is only pseudo-random if your system runs low on entropy, but is (on linux) guaranteed to terminate. If you require genuinely random data, cat /dev/random instead of /dev/urandom. This change will make your code block until enough entropy is available to produce truly random output, so it might slow down your code. For most uses, the output of /dev/urandom is sufficiently random.
If you on OS X or another BSD, you need to modify it to the following:
cat /dev/urandom | env LC_CTYPE=C tr -cd 'a-f0-9' | head -c 32
why do not use unix mktemp command:
$ TMPFILE=`mktemp tmp.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX` && echo $TMPFILE
tmp.MnxEsPDsNUjrzDIiPhnWZKmlAXAO8983
One command, no pipe, no loop:
hexdump -n 16 -v -e '/1 "%02X"' -e '/16 "\n"' /dev/urandom
If you don't need the newline, for example when you're using it in a variable:
hexdump -n 16 -v -e '/1 "%02X"' /dev/urandom
Using "16" generates 32 hex digits.
uuidgen generates exactly this, except you have to remove hyphens. So I found this to be the most elegant (at least to me) way of achieving this. It should work on linux and OS X out of the box.
uuidgen | tr -d '-'
As you probably noticed from each of the answers, you generally have to "resort to a program".
However, without using any external executables, in Bash and ksh:
string=''; for i in {0..31}; do string+=$(printf "%x" $(($RANDOM%16)) ); done; echo $string
in zsh:
string=''; for i in {0..31}; do string+=$(printf "%x" $(($RANDOM%16)) ); dummy=$RANDOM; done; echo $string
Change the lower case x in the format string to an upper case X to make the alphabetic hex characters upper case.
Here's another way to do it in Bash but without an explicit loop:
printf -v string '%X' $(printf '%.2s ' $((RANDOM%16))' '{00..31})
In the following, "first" and "second" printf refers to the order in which they're executed rather than the order in which they appear in the line.
This technique uses brace expansion to produce a list of 32 random numbers mod 16 each followed by a space and one of the numbers in the range in braces followed by another space (e.g. 11 00). For each element of that list, the first printf strips off all but the first two characters using its format string (%.2) leaving either single digits followed by a space each or two digits. The space in the format string ensures that there is then at least one space between each output number.
The command substitution containing the first printf is not quoted so that word splitting is performed and each number goes to the second printf as a separate argument. There, the numbers are converted to hex by the %X format string and they are appended to each other without spaces (since there aren't any in the format string) and the result is stored in the variable named string.
When printf receives more arguments than its format string accounts for, the format is applied to each argument in turn until they are all consumed. If there are fewer arguments, the unmatched format string (portion) is ignored, but that doesn't apply in this case.
I tested it in Bash 3.2, 4.4 and 5.0-alpha. But it doesn't work in zsh (5.2) or ksh (93u+) because RANDOM only gets evaluated once in the brace expansion in those shells.
Note that because of using the mod operator on a value that ranges from 0 to 32767 the distribution of digits using the snippets could be skewed (not to mention the fact that the numbers are pseudo random in the first place). However, since we're using mod 16 and 32768 is divisible by 16, that won't be a problem here.
In any case, the correct way to do this is using mktemp as in Oleg Razgulyaev's answer.
Tested in zsh, should work with any BASH compatible shell!
#!/bin/zsh
SUM=`md5sum <<EOF
$RANDOM
EOF`
FN=`echo $SUM | awk '// { print $1 }'`
echo "Your new filename: $FN"
Example:
$ zsh ranhash.sh
Your new filename: 2485938240bf200c26bb356bbbb0fa32
$ zsh ranhash.sh
Your new filename: ad25cb21bea35eba879bf3fc12581cc9
Yet another way[tm].
R=$(echo $RANDOM $RANDOM $RANDOM $RANDOM $RANDOM | md5 | cut -c -8)
FILENAME="abcdef-$R"
This answer is very similar to fmarks, so I cannot really take credit for it, but I found the cat and tr command combinations quite slow, and I found this version quite a bit faster. You need hexdump.
hexdump -e '/1 "%02x"' -n32 < /dev/urandom
Another thing you can add is running the date command as follows:
date +%S%N
Reads nonosecond time and the result adds a lot of randomness.
The first answer is good but why fork cat if not required.
tr -dc 'a-f0-9' < /dev/urandom | head -c32
Grab 16 bytes from /dev/random, convert them to hex, take the first line, remove the address, remove the spaces.
head /dev/random -c16 | od -tx1 -w16 | head -n1 | cut -d' ' -f2- | tr -d ' '
Assuming that "without resorting to a program" means "using only programs that are readily available", of course.
If you have openssl in your system you can use it for generating random hex (also it can be -base64) strings with defined length. I found it pretty simple and usable in cron in one line jobs.
openssl rand -hex 32
8c5a7515837d7f0b19e7e6fa4c448400e70ffec88ecd811a3dce3272947cb452
Hope to add a (maybe) better solution to this topic.
Notice: this only works with bash4 and some implement of mktemp(for example, the GNU one)
Try this
fn=$(mktemp -u -t 'XXXXXX')
echo ${fn/\/tmp\//}
This one is twice as faster as head /dev/urandom | tr -cd 'a-f0-9' | head -c 32, and eight times as faster as cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32.
Benchmark:
With mktemp:
#!/bin/bash
# a.sh
for (( i = 0; i < 1000; i++ ))
do
fn=$(mktemp -u -t 'XXXXXX')
echo ${fn/\/tmp\//} > /dev/null
done
time ./a.sh
./a.sh 0.36s user 1.97s system 99% cpu 2.333 total
And the other:
#!/bin/bash
# b.sh
for (( i = 0; i < 1000; i++ ))
do
cat /dev/urandom | tr -dc 'a-zA-Z0-9' | head -c 32 > /dev/null
done
time ./b.sh
./b.sh 0.52s user 20.61s system 113% cpu 18.653 total
If you are on Linux, then Python will come pre-installed. So you can go for something similar to the below:
python -c "import uuid; print str(uuid.uuid1())"
If you don't like the dashes, then use replace function as shown below
python -c "import uuid; print str(uuid.uuid1()).replace('-','')"

Remove carriage return in Unix

What is the simplest way to remove all the carriage returns \r from a file in Unix?
I'm going to assume you mean carriage returns (CR, "\r", 0x0d) at the ends of lines rather than just blindly within a file (you may have them in the middle of strings for all I know). Using this test file with a CR at the end of the first line only:
$ cat infile
hello
goodbye
$ cat infile | od -c
0000000 h e l l o \r \n g o o d b y e \n
0000017
dos2unix is the way to go if it's installed on your system:
$ cat infile | dos2unix -U | od -c
0000000 h e l l o \n g o o d b y e \n
0000016
If for some reason dos2unix is not available to you, then sed will do it:
$ cat infile | sed 's/\r$//' | od -c
0000000 h e l l o \n g o o d b y e \n
0000016
If for some reason sed is not available to you, then ed will do it, in a complicated way:
$ echo ',s/\r\n/\n/
> w !cat
> Q' | ed infile 2>/dev/null | od -c
0000000 h e l l o \n g o o d b y e \n
0000016
If you don't have any of those tools installed on your box, you've got bigger problems than trying to convert files :-)
tr -d '\r' < infile > outfile
See tr(1)
The simplest way on Linux is, in my humble opinion,
sed -i.bak 's/\r$//g' <filename>
-i will edit the file in place, while the .bak will create a backup of the original file by making a copy of your file and adding the extension .bak at the end. (You can specify what ever you want after the -i, or specify only -i to not create a backup.)
The strong quotes around the substitution operator 's/\r//' are essential. Without them the shell will interpret \r as an escape+r and reduce it to a plain r, and remove all lower case r. That's why the answer given above in 2009 by Rob doesn't work.
And adding the /g modifier ensures that even multiple \r will be removed, and not only the first one.
Old School:
tr -d '\r' < filewithcarriagereturns > filewithoutcarriagereturns
There's a utility called dos2unix that exists on many systems, and can be easily installed on most.
sed -i s/\r// <filename> or somesuch; see man sed or the wealth of information available on the web regarding use of sed.
One thing to point out is the precise meaning of "carriage return" in the above; if you truly mean the single control character "carriage return", then the pattern above is correct. If you meant, more generally, CRLF (carriage return and a line feed, which is how line feeds are implemented under Windows), then you probably want to replace \r\n instead. Bare line feeds (newline) in Linux/Unix are \n.
If you are a Vi user, you may open the file and remove the carriage return with:
:%s/\r//g
or with
:1,$ s/^M//
Note that you should type ^M by pressing ctrl-v and then ctrl-m.
Someone else recommend dos2unix and I strongly recommend it as well. I'm just providing more details.
If installed, jump to the next step. If not already installed, I would recommend installing it via yum like:
yum install dos2unix
Then you can use it like:
dos2unix fileIWantToRemoveWindowsReturnsFrom.txt
Once more a solution... Because there's always one more:
perl -i -pe 's/\r//' filename
It's nice because it's in place and works in every flavor of unix/linux I've worked with.
Removing \r on any UNIX® system:
Most existing solutions in this question are GNU-specific, and wouldn't work on OS X or BSD; the solutions below should work on many more UNIX systems, and in any shell, from tcsh to sh, yet still work even on GNU/Linux, too.
Tested on OS X, OpenBSD and NetBSD in tcsh, and on Debian GNU/Linux in bash.
With sed:
In tcsh on an OS X, the following sed snippet could be used together with printf, as neither sed nor echo handle \r in the special way like the GNU does:
sed `printf 's/\r$//g'` input > output
With tr:
Another option is tr:
tr -d '\r' < input > output
Difference between sed and tr:
It would appear that tr preserves a lack of a trailing newline from the input file, whereas sed on OS X and NetBSD (but not on OpenBSD or GNU/Linux) inserts a trailing newline at the very end of the file even if the input is missing any trailing \r or \n at the very end of the file.
Testing:
Here's some sample testing that could be used to ensure this works on your system, using printf and hexdump -C; alternatively, od -c could also be used if your system is missing hexdump:
% printf 'a\r\nb\r\nc' | hexdump -C
00000000 61 0d 0a 62 0d 0a 63 |a..b..c|
00000007
% printf 'a\r\nb\r\nc' | ( sed `printf 's/\r$//g'` /dev/stdin > /dev/stdout ) | hexdump -C
00000000 61 0a 62 0a 63 0a |a.b.c.|
00000006
% printf 'a\r\nb\r\nc' | ( tr -d '\r' < /dev/stdin > /dev/stdout ) | hexdump -C
00000000 61 0a 62 0a 63 |a.b.c|
00000005
%
If you're using an OS (like OS X) that doesn't have the dos2unix command but does have a Python interpreter (version 2.5+), this command is equivalent to the dos2unix command:
python -c "import sys; import fileinput; sys.stdout.writelines(line.replace('\r', '\n') for line in fileinput.input(mode='rU'))"
This handles both named files on the command line as well as pipes and redirects, just like dos2unix. If you add this line to your ~/.bashrc file (or equivalent profile file for other shells):
alias dos2unix="python -c \"import sys; import fileinput; sys.stdout.writelines(line.replace('\r', '\n') for line in fileinput.input(mode='rU'))\""
... the next time you log in (or run source ~/.bashrc in the current session) you will be able to use the dos2unix name on the command line in the same manner as in the other examples.
you can simply do this :
$ echo $(cat input) > output
Here is the thing,
%0d is the carriage return character. To make it compatabile with Unix. We need to use the below command.
dos2unix fileName.extension fileName.extension
try this to convert dos file into unix file:
fromdos file
For UNIX... I've noticed dos2unix removed Unicode headers form my UTF-8 file. Under git bash (Windows), the following script seems to work nicely. It uses sed. Note it only removes carriage-returns at the ends of lines, and preserves Unicode headers.
#!/bin/bash
inOutFile="$1"
backupFile="${inOutFile}~"
mv --verbose "$inOutFile" "$backupFile"
sed -e 's/\015$//g' <"$backupFile" >"$inOutFile"
If you are running an X environment and have a proper editor (visual studio code), then I would follow the reccomendation:
Visual Studio Code: How to show line endings
Just go to the bottom right corner of your screen, visual studio code will show you both the file encoding and the end of line convention followed by the file, an just with a simple click you can switch that around.
Just use visual code as your replacement for notepad++ on a linux environment and you are set to go.
Using sed
sed $'s/\r//' infile > outfile
Using sed on Git Bash for Windows
sed '' infile > outfile
The first version uses ANSI-C quoting and may require escaping \ if the command runs from a script. The second version exploits the fact that sed reads the input file line by line by removing \r and \n characters. When writing lines to the output file, however, it only appends a \n character. A more general and cross-platform solution can be devised by simply modifying IFS
IFS=$'\r\n' # or IFS+=$'\r' if the lines do not contain whitespace
printf "%s\n" $(cat infile) > outfile
IFS=$' \t\n' # not necessary if IFS+=$'\r' is used
Warning: This solution performs filename expansion (*, ?, [...] and more if extglob is set). Use it only if you are sure that the file does not contain special characters or you want the expansion.
Warning: None of the solutions can handle \ in the input file.
cat input.csv | sed 's/\r/\n/g' > output.csv
worked for me
I've used python for it, here my code;
end1='/home/.../file1.txt'
end2='/home/.../file2.txt'
with open(end1, "rb") as inf:
with open(end2, "w") as fixed:
for line in inf:
line = line.replace("\n", "")
line = line.replace("\r", "")
fixed.write(line)
Though it's a older post, recently I came across with same problem. As I had all the files to rename inside /tmp/blah_dir/ as each file in this directory had "/r" trailing character ( showing "?" at end of file), so doing it script way was only I could think of.
I wanted to save final file with same name (without trailing any character).
With sed, problem was the output filename which I was needed to mention something else ( which I didn't want).
I tried other options as suggested here (not considered dos2unix because of some limitations) but didn't work.
I tried with "awk" finally which worked where I used "\r" as delimiter and taken the first part:
trick is:
echo ${filename}|awk -F"\r" '{print $1}'
Below script snippet I used ( where I had all file had "\r" as trailing character at path /tmp/blah_dir/) to fix my issue:
cd /tmp/blah_dir/
for i in `ls`
do
mv $i $(echo $i | awk -F"\r" '{print $1}')
done
Note: This example is not very exact though close to what I worked (Mentioning here just to give the better idea about what I did)
I made this shell-script to remove the \r character. It works in solaris and red-hat:
#!/bin/ksh
LOCALPATH=/Any_PATH
for File in `ls ${LOCALPATH}`
do
ARCACT=${LOCALPATH}/${File}
od -bc ${ARCACT}|sed -n 'p;n'|sed 's/015/012/g'|awk '{$1=""; print $0}'|sed 's/ /\\/g'|awk '{printf $0;}'>${ARCACT}.TMP
printf "`cat ${ARCACT}.TMP`"|sed '/^$/d'>${ARCACT}
rm ${ARCACT}.TMP
done
exit 0

Is there a Unix utility to prepend timestamps to stdin?

I ended up writing a quick little script for this in Python, but I was wondering if there was a utility you could feed text into which would prepend each line with some text -- in my specific case, a timestamp. Ideally, the use would be something like:
cat somefile.txt | prepend-timestamp
(Before you answer sed, I tried this:
cat somefile.txt | sed "s/^/`date`/"
But that only evaluates the date command once when sed is executed, so the same timestamp is incorrectly prepended to each line.)
ts from moreutils will prepend a timestamp to every line of input you give it. You can format it using strftime too.
$ echo 'foo bar baz' | ts
Mar 21 18:07:28 foo bar baz
$ echo 'blah blah blah' | ts '%F %T'
2012-03-21 18:07:30 blah blah blah
$
To install it:
sudo apt-get install moreutils
Could try using awk:
<command> | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush(); }'
You may need to make sure that <command> produces line buffered output, i.e. it flushes its output stream after each line; the timestamp awk adds will be the time that the end of the line appeared on its input pipe.
If awk shows errors, then try gawk instead.
annotate, available via that link or as annotate-output in the Debian devscripts package.
$ echo -e "a\nb\nc" > lines
$ annotate-output cat lines
17:00:47 I: Started cat lines
17:00:47 O: a
17:00:47 O: b
17:00:47 O: c
17:00:47 I: Finished with exitcode 0
Distilling the given answers to the simplest one possible:
unbuffer $COMMAND | ts
On Ubuntu, they come from the expect-dev and moreutils packages.
sudo apt-get install expect-dev moreutils
How about this?
cat somefile.txt | perl -pne 'print scalar(localtime()), " ";'
Judging from your desire to get live timestamps, maybe you want to do live updating on a log file or something? Maybe
tail -f /path/to/log | perl -pne 'print scalar(localtime()), " ";' > /path/to/log-with-timestamps
Kieron's answer is the best one so far. If you have problems because the first program is buffering its out you can use the unbuffer program:
unbuffer <command> | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; }'
It's installed by default on most linux systems. If you need to build it yourself it is part of the expect package
http://expect.nist.gov
Just gonna throw this out there: there are a pair of utilities in daemontools called tai64n and tai64nlocal that are made for prepending timestamps to log messages.
Example:
cat file | tai64n | tai64nlocal
Use the read(1) command to read one line at a time from standard input, then output the line prepended with the date in the format of your choosing using date(1).
$ cat timestamp
#!/bin/sh
while read line
do
echo `date` $line
done
$ cat somefile.txt | ./timestamp
I'm not an Unix guy, but I think you can use
gawk '{print strftime("%d/%m/%y",systime()) $0 }' < somefile.txt
#! /bin/sh
unbuffer "$#" | perl -e '
use Time::HiRes (gettimeofday);
while(<>) {
($s,$ms) = gettimeofday();
print $s . "." . $ms . " " . $_;
}'
$ cat somefile.txt | sed "s/^/`date`/"
you can do this (with gnu/sed):
$ some-command | sed "x;s/.*/date +%T/e;G;s/\n/ /g"
example:
$ { echo 'line1'; sleep 2; echo 'line2'; } | sed "x;s/.*/date +%T/e;G;s/\n/ /g"
20:24:22 line1
20:24:24 line2
of course, you can use other options of the program date. just replace date +%T with what you need.
Here's my awk solution (from a Windows/XP system with MKS Tools installed in the C:\bin directory). It is designed to add the current date and time in the form mm/dd hh:mm to the beginning of each line having fetched that timestamp from the system as each line is read. You could, of course, use the BEGIN pattern to fetch the timestamp once and add that timestamp to each record (all the same). I did this to tag a log file that was being generated to stdout with the timestamp at the time the log message was generated.
/"pattern"/ "C\:\\\\bin\\\\date '+%m/%d %R'" | getline timestamp;
print timestamp, $0;
where "pattern" is a string or regex (without the quotes) to be matched in the input line, and is optional if you wish to match all input lines.
This should work on Linux/UNIX systems as well, just get rid of the C\:\\bin\\ leaving the line
"date '+%m/%d %R'" | getline timestamp;
This, of course, assumes that the command "date" gets you to the standard Linux/UNIX date display/set command without specific path information (that is, your environment PATH variable is correctly configured).
Mixing some answers above from natevw and Frank Ch. Eigler.
It has milliseconds, performs better than calling a external date command each time and perl can be found in most of the servers.
tail -f log | perl -pne '
use Time::HiRes (gettimeofday);
use POSIX qw(strftime);
($s,$ms) = gettimeofday();
print strftime "%Y-%m-%dT%H:%M:%S+$ms ", gmtime($s);
'
Alternative version with flush and read in a loop:
tail -f log | perl -pne '
use Time::HiRes (gettimeofday); use POSIX qw(strftime);
$|=1;
while(<>) {
($s,$ms) = gettimeofday();
print strftime "%Y-%m-%dT%H:%M:%S+$ms $_", gmtime($s);
}'
caerwyn's answer can be run as a subroutine, which would prevent the new processes per line:
timestamp(){
while read line
do
echo `date` $line
done
}
echo testing 123 |timestamp
Disclaimer: the solution I am proposing is not a Unix built-in utility.
I faced a similar problem a few days ago. I did not like the syntax and limitations of the solutions above, so I quickly put together a program in Go to do the job for me.
You can check the tool here: preftime
There are prebuilt executables for Linux, MacOS, and Windows in the Releases section of the GitHub project.
The tool handles incomplete output lines and has (from my point of view) a more compact syntax.
<command> | preftime
It's not ideal, but I though I'd share it in case it helps someone.
The other answers mostly work, but have some drawbacks. In particular:
Many require installing a command not commonly found on linux systems, which may not be possible or convenient.
Since they use pipes, they don't put timestamps on stderr, and lose the exit status.
If you use multiple pipes for stderr and stdout, then some do not have atomic printing, leading to intermingled lines of output like [timestamp] [timestamp] stdout line \nstderr line
Buffering can cause problems, and unbuffer requires an extra dependency.
To solve (4), we can use stdbuf -i0 -o0 -e0 which is generally available on most linux systems (see How to make output of any shell command unbuffered?).
To solve (3), you just need to be careful to print the entire line at a time.
Bad: ruby -pe 'print Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \")' (Prints the timestamp, then prints the contents of $_.)
Good: ruby -pe '\$_ = Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \") + \$_' (Alters $_, then prints it.)
To solve (2), we need to use multiple pipes and save the exit status:
alias tslines-pipe="stdbuf -i0 -o0 ruby -pe '\$_ = Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \") + \$_'"
function tslines() (
stdbuf -o0 -e0 "$#" 2> >(tslines-pipe) > >(tslines-pipe)
status="$?"
exit $status
)
Then you can run a command with tslines some command --options.
This almost works, except sometimes one of the pipes takes slightly longer to exit and the tslines function has exited, so the next prompt has printed. For example, this command seems to print all the output after the prompt for the next line has appeared, which can be a bit confusing:
tslines bash -c '(for (( i=1; i<=20; i++ )); do echo stderr 1>&2; echo stdout; done)'
There needs to be some coordination method between the two pipe processes and the tslines function. There are presumably many ways to do this. One way I found is to have the pipes send some lines to a pipe that the main function can listen to, and only exit after it's received data from both pipe handlers. Putting that together:
alias tslines-pipe="stdbuf -i0 -o0 ruby -pe '\$_ = Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \") + \$_'"
function tslines() (
# Pick a random name for the pipe to prevent collisions.
pipe="/tmp/pipe-$RANDOM"
# Ensure the pipe gets deleted when the method exits.
trap "rm -f $pipe" EXIT
# Create the pipe. See https://www.linuxjournal.com/content/using-named-pipes-fifos-bash
mkfifo "$pipe"
# echo will block until the pipe is read.
stdbuf -o0 -e0 "$#" 2> >(tslines-pipe; echo "done" >> $pipe) > >(tslines-pipe; echo "done" >> $pipe)
status="$?"
# Wait until we've received data from both pipe commands before exiting.
linecount=0
while [[ $linecount -lt 2 ]]; do
read line
if [[ "$line" == "done" ]]; then
((linecount++))
fi
done < "$pipe"
exit $status
)
That synchronization mechanism feels a bit convoluted; hopefully there's a simpler way to do it.
doing it with date and tr and xargs on OSX:
alias predate="xargs -I{} sh -c 'date +\"%Y-%m-%d %H:%M:%S\" | tr \"\n\" \" \"; echo \"{}\"'"
<command> | predate
if you want milliseconds:
alias predate="xargs -I{} sh -c 'date +\"%Y-%m-%d %H:%M:%S.%3N\" | tr \"\n\" \" \"; echo \"{}\"'"
but note that on OSX, date doesn't give you the %N option, so you'll need to install gdate (brew install coreutils) and so finally arrive at this:
alias predate="xargs -I{} sh -c 'gdate +\"%Y-%m-%d %H:%M:%S.%3N\" | tr \"\n\" \" \"; echo \"{}\"'"
No need to specify all the parameters in strftime() unless you really want to customize the outputting format :
echo "abc 123 xyz\njan 765 feb" \
\
| gawk -Sbe 'BEGIN {_=strftime()" "} sub("^",_)'
Sat Apr 9 13:14:53 EDT 2022 abc 123 xyz
Sat Apr 9 13:14:53 EDT 2022 jan 765 feb
works the same if you have mawk 1.3.4. Even on awk-variants without the time features, a quick getline could emulate it :
echo "abc 123 xyz\njan 765 feb" \
\
| mawk2 'BEGIN { (__="date")|getline _;
close(__)
_=_" " } sub("^",_)'
Sat Apr 9 13:19:38 EDT 2022 abc 123 xyz
Sat Apr 9 13:19:38 EDT 2022 jan 765 feb
If you wanna skip all that getline and BEGIN { }, then something like this :
mawk2 'sub("^",_" ")' \_="$(date)"
If the value you are prepending is the same on every line, fire up emacs with the file, then:
Ctrl + <space>
at the beginning of the of the file (to mark that spot), then scroll down to the beginning of the last line (Alt + > will go to the end of file... which probably will involve the Shift key too, then Ctrl + a to go to the beginning of that line) and:
Ctrl + x r t
Which is the command to insert at the rectangle you just specified (a rectangle of 0 width).
2008-8-21 6:45PM <enter>
Or whatever you want to prepend... then you will see that text prepended to every line within the 0 width rectangle.
UPDATE: I just realized you don't want the SAME date, so this won't work... though you may be able to do this in emacs with a slightly more complicated custom macro, but still, this kind of rectangle editing is pretty nice to know about...

Resources