How to print ASCII value of a character using basic awk only - unix

I need to print the ASCII value of the given character in awk only.
Below code gives 0 as output:
echo a | awk '{ printf("%d \n",$1); }'

Using only basic awk (not even gawk, so the below should work on all BSD and Linux variants):
$ echo a | awk 'BEGIN{for(n=0;n<256;n++)ord[sprintf("%c",n)]=n}{print ord[$1]}'
97
Here's the opposite direction (for completeness):
$ echo 97 | awk 'BEGIN{for(n=0;n<256;n++)chr[n]=sprintf("%c",n)}{print chr[$1]}'
a
Basic premise is to use a lookup table.

see the awk manual for ordinal functions you can use. But since you are using awk, you should be on some version of shell, eg bash. so why not use the shell?
$ printf "%d" "'a"
97

It seems this is not a trivial problem. I found this approach using a lookup array, which should work for A-Z at least:
BEGIN { convert="ABCDEFGHIJKLMNOPQRSTUVWXYZ" }
{ num=index(convert,substr($0,2,1))+64; print num }

Related

Attempted to use awk sqrt but only returns 0

I am attempting to use the sqrt function from awk command in my script, but all it returns is 0. Is there anything wrong with my script below?
echo "enter number"
read root
awk 'BEGIN{ print sqrt($root) }'
This is my first time using the awk command, are there any mistakes that I am not understanding here?
Maybe you can try this.
echo "enter number"
read root
echo "$root" | awk '{print sqrt($0)}'
You have to give a data input to awk. So, you can pipe 'echo'.
The BEGIN statement is to do things, like print a header...etc before
awk starts reading the input.
$ echo "enter number"
enter number
$ read root
3
$ awk -v root="$root" 'BEGIN{ print sqrt(root) }'
1.73205
See the comp.unix.shell FAQ for the 2 correct ways to pass the value of a shell variable to an awk script.
UPDATE : My proposed solution turns out to be potentially dangerous. See Ed Morton's answer for a better solution. I'll leave this answer here as a warning.
Because of the single quotes, $root is interpreted by awk, not by the shell. awk treats root as an uninitialized variable, whose value is the empty string, treated as 0 in a numeric context. $root is the root'th field of the current line -- in this case, as $0, which is the entire line. Since it's in a BEGIN block, there is no current line, so $root is the empty string -- which again is treated as 0 when passed to sqrt().
You can see this by changing your command line a bit:
$ awk 'BEGIN { print sqrt("") }'
0
$ echo 2 | awk '{ print sqrt($root) }'
1.41421
NOTE: The above is merely to show what's wrong with the original command, and how it's interpreted by the shell and by awk.
One solution is to use double quotes rather than single quotes. The shell expands variable references within double quotes:
$ echo "enter number"
enter number
$ read x
2
$ awk "BEGIN { print sqrt($x) }" # DANGEROUS
1.41421
You'll need to be careful when doing this kind of thing. The interaction between quoting and variable expansion in the shell vs. awk can be complicated.
UPDATE: In fact, you need to be extremely careful. As Ed Morton points out in a comment, this method can result in arbitrary code execution given a malicious value for $x, which is always a risk for a value read from user input. His answer avoids that problem.
(Note that I've changed the name of your shell variable from $root to $x, since it's the number whose square root you want, not the root itself.)

How to repeat a character in Bourne Shell?

I want to repeat # 10 times, something like:
"##########"
How can i do it in Bourne shell (/bin/sh) ? I have tried using print but I guess it only works for bash shell.
Please don't give bash syntax.
The shell itself has no obvious facility for repeating a string. For just ten repetitions, it's hard to beat the obvious
echo '##########'
For repeating a single character a specified number of times, this should work even on a busy BusyBox.
dd if=/dev/zero bs=10 count=1 | tr '\0' '#'
Not very elegant but fairly low overhead. (You may need to redirect the standard error from dd to get rid of pesky progress messages.)
If you have a file which is guaranteed to be long enough (such as, for example, the script you are currently running) you could replace the first 10 characters with tr.
head -c 10 "$0" | tr '\000-\377' '#'
If you have a really traditional userspace (such that head doesn't support the -c option) a 1980s-compatible variant might be
yes '#' | head -n 10 | tr -d '\n'
(Your tr might not support exactly the backslash sequences I have used here. Consult its man page or your local academic programmer from the late 1970s.)
... or, heck
strings /bin/sh | sed 's/.*/#/;10q' | tr -d '\n' # don't do this at home :-)
In pure legacy Bourne shell with no external utilities, you can't really do much better than
for f in 0 1 2 3 4 5 6 7 8 9; do
printf '#'
done
In the general case, if you can come up with a generator expression which produces (at least) the required number of repetitions of something, you can loop over that. Here's a simple replacement for seq and jot:
echo | awk '{ for (i=0; i<10; ++i) print i }'
but then you might as well do the output from Awk:
echo | awk '{ for (i=0; i<10; ++i) printf "#" }'
Well, the pure Bourne Shell (POSIX) solution, without pipes and forks would probably be
i=0; while test $i -lt 10; do printf '#'; : $((++i)); done; printf '\n'
This easily generalizes to other repeated strings, e.g. a shell function
rept () {
i=0
while test $i -lt $1; do
printf '%s' "$2"
: $((++i))
done
printf '\n'
}
rept 10 '#'
If the HP Bourne Shell is not quite POSIX and does not support arithmetic substitution with : $(()) you can use i=$(expr $i + 1) instead.
You can use this trick with printf:
$ printf "%0.s#" {1..10}
##########
If the number can be a variable, then you need to use seq:
$ var=30
$ printf "%0.s#" $(seq $var)
##############################
This prints ##########:
for a in `seq 10`; do echo -n "#"; done

I need to make a unix script to read first word from a file

I need to make a unix script to read first word from a file, and if that is "Mon,Tue....Sat,Sun" then it will print echo 0 or else echo 1
I was trying with grep command but it didn't worked
This could even be done without grep or awk, using just bash builtins (assuming your shell is bash - this should also work in ksh and also zsh, and maybe in sh, but not csh, where the syntax is quite a bit different):
read firstword otherstuff < myfile.txt
case "${firstword}" in
Sun|Mon|Tue|Wed|Thu|Fri|Sat) echo 1;;
*) echo 0;;
esac
You could also use regexp matching to avoid the case statement (this is definitely bash-only, though):
if [[ "${firstword}" =~ ^(Sun|Mon|Tue|Wed|Thu|Fri|Sat)$ ]]; then
That's just a matter of preference, though...
When you need to parse input for words awk is better then grep, it still can do what grep does, but also can precess every line with simple scripts.
This is my take on solution:
awk 'NR==1{c=0} $1~/Sun|Mon|Tue|Wed|Thu|Fri|Sat/{c=1} END{print c}' test.txt
I encourage you to learn more about awk in this (short) tutorial
Try egrep command like:
head -1 myfile.txt | egrep '^[Sun|Mon|Tue|Wed|Thu|Fri|Sat] .*'; echo $?

Generate a random filename in unix shell

I would like to generate a random filename in unix shell (say tcshell). The filename should consist of random 32 hex letters, e.g.:
c7fdfc8f409c548a10a0a89a791417c5
(to which I will add whatever is neccesary). The point is being able to do it only in shell without resorting to a program.
Assuming you are on a linux, the following should work:
cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32
This is only pseudo-random if your system runs low on entropy, but is (on linux) guaranteed to terminate. If you require genuinely random data, cat /dev/random instead of /dev/urandom. This change will make your code block until enough entropy is available to produce truly random output, so it might slow down your code. For most uses, the output of /dev/urandom is sufficiently random.
If you on OS X or another BSD, you need to modify it to the following:
cat /dev/urandom | env LC_CTYPE=C tr -cd 'a-f0-9' | head -c 32
why do not use unix mktemp command:
$ TMPFILE=`mktemp tmp.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX` && echo $TMPFILE
tmp.MnxEsPDsNUjrzDIiPhnWZKmlAXAO8983
One command, no pipe, no loop:
hexdump -n 16 -v -e '/1 "%02X"' -e '/16 "\n"' /dev/urandom
If you don't need the newline, for example when you're using it in a variable:
hexdump -n 16 -v -e '/1 "%02X"' /dev/urandom
Using "16" generates 32 hex digits.
uuidgen generates exactly this, except you have to remove hyphens. So I found this to be the most elegant (at least to me) way of achieving this. It should work on linux and OS X out of the box.
uuidgen | tr -d '-'
As you probably noticed from each of the answers, you generally have to "resort to a program".
However, without using any external executables, in Bash and ksh:
string=''; for i in {0..31}; do string+=$(printf "%x" $(($RANDOM%16)) ); done; echo $string
in zsh:
string=''; for i in {0..31}; do string+=$(printf "%x" $(($RANDOM%16)) ); dummy=$RANDOM; done; echo $string
Change the lower case x in the format string to an upper case X to make the alphabetic hex characters upper case.
Here's another way to do it in Bash but without an explicit loop:
printf -v string '%X' $(printf '%.2s ' $((RANDOM%16))' '{00..31})
In the following, "first" and "second" printf refers to the order in which they're executed rather than the order in which they appear in the line.
This technique uses brace expansion to produce a list of 32 random numbers mod 16 each followed by a space and one of the numbers in the range in braces followed by another space (e.g. 11 00). For each element of that list, the first printf strips off all but the first two characters using its format string (%.2) leaving either single digits followed by a space each or two digits. The space in the format string ensures that there is then at least one space between each output number.
The command substitution containing the first printf is not quoted so that word splitting is performed and each number goes to the second printf as a separate argument. There, the numbers are converted to hex by the %X format string and they are appended to each other without spaces (since there aren't any in the format string) and the result is stored in the variable named string.
When printf receives more arguments than its format string accounts for, the format is applied to each argument in turn until they are all consumed. If there are fewer arguments, the unmatched format string (portion) is ignored, but that doesn't apply in this case.
I tested it in Bash 3.2, 4.4 and 5.0-alpha. But it doesn't work in zsh (5.2) or ksh (93u+) because RANDOM only gets evaluated once in the brace expansion in those shells.
Note that because of using the mod operator on a value that ranges from 0 to 32767 the distribution of digits using the snippets could be skewed (not to mention the fact that the numbers are pseudo random in the first place). However, since we're using mod 16 and 32768 is divisible by 16, that won't be a problem here.
In any case, the correct way to do this is using mktemp as in Oleg Razgulyaev's answer.
Tested in zsh, should work with any BASH compatible shell!
#!/bin/zsh
SUM=`md5sum <<EOF
$RANDOM
EOF`
FN=`echo $SUM | awk '// { print $1 }'`
echo "Your new filename: $FN"
Example:
$ zsh ranhash.sh
Your new filename: 2485938240bf200c26bb356bbbb0fa32
$ zsh ranhash.sh
Your new filename: ad25cb21bea35eba879bf3fc12581cc9
Yet another way[tm].
R=$(echo $RANDOM $RANDOM $RANDOM $RANDOM $RANDOM | md5 | cut -c -8)
FILENAME="abcdef-$R"
This answer is very similar to fmarks, so I cannot really take credit for it, but I found the cat and tr command combinations quite slow, and I found this version quite a bit faster. You need hexdump.
hexdump -e '/1 "%02x"' -n32 < /dev/urandom
Another thing you can add is running the date command as follows:
date +%S%N
Reads nonosecond time and the result adds a lot of randomness.
The first answer is good but why fork cat if not required.
tr -dc 'a-f0-9' < /dev/urandom | head -c32
Grab 16 bytes from /dev/random, convert them to hex, take the first line, remove the address, remove the spaces.
head /dev/random -c16 | od -tx1 -w16 | head -n1 | cut -d' ' -f2- | tr -d ' '
Assuming that "without resorting to a program" means "using only programs that are readily available", of course.
If you have openssl in your system you can use it for generating random hex (also it can be -base64) strings with defined length. I found it pretty simple and usable in cron in one line jobs.
openssl rand -hex 32
8c5a7515837d7f0b19e7e6fa4c448400e70ffec88ecd811a3dce3272947cb452
Hope to add a (maybe) better solution to this topic.
Notice: this only works with bash4 and some implement of mktemp(for example, the GNU one)
Try this
fn=$(mktemp -u -t 'XXXXXX')
echo ${fn/\/tmp\//}
This one is twice as faster as head /dev/urandom | tr -cd 'a-f0-9' | head -c 32, and eight times as faster as cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32.
Benchmark:
With mktemp:
#!/bin/bash
# a.sh
for (( i = 0; i < 1000; i++ ))
do
fn=$(mktemp -u -t 'XXXXXX')
echo ${fn/\/tmp\//} > /dev/null
done
time ./a.sh
./a.sh 0.36s user 1.97s system 99% cpu 2.333 total
And the other:
#!/bin/bash
# b.sh
for (( i = 0; i < 1000; i++ ))
do
cat /dev/urandom | tr -dc 'a-zA-Z0-9' | head -c 32 > /dev/null
done
time ./b.sh
./b.sh 0.52s user 20.61s system 113% cpu 18.653 total
If you are on Linux, then Python will come pre-installed. So you can go for something similar to the below:
python -c "import uuid; print str(uuid.uuid1())"
If you don't like the dashes, then use replace function as shown below
python -c "import uuid; print str(uuid.uuid1()).replace('-','')"

Is there a Unix utility to prepend timestamps to stdin?

I ended up writing a quick little script for this in Python, but I was wondering if there was a utility you could feed text into which would prepend each line with some text -- in my specific case, a timestamp. Ideally, the use would be something like:
cat somefile.txt | prepend-timestamp
(Before you answer sed, I tried this:
cat somefile.txt | sed "s/^/`date`/"
But that only evaluates the date command once when sed is executed, so the same timestamp is incorrectly prepended to each line.)
ts from moreutils will prepend a timestamp to every line of input you give it. You can format it using strftime too.
$ echo 'foo bar baz' | ts
Mar 21 18:07:28 foo bar baz
$ echo 'blah blah blah' | ts '%F %T'
2012-03-21 18:07:30 blah blah blah
$
To install it:
sudo apt-get install moreutils
Could try using awk:
<command> | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush(); }'
You may need to make sure that <command> produces line buffered output, i.e. it flushes its output stream after each line; the timestamp awk adds will be the time that the end of the line appeared on its input pipe.
If awk shows errors, then try gawk instead.
annotate, available via that link or as annotate-output in the Debian devscripts package.
$ echo -e "a\nb\nc" > lines
$ annotate-output cat lines
17:00:47 I: Started cat lines
17:00:47 O: a
17:00:47 O: b
17:00:47 O: c
17:00:47 I: Finished with exitcode 0
Distilling the given answers to the simplest one possible:
unbuffer $COMMAND | ts
On Ubuntu, they come from the expect-dev and moreutils packages.
sudo apt-get install expect-dev moreutils
How about this?
cat somefile.txt | perl -pne 'print scalar(localtime()), " ";'
Judging from your desire to get live timestamps, maybe you want to do live updating on a log file or something? Maybe
tail -f /path/to/log | perl -pne 'print scalar(localtime()), " ";' > /path/to/log-with-timestamps
Kieron's answer is the best one so far. If you have problems because the first program is buffering its out you can use the unbuffer program:
unbuffer <command> | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; }'
It's installed by default on most linux systems. If you need to build it yourself it is part of the expect package
http://expect.nist.gov
Just gonna throw this out there: there are a pair of utilities in daemontools called tai64n and tai64nlocal that are made for prepending timestamps to log messages.
Example:
cat file | tai64n | tai64nlocal
Use the read(1) command to read one line at a time from standard input, then output the line prepended with the date in the format of your choosing using date(1).
$ cat timestamp
#!/bin/sh
while read line
do
echo `date` $line
done
$ cat somefile.txt | ./timestamp
I'm not an Unix guy, but I think you can use
gawk '{print strftime("%d/%m/%y",systime()) $0 }' < somefile.txt
#! /bin/sh
unbuffer "$#" | perl -e '
use Time::HiRes (gettimeofday);
while(<>) {
($s,$ms) = gettimeofday();
print $s . "." . $ms . " " . $_;
}'
$ cat somefile.txt | sed "s/^/`date`/"
you can do this (with gnu/sed):
$ some-command | sed "x;s/.*/date +%T/e;G;s/\n/ /g"
example:
$ { echo 'line1'; sleep 2; echo 'line2'; } | sed "x;s/.*/date +%T/e;G;s/\n/ /g"
20:24:22 line1
20:24:24 line2
of course, you can use other options of the program date. just replace date +%T with what you need.
Here's my awk solution (from a Windows/XP system with MKS Tools installed in the C:\bin directory). It is designed to add the current date and time in the form mm/dd hh:mm to the beginning of each line having fetched that timestamp from the system as each line is read. You could, of course, use the BEGIN pattern to fetch the timestamp once and add that timestamp to each record (all the same). I did this to tag a log file that was being generated to stdout with the timestamp at the time the log message was generated.
/"pattern"/ "C\:\\\\bin\\\\date '+%m/%d %R'" | getline timestamp;
print timestamp, $0;
where "pattern" is a string or regex (without the quotes) to be matched in the input line, and is optional if you wish to match all input lines.
This should work on Linux/UNIX systems as well, just get rid of the C\:\\bin\\ leaving the line
"date '+%m/%d %R'" | getline timestamp;
This, of course, assumes that the command "date" gets you to the standard Linux/UNIX date display/set command without specific path information (that is, your environment PATH variable is correctly configured).
Mixing some answers above from natevw and Frank Ch. Eigler.
It has milliseconds, performs better than calling a external date command each time and perl can be found in most of the servers.
tail -f log | perl -pne '
use Time::HiRes (gettimeofday);
use POSIX qw(strftime);
($s,$ms) = gettimeofday();
print strftime "%Y-%m-%dT%H:%M:%S+$ms ", gmtime($s);
'
Alternative version with flush and read in a loop:
tail -f log | perl -pne '
use Time::HiRes (gettimeofday); use POSIX qw(strftime);
$|=1;
while(<>) {
($s,$ms) = gettimeofday();
print strftime "%Y-%m-%dT%H:%M:%S+$ms $_", gmtime($s);
}'
caerwyn's answer can be run as a subroutine, which would prevent the new processes per line:
timestamp(){
while read line
do
echo `date` $line
done
}
echo testing 123 |timestamp
Disclaimer: the solution I am proposing is not a Unix built-in utility.
I faced a similar problem a few days ago. I did not like the syntax and limitations of the solutions above, so I quickly put together a program in Go to do the job for me.
You can check the tool here: preftime
There are prebuilt executables for Linux, MacOS, and Windows in the Releases section of the GitHub project.
The tool handles incomplete output lines and has (from my point of view) a more compact syntax.
<command> | preftime
It's not ideal, but I though I'd share it in case it helps someone.
The other answers mostly work, but have some drawbacks. In particular:
Many require installing a command not commonly found on linux systems, which may not be possible or convenient.
Since they use pipes, they don't put timestamps on stderr, and lose the exit status.
If you use multiple pipes for stderr and stdout, then some do not have atomic printing, leading to intermingled lines of output like [timestamp] [timestamp] stdout line \nstderr line
Buffering can cause problems, and unbuffer requires an extra dependency.
To solve (4), we can use stdbuf -i0 -o0 -e0 which is generally available on most linux systems (see How to make output of any shell command unbuffered?).
To solve (3), you just need to be careful to print the entire line at a time.
Bad: ruby -pe 'print Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \")' (Prints the timestamp, then prints the contents of $_.)
Good: ruby -pe '\$_ = Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \") + \$_' (Alters $_, then prints it.)
To solve (2), we need to use multiple pipes and save the exit status:
alias tslines-pipe="stdbuf -i0 -o0 ruby -pe '\$_ = Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \") + \$_'"
function tslines() (
stdbuf -o0 -e0 "$#" 2> >(tslines-pipe) > >(tslines-pipe)
status="$?"
exit $status
)
Then you can run a command with tslines some command --options.
This almost works, except sometimes one of the pipes takes slightly longer to exit and the tslines function has exited, so the next prompt has printed. For example, this command seems to print all the output after the prompt for the next line has appeared, which can be a bit confusing:
tslines bash -c '(for (( i=1; i<=20; i++ )); do echo stderr 1>&2; echo stdout; done)'
There needs to be some coordination method between the two pipe processes and the tslines function. There are presumably many ways to do this. One way I found is to have the pipes send some lines to a pipe that the main function can listen to, and only exit after it's received data from both pipe handlers. Putting that together:
alias tslines-pipe="stdbuf -i0 -o0 ruby -pe '\$_ = Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \") + \$_'"
function tslines() (
# Pick a random name for the pipe to prevent collisions.
pipe="/tmp/pipe-$RANDOM"
# Ensure the pipe gets deleted when the method exits.
trap "rm -f $pipe" EXIT
# Create the pipe. See https://www.linuxjournal.com/content/using-named-pipes-fifos-bash
mkfifo "$pipe"
# echo will block until the pipe is read.
stdbuf -o0 -e0 "$#" 2> >(tslines-pipe; echo "done" >> $pipe) > >(tslines-pipe; echo "done" >> $pipe)
status="$?"
# Wait until we've received data from both pipe commands before exiting.
linecount=0
while [[ $linecount -lt 2 ]]; do
read line
if [[ "$line" == "done" ]]; then
((linecount++))
fi
done < "$pipe"
exit $status
)
That synchronization mechanism feels a bit convoluted; hopefully there's a simpler way to do it.
doing it with date and tr and xargs on OSX:
alias predate="xargs -I{} sh -c 'date +\"%Y-%m-%d %H:%M:%S\" | tr \"\n\" \" \"; echo \"{}\"'"
<command> | predate
if you want milliseconds:
alias predate="xargs -I{} sh -c 'date +\"%Y-%m-%d %H:%M:%S.%3N\" | tr \"\n\" \" \"; echo \"{}\"'"
but note that on OSX, date doesn't give you the %N option, so you'll need to install gdate (brew install coreutils) and so finally arrive at this:
alias predate="xargs -I{} sh -c 'gdate +\"%Y-%m-%d %H:%M:%S.%3N\" | tr \"\n\" \" \"; echo \"{}\"'"
No need to specify all the parameters in strftime() unless you really want to customize the outputting format :
echo "abc 123 xyz\njan 765 feb" \
\
| gawk -Sbe 'BEGIN {_=strftime()" "} sub("^",_)'
Sat Apr 9 13:14:53 EDT 2022 abc 123 xyz
Sat Apr 9 13:14:53 EDT 2022 jan 765 feb
works the same if you have mawk 1.3.4. Even on awk-variants without the time features, a quick getline could emulate it :
echo "abc 123 xyz\njan 765 feb" \
\
| mawk2 'BEGIN { (__="date")|getline _;
close(__)
_=_" " } sub("^",_)'
Sat Apr 9 13:19:38 EDT 2022 abc 123 xyz
Sat Apr 9 13:19:38 EDT 2022 jan 765 feb
If you wanna skip all that getline and BEGIN { }, then something like this :
mawk2 'sub("^",_" ")' \_="$(date)"
If the value you are prepending is the same on every line, fire up emacs with the file, then:
Ctrl + <space>
at the beginning of the of the file (to mark that spot), then scroll down to the beginning of the last line (Alt + > will go to the end of file... which probably will involve the Shift key too, then Ctrl + a to go to the beginning of that line) and:
Ctrl + x r t
Which is the command to insert at the rectangle you just specified (a rectangle of 0 width).
2008-8-21 6:45PM <enter>
Or whatever you want to prepend... then you will see that text prepended to every line within the 0 width rectangle.
UPDATE: I just realized you don't want the SAME date, so this won't work... though you may be able to do this in emacs with a slightly more complicated custom macro, but still, this kind of rectangle editing is pretty nice to know about...

Resources