Shebang line C. How does it work? - unix

I was reading the Advanced Programming in UNIX and stumbled upon this example. What is the shebang line doing here? Here's the top part of the code.
#!/usr/bin/awk -f
BEGIN {
printf("#include \"apue.h\"\n")
printf("#include <errno.h>\n")
printf("\n")
printf("static void pr_sysconf(char *, int);\n")
printf("static void pr_pathconf(char *, char *, int);\n")
printf("\n")
printf("int\n")
printf("main(int argc, char *argv[])\n")
printf("{\n")
printf("\tif (argc != 2)\n")
printf("\t\terr_quit(\"usage: a.out <dirname>\");\n\n")
FS="\t+"
while (getline <"sysopt.sym" > 0) {
printf("#ifdef %s\n", $1)
printf("\tprintf(\"%s is defined (val is %%ld)\\n\", (long)%s+0);\n", $1, $1)
printf("#else\n")
printf("\tprintf(\"%s is undefined\\n\");\n", $1)
printf("#endif\n")
printf("#ifdef %s\n", $2)

The shebang line is (on Linux and most Unixes) understood by execve(2). However, POSIX don't specify anything about it. Your script should be run by e.g. GNU awk, assuming that script is an executable file (and you probably want it to be accessible from your PATH variable).
So when something (probably your Unix shell, but could be something else) is executing with execve that script, the /usr/bin/awk program gets executed. You are betting that this awk program is some implementation of AWK

The OS's routines for executing a file look for the two characters #! at the start of a file, and if present, instead of directly loading the file as a binary executable, they call the executable file referenced by the rest of that line, along with any command line argument therein, then the original file as a final argument.
That's quite an involved description; a couple of examples make it clearer:
If myFile contains:
#!/bin/echo My path is
... and we make it executable:
$ chmod +x myFile
... then when we run it, we get:
$ ./myFile
My path is /home/slim/myFile
Or if we change its contents to:
#!/bin/cat
Hello
world
Then when we run it, it prints itself ...
$ ./myFile
#!/bin/cat
Hello
world
This is generally useful when command it invokes is an interpreter which can work with the contents, and itself ignores the shebang. Of course in many languages # denotes a comment, so we get this for free:
#!/bin/bash
#!/usr/bin/perl
#!/bin/awk -f
So essentially it arranges matters such that running myFile directly, is equivalent to running awk -f myFile.

Related

Defining local variable in Makefile target

How to define local variable in Makefile target?
I would like to avoid repeating filename like:
zsh:
FILENAME := "text.txt"
#echo "Copying ${FILENAME}...";
scp "${FILENAME}" "user#host:/home/user/${FILENAME}"
But I am getting an error:
FILENAME := "text.txt"
/bin/sh: FILENAME: command not found
Same with $(FILENAME)
Trying
zsh:
export FILENAME="text.txt"
#echo "Copying ${FILENAME} to $(EC2)";
Gives me an empty value:
Copying ...
You can't define a make variable inside a recipe. Recipes are run in the shell and must use shell syntax.
If you want to define a make variable, define it outside of a recipe, like this:
FILENAME := text.txt
zsh:
#echo "Copying ${FILENAME}...";
scp "${FILENAME}" "user#host:/home/user/${FILENAME}"
Note, it's virtually never correct to add quotes around a value when assigning it to a make variable. Make doesn't care about quotes (in variable values or expansion) and doesn't treat them specially in any way.
The rules for a target are executed by the shell, so you can set a variable using shell syntax:
zsh:
#FILENAME="text.txt"; \
echo "Copying $${FILENAME}..."; \
scp "$${FILENAME}" "user#host:/home/user/$${FILENAME}"
Notice that:
I'm escaping end-of-line using \ so that everything executes in
the same shell
I'm escaping the $ in shell variables by writing $$ (otherwise
make will attempt to interpret them as make variables).
For this rule, which apparently depends on a file named text.txt,
you could alternatively declare text.txt as an explicit dependency and then write:
zsh: text.txt
#echo "Copying $<..."; \
scp "$<" "user#host:/home/user/$<"

awk getline not accepting external variable from a file

I have a file test.sh from which I am executing the following awk command.
awk -f x.awk < result/output.txt >>difference.txt
x.awk
while (getline < result/$bld/$DeviceType)
the variable DeviceType and bld are available in test.sh.
I have declared them as export type.
export DeviceType=$line
Even then while executing test.sh file, the script stops at following line
awk -f x.awk < result/output.txt >>difference.txt
and I am getting
awk: x.awk:4: (FILENAME=- FNR=116) fatal: division by zero attempted
error.
The awk script is read by awk, not touched by the shell. Inside an awk script, $bld means 'the field designated by the number in the variable bld' (that's the awk variable bld).
You can set awk variables on the command line (officially with the -v option):
awk -v bld="$bld" -v dev="$DeviceType" -f x.awk < result/output.txt >> difference.txt
Whether that does what you want is still debatable. Most likely you need x.awk to contain something like:
BEGIN { file = sprintf("result/%s/%s", bld, dev); }
{ while ((getline < file) > 0) print }
awk is not shell just like C is not shell. You should not expect to be able to access shell variables within an awk program any more than you can access shell variables within a C program.
To pass the VALUE of shell variables to an awk script, see http://cfajohnson.com/shell/cus-faq-2.html#Q24 for details but essentially:
awk -v awkvar="$shellvar" '{ ... use awkvar ...}'
is usually the right approach.
Having said that, whatever you're trying to do it looks like the wrong approach. If you are considering using getline, make sure to read http://awk.freeshell.org/AllAboutGetline first and understand all of the caveats but if you tell us what it is you're trying to do with sample input and expected output we can almost certainly help you come up with a better approach that has nothing to do with getline.

Accepting command line parameters in awk

I've looked around for awhile and found only either questions touching on the subject or providing me with an answer that does not work. Here's the question:
I'm working on an assignment for school that requires me to read in command line arguments for an awk script (which seems odd to begin with, but eh). We're using an older version of Unix and I'm running Bash. This awk only has the -f and -Fc options. Basically, I keep trying to do "awk -f awk_script arg1 arg2 arg3 arg4 arg5 arg6" but each time awk attempts to open arg1 as a file, which it isn't. An example I saw elsewhere addressing this was:
awk 'BEGIN { print "ARGV[1] = ", ARGV[1] }' foo bar
It was supposed to print "foo", but on this system I only get the output "ARGV[1] = awk: can't open foo". So, in summary, is there any way around this? Can an awk this old read command line arguments and use them for anything other than input files? The instructors notes file hinted at the above usage (of printing foo), but his program doesn't even run, so...
Any help would be greatly appreciated.
After Edit: Using SunOS 5.10 and this awk does not support the -v option, ONLY the -f and -Fc
You can decrement ARGC after reading arguments so that only the first(s) argument(s) is(are) considered by awk as input file(s) :
#!/bin/awk -f
BEGIN {
for (i=ARGC; i>2; i--) {
print ARGV[ARGC-1];
ARGC--;
}
}
…
Or alternatively, you can reset ARGC after having read all arguments :
#!/bin/awk -f
BEGIN {
for (i=0; i<ARGC; i++) {
print ARGV[ARGC-1];
}
ARGC=2;
}
…
Both methods will correctly process myawkscript.awk foobar foo bar … as if foobar was the only file to process (of course you can set ARGC to 3 if you want the two first arguments as files, etc.). In your particular case, it seems you don't want to process any file, so you would set ARGC to 1.
Use nawk or /usr/xpg4/bin/awk. These are newer versions of awk that support more features.
Alternatively, you can install another version of awk like mawk or GNU awk.
A possible work around - maybe not acceptable - would be to use the -v option of awk.
awk -v arg1=foo 'BEGIN { print arg1; }'

Using the same file for stdin and stdout with redirection

I'm writing a application that acts like a filter: it reads input from a file (stdin), processes, and write output to another file (stdout). The input file is completely read before the application starts to write the output file.
Since I'm using stdin and stdout, I can run is like this:
$ ./myprog <file1.txt >file2.txt
It works fine, but if I try to use the same file as input and output (that is: read from a file, and write to the same file), like this:
$ ./myprog <file.txt >file.txt
it cleans file.txt before the program has the chance to read it.
Is there any way I can do something like this in a command line in Unix?
There's a sponge utility in moreutils package:
./myprog < file.txt | sponge file.txt
To quote the manual:
Sponge reads standard input and writes it out to the specified file. Unlike a shell redirect, sponge soaks up all its input before opening the output file. This allows constructing pipelines that read from and write to the same file.
The shell is what clobbers your output file, as it's preparing the output filehandles before executing your program. There's no way to make your program read the input before the shell clobbers the file in a single shell command line.
You need to use two commands, either moving or copying the file before reading it:
mv file.txt filecopy.txt
./myprog < filecopy.txt > file.txt
Or else outputting to a copy and then replacing the original:
./myprog < file.txt > filecopy.txt
mv filecopy.txt file.txt
If you can't do that, then you need to pass the filename to your program, which opens the file in read/write mode, and handles all the I/O internally.
./myprog file.txt # reads and writes according to its own rules
For a solution of a purely academic nature:
$ ( unlink file.txt && ./myprog >file.txt ) <file.txt
Possibly problematic side-effects are:
If ./myprog fails, you destroy your input. (Naturally...)
./myprog runs from a subshell (Use { ... ; } instead of ( ... ) to avoid.)
file.txt becomes a new file with a new inode and file permissions.
You need +w permission on the directory housing file.txt.

'tee' and exit status

Is there an alternative to tee which captures standard output and standard error of the command being executed and exits with the same exit status as the processed command?
Something like the following:
eet -a some.log -- mycommand --foo --bar
Where "eet" is an imaginary alternative to "tee" :) (-a means append and -- separates the captured command). It shouldn't be hard to hack such a command, but maybe it already exists and I'm not aware of it?
This works with Bash:
(
set -o pipefail
mycommand --foo --bar | tee some.log
)
The parentheses are there to limit the effect of pipefail to just the one command.
From the bash(1) man page:
The return status of a pipeline is the exit status of the last command, unless the pipefail option is enabled. If pipefail is enabled, the pipeline's return status is the value of the last (rightmost) command to exit with a non-zero status, or zero if all commands exit successfully.
I stumbled upon a couple of interesting solutions at Capture Exit Code Using Pipe & Tee.
There is the $PIPESTATUS variable available in Bash:
false | tee /dev/null
[ $PIPESTATUS -eq 0 ] || exit $PIPESTATUS
And the simplest prototype of "eet" in Perl may look as follows:
open MAKE, "command 2>&1 |" or die;
open (LOGFILE, ">>some.log") or die;
while (<MAKE>) {
print LOGFILE $_;
print
}
close MAKE; # To get $?
my $exit = $? >> 8;
close LOGFILE;
Here's an eet. Works with every Bash I can get my hands on, from 2.05b to 4.0.
#!/bin/bash
tee_args=()
while [[ $# > 0 && $1 != -- ]]; do
tee_args=("${tee_args[#]}" "$1")
shift
done
shift
# now ${tee_args[*]} has the arguments before --,
# and $* has the arguments after --
# redirect standard out through a pipe to tee
exec | tee "${tee_args[#]}"
# do the *real* exec of the desired program
exec "$#"
(pipefail and $PIPESTATUS are nice, but I recall them being introduced in 3.1 or thereabouts.)
This is what I consider to be the best pure-Bourne-shell solution to use as the base upon which you could build your "eet":
# You want to pipe command1 through command2:
exec 4>&1
exitstatus=`{ { command1; echo $? 1>&3; } | command2 1>&4; } 3>&1`
# $exitstatus now has command1's exit status.
I think this is best explained from the inside out – command1 will execute and print its regular output on stdout (file descriptor 1), then once it's done, echo will execute and print command1's exit code on its stdout, but that stdout is redirected to file descriptor three.
While command1 is running, its stdout is being piped to command2 (echo's output never makes it to command2 because we send it to file descriptor 3 instead of 1, which is what the pipe reads). Then we redirect command2's output to file descriptor 4, so that it also stays out of file descriptor one – because we want file descriptor one clear for when we bring the echo output on file descriptor three back down into file descriptor one so that the command substitution (the backticks) can capture it.
The final bit of magic is that first exec 4>&1 we did as a separate command – it opens file descriptor four as a copy of the external shell's stdout. Command substitution will capture whatever is written on standard out from the perspective of the commands inside it – but, since command2's output is going to file descriptor four as far as the command substitution is concerned, the command substitution doesn't capture it – however, once it gets "out" of the command substitution, it is effectively still going to the script's overall file descriptor one.
(The exec 4>&1 has to be a separate command to work with many common shells. In some shells it works if you just put it on the same line as the variable assignment, after the closing backtick of the substitution.)
(I use compound commands ({ ... }) in my example, but subshells (( ... )) would also work. The subshell will just cause a redundant forking and awaiting of a child process, since each side of a pipe and the inside of a command substitution already normally implies a fork and await of a child process, and I don't know of any shell being coded to recognize that it can skip one of those forks because it's already done or is about to do the other.)
You can look at it in a less technical and more playful way, as if the outputs of the commands are leapfrogging each other: command1 pipes to command2, then the echo's output jumps over command2 so that command2 doesn't catch it, and then command2's output jumps over and out of the command substitution just as echo lands just in time to get captured by the substitution so that it ends up in the variable, and command2's output goes on its way to the standard output, just as in a normal pipe.
Also, as I understand it, at the end of this command, $? will still contain the return code of the second command in the pipe, because variable assignments, command substitutions, and compound commands are all effectively transparent to the return code of the command inside them, so the return status of command2 should get propagated out.
A caveat is that it is possible that command1 will at some point end up using file descriptors three or four, or that command2 or any of the later commands will use file descriptor four, so to be more hygienic, we would do:
exec 4>&1
exitstatus=`{ { command1 3>&-; echo $? 1>&3; } 4>&- | command2 1>&4; } 3>&1`
exec 4>&-
Commands inherit file descriptors from the process that launches them, so the entire second line will inherit file descriptor four, and the compound command followed by 3>&1 will inherit the file descriptor three. So the 4>&- makes sure that the inner compound command will not inherit file descriptor four, and the 3>&- makes sure that command1 will not inherit file descriptor three, so command1 gets a 'cleaner', more standard environment. You could also move the inner 4>&- next to the 3>&-, but I figure why not just limit its scope as much as possible.
Almost no programs uses pre-opened file descriptor three and four directly, so you almost never have to worry about it, but the latter is probably best to keep in mind and use for general-purpose cases.
{ mycommand --foo --bar 2>&1; ret=$?; } | tee -a some.log; (exit $ret)
KornShell, all in one line:
foo; RET_VAL=$?; if test ${RET_VAL} != 0;then echo $RET_VAL; echo Error occurred!>/tmp/out.err;exit 2;fi |tee >>/tmp/out.err ; if test ${RET_VAL} != 0;then exit $RET_VAL;fi
#!/bin/sh
logfile="$1"
shift
exec 2>&1
exec "$#" | tee "$logfile"
Hopefully this works for you.
Assuming Bash or Z shell (zsh),
my_command >>my_log 2>&1
N.B. The sequence of redirection and duplication of standard error onto standard output is significant!
I didn't realise you wanted to see the output on screen as well. This will of course direct all output to the file my_log.

Resources