I have a a Shell script that contain a Perl script and R script.
my Shell script R.sh:-
#!/bin/bash
./R.pl #calling Perl script
`perl -lane 'print $F[0]' /media/data/abc.cnv > /media/data/abc1.txt`;
#Shell script
Rscript R.r #calling R script
This is my R.pl (head):-
`export path=$PATH:/media/exe_folder/bin`;
print "Enter the path to your input file:";
$base_dir ="/media/exe_folder";
chomp($CEL_dir = <STDIN>);
opendir (DIR, "$CEL_dir") or die "Couldn't open directory $CEL_dir";
$cel_files = "$CEL_dir"."/cel_files.txt";
open(CEL,">$cel_files")|| die "cannot open $file to write";
print CEL "cel_files\n";
for ( grep { /^[\w\d]/ } readdir DIR ){
print CEL "$CEL_dir"."/$_\n";
}close (CEL);
The output of Perl script is input for Shell script and Shell's output is input for R script.
I want to run the Shell script by providing the input file name and output file name like :-
./R.sh home/folder/inputfile.txt home/folder2/output.txt
If folder contain many files then it will take only user define file and process it.
Is There is a way to do this?
I guess this is what you want:
#!/bin/bash
# command line parameters
_input_file=$1
_output_file=$2
# #TODO: not sure if this path is the one you intended...
_script_path=$(dirname $0)
# sanity checks
if [[ -z "${_input_file}" ]] ||
[[ -z "${_output_file}" ]]; then
echo 1>&2 "usage: $0 <input file> <output file>"
exit 1
fi
if [[ ! -r "${_input_file}" ]]; then
echo 1>&2 "ERROR: can't find input file '${input_file}'!"
exit 1
fi
# process input file
# 1. with Perl script (writes to STDOUT)
# 2. post-process with Perl filter
# 3. run R script (reads from STDIN, writes to STDOUT)
perl ${_script_path}/R.pl <"${_input_file}" | \
perl -lane 'print $F[0]' | \
Rscript ${_script_path}/R.r >"${_output_file}"
exit 0
Please see the notes how the called scripts should behave.
NOTE: I don't quite understand why you need to post-process the output of the Perl script with Perl filter. Why not integrate it directly into the Perl script itself?
BONUS CODE: this is how you would write the main loop in R.pl to act as proper filter, i.e. reading lines from STDIN and writing the result to STDOUT. You can use the same approach also in other languages, e.g. R.
#!/usr/bin/perl
use strict;
use warnings;
# read lines from STDIN
while (<STDIN>) {
chomp;
# add your processing code here that does something with $_, i.e. the line
# EXAMPLE: upper case first letter in all words on the line
s/\b([[:lower:]])/\U\1/;
# write result to STDOUT
print "$_\n";
}
I'm trying to tell unix to print out the command line arguments passed to a Bourne Shell script, but it's not working. I get the value of x at the echo statement, and not the command line argument at the desired location.
This is what I want:
./run a b c d
a
b
c
d
this is what I get:
1
2
3
4
What's going on? I know that UNIX is confused as per what I'm referencing in the shell script (the variable x or the command line argument at the x'th position". How can I clarify what I mean?
#!/bin/sh
x=1
until [ $x -gt $# ]
do
echo $x
x=`expr $x + 1`
done
EDIT: Thank you all for the responses, but now I have another question; what if you wanted to start counting not at the first argument, but at the second, or third? So, what would I do to tell UNIX to process elements starting at the second position, and ignore the first?
echo $*
$x is not the xth argument. It's the variable x, and expr $x+1 is like x++ in other languages.
The simplest change to your script to make it do what you asked is this:
#!/bin/sh
x=1
until [ $x -gt $# ]
do
eval "echo \${$x}"
x=`expr $x + 1`
done
HOWEVER (and this is a big however), using eval (especially on user input) is a huge security problem. A better way is to use shift and the first positional argument variable like this:
#!/bin/sh
while [ $# -gt 0 ]; do
x=$1
shift
echo ${x}
done
If you want to start counting a the 2nd argument
for i in ${#:2}
do
echo $i
done
A solution not using shift:
#!/bin/sh
for arg in "$#"; do
printf "%s " "$arg"
done
echo
What's the difference between $# and $* in UNIX? When echoed in a script, they both seem to produce the same output.
Please see the bash man page under Special Parameters.
Special Parameters
The shell treats several parameters specially. These parameters may
only be referenced; assignment to them is not allowed.
* Expands to the positional parameters, starting from one. When
the expansion occurs within double quotes, it expands to a sin‐
gle word with the value of each parameter separated by the first
character of the IFS special variable. That is, "$*" is equiva‐
lent to "$1c$2c...", where c is the first character of the value
of the IFS variable. If IFS is unset, the parameters are sepa‐
rated by spaces. If IFS is null, the parameters are joined
without intervening separators.
# Expands to the positional parameters, starting from one. When
the expansion occurs within double quotes, each parameter
expands to a separate word. That is, "$#" is equivalent to "$1"
"$2" ... If the double-quoted expansion occurs within a word,
the expansion of the first parameter is joined with the begin‐
ning part of the original word, and the expansion of the last
parameter is joined with the last part of the original word.
When there are no positional parameters, "$#" and $# expand to
nothing (i.e., they are removed).
One difference is in how they handle the IFS variable on output.
#!/bin/sh
echo "unquoted asterisk " $*
echo "quoted asterisk $*"
echo "unquoted at " $#
echo "quoted at $#"
IFS="X"
echo "IFS is now $IFS"
echo "unquoted asterisk " $*
echo "quoted asterisk $*"
echo "unquoted at " $#
echo "quoted at $#"
If you run this like this: ./demo abc def ghi, you get this output:
unquoted asterisk abc def ghi
quoted asterisk abc def ghi
unquoted at abc def ghi
quoted at abc def ghi
IFS is now X
unquoted asterisk abc def ghi
quoted asterisk abcXdefXghi
unquoted at abc def ghi
quoted at abc def ghi
Notice that (only) the "quoted asterisk" line shows an X between each "word" after IFS is changed to "X". If the value of IFS contains multiple characters, only the first character is used for this purpose.
This feature can also be used for other arrays:
$ array=(123 456 789)
$ saveIFS=$IFS; IFS="|"
$ echo "${array[*]}"
123|456|789
$ IFS=$saveIFS
for i in "$#"
do
echo $i # loop $# times
done
for i in "$*"
do
echo $i # loop 1 times
done
It's safer to use "$#" instead of $*. When you use multiword strings as arguments to a shell script, it's only "$#" that interprets each quoted argument as a separate argument.
As the output above suggests, if you use $*, the shell makes a wrong count of the arguments.
I ended up writing a quick little script for this in Python, but I was wondering if there was a utility you could feed text into which would prepend each line with some text -- in my specific case, a timestamp. Ideally, the use would be something like:
cat somefile.txt | prepend-timestamp
(Before you answer sed, I tried this:
cat somefile.txt | sed "s/^/`date`/"
But that only evaluates the date command once when sed is executed, so the same timestamp is incorrectly prepended to each line.)
ts from moreutils will prepend a timestamp to every line of input you give it. You can format it using strftime too.
$ echo 'foo bar baz' | ts
Mar 21 18:07:28 foo bar baz
$ echo 'blah blah blah' | ts '%F %T'
2012-03-21 18:07:30 blah blah blah
$
To install it:
sudo apt-get install moreutils
Could try using awk:
<command> | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush(); }'
You may need to make sure that <command> produces line buffered output, i.e. it flushes its output stream after each line; the timestamp awk adds will be the time that the end of the line appeared on its input pipe.
If awk shows errors, then try gawk instead.
annotate, available via that link or as annotate-output in the Debian devscripts package.
$ echo -e "a\nb\nc" > lines
$ annotate-output cat lines
17:00:47 I: Started cat lines
17:00:47 O: a
17:00:47 O: b
17:00:47 O: c
17:00:47 I: Finished with exitcode 0
Distilling the given answers to the simplest one possible:
unbuffer $COMMAND | ts
On Ubuntu, they come from the expect-dev and moreutils packages.
sudo apt-get install expect-dev moreutils
How about this?
cat somefile.txt | perl -pne 'print scalar(localtime()), " ";'
Judging from your desire to get live timestamps, maybe you want to do live updating on a log file or something? Maybe
tail -f /path/to/log | perl -pne 'print scalar(localtime()), " ";' > /path/to/log-with-timestamps
Kieron's answer is the best one so far. If you have problems because the first program is buffering its out you can use the unbuffer program:
unbuffer <command> | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; }'
It's installed by default on most linux systems. If you need to build it yourself it is part of the expect package
http://expect.nist.gov
Just gonna throw this out there: there are a pair of utilities in daemontools called tai64n and tai64nlocal that are made for prepending timestamps to log messages.
Example:
cat file | tai64n | tai64nlocal
Use the read(1) command to read one line at a time from standard input, then output the line prepended with the date in the format of your choosing using date(1).
$ cat timestamp
#!/bin/sh
while read line
do
echo `date` $line
done
$ cat somefile.txt | ./timestamp
I'm not an Unix guy, but I think you can use
gawk '{print strftime("%d/%m/%y",systime()) $0 }' < somefile.txt
#! /bin/sh
unbuffer "$#" | perl -e '
use Time::HiRes (gettimeofday);
while(<>) {
($s,$ms) = gettimeofday();
print $s . "." . $ms . " " . $_;
}'
$ cat somefile.txt | sed "s/^/`date`/"
you can do this (with gnu/sed):
$ some-command | sed "x;s/.*/date +%T/e;G;s/\n/ /g"
example:
$ { echo 'line1'; sleep 2; echo 'line2'; } | sed "x;s/.*/date +%T/e;G;s/\n/ /g"
20:24:22 line1
20:24:24 line2
of course, you can use other options of the program date. just replace date +%T with what you need.
Here's my awk solution (from a Windows/XP system with MKS Tools installed in the C:\bin directory). It is designed to add the current date and time in the form mm/dd hh:mm to the beginning of each line having fetched that timestamp from the system as each line is read. You could, of course, use the BEGIN pattern to fetch the timestamp once and add that timestamp to each record (all the same). I did this to tag a log file that was being generated to stdout with the timestamp at the time the log message was generated.
/"pattern"/ "C\:\\\\bin\\\\date '+%m/%d %R'" | getline timestamp;
print timestamp, $0;
where "pattern" is a string or regex (without the quotes) to be matched in the input line, and is optional if you wish to match all input lines.
This should work on Linux/UNIX systems as well, just get rid of the C\:\\bin\\ leaving the line
"date '+%m/%d %R'" | getline timestamp;
This, of course, assumes that the command "date" gets you to the standard Linux/UNIX date display/set command without specific path information (that is, your environment PATH variable is correctly configured).
Mixing some answers above from natevw and Frank Ch. Eigler.
It has milliseconds, performs better than calling a external date command each time and perl can be found in most of the servers.
tail -f log | perl -pne '
use Time::HiRes (gettimeofday);
use POSIX qw(strftime);
($s,$ms) = gettimeofday();
print strftime "%Y-%m-%dT%H:%M:%S+$ms ", gmtime($s);
'
Alternative version with flush and read in a loop:
tail -f log | perl -pne '
use Time::HiRes (gettimeofday); use POSIX qw(strftime);
$|=1;
while(<>) {
($s,$ms) = gettimeofday();
print strftime "%Y-%m-%dT%H:%M:%S+$ms $_", gmtime($s);
}'
caerwyn's answer can be run as a subroutine, which would prevent the new processes per line:
timestamp(){
while read line
do
echo `date` $line
done
}
echo testing 123 |timestamp
Disclaimer: the solution I am proposing is not a Unix built-in utility.
I faced a similar problem a few days ago. I did not like the syntax and limitations of the solutions above, so I quickly put together a program in Go to do the job for me.
You can check the tool here: preftime
There are prebuilt executables for Linux, MacOS, and Windows in the Releases section of the GitHub project.
The tool handles incomplete output lines and has (from my point of view) a more compact syntax.
<command> | preftime
It's not ideal, but I though I'd share it in case it helps someone.
The other answers mostly work, but have some drawbacks. In particular:
Many require installing a command not commonly found on linux systems, which may not be possible or convenient.
Since they use pipes, they don't put timestamps on stderr, and lose the exit status.
If you use multiple pipes for stderr and stdout, then some do not have atomic printing, leading to intermingled lines of output like [timestamp] [timestamp] stdout line \nstderr line
Buffering can cause problems, and unbuffer requires an extra dependency.
To solve (4), we can use stdbuf -i0 -o0 -e0 which is generally available on most linux systems (see How to make output of any shell command unbuffered?).
To solve (3), you just need to be careful to print the entire line at a time.
Bad: ruby -pe 'print Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \")' (Prints the timestamp, then prints the contents of $_.)
Good: ruby -pe '\$_ = Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \") + \$_' (Alters $_, then prints it.)
To solve (2), we need to use multiple pipes and save the exit status:
alias tslines-pipe="stdbuf -i0 -o0 ruby -pe '\$_ = Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \") + \$_'"
function tslines() (
stdbuf -o0 -e0 "$#" 2> >(tslines-pipe) > >(tslines-pipe)
status="$?"
exit $status
)
Then you can run a command with tslines some command --options.
This almost works, except sometimes one of the pipes takes slightly longer to exit and the tslines function has exited, so the next prompt has printed. For example, this command seems to print all the output after the prompt for the next line has appeared, which can be a bit confusing:
tslines bash -c '(for (( i=1; i<=20; i++ )); do echo stderr 1>&2; echo stdout; done)'
There needs to be some coordination method between the two pipe processes and the tslines function. There are presumably many ways to do this. One way I found is to have the pipes send some lines to a pipe that the main function can listen to, and only exit after it's received data from both pipe handlers. Putting that together:
alias tslines-pipe="stdbuf -i0 -o0 ruby -pe '\$_ = Time.now.strftime(\"[%Y-%m-%d %H:%M:%S] \") + \$_'"
function tslines() (
# Pick a random name for the pipe to prevent collisions.
pipe="/tmp/pipe-$RANDOM"
# Ensure the pipe gets deleted when the method exits.
trap "rm -f $pipe" EXIT
# Create the pipe. See https://www.linuxjournal.com/content/using-named-pipes-fifos-bash
mkfifo "$pipe"
# echo will block until the pipe is read.
stdbuf -o0 -e0 "$#" 2> >(tslines-pipe; echo "done" >> $pipe) > >(tslines-pipe; echo "done" >> $pipe)
status="$?"
# Wait until we've received data from both pipe commands before exiting.
linecount=0
while [[ $linecount -lt 2 ]]; do
read line
if [[ "$line" == "done" ]]; then
((linecount++))
fi
done < "$pipe"
exit $status
)
That synchronization mechanism feels a bit convoluted; hopefully there's a simpler way to do it.
doing it with date and tr and xargs on OSX:
alias predate="xargs -I{} sh -c 'date +\"%Y-%m-%d %H:%M:%S\" | tr \"\n\" \" \"; echo \"{}\"'"
<command> | predate
if you want milliseconds:
alias predate="xargs -I{} sh -c 'date +\"%Y-%m-%d %H:%M:%S.%3N\" | tr \"\n\" \" \"; echo \"{}\"'"
but note that on OSX, date doesn't give you the %N option, so you'll need to install gdate (brew install coreutils) and so finally arrive at this:
alias predate="xargs -I{} sh -c 'gdate +\"%Y-%m-%d %H:%M:%S.%3N\" | tr \"\n\" \" \"; echo \"{}\"'"
No need to specify all the parameters in strftime() unless you really want to customize the outputting format :
echo "abc 123 xyz\njan 765 feb" \
\
| gawk -Sbe 'BEGIN {_=strftime()" "} sub("^",_)'
Sat Apr 9 13:14:53 EDT 2022 abc 123 xyz
Sat Apr 9 13:14:53 EDT 2022 jan 765 feb
works the same if you have mawk 1.3.4. Even on awk-variants without the time features, a quick getline could emulate it :
echo "abc 123 xyz\njan 765 feb" \
\
| mawk2 'BEGIN { (__="date")|getline _;
close(__)
_=_" " } sub("^",_)'
Sat Apr 9 13:19:38 EDT 2022 abc 123 xyz
Sat Apr 9 13:19:38 EDT 2022 jan 765 feb
If you wanna skip all that getline and BEGIN { }, then something like this :
mawk2 'sub("^",_" ")' \_="$(date)"
If the value you are prepending is the same on every line, fire up emacs with the file, then:
Ctrl + <space>
at the beginning of the of the file (to mark that spot), then scroll down to the beginning of the last line (Alt + > will go to the end of file... which probably will involve the Shift key too, then Ctrl + a to go to the beginning of that line) and:
Ctrl + x r t
Which is the command to insert at the rectangle you just specified (a rectangle of 0 width).
2008-8-21 6:45PM <enter>
Or whatever you want to prepend... then you will see that text prepended to every line within the 0 width rectangle.
UPDATE: I just realized you don't want the SAME date, so this won't work... though you may be able to do this in emacs with a slightly more complicated custom macro, but still, this kind of rectangle editing is pretty nice to know about...