I need to compare two files column by column using unix shell, and store the difference in a resulting file.
For example if column 1 of the 1st record of the 1st file matches the column 1 of the 1st record of the 2nd file then the result will be stored as '=' in the resulting file against the column, but if it finds any difference in column values the same need to be printed in the resulting file.
Below is the exact requirement.
File 1:
id code name place
123 abc Tom phoenix
345 xyz Harry seattle
675 kyt Romil newyork
File 2:
id code name place
123 pkt Rosy phoenix
345 xyz Harry seattle
421 uty Romil Sanjose
Expected resulting file:
id_1 id_2 code_1 code_2 name_1 name_2 place_1 place_2
= = abc pkt Tom Rosy = =
= = = = = = = =
675 421 kyt uty = = Newyork Sanjose
Columns are tab delimited.
This is rather crudely coded, but shows a way to use awk to emit what you want, and can handle files of identical "schema" - not just the particular 4-field files you give as tests.
This approach uses pr to do a simple merge of the files: the same line of each input file is concatenated to present one line to the awk script.
The awk script assumes clean input, and uses the fact that if a variable n has the value 2, the value of $n in the script is the the same as $2. So, the script walks though pairs of fields using the i and j variables. For your test input, fields 1 and 5, then 2 and 6, etc., are processed.
Only very limited testing of input is performed: mainly, that the implied schema of the two input files (the names of columns/fields) is the same.
#!/bin/sh
[ $# -eq 2 ] || { echo "Usage: ${0##*/} <file1> <file2>" 1>&2; exit 1; }
[ -r "$1" -a -r "$2" ] || { echo "$1 or $2: cannot read" 1>&2; exit 1; }
set -e
pr -s -t -m "$#" | \
awk '
{
offset = int(NF/2)
tab = ""
for (i = 1; i <= offset; i++) {
j = i + offset
if (NR == 1) {
if ($i != $j) {
printf "\nColumn name mismatch (%s/%s)\n", $i, $j > "/dev/stderr"
exit
}
printf "%s%s_1\t%s_2", tab, $i, $j
} else if ($i == $j) {
printf "%s=\t=", tab
} else {
printf "%s%s\t%s", tab, $i, $j
}
tab = "\t"
}
printf "\n"
}
'
Tested on Linux: GNU Awk 4.1.0 and pr (GNU coreutils) 8.21.
My log file data is
[10/04/16 02:07:20 BST] Data 1
[11/04/16 02:07:20 BST] Data 1
[10/05/16 04:11:09 BST] Data 2
[12/05/16 04:11:09 BST] Data 2
[11/06/16 06:22:35 BST] Data 3
My input format is
./filename Apr 11 16 00:00:00 Jul 10 16 00:00:00
I am converting the input format to logfile format with the following function,
convert_date () {
local months=( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec )
local i
for (( i=0; i<11; i++ )); do
[[ $1 = ${months[$i]} ]] && break
done
printf "\[%2d\/%02d\/%02d $4 BST\]\n" $2 $(( i+1 )) $3
for (( i=0; i<11; i++ )); do
[[ $5 = ${months[$i]} ]] && break
done
}
And also I am storing the result in variable and using it
Start=$( convert_date $1 $2 $3 $4 )
End=$( convert_date $5 $6 $7 $8 )
But the codes gives me result only if the Stattime and endtime are present in the log file. How can I get the data between the two times even if the start and endtimes are not present in the logfile. What awk script can I use?
Your Bash (assumed) function seems to output date in following format:
$ bash test.sh "Apr 11 16 00:00:00"
\[11\/04\/16 00:00:00 BST\]
Working with that, test.awk:
BEGIN {
FS="[[/: ]+"; # set field separator to all delimiters in datetime format in the date
split(start,arr,"[\\\\[/ :]") # split the start variable in pieces for reorganize
start=arr[4]" "arr[3]" "arr[2]" "arr[5]" "arr[6]" "arr[7] # reorganize
}
start <= $4" "$3" "$2" "$5" "$6" "$7 # compare reorganized data to date in start variable
$ awk -v start="\[11\/04\/16 00:00:00 BST\]" -f test.awk test.in
[11/04/16 02:07:20 BST] Data 1
[10/05/16 04:11:09 BST] Data 2
[12/05/16 04:11:09 BST] Data 2
[11/06/16 06:22:35 BST] Data 3
It complains a bit, though:
awk: warning: escape sequence '\[' treated as plain '['
The input format is in "Input.Format" file and the log file is in "Log.File". This shell script 'cat' both files and pipes to 'awk'. The awk script changes the month format to number an compares to log file to turn on or off a print switch.
#!/bin/sh
cat Input.Format Log.File | awk 'BEGIN {
Month = " JanFebMarAprMayJunJulAugSepOctNovDec"
} {
if (NR == 1) {
startm = index(Month, $2) / 3
if (length(startm) == 1) { startm = "0" startm }
startm = $4 startm $3
endm = index(Month, $6) / 3
if (length(endm) == 1) { endm = "0" endm }
endm = $4 endm $3
# print startm " " endm
}
else {
logdate = substr($1,8,2) substr($1,5,2) substr($1,2,2)
# print logdate
if (logdate >= startm ) { prtsw = 1 }
if (logdate > endm ) { prtsw = 0 }
if (prtsw == 1 ) { print $0 }
}
}'
I am trying to split a file into different smaller files depending on the value of the fifth field. A very nice way to do this was already suggested and also here.
However, I am trying to incorporate this into a .sh script for qsub, without much success.
The problem is that in the section where the file to which output the line is specified,
i.e., f = "Alignments_" $5 ".sam" print > f
, I need to pass a variable declared earlier in the script, which specifies the directory where the file should be written. I need to do this with a variable which is built for each task when I send out the array job for multiple files.
So say $output_path = ./Sample1
I need to write something like
f = $output_path "/Alignments_" $5 ".sam" print > f
But it does not seem to like having a $variable that is not a $field belonging to awk. I don't even think it likes having two "strings" before and after the $5.
The error I get back is that it takes the first line of the file to be split (little.sam) and tries to name f like that, followed by /Alignments_" $5 ".sam" (those last three put together correctly). It says, naturally, that it is too big a name.
How can I write this so it works?
Thanks!
awk -F '[:\t]' ' # read the list of numbers in Tile_Number_List
FNR == NR {
num[$1]
next
}
# process each line of the .BAM file
# any lines with an "unknown" $5 will be ignored
$5 in num {
f = "Alignments_" $5 ".sam" print > f
} ' Tile_Number_List.txt little.sam
UPDATE, AFTER ADDING -V TO AWK AND DECLARING THE VARIABLE OPATH
input=$1
outputBase=${input%.bam}
mkdir -v $outputBase\_TEST
newdir=$outputBase\_TEST
samtools view -h $input | awk 'NR >= 18' | awk -F '[\t:]' -v opath="$newdir" '
FNR == NR {
num[$1]
next
}
$5 in num {
f = newdir"/Alignments_"$5".sam";
print > f
} ' Tile_Number_List.txt -
mkdir: created directory little_TEST'
awk: cmd. line:10: (FILENAME=- FNR=1) fatal: can't redirect to `/Alignments_1101.sam' (Permission denied)
awk variables are like C variables - just reference them by name to get their value, no need to stick a "$" in front of them like you do with shell variables:
awk -F '[:\t]' ' # read the list of numbers in Tile_Number_List
FNR == NR {
num[$1]
next
}
# process each line of the .BAM file
# any lines with an "unknown" $5 will be ignored
$5 in num {
output_path = "./Sample1/"
f = output_path "Alignments_" $5 ".sam"
print > f
} ' Tile_Number_List.txt little.sam
To pass the value of the shell variable such as $output_path to awk you need to use the -v option.
$ output_path=./Sample1/
$ awk -F '[:\t]' -v opath="$ouput_path" '
# read the list of numbers in Tile_Number_List
FNR == NR {
num[$1]
next
}
# process each line of the .BAM file
# any lines with an "unknown" $5 will be ignored
$5 in num {
f = opath"Alignments_"$5".sam"
print > f
} ' Tile_Number_List.txt little.sam
Also you still have the error from your previous question left in your script
EDIT:
The awk variable created with -v is obase but you use newdir what you want is:
input=$1
outputBase=${input%.bam}
mkdir -v $outputBase\_TEST
newdir=$outputBase\_TEST
samtools view -h "$input" | awk -F '[\t:]' -v opath="$newdir" '
FNR == NR && NR >= 18 {
num[$1]
next
}
$5 in num {
f = opath"/Alignments_"$5".sam" # <-- opath is the awk variable not newdir
print > f
}' Tile_Number_List.txt -
You should also move NR >= 18 into the second awk script.
I am working on a UNIX box, and trying to run an application, which gives some debug logs to the standard output. I have redirected this output to a log file, but now wish to get the lines where the error is being shown.
My problem here is that a simple
cat output.log | grep FAIL
does not help out. As this shows only the lines which have FAIL in them. I want some more information along with this. Like the 2-3 lines above this line with FAIL. Is there any way to do this via a simple shell command? I would like to have a single command line (can have pipes) to do the above.
grep -C 3 FAIL output.log
Note that this also gets rid of the useless use of cat (UUOC).
grep -A $NUM
This will print $NUM lines of trailing context after matches.
-B $NUM prints leading context.
man grep is your best friend.
So in your case:
cat log | grep -A 3 -B 3 FAIL
I have two implementations of what I call sgrep, one in Perl, one using just pre-Perl (pre-GNU) standard Unix commands. If you've got GNU grep, you've no particular need of these. It would be more complex to deal with forwards and backwards context searches, but that might be a useful exercise.
Perl solution:
#!/usr/perl/v5.8.8/bin/perl -w
#
# #(#)$Id: sgrep.pl,v 1.6 2007/09/18 22:55:20 jleffler Exp $
#
# Perl-based SGREP (special grep) command
#
# Print lines around the line that matches (by default, 3 before and 3 after).
# By default, include file names if more than one file to search.
#
# Options:
# -b n1 Print n1 lines before match
# -f n2 Print n2 lines following match
# -n Print line numbers
# -h Do not print file names
# -H Do print file names
use strict;
use constant debug => 0;
use Getopt::Std;
my(%opts);
sub usage
{
print STDERR "Usage: $0 [-hnH] [-b n1] [-f n2] pattern [file ...]\n";
exit 1;
}
usage unless getopts('hnf:b:H', \%opts);
usage unless #ARGV >= 1;
if ($opts{h} && $opts{H})
{
print STDERR "$0: mutually exclusive options -h and -H specified\n";
exit 1;
}
my $op = shift;
print "# regex = $op\n" if debug;
# print file names if -h omitted and more than one argument
$opts{F} = (defined $opts{H} || (!defined $opts{h} and scalar #ARGV > 1)) ? 1 : 0;
$opts{n} = 0 unless defined $opts{n};
my $before = (defined $opts{b}) ? $opts{b} + 0 : 3;
my $after = (defined $opts{f}) ? $opts{f} + 0 : 3;
print "# before = $before; after = $after\n" if debug;
my #lines = (); # Accumulated lines
my $tail = 0; # Line number of last line in list
my $tbp_1 = 0; # First line to be printed
my $tbp_2 = 0; # Last line to be printed
# Print lines from #lines in the range $tbp_1 .. $tbp_2,
# leaving $leave lines in the array for future use.
sub print_leaving
{
my ($leave) = #_;
while (scalar(#lines) > $leave)
{
my $line = shift #lines;
my $curr = $tail - scalar(#lines);
if ($tbp_1 <= $curr && $curr <= $tbp_2)
{
print "$ARGV:" if $opts{F};
print "$curr:" if $opts{n};
print $line;
}
}
}
# General logic:
# Accumulate each line at end of #lines.
# ** If current line matches, record range that needs printing
# ** When the line array contains enough lines, pop line off front and,
# if it needs printing, print it.
# At end of file, empty line array, printing requisite accumulated lines.
while (<>)
{
# Add this line to the accumulated lines
push #lines, $_;
$tail = $.;
printf "# array: N = %d, last = $tail: %s", scalar(#lines), $_ if debug > 1;
if (m/$op/o)
{
# This line matches - set range to be printed
my $lo = $. - $before;
$tbp_1 = $lo if ($lo > $tbp_2);
$tbp_2 = $. + $after;
print "# $. MATCH: print range $tbp_1 .. $tbp_2\n" if debug;
}
# Print out any accumulated lines that need printing
# Leave $before lines in array.
print_leaving($before);
}
continue
{
if (eof)
{
# Print out any accumulated lines that need printing
print_leaving(0);
# Reset for next file
close ARGV;
$tbp_1 = 0;
$tbp_2 = 0;
$tail = 0;
#lines = ();
}
}
Pre-Perl Unix solution (using plain ed, sed, and sort - though it uses getopt which was not necessarily available back then):
#!/bin/ksh
#
# #(#)$Id: old.sgrep.sh,v 1.5 2007/09/15 22:15:43 jleffler Exp $
#
# Special grep
# Finds a pattern and prints lines either side of the pattern
# Line numbers are always produced by ed (substitute for grep),
# which allows us to eliminate duplicate lines cleanly. If the
# user did not ask for numbers, these are then stripped out.
#
# BUG: if the pattern occurs in in the first line or two and
# the number of lines to go back is larger than the line number,
# it fails dismally.
set -- `getopt "f:b:hn" "$#"`
case $# in
0) echo "Usage: $0 [-hn] [-f x] [-b y] pattern [files]" >&2
exit 1;;
esac
# Tab required - at least with sed (perl would be different)
# But then the whole problem would be different if implemented in Perl.
number="'s/^\\([0-9][0-9]*\\) /\\1:/'"
filename="'s%^%%'" # No-op for sed
f=3
b=3
nflag=no
hflag=no
while [ $# -gt 0 ]
do
case $1 in
-f) f=$2; shift 2;;
-b) b=$2; shift 2;;
-n) nflag=yes; shift;;
-h) hflag=yes; shift;;
--) shift; break;;
*) echo "Unknown option $1" >&2
exit 1;;
esac
done
pattern="${1:?'No pattern'}"
shift
case $# in
0) tmp=${TMPDIR:-/tmp}/`basename $0`.$$
trap "rm -f $tmp ; exit 1" 0
cat - >$tmp
set -- $tmp
sort="sort -t: -u +0n -1"
;;
*) filename="'s%^%'\$file:%"
sort="sort -t: -u +1n -2"
;;
esac
case $nflag in
yes) num_remove='s/[0-9][0-9]*://';;
no) num_remove='s/^//';;
esac
case $hflag in
yes) fileremove='s%^$file:%%';;
no) fileremove='s/^//';;
esac
for file in $*
do
echo "g/$pattern/.-${b},.+${f}n" |
ed - $file |
eval sed -e "$number" -e "$filename" |
$sort |
eval sed -e "$fileremove" -e "$num_remove"
done
rm -f $tmp
trap 0
exit 0
The shell version of sgrep was written in February 1989, and bug fixed in May 1989. It then remained unchanged except for an administrative change (SCCS to RCS transition) in 1997 until 2007, when I added the -h option. I switched to the Perl version in 2007.
http://thedailywtf.com/Articles/The_Complicator_0x27_s_Gloves.aspx
You can use sed to print specific lines, lets say you want line 20
sed '20 p' -n FILE_YOU_WANT_THE_LINE_FROM
Done.
-n prevents echoing lines from the file. The part in quotes is a sed rule to apply, it specifies that you want the rule to apply to line 20, and you want to print.
With GNU grep on Windows:
$ grep --context 3 FAIL output.log
$ grep --help | grep context
-B, --before-context=NUM print NUM lines of leading context
-A, --after-context=NUM print NUM lines of trailing context
-C, --context=NUM print NUM lines of output context
-NUM same as --context=NUM
I need to do date arithmetic in Unix shell scripts that I use to control the execution of third party programs.
I'm using a function to increment a day and another to decrement:
IncrementaDia(){
echo $1 | awk '
BEGIN {
diasDelMes[1] = 31
diasDelMes[2] = 28
diasDelMes[3] = 31
diasDelMes[4] = 30
diasDelMes[5] = 31
diasDelMes[6] = 30
diasDelMes[7] = 31
diasDelMes[8] = 31
diasDelMes[9] = 30
diasDelMes[10] = 31
diasDelMes[11] = 30
diasDelMes[12] = 31
}
{
anio=substr($1,1,4)
mes=substr($1,5,2)
dia=substr($1,7,2)
if((anio % 4 == 0 && anio % 100 != 0) || anio % 400 == 0)
{
diasDelMes[2] = 29;
}
if( dia == diasDelMes[int(mes)] ) {
if( int(mes) == 12 ) {
anio = anio + 1
mes = 1
dia = 1
} else {
mes = mes + 1
dia = 1
}
} else {
dia = dia + 1
}
}
END {
printf("%04d%02d%02d", anio, mes, dia)
}
'
}
if [ $# -eq 1 ]; then
tomorrow=$1
else
today=$(date +"%Y%m%d")
tomorrow=$(IncrementaDia $hoy)
fi
but now I need to do more complex arithmetic.
What it's the best and more compatible way to do this?
Assuming you have GNU date, like so:
date --date='1 days ago' '+%a'
And similar phrases.
Here is an easy way for doing date computations in shell scripting.
meetingDate='12/31/2011' # MM/DD/YYYY Format
reminderDate=`date --date=$meetingDate'-1 day' +'%m/%d/%Y'`
echo $reminderDate
Below are more variations of date computation that can be achieved using date utility.
http://www.cyberciti.biz/tips/linux-unix-get-yesterdays-tomorrows-date.html
http://www.cyberciti.biz/faq/linux-unix-formatting-dates-for-display/
This worked for me on RHEL.
I have written a bash script for converting dates expressed in English into conventional
mm/dd/yyyy dates. It is called ComputeDate.
Here are some examples of its use. For brevity I have placed the output of each invocation
on the same line as the invocation, separarted by a colon (:). The quotes shown below are not necessary when running ComputeDate:
$ ComputeDate 'yesterday': 03/19/2010
$ ComputeDate 'yes': 03/19/2010
$ ComputeDate 'today': 03/20/2010
$ ComputeDate 'tod': 03/20/2010
$ ComputeDate 'now': 03/20/2010
$ ComputeDate 'tomorrow': 03/21/2010
$ ComputeDate 'tom': 03/21/2010
$ ComputeDate '10/29/32': 10/29/2032
$ ComputeDate 'October 29': 10/1/2029
$ ComputeDate 'October 29, 2010': 10/29/2010
$ ComputeDate 'this monday': 'this monday' has passed. Did you mean 'next monday?'
$ ComputeDate 'a week after today': 03/27/2010
$ ComputeDate 'this satu': 03/20/2010
$ ComputeDate 'next monday': 03/22/2010
$ ComputeDate 'next thur': 03/25/2010
$ ComputeDate 'mon in 2 weeks': 03/28/2010
$ ComputeDate 'the last day of the month': 03/31/2010
$ ComputeDate 'the last day of feb': 2/28/2010
$ ComputeDate 'the last day of feb 2000': 2/29/2000
$ ComputeDate '1 week from yesterday': 03/26/2010
$ ComputeDate '1 week from today': 03/27/2010
$ ComputeDate '1 week from tomorrow': 03/28/2010
$ ComputeDate '2 weeks from yesterday': 4/2/2010
$ ComputeDate '2 weeks from today': 4/3/2010
$ ComputeDate '2 weeks from tomorrow': 4/4/2010
$ ComputeDate '1 week after the last day of march': 4/7/2010
$ ComputeDate '1 week after next Thursday': 4/1/2010
$ ComputeDate '2 weeks after the last day of march': 4/14/2010
$ ComputeDate '2 weeks after 1 day after the last day of march': 4/15/2010
$ ComputeDate '1 day after the last day of march': 4/1/2010
$ ComputeDate '1 day after 1 day after 1 day after 1 day after today': 03/24/2010
I have included this script as an answer to this problem because it illustrates how
to do date arithmetic via a set of bash functions and these functions may prove useful
for others. It handles leap years and leap centuries correctly:
#! /bin/bash
# ConvertDate -- convert a human-readable date to a MM/DD/YY date
#
# Date ::= Month/Day/Year
# | Month/Day
# | DayOfWeek
# | [this|next] DayOfWeek
# | DayofWeek [of|in] [Number|next] weeks[s]
# | Number [day|week][s] from Date
# | the last day of the month
# | the last day of Month
#
# Month ::= January | February | March | April | May | ... | December
# January ::= jan | january | 1
# February ::= feb | january | 2
# ...
# December ::= dec | december | 12
# Day ::= 1 | 2 | ... | 31
# DayOfWeek ::= today | Sunday | Monday | Tuesday | ... | Saturday
# Sunday ::= sun*
# ...
# Saturday ::= sat*
#
# Number ::= Day | a
#
# Author: Larry Morell
if [ $# = 0 ]; then
printdirections $0
exit
fi
# Request the value of a variable
GetVar () {
Var=$1
echo -n "$Var= [${!Var}]: "
local X
read X
if [ ! -z $X ]; then
eval $Var="$X"
fi
}
IsLeapYear () {
local Year=$1
if [ $[20$Year % 4] -eq 0 ]; then
echo yes
else
echo no
fi
}
# AddToDate -- compute another date within the same year
DayNames=(mon tue wed thu fri sat sun ) # To correspond with 'date' output
Day2Int () {
ErrorFlag=
case $1 in
-e )
ErrorFlag=-e; shift
;;
esac
local dow=$1
n=0
while [ $n -lt 7 -a $dow != "${DayNames[n]}" ]; do
let n++
done
if [ -z "$ErrorFlag" -a $n -eq 7 ]; then
echo Cannot convert $dow to a numeric day of wee
exit
fi
echo $[n+1]
}
Months=(31 28 31 30 31 30 31 31 30 31 30 31)
MonthNames=(jan feb mar apr may jun jul aug sep oct nov dec)
# Returns the month (1-12) from a date, or a month name
Month2Int () {
ErrorFlag=
case $1 in
-e )
ErrorFlag=-e; shift
;;
esac
M=$1
Month=${M%%/*} # Remove /...
case $Month in
[a-z]* )
Month=${Month:0:3}
M=0
while [ $M -lt 12 -a ${MonthNames[M]} != $Month ]; do
let M++
done
let M++
esac
if [ -z "$ErrorFlag" -a $M -gt 12 ]; then
echo "'$Month' Is not a valid month."
exit
fi
echo $M
}
# Retrieve month,day,year from a legal date
GetMonth() {
echo ${1%%/*}
}
GetDay() {
echo $1 | col / 2
}
GetYear() {
echo ${1##*/}
}
AddToDate() {
local Date=$1
local days=$2
local Month=`GetMonth $Date`
local Day=`echo $Date | col / 2` # Day of Date
local Year=`echo $Date | col / 3` # Year of Date
local LeapYear=`IsLeapYear $Year`
if [ $LeapYear = "yes" ]; then
let Months[1]++
fi
Day=$[Day+days]
while [ $Day -gt ${Months[$Month-1]} ]; do
Day=$[Day - ${Months[$Month-1]}]
let Month++
done
echo "$Month/$Day/$Year"
}
# Convert a date to normal form
NormalizeDate () {
Date=`echo "$*" | sed 'sX *X/Xg'`
local Day=`date +%d`
local Month=`date +%m`
local Year=`date +%Y`
#echo Normalizing Date=$Date > /dev/tty
case $Date in
*/*/* )
Month=`echo $Date | col / 1 `
Month=`Month2Int $Month`
Day=`echo $Date | col / 2`
Year=`echo $Date | col / 3`
;;
*/* )
Month=`echo $Date | col / 1 `
Month=`Month2Int $Month`
Day=1
Year=`echo $Date | col / 2 `
;;
[a-z]* ) # Better be a month or day of week
Exp=${Date:0:3}
case $Exp in
jan|feb|mar|apr|may|june|jul|aug|sep|oct|nov|dec )
Month=$Exp
Month=`Month2Int $Month`
Day=1
#Year stays the same
;;
mon|tue|wed|thu|fri|sat|sun )
# Compute the next such day
local DayOfWeek=`date +%u`
D=`Day2Int $Exp`
if [ $DayOfWeek -le $D ]; then
Date=`AddToDate $Month/$Day/$Year $[D-DayOfWeek]`
else
Date=`AddToDate $Month/$Day/$Year $[7+D-DayOfWeek]`
fi
# Reset Month/Day/Year
Month=`echo $Date | col / 1 `
Day=`echo $Date | col / 2`
Year=`echo $Date | col / 3`
;;
* ) echo "$Exp is not a valid month or day"
exit
;;
esac
;;
* ) echo "$Date is not a valid date"
exit
;;
esac
case $Day in
[0-9]* );; # Day must be numeric
* ) echo "$Date is not a valid date"
exit
;;
esac
[0-9][0-9][0-9][0-9] );; # Year must be 4 digits
[0-9][0-9] )
Year=20$Year
;;
esac
Date=$Month/$Day/$Year
echo $Date
}
# NormalizeDate jan
# NormalizeDate january
# NormalizeDate jan 2009
# NormalizeDate jan 22 1983
# NormalizeDate 1/22
# NormalizeDate 1 22
# NormalizeDate sat
# NormalizeDate sun
# NormalizeDate mon
ComputeExtension () {
local Date=$1; shift
local Month=`GetMonth $Date`
local Day=`echo $Date | col / 2`
local Year=`echo $Date | col / 3`
local ExtensionExp="$*"
case $ExtensionExp in
*w*d* ) # like 5 weeks 3 days or even 5w2d
ExtensionExp=`echo $ExtensionExp | sed 's/[a-z]/ /g'`
weeks=`echo $ExtensionExp | col 1`
days=`echo $ExtensionExp | col 2`
days=$[7*weeks+days]
Due=`AddToDate $Month/$Day/$Year $days`
;;
*d ) # Like 5 days or 5d
ExtensionExp=`echo $ExtensionExp | sed 's/[a-z]/ /g'`
days=$ExtensionExp
Due=`AddToDate $Month/$Day/$Year $days`
;;
* )
Due=$ExtensionExp
;;
esac
echo $Due
}
# Pop -- remove the first element from an array and shift left
Pop () {
Var=$1
eval "unset $Var[0]"
eval "$Var=(\${$Var[*]})"
}
ComputeDate () {
local Date=`NormalizeDate $1`; shift
local Expression=`echo $* | sed 's/^ *a /1 /;s/,/ /' | tr A-Z a-z `
local Exp=(`echo $Expression `)
local Token=$Exp # first one
local Ans=
#echo "Computing date for ${Exp[*]}" > /dev/tty
case $Token in
*/* ) # Regular date
M=`GetMonth $Token`
D=`GetDay $Token`
Y=`GetYear $Token`
if [ -z "$Y" ]; then
Y=$Year
elif [ ${#Y} -eq 2 ]; then
Y=20$Y
fi
Ans="$M/$D/$Y"
;;
yes* )
Ans=`AddToDate $Date -1`
;;
tod*|now )
Ans=$Date
;;
tom* )
Ans=`AddToDate $Date 1`
;;
the )
case $Expression in
*day*after* ) #the day after Date
Pop Exp; # Skip the
Pop Exp; # Skip day
Pop Exp; # Skip after
#echo Calling ComputeDate $Date ${Exp[*]} > /dev/tty
Date=`ComputeDate $Date ${Exp[*]}` #Recursive call
#echo "New date is " $Date > /dev/tty
Ans=`AddToDate $Date 1`
;;
*last*day*of*th*month|*end*of*th*month )
M=`date +%m`
Day=${Months[M-1]}
if [ $M -eq 2 -a `IsLeapYear $Year` = yes ]; then
let Day++
fi
Ans=$Month/$Day/$Year
;;
*last*day*of* )
D=${Expression##*of }
D=`NormalizeDate $D`
M=`GetMonth $D`
Y=`GetYear $D`
# echo M is $M > /dev/tty
Day=${Months[M-1]}
if [ $M -eq 2 -a `IsLeapYear $Y` = yes ]; then
let Day++
fi
Ans=$[M]/$Day/$Y
;;
* )
echo "Unknown expression: " $Expression
exit
;;
esac
;;
next* ) # next DayOfWeek
Pop Exp
dow=`Day2Int $DayOfWeek` # First 3 chars
tdow=`Day2Int ${Exp:0:3}` # First 3 chars
n=$[7-dow+tdow]
Ans=`AddToDate $Date $n`
;;
this* )
Pop Exp
dow=`Day2Int $DayOfWeek`
tdow=`Day2Int ${Exp:0:3}` # First 3 chars
if [ $dow -gt $tdow ]; then
echo "'this $Exp' has passed. Did you mean 'next $Exp?'"
exit
fi
n=$[tdow-dow]
Ans=`AddToDate $Date $n`
;;
[a-z]* ) # DayOfWeek ...
M=${Exp:0:3}
case $M in
jan|feb|mar|apr|may|june|jul|aug|sep|oct|nov|dec )
ND=`NormalizeDate ${Exp[*]}`
Ans=$ND
;;
mon|tue|wed|thu|fri|sat|sun )
dow=`Day2Int $DayOfWeek`
Ans=`NormalizeDate $Exp`
if [ ${#Exp[*]} -gt 1 ]; then # Just a DayOfWeek
#tdow=`GetDay $Exp` # First 3 chars
#if [ $dow -gt $tdow ]; then
#echo "'this $Exp' has passed. Did you mean 'next $Exp'?"
#exit
#fi
#n=$[tdow-dow]
#else # DayOfWeek in a future week
Pop Exp # toss monday
Pop Exp # toss in/off
if [ $Exp = next ]; then
Exp=2
fi
n=$[7*(Exp-1)] # number of weeks
n=$[n+7-dow+tdow]
Ans=`AddToDate $Date $n`
fi
;;
esac
;;
[0-9]* ) # Number weeks [from|after] Date
n=$Exp
Pop Exp;
case $Exp in
w* ) let n=7*n;;
esac
Pop Exp; Pop Exp
#echo Calling ComputeDate $Date ${Exp[*]} > /dev/tty
Date=`ComputeDate $Date ${Exp[*]}` #Recursive call
#echo "New date is " $Date > /dev/tty
Ans=`AddToDate $Date $n`
;;
esac
echo $Ans
}
Year=`date +%Y`
Month=`date +%m`
Day=`date +%d`
DayOfWeek=`date +%a |tr A-Z a-z`
Date="$Month/$Day/$Year"
ComputeDate $Date $*
This script makes extensive use of another script I wrote (called col ... many apologies to those who use the standard col supplied with Linux). This version of
col simplifies extracting columns from the stdin. Thus,
$ echo a b c d e | col 5 3 2
prints
e c b
Here it the col script:
#!/bin/sh
# col -- extract columns from a file
# Usage:
# col [-r] [c] col-1 col-2 ...
# where [c] if supplied defines the field separator
# where each col-i represents a column interpreted according to the presence of -r as follows:
# -r present : counting starts from the right end of the line
# -r absent : counting starts from the left side of the line
Separator=" "
Reverse=false
case "$1" in
-r ) Reverse=true; shift;
;;
[0-9]* )
;;
* )Separator="$1"; shift;
;;
esac
case "$1" in
-r ) Reverse=true; shift;
;;
[0-9]* )
;;
* )Separator="$1"; shift;
;;
esac
# Replace each col-i with $i
Cols=""
for f in $*
do
if [ $Reverse = true ]; then
Cols="$Cols \$(NF-$f+1),"
else
Cols="$Cols \$$f,"
fi
done
Cols=`echo "$Cols" | sed 's/,$//'`
#echo "Using column specifications of $Cols"
awk -F "$Separator" "{print $Cols}"
It also uses printdirections for printing out directions when the script is invoked improperly:
#!/bin/sh
#
# printdirections -- print header lines of a shell script
#
# Usage:
# printdirections path
# where
# path is a *full* path to the shell script in question
# beginning with '/'
#
# To use printdirections, you must include (as comments at the top
# of your shell script) documentation for running the shell script.
if [ $# -eq 0 -o "$*" = "-h" ]; then
printdirections $0
exit
fi
# Delete the command invocation at the top of the file, if any
# Delete from the place where printdirections occurs to the end of the file
# Remove the # comments
# There is a bizarre oddity here.
sed '/#!/d;/.*printdirections/,$d;/ *#/!d;s/# //;s/#//' $1 > /tmp/printdirections.$$
# Count the number of lines
numlines=`wc -l /tmp/printdirections.$$ | awk '{print $1}'`
# Remove the last line
numlines=`expr $numlines - 1`
head -n $numlines /tmp/printdirections.$$
rm /tmp/printdirections.$$
To use this place the three scripts in the files ComputeDate, col, and printdirections, respectively. Place the file in directory named by your PATH, typically, ~/bin. Then make them executable with:
$ chmod a+x ComputeDate col printdirections
Problems? Send me some emaiL: morell AT cs.atu.edu Place ComputeDate in the subject.
For BSD / OS X compatibility, you can also use the date utility with -j and -v to do date math. See the FreeBSD manpage for date. You could combine the previous Linux answers with this answer which might provide you with sufficient compatibility.
On BSD, as Linux, running date will give you the current date:
$ date
Wed 12 Nov 2014 13:36:00 AEDT
Now with BSD's date you can do math with -v, for example listing tomorrow's date (+1d is plus one day):
$ date -v +1d
Thu 13 Nov 2014 13:36:34 AEDT
You can use an existing date as the base, and optionally specify the parse format using strftime, and make sure you use -j so you don't change your system date:
$ date -j -f "%a %b %d %H:%M:%S %Y %z" "Sat Aug 09 13:37:14 2014 +1100"
Sat 9 Aug 2014 12:37:14 AEST
And you can use this as the base of date calculations:
$ date -v +1d -f "%a %b %d %H:%M:%S %Y %z" "Sat Aug 09 13:37:14 2014 +1100"
Sun 10 Aug 2014 12:37:14 AEST
Note that -v implies -j.
Multiple adjustments can be provided sequentially:
$ date -v +1m -v -1w
Fri 5 Dec 2014 13:40:07 AEDT
See the manpage for more details.
To do arithmetic with dates on UNIX you get the date as the number seconds since the UNIX epoch, do some calculation, then convert back to your printable date format. The date command should be able to both give you the seconds since the epoch and convert from that number back to a printable date. My local date command does this,
% date -n
1219371462
% date 1219371462
Thu Aug 21 22:17:42 EDT 2008
%
See your local date(1) man page.
To increment a day add 86400 seconds.
Why not write your scripts using a language like perl or python instead which more naturally supports complex date processing? Sure you can do it all in bash, but I think you will also get more consistency across platforms using python for example, so long as you can ensure that perl or python is installed.
I should add that it is quite easy to wire in python and perl scripts into a containing shell script.
date --date='1 days ago' '+%a'
It's not a very compatible solution. It will work only in Linux. At least, it didn't worked in Aix and Solaris.
It works in RHEL:
date --date='1 days ago' '+%Y%m%d'
20080807
I have bumped into this a couple of times. My thoughts are:
Date arithmetic is always a pain
It is a bit easier when using EPOCH date format
date on Linux converts to EPOCH, but not on Solaris
For a portable solution, you need to do one of the following:
Install gnu date on solaris (already
mentioned, needs human interaction
to complete)
Use perl for the date part (most unix installs include
perl, so I would generally assume
that this action does not
require additional work).
A sample script (checks for the age of certain user files to see if the account can be deleted):
#!/usr/local/bin/perl
$today = time();
$user = $ARGV[0];
$command="awk -F: '/$user/ {print \$6}' /etc/passwd";
chomp ($user_dir = `$command`);
if ( -f "$user_dir/.sh_history" ) {
#file_dates = stat("$user_dir/.sh_history");
$sh_file_date = $file_dates[8];
} else {
$sh_file_date = 0;
}
if ( -f "$user_dir/.bash_history" ) {
#file_dates = stat("$user_dir/.bash_history");
$bash_file_date = $file_dates[8];
} else {
$bash_file_date = 0;
}
if ( $sh_file_date > $bash_file_date ) {
$file_date = $sh_file_date;
} else {
$file_date = $bash_file_date;
}
$difference = $today - $file_date;
if ( $difference >= 3888000 ) {
print "User needs to be disabled, 45 days old or older!\n";
exit (1);
} else {
print "OK\n";
exit (0);
}
Looking into it further, I think you can simply use date.
I've tried the following on OpenBSD: I took the date of Feb. 29th 2008 and a random hour (in the form of 080229301535) and added +1 to the day part, like so:
$ date -j 0802301535
Sat Mar 1 15:35:00 EST 2008
As you can see, date formatted the time correctly...
HTH
If you want to continue with awk, then the mktime and strftime functions are useful:
BEGIN { dateinit }
{ newdate=daysadd(OldDate,DaysToAdd)}
# daynum: convert DD-MON-YYYY to day count
#-----------------------------------------
function daynum(date, d,m,y,i,n)
{
y=substr(date,8,4)
m=gmonths[toupper(substr(date,4,3))]
d=substr(date,1,2)
return mktime(y" "m" "d" 12 00 00")
}
#numday: convert day count to DD-MON-YYYY
#-------------------------------------------
function numday(n, y,m,d)
{
m=toupper(substr(strftime("%B",n),1,3))
return strftime("%d-"m"-%Y",n)
}
# daysadd: add (or subtract) days from date (DD-MON-YYYY), return new date (DD-MON-YYYY)
#------------------------------------------
function daysadd(date, days)
{
return numday(daynum(date)+(days*86400))
}
#init variables for date calcs
#-----------------------------------------
function dateinit( x,y,z)
{
# Stuff for date calcs
split("JAN:1,FEB:2,MAR:3,APR:4,MAY:5,JUN:6,JUL:7,AUG:8,SEP:9,OCT:10,NOV:11,DEC:12", z)
for (x in z)
{
split(z[x],y,":")
gmonths[y[1]]=y[2]
}
}
The book "Shell Script Recipes: A Problem Solution Approach" (ISBN: 978-1-59059-471-1) by Chris F.A. Johnson has a date functions library that might be helpful. The source code is available at http://apress.com/book/downloadfile/2146 (the date functions are in Chapter08/data-funcs-sh within the tar file).
If the GNU version of date works for you, why don't you grab the source and compile it on AIX and Solaris?
http://www.gnu.org/software/coreutils/
In any case, the source ought to help you get the date arithmetic correct if you are going to write you own code.
As an aside, comments like "that solution is good but surely you can note it's not as good as can be. It seems nobody thought of tinkering with dates when constructing Unix." don't really get us anywhere. I found each one of the suggestions so far to be very useful and on target.
Here are my two pennies worth - a script wrapper making use of date and grep.
Example Usage
> sh ./datecalc.sh "2012-08-04 19:43:00" + 1s
2012-08-04 19:43:00 + 0d0h0m1s
2012-08-04 19:43:01
> sh ./datecalc.sh "2012-08-04 19:43:00" - 1s1m1h1d
2012-08-04 19:43:00 - 1d1h1m1s
2012-08-03 18:41:59
> sh ./datecalc.sh "2012-08-04 19:43:00" - 1d2d1h2h1m2m1s2sblahblah
2012-08-04 19:43:00 - 1d1h1m1s
2012-08-03 18:41:59
> sh ./datecalc.sh "2012-08-04 19:43:00" x 1d
Bad operator :-(
> sh ./datecalc.sh "2012-08-04 19:43:00"
Missing arguments :-(
> sh ./datecalc.sh gibberish + 1h
date: invalid date `gibberish'
Invalid date :-(
Script
#!/bin/sh
# Usage:
#
# datecalc "<date>" <operator> <period>
#
# <date> ::= see "man date", section "DATE STRING"
# <operator> ::= + | -
# <period> ::= INTEGER<unit> | INTEGER<unit><period>
# <unit> ::= s | m | h | d
if [ $# -lt 3 ]; then
echo "Missing arguments :-("
exit; fi
date=`eval "date -d \"$1\" +%s"`
if [ -z $date ]; then
echo "Invalid date :-("
exit; fi
if ! ([ $2 == "-" ] || [ $2 == "+" ]); then
echo "Bad operator :-("
exit; fi
op=$2
minute=$[60]
hour=$[$minute*$minute]
day=$[24*$hour]
s=`echo $3 | grep -oe '[0-9]*s' | grep -m 1 -oe '[0-9]*'`
m=`echo $3 | grep -oe '[0-9]*m' | grep -m 1 -oe '[0-9]*'`
h=`echo $3 | grep -oe '[0-9]*h' | grep -m 1 -oe '[0-9]*'`
d=`echo $3 | grep -oe '[0-9]*d' | grep -m 1 -oe '[0-9]*'`
if [ -z $s ]; then s=0; fi
if [ -z $m ]; then m=0; fi
if [ -z $h ]; then h=0; fi
if [ -z $d ]; then d=0; fi
ms=$[$m*$minute]
hs=$[$h*$hour]
ds=$[$d*$day]
sum=$[$s+$ms+$hs+$ds]
out=$[$date$op$sum]
formattedout=`eval "date -d #$out +\"%Y-%m-%d %H:%M:%S\""`
echo $1 $2 $d"d"$h"h"$m"m"$s"s"
echo $formattedout
This works for me:
TZ=GMT+6;
export TZ
mes=`date --date='2 days ago' '+%m'`
dia=`date --date='2 days ago' '+%d'`
anio=`date --date='2 days ago' '+%Y'`
hora=`date --date='2 days ago' '+%H'`