To improve Calculate Number of Days Command - unix

Would like to generate report, which calculate the number of days, the material is in the warehouse.
The number of days is the difference between date ($3 field) the material comes in and
against (01 OCT 2014) manual feed date.
Input.csv
Des11,Material,DateIN,Des22,Des33,MRP,Des44,Des55,Des66,Location,Des77,Des88
aa,xxx,19-AUG-14.08:08:01,cc,dd,x20,ee,ff,gg,XX128,hh,jj
aa,xxx,19-AUG-14.08:08:01,cc,dd,x20,ee,ff,gg,XX128,hh,jj
aa,yyy,13-JUN-14.09:06:08,cc,dd,x20,ee,ff,gg,XX128,hh,jj
aa,yyy,13-JUN-14.09:06:08,cc,dd,x20,ee,ff,gg,XX128,hh,jj
aa,yyy,05-FEB-14.09:02:09,cc,dd,x20,ee,ff,gg,YY250,hh,jj
aa,yyy,05-FEB-14.09:02:09,cc,dd,y35,ee,ff,gg,YY250,hh,jj
aa,zzz,05-FEB-14.09:02:09,cc,dd,y35,ee,ff,gg,YY250,hh,jj
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj
Currently i am using below command to popualte Ageing - No of days at $13 field ( thanks to gboffi)
awk -F, 'NR>0 {date=$3;
gsub("[-.]"," ",date);
printf $0 ",";system("date --date=\"" date "\" +%s")}
' Input.csv | awk -F, -v OFS=, -v now=`date --date="01 OCT 2014 " +%s` '
NR>0 {$13=now-$13; $13=$13/24/3600;print $0}' >Op_Step11.csv
while using the above command in Cygwin (windows), it is taking 50 minutes for 1 Lac (1,00,000) rows of sample input.
Since my actual input file contains 25 million rows of lines , it seems that the script will take couple of days ,
Looking for your suggestions to improve the command and advice !!!
Expected Output:
Des11,Material,DateIN,Des22,Des33,MRP,Des44,Des55,Des66,Location,Des77,Des88,Ageing-NoOfDays
aa,xxx,19-AUG-14.08:08:01,cc,dd,x20,ee,ff,gg,XX128,hh,jj,42.6611
aa,xxx,19-AUG-14.08:08:01,cc,dd,x20,ee,ff,gg,XX128,hh,jj,42.6611
aa,yyy,13-JUN-14.09:06:08,cc,dd,x20,ee,ff,gg,XX128,hh,jj,109.621
aa,yyy,13-JUN-14.09:06:08,cc,dd,x20,ee,ff,gg,XX128,hh,jj,109.621
aa,yyy,05-FEB-14.09:02:09,cc,dd,x20,ee,ff,gg,YY250,hh,jj,237.624
aa,yyy,05-FEB-14.09:02:09,cc,dd,y35,ee,ff,gg,YY250,hh,jj,237.624
aa,zzz,05-FEB-14.09:02:09,cc,dd,y35,ee,ff,gg,YY250,hh,jj,237.624
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj,476.787
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj,476.787
aa,zzz,11-JUN-13.05:06:17,cc,dd,y35,ee,ff,gg,YY250,hh,jj,476.787
I don't have the access to change the input format and dont have perl & python access.
Update#3:
BEGIN{ FS=OFS=","}
{
t1=$3
t2="01-OCT-14.00:00:00"
print $0,(cvttime(t2) - cvttime(t1))/24/3600
}
function cvttime(t, a) {
split(t,a,"[-.:]")
match("JANFEBMARAPRMAYJUNJULAUGSEPOCTNOVDEC",a[2])
a[2] = sprintf("%02d",(RSTART+2)/3)
return( mktime("20"a[3]" "a[2]" "a[1]" "a[4]" "a[5]" "a[6]) )
}

Since you are on cygwin you are using GNU awk which has it's own built-in time functions and so you do not need to be trying to use the shell date command. Just tweak this old command I had lying around to suit your input and output format:
function cvttime(t, a) {
split(t,a,"[/:]")
match("JanFebMarAprMayJunJulAugSepOctNovDec",a[2])
a[2] = sprintf("%02d",(RSTART+2)/3)
return( mktime(a[3]" "a[2]" "a[1]" "a[4]" "a[5]" "a[6]) )
}
BEGIN{
t1="01/Dec/2005:00:04:42"
t2="01/Dec/2005:17:14:12"
print cvttime(t2) - cvttime(t1)
}
It uses GNU awk for time functions, see http://www.gnu.org/software/gawk/manual/gawk.html#Time-Functions

Here is an example in Perl:
use feature qw(say);
use strict;
use warnings;
use Text::CSV;
use Time::Piece;
my $csv = Text::CSV->new;
my $te = Time::Piece->strptime('01-OCT-14', '%d-%b-%y');
my $fn = 'Input.csv';
open (my $fh, '<', $fn) or die "Could not open file '$fn': $!\n";
chomp(my $head = <$fh>);
say "$head,Ageing-NoOfDays";
while (my $line = <$fh>) {
chomp $line;
if ($csv->parse($line)) {
my $t = ($csv->fields())[2];
my $tp = Time::Piece->strptime($t, '%d-%b-%y.%T');
my $s = $te - $tp;
say "$line," . $s->days;
} else {
warn "Line could not be parsed: $line\n";
}
}
close($fh);

Related

AWK print unexpected newline or end of string inside shell

I have a shell script which is trying to trim a file from end of the line but I always get some error.
Shell Script:
AWK_EXPRESSION='{if(length>'"$RANGE1"'){ print substr('"$0 "',0, length-'"$RANGE2"'}) } else { print '"$0 "'} }'
for report in ${ACTUAL_TARGET_FOLDER}/* ; do
awk $AWK_EXPRESSION $report > $target_file
done
If I trigger the AWK command, I get unexpected newline or end of string near print.
What am I missing?
Why are you trying to store the awk body in a shell variable? Just use awk and the -v option to pass a shell value into an awk variable:
awk -v range1="$RANGE1" -v range2="$RANGE2" '{
if (length > range1) {
print substr($0,0, length-range2)
} else {
print
}
}' "$ACTUAL_TARGET_FOLDER"/* > "$target_file"
Add a few newlines to help readability.
Get out of the habit of using ALLCAPS variable names, leave those as reserved by the shell. One day you'll write PATH=something and then wonder why your script is broken.
Unquoted variables are subject to word splitting and glob expansion. Use double quotes for all your variables unless you know what specific side-effect you want to use.
I would recommend writing the AWK program using AWK variables instead of interpolating variables into it from the shell. You can pass variable into awk on the command line using the -v command line option.
Also, awk permits using white space to make the program readable, just like other programming languages. Like this:
AWK_EXPRESSION='{
if (length > RANGE1) {
print substr($0, 1, length-RANGE2)
} else {
print
}
}'
for report in "${ACTUAL_TARGET_FOLDER}"/* ; do
awk -v RANGE1="$RANGE1" -v RANGE2="$RANGE2" "$AWK_EXPRESSION" "$report" > "$target_file"
done

Move text up a line in Unix

I have a text file containing a suite of jobs that look like the below job. What I would like to do is move the JOB_NAME onto the delete_job: line for each job to look like
delete_job: JOB_NAME
I have tried many ways but can't get it to work! Any ideas how this can be achieved?
delete_job:
JOB_NAME
job_type: BOX
description: "*******"
date_conditions: 0
owner: *******
alarm_if_fail: 1
EDIT2: As per ED sir's suggestion adding 1 more solution too now.
awk 'val{print val OFS $0;val="";next}/delete_job/||/update_job/||/insert_job/{val=$0;next} 1' Input_file
EDIT: as OP told many strings could be there to search so changing accordingly now.
awk '/delete_job/||/update_job/||/insert_job/{val=$0;getline;print val OFS $0;next} 1' Input_file
Could you please try following and let me know if this helps you.
awk '/delete_job/{val=$0;getline;print val OFS $0;next} 1' Input_file
With good old ed:
ed infile <<'EOE'
g/^delete_job:$/ s/$/ /\
.,+j
wq
EOE
This collects all lines that match ^delete_job:$ with the g// global command; s/$/ / appends a space to that line, and .,+j joins it with the next line before wq writes the buffer back to the file and exits.
A sed solution:
sed '/^delete_job:$/{N;s/\n/ /;}' filename
The right way to do this with awk (assuming multiple possible "action_job" lines) is to just save the "action_job" line when you see it and then print it as a prefix when you're printing the next line:
awk '/(delete|update|insert)_job/{act=$0 OFS; next} {print act $0; act=""}'
I'd use awk, perhaps something like
<yourfile awk '
BEGIN { output = 1 }
/delete_job:/ { output = 0; next; }
output == 0 { print "delete_job: " $0; output = 1; next; }
output == 1 { print }' > newfile
Maybe that helps. The result will be written to newfile.
two other awks
$ awk '/^delete_job:/{printf "%s", $0 OFS; next}1' file
or
$ awk '{ORS=/^delete_job:/?OFS:RS}1' file
With paste, head and tail:
$ (head -n2 | paste -d' ' -s; tail -n+1) < filename.txt

Replace a string which is present on first line in UNIX file

I would like to replace a string which is present on the first line though it is there on rest of the lines in the file as well. How can i do that through a shell script? Can someone help me regarding this. My code is as below. I am extracting the first line from the file and after that I am not sure how to do a replace. Any help would be appreciated. Thanks.
Guys -I would like to replace a string present in $line and write the new line into the same file at same place.
Code:
while read line
do
if [[ $v_counter == 0 ]] then
echo "$line"
v_counter=$(($v_counter + 1));
fi
done < "$v_Full_File_Nm"
Sample data:
Input
BUXT_CMPID|MEDICAL_RECORD_NUM|FACILITY_ID|PATIENT_LAST_NAME|PATIENT_FIRST_NAME|HOME_ADDRESS_LINE_1|HOME_ADDRESS_LINE_2|HOME_CITY|HOME_STATE|HOME_ZIP|MOSAIC_CODE|MOSAIC_DESC|DRIVE_TIME| buxt_pt_apnd_20140624_head_5records.txt
100106086|5000120878|7141|HARRIS|NEDRA|6246 PARALLEL PKWY||KANSAS CITY|KS|66102|S71|Tough Times|2|buxt_pt_apnd_20140624_head_5records.txt
Output
BUXT_CMPID|MEDICAL_RECORD_NUM|FACILITY_ID|PATIENT_LAST_NAME|PATIENT_FIRST_NAME|HOME_ADDRESS_LINE_1|HOME_ADDRESS_LINE_2|HOME_CITY|HOME_STATE|HOME_ZIP|MOSAIC_CODE|MOSAIC_DESC|DRIVE_TIME| SRC_FILE_NM
100106086|5000120878|7141|HARRIS|NEDRA|6246 PARALLEL PKWY||KANSAS CITY|KS|66102|S71|Tough Times|2|buxt_pt_apnd_20140624_head_5records.txt
From the above sample data I need to replace buxt_pt_apnd_20140624_head_5records.txt with SRC_FILE_NAME string.
Why not use sed?
sed -e '1s/fred/frog/' yourfile
will replace fred with frog on line 1.
If your 'string' is a variable, you can do this to get the variable expanded:
sed -e "1s/$varA/$varB/" yourfile
If you want to do it in place and change your file, add -i before -e.
awk -v old="string1" -v new="string2" '
NR==1 && (idx=index($0,old)) {
$0 = substr($0,1,idx-1) new substr($0,idx+length(old))
}
1' file > /usr/tmp/tmp$$ && mv /usr/tmp/tmp$$ file
The above will replace string1 with string2 only when it appears in the first line of file.
Any solution posted that uses awk but does not use index will not work in general. Same for any solution posted that uses sed. The reason is that those would work on REs, not strings and so behave undesirably for string replacement depending what characters are present in string1.
Looks like the OPs going with a sed RE-replacement solution so this is just for anyone else looking to replace a string: Here's what a string replacement function would look like if youd rather not have it inline:
awk -v old="string1" -v new="string2" '
function strsub(old,new,tgt, idx) {
if ( idx = index(tgt,old) ) {
tgt = substr(tgt,1,idx-1) new substr(tgt,idx+length(old))
}
return tgt
}
NR==1 { $0 = strsub(old,new,$0) }
1' file
A bash solution:
file="afile.txt"
str="hello"
repl="goodbye"
IFS= read -r line < "$file"
line=${line/$str/$repl}
tmpfile="/usr/tmp/$file.$$.tmp"
{
echo "$line"
tail -n+2 "$file"
} > "$tmpfile" && mv "$tmpfile" "$file"
Note that $str above will be interpreted as a "pattern" (a simple kind of regex) where * matches any number of any characters, ? matches any single character, [abc] matches any one of the characters in the brackets, and [^abc] (or [!abc]) matches any one character not in the brackets. See Pattern-Matching

Converting dates in AWK

I have a file containing many columns of text, including a timestamp along the lines of Fri Jan 02 18:23 and I need to convert that date into MM/DD/YYYY HH:MM format.
I have been trying to use the standard `date' tool with awk getline to do the conversion, but I can't quite figure out how to pass the fields into the 'date' command in the format it expects (quoted with " or 's,) as getline needs the command string enclosed in quotes too.
Something like "date -d '$1 $2 $3 $4' +'%D %H:%M'" | getline var
Now that I think about it, I guess what I'm really asking is how to embed awk variables into a string.
If you're using gawk, you don't need the external date which can be expensive to call repeatedly:
awk '
BEGIN{
m=split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",d,"|")
for(o=1;o<=m;o++){
months[d[o]]=sprintf("%02d",o)
}
format = "%m/%d/%Y %H:%M"
}
{
split($4,time,":")
date = (strftime("%Y") " " months[$2] " " $3 " " time[1] " " time[2] " 0")
print strftime(format, mktime(date))
}'
Thanks to ghostdog74 for the months array from this answer.
you can try this. Assuming just the date you specified is in the file
awk '
{
cmd ="date \"+%m/%d/%Y %H:%M\" -d \""$1" "$2" "$3" "$4"\""
cmd | getline var
print var
close(cmd)
}' file
output
$ ./shell.sh
01/02/2010 18:23
and if you are not using GNU tools, like if you are in Solaris for example, use nawk
nawk 'BEGIN{
m=split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",d,"|")
for(o=1;o<=m;o++){
months[d[o]]=sprintf("%02d",o)
}
cmd="date +%Y"
cmd|getline yr
close(cmd)
}
{
day=$3
mth=months[$2]
print mth"/"day"/"yr" "$4
} ' file
I had a similar issue converting a date from RRDTool databases using rrdfetch but prefer one liners that I've been using since Apollo computer days.
Data looked like this:
localTemp rs1Temp rs2Temp thermostatMode
1547123400: 5.2788174937e+00 4.7788174937e+00 -8.7777777778e+00 2.0000000000e+00
1547123460: 5.1687014581e+00 4.7777777778e+00 -8.7777777778e+00 2.0000000000e+00
One liner:
rrdtool fetch -s -14400 thermostatDaily.rrd MAX | sed s/://g | awk '{print "echo ""\`date -r" $1,"\`" " " $2 }' | sh
Result:
Thu Jan 10 07:25:00 EST 2019 5.3373432378e+00
Thu Jan 10 07:26:00 EST 2019 5.2788174937e+00
On the face of it this doesn't look very efficient to me but this kind of methodology has always proven to be fairly low overhead under most circumstances even for very large files on very low power computer (like 25Mhz NeXT Machines). Yes Mhz.
Sed deletes the colon, awk is used to print the other various commands of interest including just echoing the awk variables and sh or bash executes the resulting string.
For methodology or large files or streams I just head the first few lines and gradually build up the one liner. Throw away code.

crontab report of what runs in a specified start and end datetime

Are there any tools or reports out there that given a crontab file can output which jobs run within a specified time-frame.
Our crontab file has become very large and our system administrators struggle to find out which jobs need to be rerun when we have scheduled downtime on the server. We're trying to figure out which jobs we need to run.
I was planning on writing my own script but wondering if there was something out there already
One thing you can do is:
Get Perl module Schedule::Cron
Modify it to sleep only optionally (create "fast-forward" mode and where it does sleep($sleep) change to do nothing when fast-forwarding. This will also require changing $now = time; call to do $now++.
Modify it to be able indicate start and end times for emulation.
Create a Perl one-liner which takes the output of crontab -l and converts it into similar contab but one which replaces command cmd1 arg1 arg2 with a perl subroutine sub { print "Execution: cmd1 arg1 arg2\n"}
Run the scheduler in the fast-forward mode, as indicated in the POD.
It will read in your modified crontab, and emulate the execution.
There is a fine and clean solution for a 'simulation mode' of Schedule::Cron (and for any other module using sleep,time,alarm internally without modifying Schedule::Cron itself. You can use Time::Mock for throtteling, e.g. with
perl -MTime::Mock=throttle,600 schedule.pl
one can speedup your 'time machine' by a factor 600 (so, instead of sleeping for 10 minutes it will only sleep a second). Please refer to the manpage of Time::Mock for more details.
For using a crontab file directly with Schedule::Cron you should be able to take the example from the README directly:
use Schedule::Cron;
my $cron = new Schedule::Cron(sub { system(shift) },
file => "/var/spool/crontab.perl");
$cron->run();
The trick here is to use a default dispatcher method which calls system() with the stored parameters. Please let me know, whether this will work for you or whether it will need to be fixed. Instead of system, you could use print as well, of course.
Here's a similar approach to DVK's but using Perl module Schedule::Cron::Events.
This is very much a "caveat user" posting - a starting point. Given this crontab file a_crontab.txt:
59 21 * * 1-5 ls >> $HOME/work/stack_overflow/cron_ls.txt
# A comment
18 09 * * 1-5 echo "wibble"
The below script cron.pl, run as follows, gives:
$ perl cron.pl a_crontab.txt "2009/11/09 00:00:00" "2009/11/12 00:00:00"
2009/11/09 09:18:00 "echo "wibble""
2009/11/09 21:59:00 "ls >> $HOME/work/stack_overflow/cron_ls.txt"
2009/11/10 09:18:00 "echo "wibble""
2009/11/10 21:59:00 "ls >> $HOME/work/stack_overflow/cron_ls.txt"
2009/11/11 09:18:00 "echo "wibble""
2009/11/11 21:59:00 "ls >> $HOME/work/stack_overflow/cron_ls.txt"
2009/11/12 09:18:00 "echo "wibble""
2009/11/12 21:59:00 "ls >> $HOME/work/stack_overflow/cron_ls.txt"
Prototype (!) script:
use strict;
use warnings;
use Schedule::Cron::Events;
my $crontab_file = shift || die "! Must provide crontab file name";
my $start_time = shift || die "! Must provide start time YYYY/MM/DD HH:MM:SS";
my $stop_time = shift || die "! Must provide stop time YYYY/MM/DD HH:MM:SS";
open my $fh, '<', $crontab_file or die "! Could not open file $crontab_file for reading: $!";
my $table = [];
while ( <$fh> ) {
next if /^\s*$/;
next if /^\s*#/;
chomp;
push #$table, new Schedule::Cron::Events( $_, Date => [ smhdmy_from_iso( $start_time ) ] );
}
close $fh;
my $events = [];
for my $cron ( #$table ) {
my $event_time = $stop_time;
while ( $event_time le $stop_time ) {
my ( $sec, $min, $hour, $day, $month, $year ) = $cron->nextEvent;
$event_time = sprintf q{%4d/%02d/%02d %02d:%02d:%02d}, 1900 + $year, 1 + $month, $day, $hour, $min, $sec;
push #$events, qq{$event_time "} . $cron->commandLine . q{"};
}
}
print join( qq{\n}, sort #$events ) . qq{\n};
sub smhdmy_from_iso {
my $input = shift;
my ( $y, $m, $d, $H, $M, $S ) = ( $input =~ m=(\d{4})/(\d\d)/(\d\d) (\d\d):(\d\d):(\d\d)= );
( $S, $M, $H, $d, --$m, $y - 1900 );
}
Hope you can adapt to your needs.

Resources