Please help to calculate Moving/Rolling back Weekly Sum of Amount($4) based on Distributor wise ($2) and Rolling Date wise.
Want to set vaiable like
RollingStartDate ==01/05/2015 and RollingInterval==7 and RollingEndDate ==08/05/2015
For Example :
1st May 2015 Rolling 7 Days data set would be from 01/05/2015 to 25/04/2015
2nd May 2015 Rolling 7 Days data set would be from 02/05/2015 to 26/04/2015
....................................................................
7th May 2015 Rolling 7 Days data set would be from 07/05/2015 to 01/05/2015
8th May 2015 Rolling 7 Days data set would be from 08/05/2015 to 02/05/2015
Input.csv
Des,Date,Distributor,Amount,Loc
aaa,25/04/2015,abc123,25,bbb
aaa,25/04/2015,xyz456,75,bbb
aaa,26/04/2015,xyz456,50,bbb
aaa,27/04/2015,abc123,250,bbb
aaa,27/04/2015,abc123,100,bbb
aaa,29/04/2015,xyz456,50,bbb
aaa,30/04/2015,abc123,25,bbb
aaa,01/05/2015,xyz456,75,bbb
aaa,01/05/2015,abc123,50,bbb
aaa,02/05/2015,abc123,25,bbb
aaa,02/05/2015,xyz456,75,bbb
aaa,04/05/2015,abc123,30,bbb
aaa,04/05/2015,xyz456,35,bbb
aaa,05/05/2015,xyz456,12,bbb
aaa,06/05/2015,abc123,32,bbb
aaa,06/05/2015,xyz456,43,bbb
aaa,07/05/2015,xyz456,87,bbb
aaa,08/05/2015,abc123,58,bbb
aaa,08/05/2015,xyz456,98,bbb
Example: 8th May 2015 Rolling 7 Days data set would be from 08/05/2015 to 02/05/2015
aaa,02/05/2015,abc123,25,bbb
aaa,02/05/2015,xyz456,75,bbb
aaa,04/05/2015,abc123,30,bbb
aaa,04/05/2015,xyz456,35,bbb
aaa,05/05/2015,xyz456,12,bbb
aaa,06/05/2015,abc123,32,bbb
aaa,06/05/2015,xyz456,43,bbb
aaa,07/05/2015,xyz456,87,bbb
aaa,08/05/2015,abc123,58,bbb
aaa,08/05/2015,xyz456,98,bbb
Output for 8th May 2015 Rolling 7 Days data set
RollingDate,Distributor,Amount
08/05/2015,abc123,145
08/05/2015,xyz456,350
I am able to obtain the above output from this command :
awk -F, '{key=$3;b[key]=b[key]+$4} END {for(i in a) print i","b[i]}'
Kindly suggest how to derive weekly split-up data sets then Sum.
Desired Output:
RollingDate,Distributor,Amount
01/05/2015,abc123,450
01/05/2015,xyz456,250
02/05/2015,abc123,450
02/05/2015,xyz456,250
03/05/2015,abc123,450
03/05/2015,xyz456,200
04/05/2015,abc123,130
04/05/2015,xyz456,235
05/05/2015,abc123,130
05/05/2015,xyz456,247
06/05/2015,abc123,162
06/05/2015,xyz456,240
07/05/2015,abc123,137
07/05/2015,xyz456,327
08/05/2015,abc123,145
08/05/2015,xyz456,350
Edit#1
1.
The logic is to find a Sum of Amount is billed to the distributor for the period of 7days range, i.e if i need to calculate sum for 1st May then I need to consider the line items from 1st May,30th Apr,29th Apr,28th Apr,27th Apr,26th Apr and 25th Apr , It is equivalent to 1st May (-) minus 6 days back ... like wise 2nd May rolling date is equal to from 2nd May to 26th May ( 2nd May minus 6 days back ..)
2.
Date format is DD/MM/YYYY - 02/05/2015 is 2nd May
Since the file contains 2 to 3 months deatils , dont want to select the first date (25/04/2015) from file then do minus 6 days back analysis , hence "RollingStartDate" will help from which dates need to consider the data , "RollingInterval" will help to do the analysis for "7 days" moving back or "14 days" moving back or "30 days monthly " moving back analysis.
"RollingEndDate" will help to avoid if actual file contains any future date data availabe , in this case if 09th or 15th may date line items need to be excluded ...
Here's a solution that just excludes dates that don't have 7 days before them instead of requiring a specific start/stop range:
$ cat tst.awk
BEGIN { FS=OFS=","; window=(window?window:7); secsPerDay=24*60*60 }
NR==1 { print "RollingDate", $3, $4; next }
{
endSecs = mktime(gensub(/(..)\/(..)\/(....)/,"\\3 \\2 \\1 0 0 0","",$2))
if (begSecs=="") {
begSecs = endSecs + ((window-1) * secsPerDay)
}
amount[endSecs][$3] += $4
dists[$3]
}
END {
for (currSecs=begSecs; currSecs<=endSecs; currSecs+=secsPerDay) {
for (dayNr=1; dayNr<=window; dayNr++) {
rollSecs = currSecs - ((dayNr-1) * secsPerDay)
for (dist in dists) {
sum[dist] += (rollSecs in amount ? amount[rollSecs][dist] : 0)
}
}
for (dist in dists) {
print strftime("%d/%m/%Y",currSecs), dist, sum[dist]
delete sum[dist]
}
}
}
.
$ awk -f tst.awk file
RollingDate,Distributor,Amount
01/05/2015,xyz456,250
01/05/2015,abc123,450
02/05/2015,xyz456,250
02/05/2015,abc123,450
03/05/2015,xyz456,200
03/05/2015,abc123,450
04/05/2015,xyz456,235
04/05/2015,abc123,130
05/05/2015,xyz456,247
05/05/2015,abc123,130
06/05/2015,xyz456,240
06/05/2015,abc123,162
07/05/2015,xyz456,327
07/05/2015,abc123,137
08/05/2015,xyz456,350
08/05/2015,abc123,145
.
To use some different window size than 7 days, just set it on the command line:
$ awk -v window=5 -f tst.awk file
RollingDate,Distributor,Amount
29/04/2015,xyz456,175
29/04/2015,abc123,375
30/04/2015,xyz456,100
30/04/2015,abc123,375
01/05/2015,xyz456,125
01/05/2015,abc123,425
02/05/2015,xyz456,200
02/05/2015,abc123,100
03/05/2015,xyz456,200
03/05/2015,abc123,100
04/05/2015,xyz456,185
04/05/2015,abc123,130
05/05/2015,xyz456,197
05/05/2015,abc123,105
06/05/2015,xyz456,165
06/05/2015,abc123,87
07/05/2015,xyz456,177
07/05/2015,abc123,62
08/05/2015,xyz456,275
08/05/2015,abc123,120
The above uses GNU awk for true 2D arrays and time functions. Hopefully it's clear enough that you can make any modifications you need to include/exclude specific date ranges.
I am using the following function to create a relative DateTime comparison string such as: Today (12 Minutes Ago), or Yesterday (21 Hours Ago), or 3/3/2015 (3 Days Ago).
The function is failing if I have a DateTime comparison between 1 and 2 days, so for example:
If the current time is: 3/6/2015 8:30pm and the comparison time is 3/4/2015 9:00pm
I get: 3/4/2015 (1 Day Ago)
When I should be getting: 3/4/2015 (2 Days Ago).
But what is interesting is that if I have a time comparison of 3/4/2015 7:00pm, it will return 3/4/2015 (2 Days Ago).
What's going on?
Public Function GetRelativeTime(givenDate As DateTime) As String
If (givenDate.Date = DateTime.Today) Then
Return "Today " + ConvertTimeSpanToRelativeTime(DateTime.Now.Subtract(givenDate))
ElseIf (givenDate.Date = DateTime.Today.AddDays(-1)) Then
Return "Yesterday " + ConvertTimeSpanToRelativeTime(DateTime.Now.Subtract(givenDate))
Else
Return givenDate.ToString("d") + " " + ConvertTimeSpanToRelativeTime(DateTime.Now.Subtract(givenDate))
End If
End Function
Private Shared Function ConvertTimeSpanToRelativeTime(diffDate As TimeSpan) As String
Dim d As New StringBuilder()
If diffDate.Days > 0 Then
d.AppendFormat("({0} {1} ago)", diffDate.Days, If(diffDate.Days > 1, "Days", "Day"))
ElseIf diffDate.Hours > 0 Then
d.AppendFormat("({0} {1} ago)", diffDate.Hours, If(diffDate.Hours > 1, "Hours", "Hour"))
ElseIf diffDate.Minutes > 0 Then
d.AppendFormat("({0} {1} ago)", diffDate.Minutes, If(diffDate.Minutes > 1, "Minutes", "Minute"))
ElseIf diffDate.Seconds > 0 Then
d.AppendFormat("({0} {1} ago)", diffDate.Seconds, If(diffDate.Seconds > 1, "Seconds", "Seconds"))
ElseIf diffDate.Milliseconds > 0 Then
d.AppendFormat("(Just Now)", diffDate.Milliseconds)
End If
Return d.ToString()
End Function
I think Steve Wellens nailed it. Basically you're only showing the day portion of 1 day, 23 hours, and 30 minutes. If you really want it to show up as 2 days, then instead of your
DateTime.Now.Subtract(givenDate)
you could use something like
DateDiff(DateInterval.Day, DateTime.Now.Date, givenDate.Date)
I believe this should give you the difference in days as if those two dates were set to midnight, which would be 2 in your case above. Then you could use a DateDiff by hours if this wasn't greater than 0, depending on how you want to play it. I think DateDiff returns a long though so you'll have to adjust your function.
using if statement if field one + 1 day equal field two.
using if statement if field one + 1 month equal field two.
I have this input
09-11-2013 09-12-2013
10-02-2013 10-02-2013
26-10-2013 27-10-2013
12-01-2013 12-02-2013
22-02-2013 23-02-2013
I used this code but it works with years only:
awk '{if ($1+1==$2) print }'
Have a look at the mktime funktin in awk
With it you can convert date to seconds so it be easy to compare.
This prints how many days there are between $1 and $2
awk '{split($1, sd, "-");split($2, ed, "-");print $0,(mktime(ed[3] s ed[2] s ed[1] s 0 s 0 s 0)-mktime(sd[3] s sd[2] s sd[1] s 0 s 0 s 0))/86400}' s=' ' file
09-11-2013 09-12-2013 30
10-02-2013 10-02-2013 0
26-10-2013 27-10-2013 1
12-01-2013 12-02-2013 31
22-02-2013 23-02-2013 1
Her it prints 1 of its one day, and 2 if its one month.
It take in count that February may have 28 or 29 days
awk '
BEGIN {
arr="31,28,31,30,31,30,31,31,30,31,30,31"
split(arr, month, ",")
x=0}
{
split($1, sd, "-")
split($2, ed, "-")
t=(mktime(ed[3] s ed[2] s ed[1] s 0 s 0 s 0)-mktime(sd[3] s sd[2] s sd[1] s 0 s 0 s 0))/86400
month[2]=sd[3]%4==0?29:28
}
t==month[sd[2]+0] {x=2}
t==1 {x=1}
{print $0,x
x=0}
' s=' ' file
09-11-2013 09-12-2013 2
10-02-2013 10-02-2013 0
26-10-2013 27-10-2013 1
12-01-2013 12-02-2013 2
22-02-2013 23-02-2013 1
Help needed. I want to increment Date (which is a string) column in csv by one day.
e.g. (Date Format yyyy-MM-dd)
Col1,Col2,Col3
ABC,001,1900-01-01
XYZ,002,2000-01-01
Expected OutPut
Col1,Col2,Col3
ABC,001,1900-01-02
XYZ,002,2000-01-02
There's one standard Unix utility that has all the date magic from September 14, 1752 through December 31, 9999 built-in: the calendar cal. Instead of reinventing the wheel and do messy date calculations we will use its intelligence to our advantage. The basic problem is: given a date, is it the last day of a month? If not, simply increment the day. If yes, reset day to 1 and increment month (and possibly year).
However, the output of cal is unspecified and it may look like this:
$ cal 2 1900
February 1900
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28
What we would need is a list of days, 1 2 3 ... 28. We can do this by skipping everything up to the "1":
set -- $(cal 2 1900)
while test $1 != 1; do shift; done
Now the number of args gives us the number of days in February 1900:
$ echo $#
28
Putting it all together in a script:
#!/bin/sh
read -r header
printf "%s\n" "$header"
while IFS=,- read -r col1 col2 y m d; do
case $m-$d in
(12-31) y=$((y+1)) m=01 d=01;;
(*)
set -- $(cal $m $y)
# Shift away the month and weekday names.
while test $1 != 1; do shift; done
# Is the day the last day of a month?
if test ${d#0} -eq $#; then
# Yes: increment m and reset d=01.
m=$(printf %02d $((${m#0}+1)))
d=01
else
# No: increment d.
d=$(printf %02d $((${d#0}+1)))
fi
;;
esac
printf "%s,%s,%s-%s-%s\n" "$col1" "$col2" $y $m $d
done
Running it on this input:
Col1,Col2,Col3
ABC,001,1900-01-01
ABC,001,1900-02-28
ABC,001,1900-12-31
XYZ,002,2000-01-01
XYZ,002,2000-02-28
XYZ,002,2000-02-29
yields
Col1,Col2,Col3
ABC,001,1900-01-02
ABC,001,1900-03-01
ABC,001,1901-01-01
XYZ,002,2000-01-02
XYZ,002,2000-02-29
XYZ,002,2000-03-01
I made one little assumption: The first two columns don't contain a - or escaped comma. If they do, the IFS=,- read will act up.
Using the date command, this can be done in awk:
awk 'BEGIN{FS=OFS=","}NR>1{("date -d\""$3" +1 day\" +%Y-%m-%d")|getline newdate; $3=newdate; print}' file.in
If you can extract the date from the file, you can use this:
d="1900-01-01" # date from file
date --date '#'$(( $(date --date $d +"%s") + 86400 ))