Calculating time difference in sas (military time) - datetime

I am having trouble figuring out how to calculate duration of a time variable
Any thoughts on how to tackle this?

A military time value encoded as a integer number h,hmm can be processed by converting the number to a SAS time value and then performing delta computations using certain assumptions.
data sleep_log;
input name $ boots_down boots_up;
datalines;
Joe 2000 0600 slept over midnight
Joe 1000 1230 slept into lunch
Joe 1630 1700 30 winks
Joe 0100 0100 out cold!
run;
data sleep_data;
set sleep_log;
down = hms(
int(boots_down / 100) /* extract hours */
, mod(boots_down , 100) /* extract minutes */
, 0 /* seconds not logged, use zero */
);
up = hms(
int(boots_up / 100) /* extract hours */
, mod(boots_up , 100) /* extract minutes */
, 0 /* seconds not logged, use zero */
);
* SAS time values are linear and simple arithmetic can apply;
if up <= down
then delta = '24:00't + up - down; /* presume roll over midnight */
else delta = up - down;
format down up delta time5.;
run;
A more robust log would also record the day, eliminating presumptions and providing a proper time dimension.

You can extract the Hours and Minutes from your numeric military time HHMM , then create a SAS time using HMS() function.
Extract Hours: Divide your HHMM by 100 and save as integer to get hours,
Extract Minutes: get the Remainder (MOD) of HHMM by 100 to get the minutes,
Create a new time variable using HMS(Hour,Minute,Second),
Create a new Datetime for each using DHMS(date,hour,minute,second)
Full Code:
data have;
input sleep awake date_s date_w;
informat date_s date9. date_w date9.;
format sleep z4. awake z4. date_s date9. date_w date9.;
datalines;
2300 0500 12feb2018 13feb2018
2000 0300 11feb2018 12feb2018
0530 1230 10feb2018 10feb2018
;
run;
data want;
set have;
new_sleep_time=hms(int(sleep/100),int(mod(sleep,100)),0);
new_awake_time=hms(int(awake/100),int(mod(awake,100)),0);
dt_awake=dhms(date_w,hour(new_awake_time),minute(new_awake_time),0);
dt_sleep=dhms(date_s,hour(new_sleep_time),minute(new_sleep_time),0);
diff=dt_awake-dt_sleep;
keep new_sleep_time new_awake_time dt_awake dt_sleep diff;
format new_sleep_time time8. new_awake_time time8. diff time8. dt_awake datetime21. dt_sleep datetime21.;
run;
Output:
new_sleep_time=23:00:00 new_awake_time=5:00:00 diff=6:00:00 dt_awake=13FEB2018:05:00:00 dt_sleep=12FEB2018:23:00:00
new_sleep_time=20:00:00 new_awake_time=3:00:00 diff=7:00:00 dt_awake=12FEB2018:03:00:00 dt_sleep=11FEB2018:20:00:00
new_sleep_time=5:30:00 new_awake_time=12:30:00 diff=7:00:00 dt_awake=10FEB2018:12:30:00 dt_sleep=10FEB2018:05:30:00

Related

What does NNN mean in date format <YYMMDDhhmmssNNN><C|D|G|H>?

hi I has date format and I want converted to correct GMT date :
<YYMMDDhhmmssNNN><C|D|G|H>
Sample value on that date:
210204215026000C
I get this explanation for part NNN :
NNN If flag is C or D then NNN is the number of hours relativeto GMT,
if flag is G or H, NNN is the number of quarter hours relative to GMT
C|D|G|H C and G = Ahead of GMT, D and H = Behind GMT
but I did not get how number of hours relative to GMT can present on 3 digits ? it should be in 2 digit as i knew the offset for hours related to GMT is from 0 to 23 , and also what quarter hours relative to GMT mean ?
I want to use Scala or Java.
I don’t know why they set 3 digits aside for the offset. I agree with you that 2 digits suffice for all cases. Maybe they just wanted to be very sure they would never run of out space, and maybe they even overdid this a bit. 3 digits is not a problem as long as the actual values are within the range that java.time.ZoneOffset can handle, +/-18 hours. In your example NNN is 000, so 0 hours from GMT, which certainly is OK and trivial to handle.
A quarter hour is a quarter of an hour. As Salman A mentioned in a comment, 22 quarter hours ahead of Greenwich means an offset of +05:30, currently used in Sri Lanka and India. If the producer of the string wants to use this option, they can give numbers up to 72 (still comfortably within 2 digits). 18 * 4 = 72, so 18 hours equals 72 quarter hours. To imagine a situation where 2 digits would be too little, think an offset of 25 hours. I wouldn’t think it realistic, on the other hand no one can guarantee that it will never happen.
Java solution: how to parse and convert to GMT time
I am using these constants:
private static final Pattern DATE_PATTERN
= Pattern.compile("(\\d{12})(\\d{3})(\\w)");
private static final DateTimeFormatter FORMATTER
= DateTimeFormatter.ofPattern("uuMMddHHmmss");
private static final int SECONDS_IN_A_QUARTER_HOUR
= Math.toIntExact(Duration.ofHours(1).dividedBy(4).getSeconds());
Parse and convert like this:
String sampleValue = "210204215026000C";
Matcher matcher = DATE_PATTERN.matcher(sampleValue);
if (matcher.matches()) {
LocalDateTime ldt = LocalDateTime.parse(matcher.group(1), FORMATTER);
int offsetAmount = Integer.parseInt(matcher.group(2));
char flag = matcher.group(3).charAt(0);
// offset amount denotes either hours or quarter hours
boolean quarterHours = flag == 'G' || flag == 'H';
boolean negative = flag == 'D' || flag == 'H';
if (negative) {
offsetAmount = -offsetAmount;
}
ZoneOffset offset = quarterHours
? ZoneOffset.ofTotalSeconds(offsetAmount * SECONDS_IN_A_QUARTER_HOUR)
: ZoneOffset.ofHours(offsetAmount);
OffsetDateTime dateTime = ldt.atOffset(offset);
OffsetDateTime gmtDateTime = dateTime.withOffsetSameInstant(ZoneOffset.UTC);
System.out.println("GMT time: " + gmtDateTime);
}
else {
System.out.println("Invalid value: " + sampleValue);
}
Output is:
GMT time: 2021-02-04T21:50:26Z
I think my code covers all valid cases. You will probably want to validate that the flag is indeed C, D, G or H, and also handle the potential DateTimeException and NumberFormatException from the parsing and creating the ZoneOffset (NumberFormatException should not happen).

Convert Minutes to Seconds in Kusto

How do I convert the duration from Minutes to seconds . After the below step, I get the result in Minutes for Duration and I wanted to convert it to seconds for example - 00.04.19 to 259 seconds
| extend duration = ((EndTime - StartTime)/60)
| summarize duration= avg(duration) by EndTime```
Thanks
See: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/datetime-timespan-arithmetic
For example:
print timespan(01:23:45) / 1s

pyspark daterange calculations in spark

I am trying to process website login session data by each user. I am reading an S3 session log file into an RDD. The data looks something like this.
----------------------------------------
User | Site | Session start | Session end
---------------------------------------
Joe |Waterloo| 9/21/19 3:04 AM |9/21/19 3:18 AM
Stacy|Kirkwood| 8/4/19 3:06 PM |8/4/19 3:54 PM
John |Waterloo| 9/21/19 8:48 AM |9/21/19 9:05 AM
Stacy|Kirkwood| 8/4/19 4:16 PM |8/4/19 5:41 PM
...
...
I want to find out how many users were logged in each second of the hour on a given day.
Example: I might be processing this data for 9/21/19 only. So, I would need to remove all other records and then SUM user sessions for each second of the hour for all 24 hours of 9/21/19. The output should be possibly 24 rows for all the hours of 9/21/19 and then counts for each second of the day(yikes, second by second data!).
Is this something possible to do in pyspark using either rdds or DF?
(Apologize for the tardiness in building the grid).
Thanks
my dataset
data=[['Joe','Waterloo','9/21/19 3:04 AM','9/21/19 3:18 AM'],['Stacy','Kirkwood','8/4/19 3:06 PM','8/4/19 3:54 PM'],['John','Waterloo','9/21/19 8:48 AM','9/21/19 9:05 AM'],
['Stacy','Kirkwood','9/21/19 4:06 PM', '9/21/19 4:54 PM'],
['Mo','Hashmi','9/21/19 1:06 PM', '9/21/19 5:54 PM'],
['Murti','Hash','9/21/19 1:00 PM', '9/21/19 3:00 PM'],
['Floo','Shmi','9/21/19 9:10 PM', '9/21/19 11:54 PM']]
cSchema = StructType([StructField("User", StringType())\
,StructField("Site", StringType())
, StructField("Sesh-Start", StringType())
, StructField("Sesh-End", StringType())])
df= spark.createDataFrame(data,schema=cSchema)
display(df)
parse timestamp
df1=df.withColumn("Start", F.from_unixtime(F.unix_timestamp("Sesh-Start",'MM/dd/yyyy hh:mm aa'),'20yy-MM-dd HH:mm:ss').cast("timestamp")).withColumn("End", F.from_unixtime(F.unix_timestamp("Sesh-End",'MM/dd/yyyy hh:mm aa'),'20yy-MM-dd HH:mm:ss').cast("timestamp")).drop("Sesh-Start","Sesh-End")
build and register udf, for multiple hours per person
def yo(a,b):
from datetime import datetime
d1 = datetime.strptime(str(a), '%Y-%m-%d %H:%M:%S')
d2 = datetime.strptime(str(b), '%Y-%m-%d %H:%M:%S')
y=[]
if d1.hour == d2.hour:
y.append(d1.hour)
else:
for i in range(d1.hour,d2.hour+1):
y.append(i)
return y
rng= udf(yo, ArrayType(IntegerType()))
explode list of hours into column
df2=df1.withColumn("new", rng(F.col("Start"),F.col("End"))).withColumn("new1",F.explode("new")).drop("new")
get seconds for each hour
df3=df2.withColumn("Seconds", when(F.hour("Start")==F.hour("End"), F.col("End").cast('long') - F.col("Start").cast('long'))
.when(F.hour("Start")==F.col("new1"), 3600-F.minute("Start")*60)
.when(F.hour("End")==F.col("new1"), F.minute("End")*60)
.otherwise(3600))
create temp view and query it
df3.createOrReplaceTempView("final")
display(spark.sql("Select new1, sum(Seconds) from final group by new1 order by new1"))
The above answer by Lennart could be more perfomant because he uses a join to get all the different hours, instead I use a UDF which could be slower. My code will work for any user who can be online for any amount of hours. My data used only the day required, so you could use day filter given above to limit your query to the day in question.. Final output
Try to check this:
Initiaize filter.
val filter = to_date("2019-09-21")
val startFilter = to_timestamp("2019-09-21 00:00:00.000")
val endFilter = to_timestamp("2019-09-21 23:59:59.999")
Generate range (0 .. 23).
hours = spark.range(24).collect()
Get actual user sessions that match the filter.
df = sessions.alias("s") \
.where(filter >= to_date(s.start) & filter <= to_date(s.end)) \
.select(s.user, \
when(s.start < startFilter, startFilter).otherwise(s.start).alias("start"), \
when(s.end > endFilter, endFilter).otherwise(s.end).alias("end"))
Combine match user sessions with range of hours.
df2 = df.join(hours, hours.id.between(hour(df.start), hour(df.end)), 'inner') \
.select(df.user, hours.id.alias("hour"), \
(when(hour(df.end) > hours.id, 360).otherwise(minute(df.end) * 60 + second(df.end)) - \
when(hour(df.start) < hours.id, 0).otherwise(minute(df.start) * 60 + second(df.start))).alias("seconds"))
Generate summary: calculate users count and sum of seconds for each hour of sessions.
df2.groupBy(df2.hour)\
.agg(count(df2.user).alias("user counts"), \
sum(dg2.seconds).alias("seconds")) \
.show()
Hope this helps.

date and time (local time instead of UTC) (Maxima)

These lines give the date and time in UTC:
t:timedate(absolute_real_time() - (10*3600));
t0:substring(t,1,20);
t1:concat(substring(t,12,17), " ", substring(t,9,11), "/", substring(t,6,8), "/", substring(t,1,5));
t2:concat(substring(t,1,5), substring(t,6,8), substring(t,9,11), substring(t,12,14), substring(t,15,17), substring(t,18,20));
I know that '?\*autoconf\-version\*;' can give the Maxima version number, so maybe there is some undocumented way to get the local time.
Otherwise are there any ready-made functions that can convert
UTC time to local time given conditions for start/end of daylight saving time
e.g. UTC time to UK time (which is GMT/BST depending on the time of year)?
It's not clear to me exactly what you need, but perhaps the following helps. By the way, do you really need to extract the parts (year, month, day, etc)? If so, it might be more convenient to work directly in Lisp. See DECODE-UNIVERSAL-TIME at the Common Lisp Hyperspec (a web search will find it).
The timedate now (in the just-released Maxima 5.39) accepts an optional argument which is the time zone offset, in hours (plus or minus). The time zone offset may be noninteger (e.g. 2.5). Offset 0 indicates UTC. If the offset is omitted, the time is formatted in the local time zone.
(%i5) t:absolute_real_time();
(%o5) 3691202499
(%i6) timedate (t, 0);
(%o6) 2016-12-20 06:01:39+00:00
(%i7) timedate (t);
(%o7) 2016-12-19 22:01:39-08:00
Note that the daylight saving time flag is applied at the "time of the time". Here is a time from next summer, when daylight saving time is in effect.
(%i8) timedate (t + 6*30.25*24*3600);
(%o8) 2017-06-19 11:01:39-07:00
The parse_timedate function has also been (in Maxima 5.39) updated to recognize time zone offsets.
(%i9) parse_timedate ("2016-12-19 22:01:39-08:00");
(%o9) 3691202499
As with timedate if the offset is omitted, it is assumed to be in the local time zone.
(%i10) parse_timedate ("2016-12-19 22:01:39");
(%o10) 3691202499
Note also that Maxima does not recognize any symbolic time zone indicators such as "UTC", "GMT", "EDT", "America/New_York", etc., only numerical time zone offsets.
To clarify the problem, before revealing the solution:
these are the steps that I take in Maxima v5.30
to get the time in UTC, in a readable format:
Note: When I use Maxima v5.30 (in the UK),
for some unknown reason, the time is always UTC adjusted
by 10 hours, and does not adjust for DST.
/* 1st Jan 2017 12 noon: */
timedate(3692260800); /* "2017-01-01 22:00:00+10:00" */
timedate(3692260800-10*3600); /* "2017-01-01 12:00:00+10:00" */
substring(timedate(3692260800-10*3600),1,20); /* "2017-01-01 12:00:00" */
Note: timedate works better/differently in later versions of Maxima,
but some institutions recommend installing a specific version of Maxima.
Sometimes I want the date in the form: 'yyyyMMddHHmmss'.
A function for this is:
SecUTCToDate(vSec,vHour):=
block([d1,d2],
d1:timedate(vSec+vHour*3600),
d2:concat(substring(d1,1,5), substring(d1,6,8), substring(d1,9,11), substring(d1,12,14), substring(d1,15,17), substring(d1,18,20)),
parse_string(d2)
);
Note: [d1,d2] keeps those variables local to within the block, and not global.
To get the local time I have to add on hours based on my time zone (0 in the UK), and DST.
To calculate whether a time is within the DST period requires an individual function per time zone: in the UK, and many European countries, one such function is:
/* correct for the years 1900-2200 inclusive */
SecUTCIsDSTUK(vSec):=
block([vLeap,vDaysMar25,vDaysOct25,vWDayMar25,vWDayOct25,vRange1,vRange2],
vYear : parse_string(substring(timedate(vSec),1,5)),
vLeap : floor((vYear-1900)/4), if (vYear>=2100) then vLeap : vLeap-1,
vDaysMar25 : (vYear-1900)*365 + vLeap + 83,
vDaysOct25 : vDaysMar25 + 214,
vWDayMar25 : mod(vDaysMar25+1,7),
vWDayOct25 : mod(vDaysOct25+1,7),
vRange1 : (vDaysMar25+mod(-vWDayMar25,7))*86400 + 3600,
vRange2 : (vDaysOct25+mod(-vWDayOct25,7))*86400 + 3600,
if ((vSec >= vRange1) and (vSec < vRange2)) then 1 else 0);
You can create a mac file with such a function, and call up the the function when needed, e.g.:
load("C:\\MyFolder\\MyFile.mac");
SecUTCIsDSTUK(absolute_real_time());
SecUTCIsDSTUK(absolute_real_time()+86400*180);
thank you for your helpful response,
results (v. 5.39.0) (works fine, param 2 omitted gives local time, param 2 as 0 gives UTC):
t:3691202499;
timedate (t);
timedate (t + 6*30.25*24*3600);
timedate (t + 6*30*24*3600);
timedate (t, 0);
timedate (t + 6*30.25*24*3600, 0);
timedate (t + 6*30*24*3600, 0);
:lisp (decode-universal-time 3691202499)
:lisp (decode-universal-time 3691202499 0)
:lisp (decode-universal-time 3706754499)
:lisp (decode-universal-time 3706754499 0)
3691202499
"2016-12-20 06:01:39+00:00"
"2017-06-19 19:01:39+01:00"
"2017-06-18 07:01:39+01:00"
"2016-12-20 06:01:39+00:00"
"2017-06-19 18:01:39+00:00"
"2017-06-18 06:01:39+00:00"
39 1 6 20 12 2016 1 NIL 0
39 1 6 20 12 2016 1 NIL 0
39 1 7 18 6 2017 6 T 0
39 1 6 18 6 2017 6 NIL 0
results (v. 5.30.0) (it seems param 2 omitted gives UTC+10, with no daylight saving time):
(if this is true, I would have to find another way to get local time, possibly by Common LISP commands)
t:3691202499;
timedate (t);
timedate (t + 6*30.25*24*3600);
timedate (t + 6*30*24*3600);
:lisp (decode-universal-time 3691202499)
:lisp (decode-universal-time 3691202499 0)
:lisp (decode-universal-time 3706754499)
:lisp (decode-universal-time 3706754499 0)
3691202499
"2016-12-20 16:01:39+10:00"
"2017-06-20 04:01:39.0+10:00"
"2017-06-18 16:01:39+10:00"
39 1 16 20 12 2016 1 NIL -10
39 1 6 20 12 2016 1 NIL 0
39 1 16 18 6 2017 6 NIL -10
39 1 6 18 6 2017 6 NIL 0
(I can see that the timedate and decode-universal-time functions
have key differences between Maxima versions)
thank you for the website mention,
CLHS: Section The Environment Dictionary
http://clhs.lisp.se/Body/c_enviro.htm
is there a list of LISP commands that work in Maxima?
the main reason for the datestamp concerns:
to produce datestamps for filenames such as 'z title yyyymmddhhmmss.txt',
or for friendly dates inside those files such as 'hh:mm dd/mm/yyyy',
the string manipulation method was the simplest method
that I could successfully code (I don't explicitly need to extract individual d m y etc)

To calculate Moving/Rolling back Weekly (7 days) Sum:

Please help to calculate Moving/Rolling back Weekly Sum of Amount($4) based on Distributor wise ($2) and Rolling Date wise.
Want to set vaiable like
RollingStartDate ==01/05/2015 and RollingInterval==7 and RollingEndDate ==08/05/2015
For Example :
1st May 2015 Rolling 7 Days data set would be from 01/05/2015 to 25/04/2015
2nd May 2015 Rolling 7 Days data set would be from 02/05/2015 to 26/04/2015
....................................................................
7th May 2015 Rolling 7 Days data set would be from 07/05/2015 to 01/05/2015
8th May 2015 Rolling 7 Days data set would be from 08/05/2015 to 02/05/2015
Input.csv
Des,Date,Distributor,Amount,Loc
aaa,25/04/2015,abc123,25,bbb
aaa,25/04/2015,xyz456,75,bbb
aaa,26/04/2015,xyz456,50,bbb
aaa,27/04/2015,abc123,250,bbb
aaa,27/04/2015,abc123,100,bbb
aaa,29/04/2015,xyz456,50,bbb
aaa,30/04/2015,abc123,25,bbb
aaa,01/05/2015,xyz456,75,bbb
aaa,01/05/2015,abc123,50,bbb
aaa,02/05/2015,abc123,25,bbb
aaa,02/05/2015,xyz456,75,bbb
aaa,04/05/2015,abc123,30,bbb
aaa,04/05/2015,xyz456,35,bbb
aaa,05/05/2015,xyz456,12,bbb
aaa,06/05/2015,abc123,32,bbb
aaa,06/05/2015,xyz456,43,bbb
aaa,07/05/2015,xyz456,87,bbb
aaa,08/05/2015,abc123,58,bbb
aaa,08/05/2015,xyz456,98,bbb
Example: 8th May 2015 Rolling 7 Days data set would be from 08/05/2015 to 02/05/2015
aaa,02/05/2015,abc123,25,bbb
aaa,02/05/2015,xyz456,75,bbb
aaa,04/05/2015,abc123,30,bbb
aaa,04/05/2015,xyz456,35,bbb
aaa,05/05/2015,xyz456,12,bbb
aaa,06/05/2015,abc123,32,bbb
aaa,06/05/2015,xyz456,43,bbb
aaa,07/05/2015,xyz456,87,bbb
aaa,08/05/2015,abc123,58,bbb
aaa,08/05/2015,xyz456,98,bbb
Output for 8th May 2015 Rolling 7 Days data set
RollingDate,Distributor,Amount
08/05/2015,abc123,145
08/05/2015,xyz456,350
I am able to obtain the above output from this command :
awk -F, '{key=$3;b[key]=b[key]+$4} END {for(i in a) print i","b[i]}'
Kindly suggest how to derive weekly split-up data sets then Sum.
Desired Output:
RollingDate,Distributor,Amount
01/05/2015,abc123,450
01/05/2015,xyz456,250
02/05/2015,abc123,450
02/05/2015,xyz456,250
03/05/2015,abc123,450
03/05/2015,xyz456,200
04/05/2015,abc123,130
04/05/2015,xyz456,235
05/05/2015,abc123,130
05/05/2015,xyz456,247
06/05/2015,abc123,162
06/05/2015,xyz456,240
07/05/2015,abc123,137
07/05/2015,xyz456,327
08/05/2015,abc123,145
08/05/2015,xyz456,350
Edit#1
1.
The logic is to find a Sum of Amount is billed to the distributor for the period of 7days range, i.e if i need to calculate sum for 1st May then I need to consider the line items from 1st May,30th Apr,29th Apr,28th Apr,27th Apr,26th Apr and 25th Apr , It is equivalent to 1st May (-) minus 6 days back ... like wise 2nd May rolling date is equal to from 2nd May to 26th May ( 2nd May minus 6 days back ..)
2.
Date format is DD/MM/YYYY - 02/05/2015 is 2nd May
Since the file contains 2 to 3 months deatils , dont want to select the first date (25/04/2015) from file then do minus 6 days back analysis , hence "RollingStartDate" will help from which dates need to consider the data , "RollingInterval" will help to do the analysis for "7 days" moving back or "14 days" moving back or "30 days monthly " moving back analysis.
"RollingEndDate" will help to avoid if actual file contains any future date data availabe , in this case if 09th or 15th may date line items need to be excluded ...
Here's a solution that just excludes dates that don't have 7 days before them instead of requiring a specific start/stop range:
$ cat tst.awk
BEGIN { FS=OFS=","; window=(window?window:7); secsPerDay=24*60*60 }
NR==1 { print "RollingDate", $3, $4; next }
{
endSecs = mktime(gensub(/(..)\/(..)\/(....)/,"\\3 \\2 \\1 0 0 0","",$2))
if (begSecs=="") {
begSecs = endSecs + ((window-1) * secsPerDay)
}
amount[endSecs][$3] += $4
dists[$3]
}
END {
for (currSecs=begSecs; currSecs<=endSecs; currSecs+=secsPerDay) {
for (dayNr=1; dayNr<=window; dayNr++) {
rollSecs = currSecs - ((dayNr-1) * secsPerDay)
for (dist in dists) {
sum[dist] += (rollSecs in amount ? amount[rollSecs][dist] : 0)
}
}
for (dist in dists) {
print strftime("%d/%m/%Y",currSecs), dist, sum[dist]
delete sum[dist]
}
}
}
.
$ awk -f tst.awk file
RollingDate,Distributor,Amount
01/05/2015,xyz456,250
01/05/2015,abc123,450
02/05/2015,xyz456,250
02/05/2015,abc123,450
03/05/2015,xyz456,200
03/05/2015,abc123,450
04/05/2015,xyz456,235
04/05/2015,abc123,130
05/05/2015,xyz456,247
05/05/2015,abc123,130
06/05/2015,xyz456,240
06/05/2015,abc123,162
07/05/2015,xyz456,327
07/05/2015,abc123,137
08/05/2015,xyz456,350
08/05/2015,abc123,145
.
To use some different window size than 7 days, just set it on the command line:
$ awk -v window=5 -f tst.awk file
RollingDate,Distributor,Amount
29/04/2015,xyz456,175
29/04/2015,abc123,375
30/04/2015,xyz456,100
30/04/2015,abc123,375
01/05/2015,xyz456,125
01/05/2015,abc123,425
02/05/2015,xyz456,200
02/05/2015,abc123,100
03/05/2015,xyz456,200
03/05/2015,abc123,100
04/05/2015,xyz456,185
04/05/2015,abc123,130
05/05/2015,xyz456,197
05/05/2015,abc123,105
06/05/2015,xyz456,165
06/05/2015,abc123,87
07/05/2015,xyz456,177
07/05/2015,abc123,62
08/05/2015,xyz456,275
08/05/2015,abc123,120
The above uses GNU awk for true 2D arrays and time functions. Hopefully it's clear enough that you can make any modifications you need to include/exclude specific date ranges.

Resources