ISO 8601 Repeating Interval - datetime

Wikipedia gives an example of an ISO 8601 example of a repeating interval:
R5/2008-03-01T13:00:00Z/P1Y2M10DT2H30M
This is what this means:
R5 means that the interval after the slash is repeated 5 times.
2008-03-01T13:00:00Z means that the interval begins at this given datetime.
P1Y2M10DT2H30M means that the interval lasts for
1 year
2 months
10 days
2 hours
30 minutes
My problem is that I do not know exactly what is being repeated here. Does the repetition
occur immediately after the interval ends? Can I specify that every Monday something happens from 13:00 to 14:00?

The standard itself doesn't clarify, but the only obvious interpretation here is that the interval repeats back-to-back. So this recurring interval:
R2/2008-03-01T13:00:00Z/P1Y2M10DT2H30M
Will be equivalent to these non-recurring intervals:
2008-03-01T13:00:00Z/P1Y2M10DT2H30M
2009-05-01T15:30:00Z/P1Y2M10DT2H30M
(Note: my reading is that the number of repetitions does include the first occurrence)
There is no way to represent "every Monday from 13:00 to 14:00" inside of ISO 8601, but it's natural to do for a VEVENT in the iCalendar format. (If you could do that entirely within ISO 8601, then that would give rise to a slew of further feature requests)

Yes, ISO8601 does define a regular repeating interval (or as regular as a "month" can be as one of the units).
R5/2008-03-01T13:00:00Z/P1Y2M10DT2H30M
Should generate these times:
2009-05-11T15:30:00Z
2010-07-21T18:00:00Z
2011-10-01T20:30:00Z
2012-12-11T23:00:00Z
2014-02-22T00:30:00Z
It doesn't define a "start time" and "end time" like RFC5545 (iCalendar) does, or even irregular repetition like RRULE or crontab can.
You should be able to specify a weekly repetition using the ISO Week Date as a starting point, but you'll need separate repetitions for "start" and "end" times:
R/2021-W01-1T13:00:00Z/P1W
R/2021-W01-1T14:00:00Z/P1W
The first interval is for the start times: Mondays at 13:00 (starting in 2021), and the second is for the end times: Mondays at 14:00 (starting in 2021).

I'm probably being an idiot (Long Covid Brain) but isn't the obvious extension to ISO-8601 a second duration part? In the absence of the second duration, the repeats are back to back, in its presence what is actually repeating is a smaller duration event at the start of each period. e.g.
R/2021-W01-1T13:00:00Z/P1W/P1H
indefinite weekly repeat of hour long slots every Monday 1pm starting week 1 2021.
EDIT: Maybe you could even nest them ...
R/2021-W01-1T09:00:00Z/P1W/R5/P1D/P8H
Mon to Fri, 9am to 5pm, every week? Ok I'll get my coat

Related

How to I transform half-hourly data that does not span the whole day to a Time Series in R?

This is my first question on stackoverflow, sorry if the question is poorly put.
I am currently developing a project where I predict how much a person drinks each day. I currently have data that looks like this:
The menge column represents how much water a person has actually drunk in 30 minutes (So first value represents amount from 8:00 till before 8:30 etc..). This is a 1 day sample from 3 months of data. The day starts at 8 AM and ends at 8 PM.
I am trying to forecast the Time Series for each day. For example, given the first one or two time steps, we would predict the whole day and then we know how much in total the person has drunk until 8 PM.
I am trying to model this data as a Time Series object in R (Google Colab), in order to use Croston's Method for the forecasting. Using the ts() function, what should I set the frequency to knowing that:
The data is half-hourly
The data is from 8:00 till 20:00 each day (Does not span the whole day)
Would I need to make the data span the whole day by adding 0 values? Are there maybe better approaches for this? Thank you in advance.
When using the ts() function, the frequency is used to define the number of (usually regularly spaced) observations within a given time period. For your example, your observations are every 30 minutes between 8AM and 8PM, and your time period is 1 day. The time period of 1 day assumes that the patterns over each day is of most interest here, you could also use 1 week here.
So within each day of your data (8AM-8PM) you have 24 observations (24 half hours). So a suitable frequency for this data would be 24.
You can also pad the data with 0 values, however this isn't necessary and would complicate the model. If you padded the data so that it has observations for all half-hours of the day, the frequency would then be 48.

ceiling date in R

I was using ceiling_date when I saw that it was behaving in a way inconsistent with floor_date. For example,
> floor_date(as.Date("05/10/2020","%m/%d/%Y"),unit="week",week_start=7)
[1] "2020-05-10"
> ceiling_date(as.Date("05/10/2020","%m/%d/%Y"),unit="week",week_start=7)
[1] "2020-05-17"
But floor(5)=ceiling(5)=5 in R.
One has to set change_on_boundary = False in the ceiling_date function to make it behave like floor_date, but I think this should be the default behavior. I read up on the rationale behind having ceiling_date behave the way it does above, and it did not make sense to me. In fact, there was a time when what I think should be default behavior was indeed the default behavior. Please see my comments in italics below against the documentation.
change_on_boundary
If NULL (the default) don't change instants on the boundary (ceiling_date(ymd_hms('2000-01-01 00:00:00')) is 2000-01-01 00:00:00), but round up Date objects to the next boundary (ceiling_date(ymd("2000-01-01"), "month") is "2000-02-01"). When TRUE, instants on the boundary are rounded up to the next boundary. When FALSE, date-time on the boundary are never rounded up (this was the default for lubridate prior to v1.6.0. See section Rounding Up Date Objects below for more details. <- So there was a time when what I indicated should be default behavior was default behavior.
By default rounding up Date objects follows 3 steps:
Convert to an instant representing lower bound of the Date: 2000-01-01 –> 2000-01-01 00:00:00
Round up to the next closest rounding unit boundary. For example, if the rounding unit is month then next closest boundary of 2000-01-01 is 2000-02-01 00:00:00.
The motivation for this is that the "partial" 2000-01-01 is conceptually an interval (2000-01-01 00:00:00 – 2000-01-02 00:00:00) and the day hasn't started clocking yet at the exact boundary 00:00:00. Thus, it seems wrong to round up a day to its lower boundary.
<-I don't follow what "and the day hasn't started clocking yet at the exact boundary 00:00:00" means, and how and why " 2000-01-01 is conceptually an interval (2000-01-01 00:00:00 – 2000-01-02 00:00:00) " is relevant.
Even if 5/10/2020 is considered as a whole day its ceiling date should, for unit=week and week_start=7, still be 5/10/2020 because ceiling_date(as.Date("05/10/2020","%m/%d/%Y"),unit=week and week_start=7) should return the earliest Sunday no earlier than 5/10/2020. And this day is clearly 5/10/2020. It is not 5/17/2020.
Can someone weigh in on this?
I understand what you're saying, but I do see what they mean in the documentation as well.
I think the easiest way to illustrate this is to imagine someone telling you "happy Monday!" You would think that what they imply is that Monday has already started, right? Therefore, "Monday" loosely refers to the time period between Monday and Tuesday. If you think of days of a week on a real line and let the integers 1 to 7 represent days Monday to Sunday, then what the documentation is trying to say is that Monday loosely refers to the interval (1, 2), Tuesday loosely refers to the interval (2, 3), and so on. So basically you wanna think of a date as an instant that's strictly between that date and the next date. Thus it's not entirely illogical that they make the default behavior of rounding up dates the way they do.

Can I define exact weekday in iso8601 datetime to schedule a job?

The date-time that i have now:
"schedule": "R/2017-10-05T17:21:00/PT15M"
for now the job is scheduled for every 15 minutes (in chron), but if i want to perform it three times a day at a certain time and only Monday - Friday?
Is it possible to define in this format?
It's not possible
ISO8601 is designed to define intervals but only static. There is no way to define weekdays or weekends. This means you can only define interval for 3 times a day but not every day from Monday to Friday because then you will have not equal intervals between Friday and Monday.
What you can do is to create 15 jobs scheduled weekly.

Is IDL able to add / subtract from date?

As you can see the question above, I was wondering if IDL is able to add or subtract days / months / years to a given date.
For example:
given_date = anytim('01-jan-2000')
print, given_date
1-Jan-2000 00:00:00.000
When I would add 2 weeks to the given_date, then this date should appear:
15-Jan-2000 00:00:00.000
I was already looking for a solution for this problem, but I unfortunately couldn't find any solution.
Note:
I am using a normal calendar date, not the julian date.
Are you only concerned with dates after 1582? Is accuracy to the second important?
The ANYTIM routine is not part of the IDL distribution. Possibly there are third party routines to handle time increments, but I don't know of any builtin to the IDL library.
By default, which you are using, ANYTIM returns seconds from Jan 1, 1979. So to add/subtract some number of days, weeks, or years, you could calculate the number of seconds in the time interval. Of course, this does not take into account leap seconds/years (but leap years are fairly easy to take into account, leap seconds requires a database of when they were added). And adding months is going to require determining which month so to determine the number of days in it.
IDL can convert to and from Julian dates using JULDAY and CALDAT.
You may also read and write Julian dates (which are doubles or long integers) to and from strings using the format keyword to PRINT, STRING, and READS.
You'll want to use the (C()) calendar date format code.
format='(c(cdi0,"-",cMoa,"-"cyi04," ",cHi02,":",cmi02,":",csf06.3))'
date = julday(1, 1, 2000)
print, date, format=format
; 1-Jan-2000 00:00:00.000
date = date + 14
print, date, format=format
; 15-Jan-2000 00:00:00.000

How to calculate epoch day?

Is calculating the epoch day as simple as taking the epoch seconds and dividing by 86400? Or are there some special calculations that need to be done to take account of daylight savings or leap year or some other factor?
Update: by "epoch day" I mean number of days since the epoch.
POSIX defines that you can deduce the number of days since The Epoch (1970-01-01 00:00:00Z) by dividing the timestamp by 86400. This deliberately and consciously ignores leap seconds.
See the definition Seconds since the Epoch:
4.15 Seconds Since the Epoch
A value that approximates the number of seconds that have elapsed since the Epoch. A Coordinated Universal Time name (specified in terms of seconds (tm_sec), minutes (tm_min), hours (tm_hour), days since January 1 of the year (tm_yday), and calendar year minus 1900 (tm_year)) is related to a time represented as seconds since the Epoch, according to the expression below.
If the year is <1970 or the value is negative, the relationship is undefined. If the year is >=1970 and the value is non-negative, the value is related to a Coordinated Universal Time name according to the C-language expression, where tm_sec, tm_min, tm_hour, tm_yday, and tm_year are all integer types:
tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 +
(tm_year-70)*31536000 + ((tm_year-69)/4)*86400 -
((tm_year-1)/100)*86400 + ((tm_year+299)/400)*86400
The relationship between the actual time of day and the current value for seconds since the Epoch is unspecified.
How any changes to the value of seconds since the Epoch are made to align to a desired relationship with the current actual time is implementation-defined. As represented in seconds since the Epoch, each and every day shall be accounted for by exactly 86400 seconds.
Note:
The last three terms of the expression add in a day for each year that follows a leap year starting with the first leap year since the Epoch. The first term adds a day every 4 years starting in 1973, the second subtracts a day back out every 100 years starting in 2001, and the third adds a day back in every 400 years starting in 2001. The divisions in the formula are integer divisions; that is, the remainder is discarded leaving only the integer quotient.

Resources