Create a custom time zone - r

Is it possible to create a custom time zone in R for handling datetime objects?
More specifically I am interested in dealing with POSIXct objects, and would like to create a time zone than corresponds to "US/Eastern" - 17 hours. Time zones with a similar offset do not follow the same daylight savings convention as the US.
The reason for using a time zone so defined comes from FX trading, for which 5 pm EST is a reasonable 'midnight'.

When you are concerned about a specific ”midnight-like” time for each day, I assume that you want to obtain a date without time which switches over at that time. If that is your intention, then how about simply subtracting 17 hours (= 17*3600 seconds) from your vector of times, and taking the date of the resulting POSIXct value?
That would avoid complicated time zone maniplulations, which are usually not hanled by R itself but the underlying C libraray, as far as I know, so they might be difficult to achieve from within R. Instead, all computations would be performed in EST, and you'd still get a different switchover time than the local midnight.

Related

seconds since date to date in R

I have a dataset file with a time variable in "seconds since 1981-01-01 00:00:00". What I need is to convert this time into calendar date (YYYY-MM-DD HH:mm:ss). I've seen a lot of different ways to do this for time since epoch (1970) (timestamp, calendar.timegm, etc) but I'm failing to do this with a different reference date.
One option is to simply add 347133600s (11 years) to each value in seconds. this will then allow you to simply use conversion as it would be from 1970-01-01.

What are the consequences of choosing different frequencies for ts objects?

To create a ts-object in R, one has to specify a data frame, a start date and the frequency of the time series.
When searching the internet (e.g. Role of frequency parameter in ts), I get the impression that by choosing the frequency, one can emphasise whatever periodic pattern one believes is the most important in the data. However, I doubt that this is actually true. My impression is that it is solely used to compute the dates of the time series on-the-fly. E.g. when I set the start date “2015-08-01”, R automatically transforms it into a decimal date and I get something like 2015.58. If I now choose a frequency of 365 (or 365.25), R divides one unit by 365 and assigns this fraction to each day as one unit ahead, so the entry 366 days later is exactly 2016.58. However, if I choose frequency=7, the fraction assigned to each day is 1/7th, so the date assigned to the 8th day after my start date corresponds to a decimal number between 2016 and 2017. So the only choice for a data set with 365 entries per year is 365, isn’t it? And it is only used to actually create the time series?
Otherwise, if I choose the xts-class, an xts-object is built from a vector and a matrix where the vector has to be created in advance. So here there is no need to compute the date on-the-fly using a start date and a frequency and that is the reason why no frequency has to be assigned at all.
In both cases I can apply forecasting packages to either ts or xts objects (such as ARIMA, ets, stl, bats, bats etc) without specifying anything else so this shows that the frequency is actually not used for anything else. Or am I missing something here?
Thanks in advance for your comments!

Creating a Time Series with Half Hourly Data in R

This is my first time ever asking a question on Stack Overflow and I'm a programming novice so any advice as to how to improve my question asking abilities would be appreciated.
Onto my question: I have two csv files, one containing three columns (date time in dd/mm/yyyy hh:(00 or 30) format, production of a certain product, and demand for said product), and the other containing several columns (decomposition of the date time into year, month, day, hour, and whether it is :00 or :30 represented by 1 or 2 respectively, alongside several columns for independent variables which may affect production/demand of said product).
I've only played around with the first csv file, converting the string into a datetime object but the ts() function won't recognise the datetime objects as my times. I've tried adjusting the frequency parameter but ultimately failed and have no idea how to create a time series using half hourly data. Would appreciate any help.
Thanks in advance!
My suggestion is to apply the "difftime" over all your time data. For instance, like following code, you can use your initial time (the time of first record) for all comparisons as time_start and the others as time_finish. Then it return the time intervals as number of seconds and then you are ready to use other column values as the value of the time stamps.
interval=as.integer(difftime(strptime(time_finish,"%H:%M"),strptime(time_start,"%H:%M"),units = "sec"))
Second 0 10 15 ....

Starting minute in minute data aggregation in xts

I have a question about xts to.hourly and split methods. It seems that it assumes that if my timestamp is 12:00 that it means a minute 12:00-12:01. However my data provider has 11:59-12:00 in mind. I didn't find any parameters on this. Is the only solution to simply lag my time series by one minute?
Your question is actually about the endpoints function, which is what to.hourly, split.xts, and several other functions use to define intervals.
The endpoints function "assumes" that your timestamps are actual datetimes, and a time of 12:00:00 is the beginning of the 12 o'clock hour. If your data have 1-minute resolution, a time of 12:00:00 falls in the interval of 12:00:00.0-12:00:59.999.
There is no parameter you can change to make xts behave as if a time of 12:00:00 means anything other than what it actually is.
If you're certain that your data provider is putting a timestamp of 12:00 on data that occur in the interval of 11:59:00.0-11:59:59.999, then you might be able to simply subtract a small value from the xts object index.
# assuming 'x' is your xts object
.index(x) <- .index(x) - 0.001
That said, you need to think carefully about doing this because making a mistake can cause you to have look-ahead bias.

Can I make a time series work with date objects rather than integers?

I have time series data that I'm trying to analyse in R. It was provided as a CSV from excel, which I subsequently read as a data.frame all. Let's say it has two columns: all$date and all$people, representing the count of people on a particular date. The frequency is hence daily.
Being from Excel, the dates are integers representing the number of days since 1900-01-01.
I could read the data as people = ts(all$people, start=c(all$date[1], 1), frequency=365); but that gives a silly start value of almost 40000 because the data starts in 2006. The start parameter doesn't take a date object, according to ?ts, so I can't just use as.Date():
ts - ...
start: the time of the first observation. Either a single number
or a vector of two integers, which specify a natural time unit and
a (1-based) number of samples into the time unit. See the examples
for the use of the second form.
I could of course set start=1, but it's a bit painful to figure out what season we're in when the plot tells me interesting things are happening around day 2100. (To be clear, setting frequency=365 does tell me what year we're in, but isn't useful more precise dates). Is there a useful way of expressing the date in ts in a human-readable form so that I don't have to keep calling as.Date() to understand when the interesting features are happening?

Resources