Generate random times in sample of POSIXct - r

I want to generate a load of POSIXct dates. I want to have the time component only between 9am and 5pm and only at 15 minute blocks. I know how to generate the random POSIXct between certain dates but how do I specify the minute blocks and the time range. This is where I am at:
sample(seq(as.POSIXct('2013/01/01'), as.POSIXct('2017/05/01'), by="day"), 1000)

Just change the by argument to 15mins:
sample(seq(as.POSIXct('2013/01/01'), as.POSIXct('2017/05/01'), by="15 mins"), 1000)
EDIT:
I overlooked that the time component should be between 9am and 5pm. To take this into account I would filter the sequence:
library(lubridate)
possible_dates <- seq(as.POSIXct('2013/01/01'), as.POSIXct('2017/05/01'), by="15 mins")
possible_dates <- possible_dates[hour(possible_dates) < 17 & hour(possible_dates) >=9]
sample(possible_dates, 1000)

As #AEF also pointed out, you can use the argument by to create the sequence in steps of 15 minutes.
x <- seq(as.POSIXct('2013/01/01'), as.POSIXct('2017/05/01'), by="15 mins")
You then can use lubridate::hour() like this to extract the values from the sequence and create the sample:
library(lubridate)
sample(x[hour(x) > "09:00" & hour(x) < "17:00"], 1000)
# [1] "2015-06-28 12:45:00 CEST" "2014-05-04 10:15:00 CEST" "2017-01-08 01:00:00 CET" "2015-06-22 12:30:00 CEST"
# [5] "2016-01-14 13:30:00 CET" "2015-06-15 14:00:00 CEST" "2014-11-20 13:15:00 CET" "2013-09-23 11:15:00 CEST"
# [9] "2014-11-25 11:30:00 CET" "2014-12-04 15:30:00 CET" "2016-05-28 14:45:00 CEST" "2017-01-12 14:15:00 CET"
# .....

OK so I used this in the end:
ApptDate<-sample(seq(as.Date('2013/01/01'), as.Date('2017/05/01'), by="day"), 1000)
Time<-paste(sample(9:15,1000,replace=T),":",sample(seq(0,59,by=15),1000,replace=T),sep="")
FinalPOSIXDate<-as.POSIXct(paste(ApptDate," ",Time,sep=""))

Related

Floor datetime with custom start time (lubridate)

Is there a way to floor dates using a custom start time instead of the earliest possible time?
For example, flooring hours in a day into 2 12-hour intervals starting at 8am and 8pm rather than 12am and 12pm.
Example:
x <- ymd_hms("2009-08-03 21:00:00")
y <- ymd_hms("2009-08-03 09:00:00")
floor_date(x, '12 hours')
floor_date(y, '12 hours')
# default lubridate output:
[1] "2009-08-03 12:00:00 UTC"
[1] "2009-08-03 UTC"
# what i would like to have:
[1] "2009-08-03 20:00:00 UTC"
[1] "2009-08-03 08:00:00 UTC"
You could program a small switch (without lubridate, though).
FUN <- function(x) {
s <- switch(which.min(abs(mapply(`-`, c(8, 20), as.numeric(substr(x, 12, 13))))),
"08:00:00", "20:00:00")
as.POSIXct(paste(as.Date(x), s))
}
FUN("2009-08-03 21:00:00")
# [1] "2009-08-03 20:00:00 CEST"
FUN("2009-08-03 09:00:00")
# [1] "2009-08-03 08:00:00 CEST"

convert numeric variable into POSIXct

i have a variable that contains values about
" the beginning of time interval expressed as the number of millisecond elapsed from the Unix Epoch on January 1st, 1970 at UTC." (according to data source metadata)
This is the head:
x$timeInt
[1] 1.388068e+12 1.388075e+12 1.388096e+12 1.388051e+12 1.388051e+12 1.388072e+12
So i try to convert it as POSIXct
as.POSIXct(x$timeInt, origin = '01-01-1970',tz='UTC')
but i get this result
[1] "43987-03-01 05:20:00 UTC" "43987-05-23 13:20:00 UTC" "43988-01-28 13:20:00 UTC" "43986-08-25 17:20:00 UTC"
[5] "43986-08-25 17:20:00 UTC" "43987-04-25 18:40:00 UTC"
As you can see, the year is totally wrong. I tried using other formats in origin like "1970-01-01", but the result is the same.
I know thata data is taken in december 2013.
You have to take care, that this is in milliseconds, so:
x$timeInt <- x$timeInt/1000
And then one of the two approaches:
as.POSIXct(x$timeInt, origin = '1970-01-01',tz='UTC')
or
library(anytime)
anytime(x$timeInt)
#[1] "2013-12-26 15:26:40 CET" "2013-12-26 17:23:20 CET" "2013-12-26 23:13:20 CET" "2013-12-26 10:43:20 CET" "2013-12-26 10:43:20 CET"
#[6] "2013-12-26 16:33:20 CET"

Generate a uniformly sampled time series object in R

Hi I am looking to generate a uniformly sampled time series at 30 minute interval from a particular start date to some end date. However the constraint is that on each day the 30 minute interval begins at 7:00 and ends at 18:30 i.e. I need the time series object to be something like
c('2016-08-19 07:00:00',
'2016-08-19 07:30:00',
...,
'2016-08-19 18:30:00',
'2016-08-20 07:00:00',
...,
'2016-08-20 18:30:00',
...
'2016-08-31 18:30:00')
Without the constraints it can be done with something like
seq(as.POSIXct('2016-08-19 07:00:00'), as.POSIXct('2016-08-21 18:30:00'), by="30 min")
But I dont want the times between '2016-08-20 18:30:00' and '2016-08-21 07:30:00' in this case. Any help will be appreciated. Thanks!
Using the example series you created:
ts <- seq(as.POSIXct('2016-08-19 07:00:00'),
as.POSIXct('2016-08-21 18:30:00'), by="30 min")
Pull out the hours from your series using strftime:
hours <- strftime(ts, format="%H:%M:%S")
> head(hours)
[1] "07:00:00" "07:30:00"
[3] "08:00:00" "08:30:00"
[5] "09:00:00" "09:30:00"
You can then convert it back to POSIXct:
hours <- as.POSIXct(hours, format="%H:%M:%S")
This will retain the times of the day but it will make the date today's date:
> head(hours)
[1] "2016-09-11 07:00:00 EDT"
[2] "2016-09-11 07:30:00 EDT"
[3] "2016-09-11 08:00:00 EDT"
[4] "2016-09-11 08:30:00 EDT"
[5] "2016-09-11 09:00:00 EDT"
[6] "2016-09-11 09:30:00 EDT"
> tail(hours)
[1] "2016-09-11 16:00:00 EDT"
[2] "2016-09-11 16:30:00 EDT"
[3] "2016-09-11 17:00:00 EDT"
[4] "2016-09-11 17:30:00 EDT"
[5] "2016-09-11 18:00:00 EDT"
[6] "2016-09-11 18:30:00 EDT"
You can then create a TRUE/FALSE vector based on the condition you want:
condition <- hours > "2016-09-11 07:30:00 EDT" &
hours < "2016-09-11 18:30:00 EDT"
Then filter your original series based on this condition:
ts[condition]
Here is my short and handy solution with package lubridate
library("lubridate")
list <- lapply(0:2, function(x){
temp <- ymd_hms('2016-08-19 07:00:00') + days(x)
result <- temp + minutes(seq(0, 690, 30))
return(strftime(result))
})
do.call("c", list)
I have to use strftime(result) to remove the timezone and to have the right times.

Decompose xts hourly time series

I want to decompose hourly time series with decompose, ets, or stl or whatever function. Here is an example code and its output:
require(xts)
require(forecast)
time_index1 <- seq(from = as.POSIXct("2012-05-15 07:00"),
to = as.POSIXct("2012-05-17 18:00"), by="hour")
head(time_index1 <- format(time_index1, format="%Y-%m-%d %H:%M:%S",
tz="UTC", usetz=TRUE)
# [1] "2012-05-15 05:00:00 UTC" "2012-05-15 06:00:00 UTC"
# [3] "2012-05-15 07:00:00 UTC" "2012-05-15 08:00:00 UTC"
# [5] "2012-05-15 09:00:00 UTC" "2012-05-15 10:00:00 UTC"
head(time_index <- as.POSIXct(time_index1))
# [1] "2012-05-15 05:00:00 CEST" "2012-05-15 06:00:00 CEST"
# [3] "2012-05-15 07:00:00 CEST" "2012-05-15 08:00:00 CEST"
# [5] "2012-05-15 09:00:00 CEST" "2012-05-15 10:00:00 CEST"
Why does the timezone for time_index change back to CEST?
set.seed(1)
value <- rnorm(n = length(time_index1))
eventdata1 <- xts(value, order.by = time_index)
tzone(eventdata1)
# [1] ""
head(index(eventdata1))
# [1] "2012-05-15 05:00:00 CEST" "2012-05-15 06:00:00 CEST"
# [3] "2012-05-15 07:00:00 CEST" "2012-05-15 08:00:00 CEST"
# [5] "2012-05-15 09:00:00 CEST" "2012-05-15 10:00:00 CEST"
ets(eventdata1)
# ETS(A,N,N)
#
# Call:
# ets(y = eventdata1)
#
# Smoothing parameters:
# alpha = 1e-04
#
# Initial states:
# l = 0.1077
#
# sigma: 0.8481
#
# AIC AICc BIC
# 229.8835 230.0940 234.0722
decompose(eventdata1)
# Error in decompose(eventdata1) :
# time series has no or less than 2 periods
stl(eventdata1)
# Error in stl(eventdata1) :
# series is not periodic or has less than two periods
When I call tzone or indexTZ there is no timezone but the index clearly show that the times are defined with a timezone.
Also, why does only ets work? Can it be used to decompose a time series?
Why does the timezone for time_index change back to CEST?
Because you didn't specify tz= in your call to as.POSIXct. It will only pick up the timezone from the string if it's specified by offset from UTC (e.g. -0800). See ?strptime.
R> head(time_index <- as.POSIXct(time_index1, "UTC"))
[1] "2012-05-15 12:00:00 UTC" "2012-05-15 13:00:00 UTC"
[3] "2012-05-15 14:00:00 UTC" "2012-05-15 15:00:00 UTC"
[5] "2012-05-15 16:00:00 UTC" "2012-05-15 17:00:00 UTC"
When I call tzone or indexTZ there is no timezone but the index clearly show that the times are defined with a timezone.
All POSIXct objects have a timezone. A timezone of "" simply means R wasn't able to determine a specific timezone, so it is using the timezone specified by your operating system. See ?timezone.
Only the ets function works because your xts object doesn't have a properly defined frequency attribute. This is a known limitation of xts objects, and I have plans to address them over the next several months. You can work around the current issues by explicitly specifying the frequency attribute after calling the xts constructor.
R> set.seed(1)
R> value <- rnorm(n = length(time_index1))
R> eventdata1 <- xts(value, order.by = time_index)
R> attr(eventdata1, 'frequency') <- 24 # set frequency attribute
R> decompose(as.ts(eventdata1)) # decompose expects a 'ts' object
You can use tbats to decompose hourly data:
require(forecast)
set.seed(1)
time_index1 <- seq(from = as.POSIXct("2012-05-15 07:00"),
to = as.POSIXct("2012-05-17 18:00"), by="hour")
value <- rnorm(n = length(time_index1))
eventdata1 <- msts(value, seasonal.periods = c(24) )
seasonaldecomp <- tbats(eventdata1)
plot(seasonaldecomp)
Additionally, using msts instead of xts allows you to specify multiple seasons/cycles, fore instance hourly as well as daily: c(24, 24*7)

How to add the time to a date when using as.date?

I have measurements that were taken at this time: 13880 and they represent "days since 1970-01-01 00:00:00"
So now I want to know the dat and time:
as.Date(13880, origin="1970-01-01")
[1] "2008-01-02" # works fine
Now to add the time:
as.Date(13880, origin="1970-01-01",tz = "UTC", format="%Y/%m/%d %H:%M:%S")
[1] NA
or
as.POSIXct(13880, origin="1970-01-01")
[1] "1970-01-01 04:51:20 CET"
as.POSIXlt(13879, origin="1970-01-01")
[1] "1970-01-01 04:51:19 CET"
None of these worked for me. Any idea?
as.POSIXct(as.Date("1970-01-01") + 13880) # returns "2008-01-01 19:00:00 EST"
as.POSIXct(as.Date("1970-01-01") + 13880.5) # returns "2008-01-02 07:00:00 EST"
You can also set your time zone:
How to change the default time zone in R?
also: http://blog.revolutionanalytics.com/2009/06/converting-time-zones.html

Resources