Hi I am looking to generate a uniformly sampled time series at 30 minute interval from a particular start date to some end date. However the constraint is that on each day the 30 minute interval begins at 7:00 and ends at 18:30 i.e. I need the time series object to be something like
c('2016-08-19 07:00:00',
'2016-08-19 07:30:00',
...,
'2016-08-19 18:30:00',
'2016-08-20 07:00:00',
...,
'2016-08-20 18:30:00',
...
'2016-08-31 18:30:00')
Without the constraints it can be done with something like
seq(as.POSIXct('2016-08-19 07:00:00'), as.POSIXct('2016-08-21 18:30:00'), by="30 min")
But I dont want the times between '2016-08-20 18:30:00' and '2016-08-21 07:30:00' in this case. Any help will be appreciated. Thanks!
Using the example series you created:
ts <- seq(as.POSIXct('2016-08-19 07:00:00'),
as.POSIXct('2016-08-21 18:30:00'), by="30 min")
Pull out the hours from your series using strftime:
hours <- strftime(ts, format="%H:%M:%S")
> head(hours)
[1] "07:00:00" "07:30:00"
[3] "08:00:00" "08:30:00"
[5] "09:00:00" "09:30:00"
You can then convert it back to POSIXct:
hours <- as.POSIXct(hours, format="%H:%M:%S")
This will retain the times of the day but it will make the date today's date:
> head(hours)
[1] "2016-09-11 07:00:00 EDT"
[2] "2016-09-11 07:30:00 EDT"
[3] "2016-09-11 08:00:00 EDT"
[4] "2016-09-11 08:30:00 EDT"
[5] "2016-09-11 09:00:00 EDT"
[6] "2016-09-11 09:30:00 EDT"
> tail(hours)
[1] "2016-09-11 16:00:00 EDT"
[2] "2016-09-11 16:30:00 EDT"
[3] "2016-09-11 17:00:00 EDT"
[4] "2016-09-11 17:30:00 EDT"
[5] "2016-09-11 18:00:00 EDT"
[6] "2016-09-11 18:30:00 EDT"
You can then create a TRUE/FALSE vector based on the condition you want:
condition <- hours > "2016-09-11 07:30:00 EDT" &
hours < "2016-09-11 18:30:00 EDT"
Then filter your original series based on this condition:
ts[condition]
Here is my short and handy solution with package lubridate
library("lubridate")
list <- lapply(0:2, function(x){
temp <- ymd_hms('2016-08-19 07:00:00') + days(x)
result <- temp + minutes(seq(0, 690, 30))
return(strftime(result))
})
do.call("c", list)
I have to use strftime(result) to remove the timezone and to have the right times.
Related
Is there a way to floor dates using a custom start time instead of the earliest possible time?
For example, flooring hours in a day into 2 12-hour intervals starting at 8am and 8pm rather than 12am and 12pm.
Example:
x <- ymd_hms("2009-08-03 21:00:00")
y <- ymd_hms("2009-08-03 09:00:00")
floor_date(x, '12 hours')
floor_date(y, '12 hours')
# default lubridate output:
[1] "2009-08-03 12:00:00 UTC"
[1] "2009-08-03 UTC"
# what i would like to have:
[1] "2009-08-03 20:00:00 UTC"
[1] "2009-08-03 08:00:00 UTC"
You could program a small switch (without lubridate, though).
FUN <- function(x) {
s <- switch(which.min(abs(mapply(`-`, c(8, 20), as.numeric(substr(x, 12, 13))))),
"08:00:00", "20:00:00")
as.POSIXct(paste(as.Date(x), s))
}
FUN("2009-08-03 21:00:00")
# [1] "2009-08-03 20:00:00 CEST"
FUN("2009-08-03 09:00:00")
# [1] "2009-08-03 08:00:00 CEST"
I want to generate a load of POSIXct dates. I want to have the time component only between 9am and 5pm and only at 15 minute blocks. I know how to generate the random POSIXct between certain dates but how do I specify the minute blocks and the time range. This is where I am at:
sample(seq(as.POSIXct('2013/01/01'), as.POSIXct('2017/05/01'), by="day"), 1000)
Just change the by argument to 15mins:
sample(seq(as.POSIXct('2013/01/01'), as.POSIXct('2017/05/01'), by="15 mins"), 1000)
EDIT:
I overlooked that the time component should be between 9am and 5pm. To take this into account I would filter the sequence:
library(lubridate)
possible_dates <- seq(as.POSIXct('2013/01/01'), as.POSIXct('2017/05/01'), by="15 mins")
possible_dates <- possible_dates[hour(possible_dates) < 17 & hour(possible_dates) >=9]
sample(possible_dates, 1000)
As #AEF also pointed out, you can use the argument by to create the sequence in steps of 15 minutes.
x <- seq(as.POSIXct('2013/01/01'), as.POSIXct('2017/05/01'), by="15 mins")
You then can use lubridate::hour() like this to extract the values from the sequence and create the sample:
library(lubridate)
sample(x[hour(x) > "09:00" & hour(x) < "17:00"], 1000)
# [1] "2015-06-28 12:45:00 CEST" "2014-05-04 10:15:00 CEST" "2017-01-08 01:00:00 CET" "2015-06-22 12:30:00 CEST"
# [5] "2016-01-14 13:30:00 CET" "2015-06-15 14:00:00 CEST" "2014-11-20 13:15:00 CET" "2013-09-23 11:15:00 CEST"
# [9] "2014-11-25 11:30:00 CET" "2014-12-04 15:30:00 CET" "2016-05-28 14:45:00 CEST" "2017-01-12 14:15:00 CET"
# .....
OK so I used this in the end:
ApptDate<-sample(seq(as.Date('2013/01/01'), as.Date('2017/05/01'), by="day"), 1000)
Time<-paste(sample(9:15,1000,replace=T),":",sample(seq(0,59,by=15),1000,replace=T),sep="")
FinalPOSIXDate<-as.POSIXct(paste(ApptDate," ",Time,sep=""))
I am trying to do some simple operation in R, after loading a table i encountered a date column which has many formats combined.
**Date**
1/28/14 6:43 PM
1/29/14 4:10 PM
1/30/14 12:09 PM
1/30/14 12:12 PM
02-03-14 19:49
02-03-14 20:03
02-05-14 14:33
I need to convert this to format like 28-01-2014 18:43 i.e. %d-%m-%y %h:%m
I tried this
tablename$Date <- as.Date(as.character(tablename$Date), "%d-%m-%y %h:%m")
but doing this its filling NA in the entire column. Please help me to get this right!
The lubridate package makes quick work of this:
library(lubridate)
d <- parse_date_time(dates, names(guess_formats(dates, c("mdy HM", "mdy IMp"))))
d
## [1] "2014-01-28 18:43:00 UTC" "2014-01-29 16:10:00 UTC"
## [3] "2014-01-30 12:09:00 UTC" "2014-01-30 12:12:00 UTC"
## [5] "2014-02-03 19:49:00 UTC" "2014-02-03 20:03:00 UTC"
## [7] "2014-02-05 14:33:00 UTC"
# put in desired format
format(d, "%m-%d-%Y %H:%M:%S")
## [1] "01-28-2014 18:43:00" "01-29-2014 16:10:00" "01-30-2014 12:09:00"
## [4] "01-30-2014 12:12:00" "02-03-2014 19:49:00" "02-03-2014 20:03:00"
## [7] "02-05-2014 14:33:00"
You'll need to adjust the vector in guess_formats if you come across other format variations.
From two integers (1, 5) one can create a range in the following way
1:5
[1] 1 2 3 4 5
How can you make a range of dates if you are give two dates ("2014-09-04 JST", "2014-09-11 JST")
The output must be
[1] ("2014-09-04 JST", "2014-09-05 JST", "2014-09-06 JST", "2014-09-07 JST", "2014-09-08 JST")
Does this help?
seq(as.Date("2014/09/04"), by = "day", length.out = 5)
# [1] "2014-09-04" "2014-09-05" "2014-09-06" "2014-09-07" "2014-09-08"
edit: adding in something about timezones
this works for my current timezone
seq(c(ISOdate(2014,4,9)), by = "DSTday", length.out = 5)
#[1] "2014-04-09 08:00:00 EDT" "2014-04-10 08:00:00 EDT" "2014-04-11 08:00:00 EDT" "2014-04-12 08:00:00 EDT"
#[5] "2014-04-13 08:00:00 EDT"
edit2:
OlsonNames() # I used this to find out what to write for the JST tz - it's "Japan"
x <- as.POSIXct("2014-09-04 23:59:59", tz="Japan")
format(seq(x, by="day", length.out=5), "%Y-%m-%d %Z")
# [1] "2014-09-04 JST" "2014-09-05 JST" "2014-09-06 JST" "2014-09-07 JST" "2014-09-08 JST"
To get a sequence of dates ( days, weeks,.. ) using only start and end dates you can use:
seq(as.Date("2014/1/1"), as.Date("2014/1/10"), "days”)
[1] "2014-01-01" "2014-01-02" "2014-01-03" "2014-01-04" "2014-01-05" "2014-01-06" "2014-01-07"
[8] "2014-01-08" "2014-01-09" "2014-01-10”
Here's an answer, admittedly worse than #jalapic's, that doesn't use seq and instead uses a for loop:
date1 <- "2014-09-04"
date2 <- "2014-09-11"
dif <- as.numeric(abs(as.Date(date1) - as.Date(date2)))
dates <- vector()
for (i in 1:dif) {
date <- (as.Date(date1) + i)
dates <- append(dates, date)
}
# [1] "2014-09-05" "2014-09-06" "2014-09-07" "2014-09-08" "2014-09-09" "2014-09-10" "2014-09-11
here's a shot though the timezone JST isn't recognized by my system
d1<-ISOdate(year=2014,month=9,day=4,tz="GMT")
seq(from=d1,by="day",length.out=5)
[1] "2014-09-04 12:00:00 GMT" "2014-09-05 12:00:00 GMT" "2014-09-06 12:00:00 GMT" "2014-09-07 12:00:00 GMT" "2014-09-08 12:00:00 GMT"
While using seq(date1, date2, "days") is by far the better option in nearly all cases, I'd just like to add, that the following works too, if you need a range of dates that are n_number of days from a date:
1:10 + as.Date("2020-01-01")
# [1] "2020-01-02" "2020-01-03" "2020-01-04" "2020-01-05"
# [5] "2020-01-06" "2020-01-07" "2020-01-08" "2020-01-09"
# [9] "2020-01-10" "2020-01-11"
I am running into an error when I try to localize times for "date" (a variable of class=POSIXlt) in my dataset. Example code is as follows:
# All dates are coded by survey software in EST(not local time)
date <- c("2011-07-26 07:23", "2011-07-29 07:34", "2011-07-29 07:40")
region <-c("USA-EST", "UK", "Singapore")
#Change the times based on time-zone differences
start_time<-strptime(date,"%Y-%m-%d %h:%m")
localtime=as.POSIXlt(start_time)
localtime<-ifelse(region=="UK",start_time+6,start_time)
localtime<-ifelse(region=="Singapore",start_time+12,start_time)
#Then, I need to extract the hour and weekday
weekday<-weekdays(localtime)
hour<-factor(localtime)
There must be something wrong with my "ifelse" statement, because I get the error: number of items to replace is not a multiple of replacement length. Please help!
How about using R's native time code? The trick is that you can't have more than one time-zone in a POSIX vector, so use a list instead:
region <- c("EST","Europe/London","Asia/Singapore")
(localtime <- lapply(seq(date),function(x) as.POSIXlt(date[x],tz=region[x])))
[[1]]
[1] "2011-07-26 07:23:00 EST"
[[2]]
[1] "2011-07-29 07:34:00 Europe/London"
[[3]]
[1] "2011-07-29 07:40:00 Asia/Singapore"
And to convert to a vector in a single timezone:
Reduce("c",localtime)
[1] "2011-07-26 13:23:00 BST" "2011-07-29 07:34:00 BST"
[3] "2011-07-29 00:40:00 BST"
Note that my system timezone is BST, but if yours is EST it will convert to that.
You can use the timezone handling built in in POSIXct:
> start_time <- as.POSIXct(date,"%Y-%m-%d %H:%M", tz = "America/New_York")
> start_time
[1] "2011-07-26 07:23:00 EDT" "2011-07-29 07:34:00 EDT" "2011-07-29 07:40:00 EDT"
> format(start_time, tz="Europe/London", usetz=TRUE)
[1] "2011-07-26 12:23:00 BST" "2011-07-29 12:34:00 BST" "2011-07-29 12:40:00 BST"
> format(start_time, tz="Asia/Singapore", usetz=TRUE)
[1] "2011-07-26 19:23:00 SGT" "2011-07-29 19:34:00 SGT" "2011-07-29 19:40:00 SGT"