I'm currently playing around a lot with dates and times for a package I'm building.
Stumbling across this post reminded me again that it's generally not a bad idea to check out if something can be done with basic R features before turning to contrib packages.
Thus, is it possible to round a date of class POSIXct with base R functionality?
I checked
methods(round)
which "only" gave me
[1] round.Date round.timeDate*
Non-visible functions are asterisked
This is what I'd like to do (Pseudo Code)
x <- as.POSIXct(Sys.time())
[1] "2012-07-04 10:33:55 CEST"
round(x, atom="minute")
[1] "2012-07-04 10:34:00 CEST"
round(x, atom="hour")
[1] "2012-07-04 11:00:00 CEST"
round(x, atom="day")
[1] "2012-07-04 CEST"
I know this can be done with timeDate, lubridate etc., but I'd like to keep package dependencies down. So before going ahead and checking out the source code of the respective packages, I thought I'd ask if someone has already done something like this.
base has round.POSIXt to do this. Not sure why it doesn't come up with methods.
x <- as.POSIXct(Sys.time())
x
[1] "2012-07-04 10:01:08 BST"
round(x,"mins")
[1] "2012-07-04 10:01:00 BST"
round(x,"hours")
[1] "2012-07-04 10:00:00 BST"
round(x,"days")
[1] "2012-07-04"
On this theme with lubridate, also look into the ceiling_date() and floor_date() functions:
x <- as.POSIXct("2009-08-03 12:01:59.23")
ceiling_date(x, "second")
# "2009-08-03 12:02:00 CDT"
ceiling_date(x, "hour")
# "2009-08-03 13:00:00 CDT"
ceiling_date(x, "day")
# "2009-08-04 CDT"
ceiling_date(x, "week")
# "2009-08-09 CDT"
ceiling_date(x, "month")
# "2009-09-01 CDT"
If you don't want to call external libraries and want to keep POSIXct as I do this is one idea (inspired by this question): use strptime and paste a fake month and day. It should be possible to do it more straight forward, as said in this comment
"For strptime the input string need not specify the date completely:
it is assumed that unspecified seconds, minutes or hours are zero, and
an unspecified year, month or day is the current one."
Thus it seems that you have to use strftime to output a truncated string, paste the missing part and convert again in POSIXct.
This is how an update answer could look:
x <- as.POSIXct(Sys.time())
x
[1] "2018-12-27 10:58:51 CET"
round(x,"mins")
[1] "2018-12-27 10:59:00 CET"
round(x,"hours")
[1] "2018-12-27 11:00:00 CET"
round(x,"days")
[1] "2018-12-27 CET"
as.POSIXct(paste0(strftime(x,format="%Y-%m"),"-01")) #trunc by month
[1] "2018-12-01 CET"
as.POSIXct(paste0(strftime(x,format="%Y"),"-01-01")) #trunc by year
[1] "2018-01-01 CET"
Related
This question already has an answer here:
Issue with the format in anytime package
(1 answer)
Closed yesterday.
I am trying to convert a string of date and time in R using anytime() function. The string have the format 'date-month-year hour-minute-second'. It seems that the anytime() function does not work for a few cases, as shown in the example below.
library(lubridate)
x1<-"03-01-2019 01:00:00"
x2<-"23-01-2019 17:00:00"
anytime(x1)
[1] "2019-03-01 01:00:00 CET"
anytime(x2)
[1] NA
I am trying to figure out how to get rid of this problem. Thanks for your help :)
You can use addFormats() to add a specific format to the stack known and used by anytime() (and/or anydate())
> library(anytime)
> addFormats("%d-%m-%Y %H:%M:%S")
> anytime(c("03-01-2019 01:00:00", "23-01-2019 17:00:00"))
[1] "2019-01-03 01:00:00 CST" "2019-01-23 17:00:00 CST"
>
As explained a few times before on this site, the xx-yy notation is ambiguous and interpreted differently in different parts of the world.
So anytime is guided by use as the separator: / is more common in North America so we use "mm/dd/yyy". On the other hand a hyphen is more common in Europe so the "dd-mm-yyyy" starts that way. You can use getFormats() to see the formats in anytime(). In a fresh session:
> head(getFormats(), 12) ## abbreviated for display here
[1] "%Y-%m-%d %H:%M:%S%f" "%Y-%m-%e %H:%M:%S%f" "%Y-%m-%d %H%M%S%f"
[4] "%Y-%m-%e %H%M%S%f" "%Y/%m/%d %H:%M:%S%f" "%Y/%m/%e %H:%M:%S%f"
[7] "%Y%m%d %H%M%S%f" "%Y%m%d %H:%M:%S%f" "%m/%d/%Y %H:%M:%S%f"
[10] "%m/%e/%Y %H:%M:%S%f" "%m-%d-%Y %H:%M:%S%f" "%m-%e-%Y %H:%M:%S%f"
>
You can use it without the head() to see all.
An alternative approach could be using parsedate package:
library(parsedate)
> parsedate::parse_date(x1)
[1] "2019-03-01 01:00:00 UTC"
> parsedate::parse_date(x2)
[1] "2019-01-23 17:00:00 UTC"
I’m having difficulties with a date time problem. My data frame looks like this and I want to find the duration that each person watches TV.
Start.Time <- c(193221,201231,152324,182243,123432,192245)
End.Time <- c(202013,211232,154521,183422,133121,201513)
cbind(Start.Time,End.Time)
I have tried different methods to convert them in order to be able to make calculation but I didn’t produce any significant results.
as.POSIXct(Start.Time , origin="2015-11-01")
My results are completely wrong
[1] "2015-11-03 05:40:21 GMT" "2015-11-03 07:53:51 GMT"
[3] "2015-11-02 18:18:44 GMT" "2015-11-03 02:37:23 GMT"
[5] "2015-11-02 10:17:12 GMT" "2015-11-03 05:24:05 GMT"
For example I want 193221 to become 19:32:21 HH:MM:SS
Is there a package out there that easily does the conversion? and if its possible i don't want the date displayed, just the time.
You can convert your numbers to actual time stamps (in POSIXct format) like this:
Start.Time <- c(193221,201231,152324,182243,123432,192245)
Start.POSIX <- as.POSIXct(as.character(Start.Time), format = "%H%M%S")
Start.POSIX
## [1] "2015-12-19 19:32:21 CET" "2015-12-19 20:12:31 CET" "2015-12-19 15:23:24 CET"
## [4] "2015-12-19 18:22:43 CET" "2015-12-19 12:34:32 CET" "2015-12-19 19:22:45 CET"
As you can see, as.POSIXct assumes the times to belong to the current date. POSIXct alway denotes a specific moment in time and thus contains not only a time but also a date. You can now easily do calculations with these:
End.Time <- c(202013,211232,154521,183422,133121,201513)
End.POSIX <- as.POSIXct(as.character(End.Time), format = "%H%M%S")
End.POSIX - Start.POSIX
## Time differences in mins
## [1] 47.86667 60.01667 21.95000 11.65000 56.81667 52.46667
When you print the POSIXct objects (as I did above with Start.POSIX) they are acutally converted to characters and these are printed. You can see this, because there are " around the dates. You can control the format that is used when printing and thus, you could print the times only as follows:
format(Start.POSIX, "%H:%M:%S")
## [1] "19:32:21" "20:12:31" "15:23:24" "18:22:43" "12:34:32" "19:22:45"
When I put a single date to be parsed, it parses accurately
> ymd("20011001")
[1] "2001-10-01 UTC"
But when I try to create a vector of dates they all come out one day off:
> b=c(ymd("20111001"),ymd("20101001"),ymd("20091001"),ymd("20081001"),ymd("20071001"),ymd("20061001"),ymd("20051001"),ymd("20041001"),ymd("20031001"),ymd("20021001"),ymd("20011001"))
> b
[1] "2011-09-30 19:00:00 CDT" "2010-09-30 19:00:00 CDT" "2009-09-30 19:00:00 CDT"
[4] "2008-09-30 19:00:00 CDT" "2007-09-30 19:00:00 CDT" "2006-09-30 19:00:00 CDT"
[7] "2005-09-30 19:00:00 CDT" "2004-09-30 19:00:00 CDT" "2003-09-30 19:00:00 CDT"
[10] "2002-09-30 19:00:00 CDT" "2001-09-30 19:00:00 CDT"
how can I fix this??? Many thanks.
I don't claim to understand exactly what's going on here, but the proximal problem is that c() strips attributes, so using c() on a POSIX[c?]t vector changes it from UTC to the time zone specified by your locale strips the time zone attribute, messing it up (even if you set the time zone to agree with the one specified by your locale). On my system:
library(lubridate)
(y1 <- ymd("20011001"))
## [1] "2001-10-01 UTC"
(y2 <- ymd("20011002"))
c(y1,y2)
## now in EDT (and a day earlier/4 hours before UTC):
## [1] "2001-09-30 20:00:00 EDT" "2001-10-01 20:00:00 EDT"
(y12 <- ymd(c("20011001","20011002")))
## [1] "2001-10-01 UTC" "2001-10-02 UTC"
c(y12)
## back in EDT
## [1] "2001-09-30 20:00:00 EDT" "2001-10-01 20:00:00 EDT"
You can set the time zone explicitly ...
y3 <- ymd("20011001",tz="EDT")
## [1] "2001-10-01 EDT"
But c() is still problematic.
(y3c <- c(y3))
## [1] "2001-09-30 20:00:00 EDT"
So two solutions are
convert a character vector rather than combining the objects after converting them one by one or
restore the tzone attribute after combining.
For example:
attr(y3c,"tzone") <- attr(y3,"tzone")
#Joran points out that this is almost certainly a general property of applying c() to POSIX[c?]t objects, not specifically lubridate-related. I hope someone will chime in and explain whether this is a well-known design decision/infelicity/misfeature.
Update: there is some discussion of this on R-help in 2012, and Brian Ripley comments:
But in any case, the documentation (?c.POSIXct) is clear:
Using ‘c’ on ‘"POSIXlt"’ objects converts them to the current time
zone, and on ‘"POSIXct"’ objects drops any ‘"tzone"’ attributes
(even if they are all marked with the same time zone).
So the recommended way is to add a "tzone" attribute if you know what
you want it to be. POSIXct objects are absolute times: the timezone
merely affects how they are converted (including to character for
printing).
It might be nice if lubridate added a method to do this ...
I am running into an error when I try to localize times for "date" (a variable of class=POSIXlt) in my dataset. Example code is as follows:
# All dates are coded by survey software in EST(not local time)
date <- c("2011-07-26 07:23", "2011-07-29 07:34", "2011-07-29 07:40")
region <-c("USA-EST", "UK", "Singapore")
#Change the times based on time-zone differences
start_time<-strptime(date,"%Y-%m-%d %h:%m")
localtime=as.POSIXlt(start_time)
localtime<-ifelse(region=="UK",start_time+6,start_time)
localtime<-ifelse(region=="Singapore",start_time+12,start_time)
#Then, I need to extract the hour and weekday
weekday<-weekdays(localtime)
hour<-factor(localtime)
There must be something wrong with my "ifelse" statement, because I get the error: number of items to replace is not a multiple of replacement length. Please help!
How about using R's native time code? The trick is that you can't have more than one time-zone in a POSIX vector, so use a list instead:
region <- c("EST","Europe/London","Asia/Singapore")
(localtime <- lapply(seq(date),function(x) as.POSIXlt(date[x],tz=region[x])))
[[1]]
[1] "2011-07-26 07:23:00 EST"
[[2]]
[1] "2011-07-29 07:34:00 Europe/London"
[[3]]
[1] "2011-07-29 07:40:00 Asia/Singapore"
And to convert to a vector in a single timezone:
Reduce("c",localtime)
[1] "2011-07-26 13:23:00 BST" "2011-07-29 07:34:00 BST"
[3] "2011-07-29 00:40:00 BST"
Note that my system timezone is BST, but if yours is EST it will convert to that.
You can use the timezone handling built in in POSIXct:
> start_time <- as.POSIXct(date,"%Y-%m-%d %H:%M", tz = "America/New_York")
> start_time
[1] "2011-07-26 07:23:00 EDT" "2011-07-29 07:34:00 EDT" "2011-07-29 07:40:00 EDT"
> format(start_time, tz="Europe/London", usetz=TRUE)
[1] "2011-07-26 12:23:00 BST" "2011-07-29 12:34:00 BST" "2011-07-29 12:40:00 BST"
> format(start_time, tz="Asia/Singapore", usetz=TRUE)
[1] "2011-07-26 19:23:00 SGT" "2011-07-29 19:34:00 SGT" "2011-07-29 19:40:00 SGT"
I need to use as.Date on the index of a zoo object. Some of the dates are in BST and so when converting I lose a day on (only) these entries. I don't care about one hour's difference or even the time part of the date at all, I just want to make sure that the dates displayed stay the same. I'm guessing this is not very hard but I can't manage it. Can somebody help please?
class(xtsRet)
#[1] "xts" "zoo"
index(xtsRet)
#[1] "2007-07-31 BST" "2007-08-31 BST" "2007-09-30 BST" "2007-10-31 GMT"
class(index(xtsRet))
#[1] "POSIXt" "POSIXct"
index(xtsRet) <- as.Date(index(xtsRet))
index(xtsRet)
#[1] "2007-07-30" "2007-08-30" "2007-09-29" "2007-10-31"
Minimally reproducible example (not requiring zoo package):
my_date <- as.POSIXct("2007-04-01") # Users in non-UK timezone will need to
# do as.POSIXct("2007-04-01", "Europe/London")
my_date
#[1] "2017-04-01 BST"
as.Date(my_date)
#[1] "2017-03-31"
Suppose we have this sample data:
library(zoo)
x <- as.POSIXct("2000-01-01", tz = "GMT")
Then see if any of these are what you want:
# use current time zone
as.Date(as.character(x, tz = ""))
# use GMT
as.Date(as.character(x, tz = "GMT"))
# set entire session to GMT
Sys.setenv(TZ = "GMT")
as.Date(x)
Also try "BST" in place of "GMT" and note the article on dates and times in R News 4/1 .
You can offset the POSIX objects so its not based around midnight. 1 hour (3600 secs) should be sufficient:
d <- as.POSIXct(c("2007-07-31","2007-08-31","2007-09-30","2007-10-31"))
d
[1] "2007-07-31 BST" "2007-08-31 BST" "2007-09-30 BST" "2007-10-31 GMT"
as.Date(d)
[1] "2007-07-30" "2007-08-30" "2007-09-29" "2007-10-31"
as.Date(d+3600)
[1] "2007-07-31" "2007-08-31" "2007-09-30" "2007-10-31"
I would suggest using as.POSIXlt to convert to a date object, wrapped in as.Date:
d <- as.POSIXct(c("2007-07-31","2007-08-31","2007-09-30","2007-10-31"))
d
[1] "2007-07-31 BST" "2007-08-31 BST" "2007-09-30 BST" "2007-10-31 GMT"
as.Date(as.POSIXlt(d))
[1] "2007-07-31" "2007-08-31" "2007-09-30" "2007-10-31"
Achieves the same thing as the +3600 above, but slightly less of a hack