In ymd from lubridate, the default value of tz was UTC. I don't know exactly when the change was made but I know that in 1.5 the default was UTC but in 1.5.8 the default is now NULL.
This changes the output of ymd from POSIXct objects to Date objects which breaks a lot of my code where I rely on having a POSIXct object but now have a Date. Is there a convenient way to make this backwards compatible or do I need to add the tz='UTC' to all of my old code that relied on this?
Write a wrapper to replace ymd with ymd_hms for which the default is still tz = "UTC"
library(lubridate)
ymd2 = function(x){
ymd_hms(paste(x, "00:00:00"))
}
ymd2("2017/3/4")
#[1] "2017-03-04 UTC"
class(ymd2("2017/3/4"))
#[1] "POSIXct" "POSIXt"
Related
I am working with a "data.frame" which are given in the following formate: Aug 12, 2017.
class(data[,1]) = factor
How can i convert these into dates?
data[,1] <- as.Date.factor(data[,1],format = "%m.%d.%y"), returns NA's.
I would suggest the package lubridate for very easy to use functions to operate with dates. For example:
mdy("Aug 12,2017")
[1] "2017-08-12"
If your date is in YYYY-MM-DD format, you can use the ymd function. There are also other functions such as dmy, dmy_hms (for datetime), etc.
If your column is called my.date, you can do:
data$my.date <- mdy(data$my.date)
Alternatively, you can use the %<>% operator from magrittr to make your code even shorter:
data$my.date %<>% mdy
Use as.POSIXct (Base-R Solution):
as.POSIXct("Aug 12,2017", format="%b%d,%Y")
Output:
[1] "2017-08-12 CEST"
Using strptime, could work:
strptime("Aug 12,2017", "%b%d,%Y")
Output:
[1] "2017-08-12 UTC"
The second parameter for strptime is the format of the dates you have. For instance, if your dates are like this "1/5/2005", then the format would be:
format="%m/%d/%Y"
Hope it helps
Why does the ymd_hms function from R's lubridate package return "2018-01-09 15:43:44.843 UTC" for ymd_hms('2018-01-09T15:43:44.844Z')?
I naively would have expected "2018-01-09 15:43:44.844 UTC".
ymd_hms('2018-01-09T15:43:44.822Z') returns "2018-01-09 15:43:44.822 UTC".
Since this is GMT/UTC, I don't believe daylight savings would be a factor, and different values for the truncated = option don't seem to make a difference.
From the ymd_hms documention:
NOTE: The ymd family of functions are based on strptime()
As described here, the issue seems to lie in how strptime truncates rather than rounds fractions of a second. Play around with options(digits.secs = n) to see how it handles various numbers of decimal places.
I would rather use:
format(strptime("2018-01-09T15:43:44.844Z", "%Y-%m-%dT%H:%M:%OS", tz = "EST"), format="%Y-%m-%d %H:%M:%OS3 %Z", tz = "EST")
[1] "2018-01-09 15:43:44.844 EST"
Note that the milliseconds are passed in the format using the argument %OS
I have a dataframe where one column lists a bunch of datetimes. Oddly, the data type for that column is "integer." I need to coerce the column to a proper datetime data type such as POSIXct so that I can subtract these timestamps from those in another field. However, when I try to coerce these datetime values into POSIXct, they lose the time component. When I try to do math on the datetimes without first coercing into another datatype, R acts as if the time component of the timestamp isn't there (it assumes each date has a time of midnight). What's going on and how do I fix it so that R recognizes the timestamp?
> dates[1]
[1] 2016-05-05T16:46:21-04:00
48 Levels: 2016-05-03T06:45:42-04:00 2016-05-03T06:45:43-04:00 ... 2016-05-05T16:50:00-04:00
> typeof(dates)
[1] "integer"
> as.POSIXct(dates[1])
[1] "2016-05-05 EDT"
> as.character(dates[1])
[1] "2016-05-05T16:46:21-04:00"
> as.POSIXct(as.character(dates[1]))
[1] "2016-05-05 EDT"
You can use as.POSIXct with the tz argument to convert the timestamps with the right level of control.
If the timezones are all UTC-04:00 and that is your local timezone, you can use:
dates = as.POSIXct(dates, format="%Y-%m-%dT%H:%M:%S", tz=Sys.timezone())
If they are all UTC-04:00 and that is not your local timezone, but you know the exact location, then you can specify the appropriate timezone from the tz database:
dates = as.POSIXct(dates, format="%Y-%m-%dT%H:%M:%S", tz="America/Port_of_Spain")
Alternatively, you can use a generic GMT-4 timezone:
dates = as.POSIXct(dates, format="%Y-%m-%dT%H:%M:%S", tz="Etc/GMT-4")
[EDIT: With thanks to Roland for his comment below. I originally used strptime, which uses the same syntax, but returns a POSIXlt object.]
I'm working with the POSIXct data type in R. In my work, I incorporate a function that returns two POSIXct dates in a vector. However, I am discovering some unexpected behavior. I wrote some example code to illustrate my problem:
# POSIXct returning issue:
returnTime <- function(date) {
oneDay <- 60 * 60 * 24
nextDay <- date + oneDay
print(date)
print(nextDay)
return(c(date, nextDay))
}
myTime <- as.POSIXct("2015-01-01", tz = "UTC")
bothDays <- returnTime(myTime)
print(bothDays)
The print statements in the function give:
[1] "2015-01-01 UTC"
[1] "2015-01-02 UTC"
While the print statement at the end of the code gives:
[1] "2014-12-31 19:00:00 EST" "2015-01-01 19:00:00 EST"
I understand what is happening, but I don't see as to why. It could be a simple mistake that is eluding me, but I really am quite confused. I don't understand why the time zone is changing on the return. The class is still POSIXct as well, just the time zone has changed.
Additionally, I did the same as above, but just returned one of the dates and the date's timezone did not change. I can work around this for now, but wanted to see if anyone had any insight to my problem. Thank you in advance!
Thanks for the help below. I instead did:
return(list(date, nextDay))
and this solved my issue of the time zone being dropped.
From ?c.POSIXct:
Using c on "POSIXlt" objects converts them to the current time zone,
and on "POSIXct" objects drops any "tzone" attributes (even if they
are all marked with the same time zone).
See also here.
The problem is that the function c removes the timezone attribute:
attributes(myTime)
#$class
#[1] "POSIXct" "POSIXt"
#
#$tzone
#[1] "UTC"
attributes(c(myTime))
#$class
#[1] "POSIXct" "POSIXt"
To fix, you can e.g. use the setattr function from data.table, to modify the attribute in place:
(setattr(c(myTime), 'tzone', attributes(myTime)$tzone))
#[1] "2015-01-01 UTC"
In R, how can I convert time variable "30MAY07" or "21AUG09" to a value? I want to find the time difference between them. Thanks!
You can use the lubridate package for this:
library(lubridate)
dmy(c('30MAY07', '21AUG09'))
# [1] "2007-05-30 UTC" "2009-08-21 UTC"
strftime and as.Date from base R are also good options, but lubridate make very good informed guesses as to the format of the date. You see in your example, there is no need to specify anything else than to use the day month year function (dmy) and things work out of the box.