lubridate package assignments involving thousandths of a second - r

Why does the ymd_hms function from R's lubridate package return "2018-01-09 15:43:44.843 UTC" for ymd_hms('2018-01-09T15:43:44.844Z')?
I naively would have expected "2018-01-09 15:43:44.844 UTC".
ymd_hms('2018-01-09T15:43:44.822Z') returns "2018-01-09 15:43:44.822 UTC".
Since this is GMT/UTC, I don't believe daylight savings would be a factor, and different values for the truncated = option don't seem to make a difference.

From the ymd_hms documention:
NOTE: The ymd family of functions are based on strptime()
As described here, the issue seems to lie in how strptime truncates rather than rounds fractions of a second. Play around with options(digits.secs = n) to see how it handles various numbers of decimal places.

I would rather use:
format(strptime("2018-01-09T15:43:44.844Z", "%Y-%m-%dT%H:%M:%OS", tz = "EST"), format="%Y-%m-%d %H:%M:%OS3 %Z", tz = "EST")
[1] "2018-01-09 15:43:44.844 EST"
Note that the milliseconds are passed in the format using the argument %OS

Related

Time series, lubridate, and the seq function

I'm just starting my adventure with the lubridate package and the dates in R. And at the beginning I was surprised by some behavior. Because when I do this seq(ymd("2021-01-01"), ymd("2021-01-04"), ddays(1)) I only get one date [1]" 2021-01-01". But when I do this seq(ymd_h("2021-01-01 00"), ymd_h("2021-01-04 00"), ddays(1)) I get the more expected result which is four dates "2021- 01-01 UTC" "2021-01-02 UTC" "2021-01-03 UTC" "2021-01-04 UTC".
I admit that it surprised me a lot.
I will be very grateful for explaining in simple words why this is happening.
And immediately the second question. Is there any function like seq that would correctly understand the d... functions in the lubridate package (ddays, dhours, dminutes etc)?
seq is not part of the lubridate package and doesn't understand the d... functions.
ymd returns a Date, so when you call seq, you are using seq.Date.
You want seq(ymd("2021-01-01"), ymd("2021-01-04"), "days")
ymd_h returns a POSIXct object, so then seq is using seq.POSIXct.
You again want seq(ymd_h("2021-01-01"), ymd_h("2021-01-04"), "days"), but now the result is a POSIXct vector.
See the help for seq.Date and seq.POSIXct to see how they differ.
The new clock package has many good functions for date manipulation, including one called date_seq you might find useful.

How to deal with change in default of lubridate's ymd

In ymd from lubridate, the default value of tz was UTC. I don't know exactly when the change was made but I know that in 1.5 the default was UTC but in 1.5.8 the default is now NULL.
This changes the output of ymd from POSIXct objects to Date objects which breaks a lot of my code where I rely on having a POSIXct object but now have a Date. Is there a convenient way to make this backwards compatible or do I need to add the tz='UTC' to all of my old code that relied on this?
Write a wrapper to replace ymd with ymd_hms for which the default is still tz = "UTC"
library(lubridate)
ymd2 = function(x){
ymd_hms(paste(x, "00:00:00"))
}
ymd2("2017/3/4")
#[1] "2017-03-04 UTC"
class(ymd2("2017/3/4"))
#[1] "POSIXct" "POSIXt"

How do I change the format of a char vector containing milliseconds to timeseries vector in R

I have a DF in R which has two character columns. The first column is a time series array and the second column contains continuous numbers. The time series field has time recorded in milliseconds. I am trying to convert this array to a date array. However whichever method I use to convert the same, I lose the milliseconds information.
Following is the dataframe:
time = c("08-08-2016 09:16:33.430","08-08-2016 09:16:37.930")
values <- c(45,21)
my_data <- data.frame(time,values)
I would like to preserve the millisecond information. However, as I convert the time char array using following method, I lose the milliseconds (O/P time array= 2016-08-08 09:16:33,08-08-2016 09:16:37) .
my_data$time=strptime(my_data$time,format="%m-%d-%Y %H:%M:%S.%OS")
I also tried using as.POSIXct, as.Date functions but could not resolve. Can someone please help?
%OS instead of %S, not in addition to it. "%m-%d-%Y %H:%M:%OS" is the format string required:
options(digits.secs=6)
as.POSIXct(my_data$time, format="%m-%d-%Y %H:%M:%OS")
#[1] "2016-08-08 09:16:33.43 AEST" "2016-08-08 09:16:37.93 AEST"
You have a standard-enough format so that anytime can parse this automagically with additional input from you:
R> timevec <- c("08-08-2016 09:16:33.430","08-08-2016 09:16:37.930")
R> anytime(timevec)
[1] "2016-08-08 09:16:33.43 CDT" "2016-08-08 09:16:37.93 CDT"
R>
I tend to have options(digits.secs=6) set by default which is why the display also shows the fractional seconds.

R not recognizing time component of datetime values

I have a dataframe where one column lists a bunch of datetimes. Oddly, the data type for that column is "integer." I need to coerce the column to a proper datetime data type such as POSIXct so that I can subtract these timestamps from those in another field. However, when I try to coerce these datetime values into POSIXct, they lose the time component. When I try to do math on the datetimes without first coercing into another datatype, R acts as if the time component of the timestamp isn't there (it assumes each date has a time of midnight). What's going on and how do I fix it so that R recognizes the timestamp?
> dates[1]
[1] 2016-05-05T16:46:21-04:00
48 Levels: 2016-05-03T06:45:42-04:00 2016-05-03T06:45:43-04:00 ... 2016-05-05T16:50:00-04:00
> typeof(dates)
[1] "integer"
> as.POSIXct(dates[1])
[1] "2016-05-05 EDT"
> as.character(dates[1])
[1] "2016-05-05T16:46:21-04:00"
> as.POSIXct(as.character(dates[1]))
[1] "2016-05-05 EDT"
You can use as.POSIXct with the tz argument to convert the timestamps with the right level of control.
If the timezones are all UTC-04:00 and that is your local timezone, you can use:
dates = as.POSIXct(dates, format="%Y-%m-%dT%H:%M:%S", tz=Sys.timezone())
If they are all UTC-04:00 and that is not your local timezone, but you know the exact location, then you can specify the appropriate timezone from the tz database:
dates = as.POSIXct(dates, format="%Y-%m-%dT%H:%M:%S", tz="America/Port_of_Spain")
Alternatively, you can use a generic GMT-4 timezone:
dates = as.POSIXct(dates, format="%Y-%m-%dT%H:%M:%S", tz="Etc/GMT-4")
[EDIT: With thanks to Roland for his comment below. I originally used strptime, which uses the same syntax, but returns a POSIXlt object.]

Odd POSIXct Function Behavior In R

I'm working with the POSIXct data type in R. In my work, I incorporate a function that returns two POSIXct dates in a vector. However, I am discovering some unexpected behavior. I wrote some example code to illustrate my problem:
# POSIXct returning issue:
returnTime <- function(date) {
oneDay <- 60 * 60 * 24
nextDay <- date + oneDay
print(date)
print(nextDay)
return(c(date, nextDay))
}
myTime <- as.POSIXct("2015-01-01", tz = "UTC")
bothDays <- returnTime(myTime)
print(bothDays)
The print statements in the function give:
[1] "2015-01-01 UTC"
[1] "2015-01-02 UTC"
While the print statement at the end of the code gives:
[1] "2014-12-31 19:00:00 EST" "2015-01-01 19:00:00 EST"
I understand what is happening, but I don't see as to why. It could be a simple mistake that is eluding me, but I really am quite confused. I don't understand why the time zone is changing on the return. The class is still POSIXct as well, just the time zone has changed.
Additionally, I did the same as above, but just returned one of the dates and the date's timezone did not change. I can work around this for now, but wanted to see if anyone had any insight to my problem. Thank you in advance!
Thanks for the help below. I instead did:
return(list(date, nextDay))
and this solved my issue of the time zone being dropped.
From ?c.POSIXct:
Using c on "POSIXlt" objects converts them to the current time zone,
and on "POSIXct" objects drops any "tzone" attributes (even if they
are all marked with the same time zone).
See also here.
The problem is that the function c removes the timezone attribute:
attributes(myTime)
#$class
#[1] "POSIXct" "POSIXt"
#
#$tzone
#[1] "UTC"
attributes(c(myTime))
#$class
#[1] "POSIXct" "POSIXt"
To fix, you can e.g. use the setattr function from data.table, to modify the attribute in place:
(setattr(c(myTime), 'tzone', attributes(myTime)$tzone))
#[1] "2015-01-01 UTC"

Resources