This might sound like a duplicate issue but I have gone through many POSIxct related bugs but did not come across this. If you still find one, I will really appreciate being pointed in that direction. as.POSIXct is behaving very awkwardly in my case. See the example below:
options(digits.secs = 3)
test_time <- "2017-01-26 23:00:00.010"
test_time <- as.POSIXct(test_time, format = "%Y-%m-%d %H:%M:%OS")
This returns:
"2017-01-26 23:00:00.00"
Now, I try the following option and it returns NA. I have no idea why is this behaving like that when all I need it to convert to is "2017-01-26 23:00:00.010".
test_time <- "2017-01-26 23:00:00.010"
test_time <- as.POSIXct(test_time, format = "%Y-%m-%d %H:%M:%OS3")
Now it works fine when I do this:
as.POSIXlt(strptime(test_time,format = "%Y-%m-%d %H:%M:%OS"), format = "%Y-%m-%d %H:%M:%OS")
But for my purpose I need to have this as a POSIxct object because some libraries I am working with only take POSIXct objects. Converting POSIXlt to POSIXct again results in the same problem as before.
Is there an issue with my system settings? The date is also not one of those daylight savings times one to throw an error. Why would it work with one format and not others? Any leads/suggestions are welcome!
Running on Windows 10 64-bit
The issue here has to do with the maximum precision that POSIXct can handle. It is backed by a double under the hood, representing the number of seconds since the epoch, midnight on 1970-01-01 UTC. Fractional seconds are represented as fractional parts of that double, i.e. 63.02 represents 1970-01-01 00:01:03.02 UTC.
options(digits = 22, digits.secs = 3)
.POSIXct(63.02, tz = "UTC")
#> [1] "1970-01-01 00:01:03.02 UTC"
63.02
#> [1] 63.02000000000000312639
Now, when working with doubles there are limits to the precision that they can represent exactly. You can see this with the above example; typing in 63.02 in the console doesn't return exactly the same number, and instead returns something close, but with some extra bits at the end.
So now let's take a look at your example. If we start as "low level" as possible, the first thing as.POSIXct() does is call strptime(), which returns a POSIXlt object. That keeps each "field" of the date-time as a separate element (i.e. year is kept separate from month, day, second, etc). We can see that it parsed correctly and our sec field holds 0.01.
# `digits.secs` to print 3 fractional digits (has no effect on parsing)
# `digits` to print 22 fractional digits for double values
options(digits.secs = 3, digits = 22)
x <- "2017-01-26 23:00:00.010"
# looks good
lt <- strptime(x, format = "%Y-%m-%d %H:%M:%OS", tz = "America/New_York")
lt
#> [1] "2017-01-26 23:00:00.01 EST"
# This is a POSIXlt, which is a list holding fields like year,month,day,...
class(lt)
#> [1] "POSIXlt" "POSIXt"
# sure enough...
lt$sec
#> [1] 0.01000000000000000020817
But now convert that to POSIXct. At this point, the individual fields are collapsed into a single double, which might have precision issues.
# now convert to POSIXct (i.e. a single double holding all the info)
# looks like we lost the fractional seconds?
ct <- as.POSIXct(lt)
ct
#> [1] "2017-01-26 23:00:00.00 EST"
# no, they are still there, but the precision in the `double` data type
# isn't enough to be able to represent this exactly as `1485489600.010`
unclass(ct)
#> [1] 1485489600.009999990463
#> attr(,"tzone")
#> [1] "America/New_York"
So the ct fractional part of the double value is close to .010, but can't represent it exactly and returns a value slightly less than .010, which gets (I presume) rounded down when the POSIXct is printed, making it look like you lost the fractional seconds.
Because these issues are so troublesome, I recommend using the low level API of the clock package (note that I wrote this package). It has support for fractional seconds up to nanoseconds without loss of precision (by using a different data structure than POSIXct).
https://clock.r-lib.org/
library(clock)
x <- "2017-01-26 23:00:00.010"
nt <- naive_time_parse(x, format = "%Y-%m-%d %H:%M:%S", precision = "millisecond")
nt
#> <time_point<naive><millisecond>[1]>
#> [1] "2017-01-26 23:00:00.010"
# If you need it in a time zone
as_zoned_time(nt, zone = "America/New_York")
#> <zoned_time<millisecond><America/New_York>[1]>
#> [1] "2017-01-26 23:00:00.010-05:00"
Related
I have a column - avg_pace - which I want to use as a time in the format %M %S .%f (I think that's right for milliseconds, but please correct me).
garmin2 <- garmin %>%
strptime(avg_pace, "%M:%S:.%f")
This doesn't work for me - I keep getting that strptime can't find avg_pace - but I've used it elsewhere.
Fractional seconds are defined in the POSIXlt format standards using %OS (see ?strptime). You need to specify this format both in parsing the time and in printing it out:
times <- c("10:56.391", "25:21.188")
times2 <- strptime(times, "%M:%OS")
# default printing: not what you want (note that it defaults to today's date)
times2
[1] "2022-10-24 00:10:56 BST" "2022-10-24 00:25:21 BST"
# specify printing format with %OS and the number of fractional second digits
format(times2, "%M:%OS3")
[1] "10:56.391" "25:21.188"
I have some MATLAB serial date number that I need to use in R but I havt to convert them to a normal date.
Matlab:
datestr(733038.6)
ans =
27-Dec-2006 14:24:00
you can see it gives the date and time.
Now we try in R:
Matlab2Rdate <- function(val) as.Date(val - 1, origin = '0000-01-01')
> Matlab2Rdate(733038.6)
[1] "2006-12-27"
It gives only the date but I need also the time? Any idea
The trick is Matlab uses "January 01, 0000", a fictional reference date, to calculate its date number. The origin of time for the "POSIXct" class in R is, ‘1970-01-01 00:00.00 UTC’. You can read about how different systems handle dates here.
Before converting, you need to account for this difference in reference from one format to another. The POSIX manual contains such an example. Here's my output:
> val<-733038.6
> as.POSIXct((val - 719529)*86400, origin = "1970-01-01", tz = "UTC")
[1] "2006-12-27 14:23:59 UTC"
Where 719529 is ‘1970-01-01 00:00.00 UTC’ in Matlab's datenum and 86400 the number of seconds in an standard UTC day.
How can I import the folowing date/time format example in R ? I'm willing to keep all information within this format.
2016-09-12T09:47:00.000+0200
where:
YYYY = four-digit year
MM = two-digit month (01=January, etc.)
DD = two-digit day of month (01 through 31)
hh = two digits of hour (00 through 23) (am/pm NOT allowed)
mm = two digits of minute (00 through 59)
ss = two digits of second (00 through 59)
s = one or more digits representing a decimal fraction of a second
TZD = time zone designator (Z or +hh:mm or -hh:mm)
I've tried strptime without success since I cannot find how to match s and TZD, example:
> strptime("2016-09-12T09:47:00.000+0200", format = '%Y-%m-%dT%H:%M:%S.000%z')
[1] "2016-09-12 09:47:00
To match the decimal fraction of a second (from the docs ?strptime in Examples) use:
format = '%Y-%m-%dT%H:%M:%OS%z'
Then, to see the 3-digits:
op <- options(digits.secs = 3)
strptime("2016-09-12T09:47:00.123+0200", format = '%Y-%m-%dT%H:%M:%OS%z')
##[1] "2016-09-12 03:47:00.123"
To go back to not seeing the 3-digits:
options(op)
I believe this does parse the offset from UTC (i.e., the +0200). I'm on the east coast of the United States, and it is EDT (-0400). Therefore, I'm 6 hours behind (+0200) so that 09:47:00.123+0200 becomes 03:47:00.123 EDT.
You could use the (pretty new) anytime package which does this without formats:
R> anytime("2016-09-12T09:47:00.000+0200")
[1] "2016-09-12 09:47:00 CDT"
R>
I may try to extend it to also recognize the trailing TZ offset as the underlying Boost date_time code supports it. However, I have so far followed R and taken to interpret the time as local time for which it also (automatically) finds the local timezone.
anytime also supports fractional seconds automatically (but you need to ensure you display them):
R> anytime("2016-09-12T09:47:00.123456+0200")
[1] "2016-09-12 09:47:00.123456 CDT"
R>
I tend to work with microsecond data so I tend to have six digits on all the anyway as shown here.
I run discrete event simulations where the time originates from dates. I think that simulations run much faster, when I convert all the dates to integers (relative time in seconds).
What is the best way, to switch between date and seconds in a well definied way where I want to
set the reference time (e.g. "1970-01-01 00:00:00 GMT" or "2016-01-01 00:00:00 GMT") manually,
the time zone and
the origin (Not possible in lubridate?)
I thought I can use the origin for this purpose but it does not influence the result:
> as.numeric(as.POSIXct("2016-01-01 00:00:00 GMT",origin="2016-01-01",tz="GMT"))
> as.numeric(as.POSIXct("2016-01-01 00:00:00 GMT",origin="1970-01-01",tz="GMT"))
both result in [1] 1451606400.
(Only the tz argument changes the result, which is ok of course:
> as.numeric(as.POSIXct("2016-01-01 00:00:00 CEST", tz= "America/Chicago"))
[1] 1451628000)
You can use difftime() to calculate the difference between some timestamp and a reference time:
as.numeric(difftime(as.POSIXct("2016-01-01 00:00:00",tz="GMT"),
as.POSIXct("1970-01-01 00:00:00",tz="GMT"), units = "secs"))
## [1] 1451606400
By choosing another value for units, you could also get the number of minutes, hours, etc.
The reason that you get the same result for both choices of origin is that this argument is only intended to be used when converting a number into a date. Then, the number is interpreted as seconds since the origin that you pass to the function.
Internally, a POSIXct object is always stored as seconds since 1970-01-01 00:00:00, UTC, independent of the origin that you specified when doing the conversion. And accordingly, converting to numeric gives the same result for any choice of origin.
You can have a look at the documentation of as.POSIXct():
## S3 method for class 'character'
as.POSIXlt(x, tz = "", format, ...)
## S3 method for class 'numeric'
as.POSIXlt(x, tz = "", origin, ...)
As you can see, origin is only an argument for the method for numeric, but not for character.
I have some MATLAB serial date number that I need to use in R but I havt to convert them to a normal date.
Matlab:
datestr(733038.6)
ans =
27-Dec-2006 14:24:00
you can see it gives the date and time.
Now we try in R:
Matlab2Rdate <- function(val) as.Date(val - 1, origin = '0000-01-01')
> Matlab2Rdate(733038.6)
[1] "2006-12-27"
It gives only the date but I need also the time? Any idea
The trick is Matlab uses "January 01, 0000", a fictional reference date, to calculate its date number. The origin of time for the "POSIXct" class in R is, ‘1970-01-01 00:00.00 UTC’. You can read about how different systems handle dates here.
Before converting, you need to account for this difference in reference from one format to another. The POSIX manual contains such an example. Here's my output:
> val<-733038.6
> as.POSIXct((val - 719529)*86400, origin = "1970-01-01", tz = "UTC")
[1] "2006-12-27 14:23:59 UTC"
Where 719529 is ‘1970-01-01 00:00.00 UTC’ in Matlab's datenum and 86400 the number of seconds in an standard UTC day.