How to avoid that anytime(<numeric>) "updates by reference"? - r

I want to convert a numeric variable to POSIXct using anytime. My issue is that anytime(<numeric>) converts the input variable as well - I want to keep it.
Simple example:
library(anytime)
t_num <- 1529734500
anytime(t_num)
# [1] "2018-06-23 08:15:00 CEST"
t_num
# [1] "2018-06-23 08:15:00 CEST"
This differs from the 'non-update by reference' behaviour of as.POSIXct in base R:
t_num <- 1529734500
as.POSIXct(t_num, origin = "1970-01-01")
# [1] "2018-06-23 08:15:00 CEST"
t_num
# 1529734500
Similarly, anydate(<numeric>) also updates by reference:
d_num <- 17707
anydate(d_num)
# [1] "2018-06-25"
d_num
# [1] "2018-06-25"
I can't find an explicit description of this behaviour in ?anytime. I could use as.POSIXct as above, but does anyone know how to handle this within anytime?

anytime author here: this is standard R and Rcpp and passing-by-SEXP behaviour: you cannot protect a SEXP being passed from being changed.
The view that anytime takes is that you are asking for an input to be converted to a POSIXct as that is what anytime does: from char, from int, from factor, from anything. As a POSIXct really is a numeric value (plus a S3 class attribute) this is what you are getting.
If you do not want this (counter to the design of anytime) you can do what #Moody_Mudskipper and #PKumar showed: used a temporary expression (or variable).
(I also think the data.table example is a little unfair as data.table -- just like Rcpp -- is very explicit about taking references where it can. So of course it refers back to the original variable. There are idioms for deep copy if you need them.)
Lastly, an obvious trick is to use format if you just want different display:
R> d <- data.frame(t_num=1529734500)
R> d[1, "posixct"] <- format(anytime::anytime(d[1, "t_num"]))
R> d
t_num posixct
1 1529734500 2018-06-23 01:15:00
R>
That would work the same way in data.table, of course, as the string representation is a type change. Ditto for IDate / ITime.
Edit: And the development version in the Github repo has had functionality to preserve the incoming argument since June 2017. So the next CRAN version, whenever I will push it, will have it too.

You could hack it like this:
library(anytime)
t_num <- 1529734500
anytime(t_num+0)
# POSIXct[1:1], format: "2018-06-23 08:15:00"
t_num
# [1] 1529734500
Note that an integer input will be treated differently:
t_int <- 1529734500L
anytime(t_int)
# POSIXct[1:1], format: "2018-06-23 08:15:00"
t_int
# [1] 1529734500

If you do this, it will work :
t_num <- 1529734500
anytime(t_num*1)
#> anytime(t_num*1)
#[1] "2018-06-23 06:15:00 UTC"
#> t_num
#[1] 1529734500

Any reason to be married to anytime?
.POSIXct(t_num, tz = 'Europe/Berlin')
# [1] "2018-06-23 08:15:00 CEST"
.POSIXct(x, tz) is a wrapper for structure(x, class = c('POSIXct', 'POSIXt'), tzone = tz) (i.e. you can ignore declaring the origin), and is essentially as.POSIXct.numeric (except the latter is flexible in allowing non-UTC origin dates), look at print(as.POSIXct.numeric).

When I did my homework before posting the question, I checked the open anytime issues. I have now browsed the closed ones as well, where I found exactly the same issue as mine:
anytime is overwriting inputs
There the package author writes:
I presume because as.POSIXct() leaves its input alone, we should too?
So from anytime version 0.3.1 (unreleased):
Numeric input is now preserved rather than silently cast to the return object type
Thus, one answer to my question is: "wait for 0.3.1"*.
When 0.3.1 is released, the behaviour of anytime(<numeric>) will agree with anytime(<non-numeric>) and as.POSIXct(<numeric>), and work-arounds not needed.
*Didn't have to wait too long: 0.3.1 is now released: "Numeric input is now preserved rather than silently cast to the return object type"

Related

Convert "xx-xxx-xxxx" to date in R

I want to convert strings such as "19-SEP-2022" to date. Is there any available function in R? Thank you.
Just to complete I want to add parse_date_time function from lubridate package. With no doubt, the preferred answer here is that of #Marco Sandri:
library(lubridate)
x <- "19-SEP-2022"
x <- parse_date_time(x, "dmy")
class(x)
[1] "2022-09-19 UTC"
> class(x)
[1] "POSIXct" "POSIXt"
Yes, strptime can be used to parse strings into dates.
You could do something like strptime("19-SEP-2022", "%d-%b-%Y").
If your days are not zero-padded, then use %e instead of %d.
A decade or so ago I starting writing the anytime package because of the firm belief that for obvious date(time) patterns we should not need to specify patterns, or learn grammars.
I still use it daily, and so do a bunch of other CRAN users.
> anytime::anydate("19-SEP-2022")
[1] "2022-09-19"
>
So here we do exaxtly what you ask for: supply the string, return a date object.

Convert timestamps without adding yyyy-mm-dd

I want to convert timestamps from characters to time but I do not want any date. On top of that, it must display milliseconds, as in "hh:mm:ss,os".
If I use as.POSIXCT it always adds a date prefix to my timestamp and that is not my intention. I also checked the lubridate package but I can't seem to find a function that goes beyond "as.hms" so that it displays at least two digits in milliseconds.
Example using POSIXct
df <-c("01:31:12.20","01:31:14.56","01:31:14.84")
options(digits.secs = 2)
df <- as.POSIXct(df, format="%H:%M:%OS")
This is the outcome:
[1] "2019-03-15 01:31:12.20 EDT" "2019-03-15 01:31:14.55 EDT"
[3] "2019-03-15 01:31:14.83 EDT"
Thank you.
Perhaps, the hms package does what the OP expects. hms implements an S3 class for storing and formatting time-of-day values, based on the 'difftime' class.
library(hms)
as.hms(df)
01:31:12.200000
01:31:14.560000
01:31:14.840000
It can be used for calculation, e.g.,
diff(as.hms(df))
00:00:02.360000
00:00:00.280000
Please, note that the print() and format() methods do not accept other parameters and do not respect options(digits.secs = 2).
The hms class is similar to lubridate's as.hms() function which creates a period object:
lubridate::hms(df)
[1] "1H 31M 12.2S" "1H 31M 14.56S" "1H 31M 14.84S"
Arithmetic can be done as well:
diff(lubridate::hms(df))
[1] 2.36 0.28
Please, be aware that the internal representation of time, date, and datetime objects usually is based on numeric types to allow for doing calculations. The internal representation is different from the character string when the object is printed.
The ITime class in the data.table package is a time-of-day class which stores the integer number of seconds in the day. So, it cannot handle milliseconds.

Is there a comparable R method to Panda's to_datetime? Char to POSIXct issue

I'm simply attempting to take a string (char) time (mm:ss.s) (minute:second.fractional second) to a POSIXct object.
I've attempted many solutions with base R and lubridate, but I can't seem to preserve the fractional second.
In Python, I can simply use to_datetime and I can parse out what I need into the correct object type.
I'm wondering from the community if there is a solution.
The column data looks like this if you need a more clear visual:
> glimpse(update_times$Times)
chr [1:318] "24:45.0" "24:11.8" "22:22.6"
Thank you in advance. Much appreciated!
lubridate::ms(c("24:45.0", "24:11.8", "22:22.6"))
## [1] "24M 45S" "24M 11.8S" "22M 22.6S"
I believe this is possible with the standard strptime() function. From help(strptime):
## time with fractional seconds
z <- strptime("20/2/06 11:16:16.683", "%d/%m/%y %H:%M:%OS")
z # prints without fractional seconds
op <- options(digits.secs = 3)
z
options(op)

Issue in converting date format to numeric format in R

I had a dataset that looked like this:
#df
id date
1 2016-08-30 10:46:46.810
I tried to remove the hour part and only keep the date. This function worked:
df$date <- format(as.POSIXct(strptime(df$date,"%Y-%m-%d %H:%M:%S")) ,format = "%Y-%m-%d")
and the date now look likes this
id date
1 2016-08-30
Which is something that I was looking for. But the problem is I wish to do some calculation on this data and have to convert it to integer:
temp <- as.numeric(df$date )
It gives me the following warning:
Warning message:
NAs introduced by coercion
and results in
NA
Does anyone know where the issue is?
It's pretty easy as you have a standard format (see ISO 8601) which inter alia the anytime package supports (and it supports other, somewhat regular formats):
R> library(anytime)
R> at <- anytime("2016-08-30 10:46:46.810")
R> at
[1] "2016-08-30 10:46:46.80 CDT"
R> ad <- anydate("2016-08-30 10:46:46.810")
R> ad
[1] "2016-08-30"
R>
The key, though, is to understand the relationship between the underlying date formats. You will have to read and try a bit more on that. Here, in essence we just have
R> as.Date(anytime("2016-08-30 10:46:46.810"))
[1] "2016-08-30"
R>
The anytime package has a few other tricks such as automagic conversion from integer, character, factor, ordered, ...
As for the second part of your question, your were so close and then you spoiled it again with format() creating a character representation.
You almost always want Date representation instead:
R> ad <- as.Date(anytime("2016-08-30 10:46:46.810"))
R> as.integer(ad)
[1] 17043
R> as.numeric(ad)
[1] 17043
R> ad + 1:3
[1] "2016-08-31" "2016-09-01" "2016-09-02"
R>
Not format(). format gives you a character vector (string), and this confuses as.numeric because there are weird non-numeric characters in there. As far as the parser is concerned, you might as well have asked as.numeric("ripe red tomatoes").
Use as.Date() instead. e.g.
as.Date(as.POSIXct(df$date, format="%Y-%m-%d %H:%M:%S"))

Odd POSIXct Function Behavior In R

I'm working with the POSIXct data type in R. In my work, I incorporate a function that returns two POSIXct dates in a vector. However, I am discovering some unexpected behavior. I wrote some example code to illustrate my problem:
# POSIXct returning issue:
returnTime <- function(date) {
oneDay <- 60 * 60 * 24
nextDay <- date + oneDay
print(date)
print(nextDay)
return(c(date, nextDay))
}
myTime <- as.POSIXct("2015-01-01", tz = "UTC")
bothDays <- returnTime(myTime)
print(bothDays)
The print statements in the function give:
[1] "2015-01-01 UTC"
[1] "2015-01-02 UTC"
While the print statement at the end of the code gives:
[1] "2014-12-31 19:00:00 EST" "2015-01-01 19:00:00 EST"
I understand what is happening, but I don't see as to why. It could be a simple mistake that is eluding me, but I really am quite confused. I don't understand why the time zone is changing on the return. The class is still POSIXct as well, just the time zone has changed.
Additionally, I did the same as above, but just returned one of the dates and the date's timezone did not change. I can work around this for now, but wanted to see if anyone had any insight to my problem. Thank you in advance!
Thanks for the help below. I instead did:
return(list(date, nextDay))
and this solved my issue of the time zone being dropped.
From ?c.POSIXct:
Using c on "POSIXlt" objects converts them to the current time zone,
and on "POSIXct" objects drops any "tzone" attributes (even if they
are all marked with the same time zone).
See also here.
The problem is that the function c removes the timezone attribute:
attributes(myTime)
#$class
#[1] "POSIXct" "POSIXt"
#
#$tzone
#[1] "UTC"
attributes(c(myTime))
#$class
#[1] "POSIXct" "POSIXt"
To fix, you can e.g. use the setattr function from data.table, to modify the attribute in place:
(setattr(c(myTime), 'tzone', attributes(myTime)$tzone))
#[1] "2015-01-01 UTC"

Resources