Time series, lubridate, and the seq function - r

I'm just starting my adventure with the lubridate package and the dates in R. And at the beginning I was surprised by some behavior. Because when I do this seq(ymd("2021-01-01"), ymd("2021-01-04"), ddays(1)) I only get one date [1]" 2021-01-01". But when I do this seq(ymd_h("2021-01-01 00"), ymd_h("2021-01-04 00"), ddays(1)) I get the more expected result which is four dates "2021- 01-01 UTC" "2021-01-02 UTC" "2021-01-03 UTC" "2021-01-04 UTC".
I admit that it surprised me a lot.
I will be very grateful for explaining in simple words why this is happening.
And immediately the second question. Is there any function like seq that would correctly understand the d... functions in the lubridate package (ddays, dhours, dminutes etc)?

seq is not part of the lubridate package and doesn't understand the d... functions.
ymd returns a Date, so when you call seq, you are using seq.Date.
You want seq(ymd("2021-01-01"), ymd("2021-01-04"), "days")
ymd_h returns a POSIXct object, so then seq is using seq.POSIXct.
You again want seq(ymd_h("2021-01-01"), ymd_h("2021-01-04"), "days"), but now the result is a POSIXct vector.
See the help for seq.Date and seq.POSIXct to see how they differ.
The new clock package has many good functions for date manipulation, including one called date_seq you might find useful.

Related

Convert "xx-xxx-xxxx" to date in R

I want to convert strings such as "19-SEP-2022" to date. Is there any available function in R? Thank you.
Just to complete I want to add parse_date_time function from lubridate package. With no doubt, the preferred answer here is that of #Marco Sandri:
library(lubridate)
x <- "19-SEP-2022"
x <- parse_date_time(x, "dmy")
class(x)
[1] "2022-09-19 UTC"
> class(x)
[1] "POSIXct" "POSIXt"
Yes, strptime can be used to parse strings into dates.
You could do something like strptime("19-SEP-2022", "%d-%b-%Y").
If your days are not zero-padded, then use %e instead of %d.
A decade or so ago I starting writing the anytime package because of the firm belief that for obvious date(time) patterns we should not need to specify patterns, or learn grammars.
I still use it daily, and so do a bunch of other CRAN users.
> anytime::anydate("19-SEP-2022")
[1] "2022-09-19"
>
So here we do exaxtly what you ask for: supply the string, return a date object.

Convert timestamps without adding yyyy-mm-dd

I want to convert timestamps from characters to time but I do not want any date. On top of that, it must display milliseconds, as in "hh:mm:ss,os".
If I use as.POSIXCT it always adds a date prefix to my timestamp and that is not my intention. I also checked the lubridate package but I can't seem to find a function that goes beyond "as.hms" so that it displays at least two digits in milliseconds.
Example using POSIXct
df <-c("01:31:12.20","01:31:14.56","01:31:14.84")
options(digits.secs = 2)
df <- as.POSIXct(df, format="%H:%M:%OS")
This is the outcome:
[1] "2019-03-15 01:31:12.20 EDT" "2019-03-15 01:31:14.55 EDT"
[3] "2019-03-15 01:31:14.83 EDT"
Thank you.
Perhaps, the hms package does what the OP expects. hms implements an S3 class for storing and formatting time-of-day values, based on the 'difftime' class.
library(hms)
as.hms(df)
01:31:12.200000
01:31:14.560000
01:31:14.840000
It can be used for calculation, e.g.,
diff(as.hms(df))
00:00:02.360000
00:00:00.280000
Please, note that the print() and format() methods do not accept other parameters and do not respect options(digits.secs = 2).
The hms class is similar to lubridate's as.hms() function which creates a period object:
lubridate::hms(df)
[1] "1H 31M 12.2S" "1H 31M 14.56S" "1H 31M 14.84S"
Arithmetic can be done as well:
diff(lubridate::hms(df))
[1] 2.36 0.28
Please, be aware that the internal representation of time, date, and datetime objects usually is based on numeric types to allow for doing calculations. The internal representation is different from the character string when the object is printed.
The ITime class in the data.table package is a time-of-day class which stores the integer number of seconds in the day. So, it cannot handle milliseconds.

R - converting dates within data.frame

I am working with a "data.frame" which are given in the following formate: Aug 12, 2017.
class(data[,1]) = factor
How can i convert these into dates?
data[,1] <- as.Date.factor(data[,1],format = "%m.%d.%y"), returns NA's.
I would suggest the package lubridate for very easy to use functions to operate with dates. For example:
mdy("Aug 12,2017")
[1] "2017-08-12"
If your date is in YYYY-MM-DD format, you can use the ymd function. There are also other functions such as dmy, dmy_hms (for datetime), etc.
If your column is called my.date, you can do:
data$my.date <- mdy(data$my.date)
Alternatively, you can use the %<>% operator from magrittr to make your code even shorter:
data$my.date %<>% mdy
Use as.POSIXct (Base-R Solution):
as.POSIXct("Aug 12,2017", format="%b%d,%Y")
Output:
[1] "2017-08-12 CEST"
Using strptime, could work:
strptime("Aug 12,2017", "%b%d,%Y")
Output:
[1] "2017-08-12 UTC"
The second parameter for strptime is the format of the dates you have. For instance, if your dates are like this "1/5/2005", then the format would be:
format="%m/%d/%Y"
Hope it helps

how to take the difference between dates like "30MAY07"?

In R, how can I convert time variable "30MAY07" or "21AUG09" to a value? I want to find the time difference between them. Thanks!
You can use the lubridate package for this:
library(lubridate)
dmy(c('30MAY07', '21AUG09'))
# [1] "2007-05-30 UTC" "2009-08-21 UTC"
strftime and as.Date from base R are also good options, but lubridate make very good informed guesses as to the format of the date. You see in your example, there is no need to specify anything else than to use the day month year function (dmy) and things work out of the box.

INTNX equivalent in R

I am currently working on replicating a SAS code to R.
In SAS, there is INTNX function that helps to advance a date by a given interval.
For example -
intnx('month','2013/12/10',3) = 2014/03/10
I was wondering if there is a function in R that works in a similar fashion?
Using lubridate package you can simply do this:
library(lubridate)
ymd("2013/12/10") + months(3)
[1] "2014-03-10 UTC"
Note also if you want to add a month without exceeding the last day of the new month, you should use %m+:
ymd("2013/01/31") %m+% months(1)
[1] "2013-02-28 UTC"
There is. You could do:
seq(as.Date("2013-12-10"), length=2, by="3 months")[2]
[1] "2014-03-10"

Resources