I am having problems recovering lubridate::intervals when reading back from csv, and fst formats.
Does anyone have a suggestion for how to do this?
library(tidyverse)
library(fst)
library(lubridate)
test <- tibble(
start = ymd_hms("2020-01-01 12:13:14", tz="UTC"),
end = ymd_hms("2021-01-01 12:13:14", tz="UTC"),
interval = lubridate::interval(start, end)
) %>%
write_csv("test1.csv")
test %>% fst::write_fst("test1.fst")
str(test)
test_read_back_csv <- read_csv("test1.csv")
str(test_read_back_csv)
test_read_back_fst <- read_fst("test1.fst")
str(test_read_back_fst)
You will see that the structure of the returned object test_read_back_csv$interval or test_read_back_fst$interval is not a lubridate interval, and I both need to save this file, and read it back properly.
Save to a binary format such as .RDS:
str(test)
#> tibble [1 x 3] (S3: tbl_df/tbl/data.frame)
#> $ start : POSIXct[1:1], format: "2020-01-01 12:13:14"
#> $ end : POSIXct[1:1], format: "2021-01-01 12:13:14"
#> $ interval:Formal class 'Interval' [package "lubridate"] with 3 slots
#> .. ..# .Data: num 31622400
#> .. ..# start: POSIXct[1:1], format: "2020-01-01 12:13:14"
#> .. ..# tzone: chr "UTC"
write_rds(test, "test1.rds")
test_read_back_rds <- read_rds("test1.rds")
str(test_read_back_rds)
#> tibble [1 x 3] (S3: tbl_df/tbl/data.frame)
#> $ start : POSIXct[1:1], format: "2020-01-01 12:13:14"
#> $ end : POSIXct[1:1], format: "2021-01-01 12:13:14"
#> $ interval:Formal class 'Interval' [package "lubridate"] with 3 slots
#> .. ..# .Data: num 31622400
#> .. ..# start: POSIXct[1:1], format: "2020-01-01 12:13:14"
#> .. ..# tzone: chr "UTC"
Created on 2022-03-14 by the reprex package (v2.0.1)
Related
I am trying to separate a column.. I have columns (date, lmp, node). The code recognizes date but does not recognize lmp and node.
names(Pjm)
# [1] "date" "lmp" "node"
names(Pjm) <- c("date", "lmp", "node")
str(Pjm)
#tibble [210,240 × 3] (S3: tbl_df/tbl/data.frame)
# $ date: POSIXct[1:210240], format: "2021-01-01 00:00:00" "2021-01-01 01:00:00" "2021-01-01 02:00:00" "2021-01-01 03:00:00" ...
# $ lmp : num [1:210240] 19.2 17.8 17.8 17.8 17.9 ...
# $ node: chr [1:210240] "AEP_GEN" "AEP_GEN" "AEP_GEN" "AEP_GEN" ...
exists("lmp")
# [1] FALSE
I have read a lot of blogs, but I cannot find the answer to my question:
I have a date 2020-25-02 17:45:03 and I would like to convert it to two columns day and time.
hello <- strptime(as.character("2020-25-02 17:42:03"),"%Y-%m-%d %H:%M:%S")
df$day <- as.Date(hello, format = "%Y-%d-%m")
But I also would like df$time. Is it possible ?
dtimes = c("2002-06-09 12:45:40","2003-01-29 09:30:40",
+ "2002-09-04 16:45:40","2002-11-13 20:00:40",
+ "2002-07-07 17:30:40")
> dtparts = t(as.data.frame(strsplit(dtimes,' ')))
> row.names(dtparts) = NULL
> thetimes = chron(dates=dtparts[,1],times=dtparts[,2],
+ format=c('y-m-d','h:m:s'))
> thetimes
[1] (02-06-09 12:45:40) (03-01-29 09:30:40) (02-09-04 16:45:40)
[4] (02-11-13 20:00:40) (02-07-07 17:30:40)
Please see this link
Use function hms in package lubridate.
df <- data.frame(day = as.Date(hello, format = "%Y-%d-%m"))
df$time <- lubridate::hms(sub("^[^ ]*\\b(.*)$", "\\1", hello))
df
# day time
#1 2020-02-25 17H 42M 3S
str(df)
#'data.frame': 1 obs. of 2 variables:
# $ day : Date, format: "2020-02-25"
# $ time:Formal class 'Period' [package "lubridate"] with 6 slots
# .. ..# .Data : num 3
# .. ..# year : num 0
# .. ..# month : num 0
# .. ..# day : num 0
# .. ..# hour : num 17
# .. ..# minute: num 42
what about experience by parsing/converting strings like "now-1h", "today", "now-3d", "today+30m" in R?
how to recognize and convert string (for example as function's argument) to date_time?
I do not recommend using this code, which is brittle. This is just to demonstrate this is possible:
library(lubridate)
library(stringr)
library(dplyr)
data <- c("now-1h", "today", "now-3d", "today+30m", "-3d")
We're going to use lubridate functions now(), today(), days(), hours(), ... which resemble your input data:
str(now())
# POSIXct[1:1], format: "2017-03-24 12:26:18"
str(today())
# Date[1:1], format: "2017-03-24"
str(now() - days(3))
# POSIXct[1:1], format: "2017-03-21 12:27:11"
str(- days(3))
# Formal class 'Period' [package "lubridate"] with 6 slots
# ..# .Data : num 0
# ..# year : num 0
# ..# month : num 0
# ..# day : num -3
# ..# hour : num 0
# ..# minute: num 0
We're going to have to parse() them as strings, to be able to actually use them, like that:
eval(parse(text = "now() + days(3)"))
# [1] "2017-03-27 12:41:34 CEST"
Now let's parse the input strings with a regex, manipulate them a bit to match lubridate syntax, then eval()uate them:
res <-
str_match(data, "(today|now)?([+-])?(\\d+)?([dhms])?")[, - 1] %>%
apply(1, function(x) {
time_ <- if (is.na(x[1])) NULL else paste0(x[1], "()")
offset_ <- if (any(is.na(x[2:4]))) NULL else paste(x[2],
recode(x[4], "d" = "days(", "h" = "hours(", "m" = "minutes(", "s" = "seconds("),
x[3],
")")
parse(text = paste(time_, offset_))
}) %>%
lapply(eval)
Notice that you get a variety of classes as output (either POSIXct or Date or lubridate::Period):
invisible(lapply(res, function(x) { print(x) ; str(x) }))
# [1] "2017-03-24 11:57:52 CET"
# POSIXct[1:1], format: "2017-03-24 11:57:52"
# [1] "2017-03-24"
# Date[1:1], format: "2017-03-24"
# [1] "2017-03-21 12:57:52 CET"
# POSIXct[1:1], format: "2017-03-21 12:57:52"
# [1] "2017-03-24 00:30:00 UTC"
# POSIXlt[1:1], format: "2017-03-24 00:30:00"
# [1] "-3d 0H 0M 0S"
# Formal class 'Period' [package "lubridate"] with 6 slots
# ..# .Data : num 0
# ..# year : num 0
# ..# month : num 0
# ..# day : num -3
# ..# hour : num 0
# ..# minute: num 0
(What I recommend instead is to pre-process the data with the language that produced it and possesses the right tools for the job, which appears to be Perl).
I can generate a list of timeDate objects for New York Exchange. However, most of the analytical functions expect a single timeDate object. The underlying data representation is POSIXct, so I can't just append them like a vector or a list.
How to do it?
library(timeDate)
x <- lapply(c(1885: 1886), holidayNYSE)
x
[[1]]
NewYork
[1] [1885-01-01] [1885-02-23] [1885-04-03] [1885-11-03] [1885-11-26] [1885-12-25]
[[2]]
NewYork
[1] [1886-01-01] [1886-02-22] [1886-04-23] [1886-05-31] [1886-07-05] [1886-11-02] [1886-11-25]
class(x[[1]])
[1] "timeDate"
attr(,"package")
[1] "timeDate"
class(x[[1]]#Data)
[1] "POSIXct" "POSIXt"
# ??? How to my two datetime objects ???
We can use do.call with c
x1 <- do.call(c, x)
x1
#NewYork
#[1] [1885-01-01] [1885-02-23] [1885-04-03] [1885-11-03] [1885-11-26] [1885-12-25] [1886-01-01] [1886-02-22] [1886-04-23] [1886-05-31] [1886-07-05] [1886-11-02]
#[13] [1886-11-25]
str(x1)
#Formal class 'timeDate' [package "timeDate"] with 3 slots
# ..# Data : POSIXct[1:13], format: "1885-01-01 05:00:00" "1885-02-23 05:00:00" "1885-04-03 05:00:00" "1885-11-03 05:00:00" ...
# ..# format : chr "%Y-%m-%d"
# ..# FinCenter: chr "NewYork"
and the structure of OP's list is
str(x)
#List of 2
#$ :Formal class 'timeDate' [package "timeDate"] with 3 slots
# .. ..# Data : POSIXct[1:6], format: "1885-01-01 05:00:00" "1885-02-23 05:00:00" "1885-04-03 05:00:00" "1885-11-03 05:00:00" ...
# .. ..# format : chr "%Y-%m-%d"
# .. ..# FinCenter: chr "NewYork"
# $ :Formal class 'timeDate' [package "timeDate"] with 3 slots
# .. ..# Data : POSIXct[1:7], format: "1886-01-01 05:00:00" "1886-02-22 05:00:00" "1886-04-23 05:00:00" "1886-05-31 05:00:00" ...
# .. ..# format : chr "%Y-%m-%d"
# .. ..# FinCenter: chr "NewYork"
I have the following xts object with indexClass "Date". index(data) gives me "POSIXct" object. I thought index(Data) will return an "Date" object.
How can I get "Date" object from index()?
str(data)
An ‘xts’ object from 2007-01-15 to 2012-04-27 containing:
Data: num [1:1282, 1:5] 1881 2003 2064 2026 2098 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:5] "open" "high" "low" "close" ...
Indexed by objects of class: [Date] TZ: GMT
xts Attributes:
List of 2
$ tclass: chr "Date"
$ tzone : chr "GMT"
indexClass(data)
"Date"
str(index(data))
Class 'POSIXct' atomic [1:1282] 1.17e+09 1.17e+09 1.17e+09 1.17e+09 1.17e+09 ...
..- attr(*, "tzone")= chr "GMT"
..- attr(*, "tclass")= chr "Date"
Quick answer: Dates don't have timezones. So it (I presume) has to wrap your Date in a POSIXct to preserve the timezone information.
Here is an example without timezones, where it shows the behaviour you expect:
x=xts(1:10, seq.Date(as.Date('2012-01-01'),by=1,length.out=10))
indexClass(x)
# [1] "Date"
index(x)
# "2012-01-01" "2012-01-02" "2012-01-03" "2012-01-04" "2012-01-05" "2012-01-06" "2012-01-07" "2012-01-08" "2012-01-09" "2012-01-10"
str(index(x))
# Date[1:10], format: "2012-01-01" "2012-01-02" "2012-01-03" "2012-01-04" "2012-01-05" "2012-01-06" "2012-01-07" "2012-01-08" "2012-01-09" "2012-01-10"
UPDATE: Adding the tzone attribute to the xts object does not change anything:
x=xts(1:10, seq.Date(as.Date('2012-01-01'),by=1,length.out=10), tzone="GMT")
str(index(x))
# Date[1:10], format: "2012-01-01" "2012-01-02" "2012-01-03" "2012-01-04" "2012-01-05" "2012-01-06" "2012-01-07" "2012-01-08" "2012-01-09" "2012-01-10"
This is despite str(x) giving the same output as you:
An ‘xts’ object from 2012-01-01 to 2012-01-10 containing:
Data: int [1:10, 1] 1 2 3 4 5 6 7 8 9 10
Indexed by objects of class: [Date] TZ: GMT
xts Attributes:
List of 2
$ tclass: chr "Date"
$ tzone : chr "GMT"