POSIXct date conversion error [duplicate] - r

This question already has an answer here:
R: strptime() and is.na () unexpected results
(1 answer)
Closed 8 years ago.
I've encountered the following error when converting a set of dates in character format to a POSIXct object.
Example Data:
t<-c("3/11/2007 1:30", "3/11/2007 2:00", "4/11/2007 2:00")
str(t)
chr [1:3] "3/11/2007 1:30" "3/11/2007 2:00" "4/11/2007 2:00"
z<-as.POSIXct(strptime(t, format ="%m/%d/%Y %H:%M"))
z
"2007-03-11 01:30:00 MST" NA "2007-04-11 02:00:00 MDT"
str(z)
POSIXct[1:3], format: "2007-03-11 01:30:00" NA "2007-04-11 02:00:00"
My question is why is the NA returned for the second date in z? I have a dataset that contains 8 years of hourly data (from which I copied the dates above), and this NA error pops up only for dates between 3/8 - 3/14 and ONLY when the hour is 02:00:00.
I do not encounter an error if the dates are converted to POSIXlt, so that is my current work around.
Any thoughts?

Try using a time zone that does not use daylight savings time:
as.POSIXct(t, format = "%m/%d/%Y %H:%M", tz = "GMT")
## [1] "2007-03-11 01:30:00 GMT" "2007-03-11 02:00:00 GMT" "2007-04-11 02:00:00 GMT"

Related

Converting long integer into date and time in r [duplicate]

This question already has answers here:
Convert integer to class Date
(3 answers)
Closed 1 year ago.
I have date and time information in the following format:
z <- 20201019083000
I want to convert it into a readable date and time format such as follows:
"2020-10-19 20:20"
So far I have tried this but cannot get the correct answer.
#in local
as.POSIXct(z, origin = "1904-01-01")
"642048-10-22 14:43:20 KST"
#in UTC
as.POSIXct(z, origin = "1960-01-01", tz = "GMT")
"642104-10-23 05:43:20 GMT"
#in
as.POSIXct(as.character(z), format = "%H%M%S")
"2021-07-13 20:20:10 KST"
Any better way to do it?
library(lubridate)
ymd_hms("20201019083000")
# [1] "2020-10-19 08:30:00 UTC"
# or, format the output:
format(ymd_hms("20201019083000"), "%Y-%m-%d %H:%M")
# "2020-10-19 08:30"
You can use as.POSIXct or strptime with the format %Y%m%d%H%M%S:
as.POSIXct(as.character(z), format="%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
strptime(z, "%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
Your tried format "%H%M%S" dos not include Year %Y , Month %m and Day %d.

R: xts conversion problem (Add x's at the index row)

This is my setup: I have an excel-file with hourly electricity prices. I want to index them by the hourly interval, file here: Data. I load the data the usual way.
library(readxl)
library(tidyverse)
rm(list = ls())
DK1 <- read_excel("DK1.xlsx")
time_index <- as.POSIXct(DK1$Datetime, format="%Y/%m/%d %H:%M:%S", tz=Sys.timezone())
test <- xts(DK1[,-1], order.by = time_index)
This is just one of many ways I've tried to index it in XTS to no avail. The index row looks wrong and I do not know what to do.
UPDATE 1: dput(head(DK1))
It appears that read_excel is converting your time column into a datetime, but with all the dates set to "1899-12-31". This can be seen by running:
> str(DK1)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 8760 obs. of 6 variables:
$ Date : POSIXct, format: "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01"...
$ Hours : POSIXct, format: "1899-12-31 00:00:00" "1899-12-31 01:00:00" "1899-12-31 02:00:00" "1899-12-31 03:00:00" ...
$ Datetime : chr "2019-01-01 00:00:00" "2019-01-01 01:00:00" "2019-01-01 02:00:00" "2019-01-01 03:00:00" ...
$ DK1 : num 211.5 75.2 -30.5 -74 -55.3 ...
This is more of a data import problem and the Datetime concat in excel can be performed in R. Generally it's simpler to have all data manipulation performed in a single spot.
library(readxl)
library(xts)
DK1 <- read_excel("DK1.xlsx")
# pasting date and time together in new column name for comparison
# note the use of strftime to remove the date information discussed earlier
DK1$Datetime2 <- paste(DK1$Date, strftime(DK1$Hours, "%H:%M:%S", tz = "UTC"))
# the format / in excel need to change to - for how it's displayed in R
DK1$time_index <- as.POSIXct(DK1$Datetime, format = "%Y-%m-%d %H:%M:%S", tz = Sys.timezone())
# filtering out the NA value of 2019-03-10 02:00:00 which is when daylight savings occurred
DK1 <- DK1[!is.na(DK1$time_index), ]
DK1a <- xts(DK1[, "DK1"], order.by = DK1$time_index)
> head(DK1a)
DK1
2019-01-01 00:00:00 211.48
2019-01-01 01:00:00 75.20
2019-01-01 02:00:00 -30.47
2019-01-01 03:00:00 -74.00
2019-01-01 04:00:00 -55.33
2019-01-01 05:00:00 -93.72
We can select the numeric column and then order.by the 'Date' which is already a Datetime class
library(xts)
xts(DK1$DK1, order.by = DK1$Date)
as the format is in the default format, we don't have to specify the format

I would like to extract the time from a character vector [duplicate]

This question already has answers here:
Convert date-time string to class Date
(4 answers)
Closed 3 years ago.
I have date&time stamp as a character variable
"2018-12-13 11:00:01 EST" "2018-10-23 22:00:01 EDT" "2018-11-03 14:15:00 EDT" "2018-10-04 19:30:00 EDT" "2018-11-10 17:15:31 EST" "2018-10-05 13:30:00 EDT"
How can I strip the time from this character vector?
PS: Can someone please help. I have tried using strptime but I am getting NA values as a result
It's a bit unclear whether you want the date or time but if you want the date then as.Date ignores any junk after the date so:
x <- c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT")
as.Date(x)
## [1] "2018-12-13" "2018-10-23"
would be sufficient to get a Date vector from the input vector x. No packages are used.
If you want the time then:
read.table(text = x, as.is = TRUE)[[2]]
## [1] "11:00:01" "22:00:01"
If you want a data frame with each part in a separate column then:
read.table(text = x, as.is = TRUE, col.names = c("date", "time", "tz"))
## date time tz
## 1 2018-12-13 11:00:01 EST
## 2 2018-10-23 22:00:01 EDT
I think the OP wants to extract the time from date-time variable (going by the title of the question).
x <- "2018-12-13 11:00:01 EST"
as.character(strptime(x, "%Y-%m-%d %H:%M:%S"), "%H:%M:%S")
[1] "11:00:01"
Another option:
library(lubridate)
format(ymd_hms(x, tz = "EST"), "%H:%M:%S")
[1] "11:00:01"
The package lubridate makes everything like this easy:
library(lubridate)
x <- "2018-12-13 11:00:01 EST"
as_date(ymd_hms(x))
You can use the as.Date function and specify the format
> as.Date("2018-12-13 11:00:01 EST", format="%Y-%m-%d")
[1] "2018-12-13"
If all values are in a vector:
x = c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT",
"2018-11-03 14:15:00 EDT", "2018-10-04 19:30:00 EDT",
"2018-11-10 17:15:31 EST", "2018-10-05 13:30:00 EDT")
> as.Date(x, format="%Y-%m-%d")
[1] "2018-12-13" "2018-10-23" "2018-11-03" "2018-10-04" "2018-11-10"
[6] "2018-10-05"

Bizzare as.POSIXct error [duplicate]

This question already has an answer here:
R: strptime() and is.na () unexpected results
(1 answer)
Closed 8 years ago.
I've encountered the following error when converting a set of dates in character format to a POSIXct object.
Example Data:
t<-c("3/11/2007 1:30", "3/11/2007 2:00", "4/11/2007 2:00")
str(t)
chr [1:3] "3/11/2007 1:30" "3/11/2007 2:00" "4/11/2007 2:00"
z<-as.POSIXct(strptime(t, format ="%m/%d/%Y %H:%M"))
z
"2007-03-11 01:30:00 MST" NA "2007-04-11 02:00:00 MDT"
str(z)
POSIXct[1:3], format: "2007-03-11 01:30:00" NA "2007-04-11 02:00:00"
My question is why is the NA returned for the second date in z? I have a dataset that contains 8 years of hourly data (from which I copied the dates above), and this NA error pops up only for dates between 3/8 - 3/14 and ONLY when the hour is 02:00:00.
I do not encounter an error if the dates are converted to POSIXlt, so that is my current work around.
Any thoughts?
Try using a time zone that does not use daylight savings time:
as.POSIXct(t, format = "%m/%d/%Y %H:%M", tz = "GMT")
## [1] "2007-03-11 01:30:00 GMT" "2007-03-11 02:00:00 GMT" "2007-04-11 02:00:00 GMT"

change timezone for some POSIXct entries in a data frame in R

I am having difficulty changing time zones for POSIXct object. Following the suggestion in:
Change timezone in a POSIXct object
I tried
> test
timestamp dttm_utc value estimated anomaly SITE_ID
954157 1328043600 2012-02-01 00:00:00 16.4803 0 NA 31
954158 1328043900 2012-02-01 00:05:00 16.4364 0 NA 31
> attributes(test[2,2])$tzone
TIME_ZONE
"America/New_York"
> attributes(test[2,2])$tzone <- "America/Los_Angeles"
> attributes(test[2,2])$tzone
TIME_ZONE
"America/New_York"
Why does this not work? How can I solve this problem?
The problem is that tzone is a property of the entire vector. Each element cannot have their own timezone. You can change the timezone for the entire vector. Consider this example
x<-as.POSIXct(c("2012-02-01 00:00:00","2012-02-01 00:05:00"), tz="America/New_York")
attributes(x[1])$tzone
# [1] "America/New_York"
# does not change
attributes(x[1])$tzone<-"America/Los_Angeles"
attributes(x[1])$tzone
# [1] "America/New_York"
#changes
attributes(x)$tzone<-"America/Los_Angeles"
attributes(x[1])$tzone
# [1] "America/Los_Angeles"
If you have dates from different time zones, you can specify the time zone with a UTC offset and then they will all be converted to a common timezone
x<-as.POSIXct(c("2012-02-01 00:00:00-0800","2012-02-01 00:05:00-0500"),
format="%Y-%m-%d %H:%M:%S%z", tz="America/Los_Angeles")
# [1] "2012-02-01 00:00:00 PST" "2012-01-31 21:05:00 PST"

Resources