Bizzare as.POSIXct error [duplicate] - r

This question already has an answer here:
R: strptime() and is.na () unexpected results
(1 answer)
Closed 8 years ago.
I've encountered the following error when converting a set of dates in character format to a POSIXct object.
Example Data:
t<-c("3/11/2007 1:30", "3/11/2007 2:00", "4/11/2007 2:00")
str(t)
chr [1:3] "3/11/2007 1:30" "3/11/2007 2:00" "4/11/2007 2:00"
z<-as.POSIXct(strptime(t, format ="%m/%d/%Y %H:%M"))
z
"2007-03-11 01:30:00 MST" NA "2007-04-11 02:00:00 MDT"
str(z)
POSIXct[1:3], format: "2007-03-11 01:30:00" NA "2007-04-11 02:00:00"
My question is why is the NA returned for the second date in z? I have a dataset that contains 8 years of hourly data (from which I copied the dates above), and this NA error pops up only for dates between 3/8 - 3/14 and ONLY when the hour is 02:00:00.
I do not encounter an error if the dates are converted to POSIXlt, so that is my current work around.
Any thoughts?

Try using a time zone that does not use daylight savings time:
as.POSIXct(t, format = "%m/%d/%Y %H:%M", tz = "GMT")
## [1] "2007-03-11 01:30:00 GMT" "2007-03-11 02:00:00 GMT" "2007-04-11 02:00:00 GMT"

Related

Converting long integer into date and time in r [duplicate]

This question already has answers here:
Convert integer to class Date
(3 answers)
Closed 1 year ago.
I have date and time information in the following format:
z <- 20201019083000
I want to convert it into a readable date and time format such as follows:
"2020-10-19 20:20"
So far I have tried this but cannot get the correct answer.
#in local
as.POSIXct(z, origin = "1904-01-01")
"642048-10-22 14:43:20 KST"
#in UTC
as.POSIXct(z, origin = "1960-01-01", tz = "GMT")
"642104-10-23 05:43:20 GMT"
#in
as.POSIXct(as.character(z), format = "%H%M%S")
"2021-07-13 20:20:10 KST"
Any better way to do it?
library(lubridate)
ymd_hms("20201019083000")
# [1] "2020-10-19 08:30:00 UTC"
# or, format the output:
format(ymd_hms("20201019083000"), "%Y-%m-%d %H:%M")
# "2020-10-19 08:30"
You can use as.POSIXct or strptime with the format %Y%m%d%H%M%S:
as.POSIXct(as.character(z), format="%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
strptime(z, "%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
Your tried format "%H%M%S" dos not include Year %Y , Month %m and Day %d.

R: xts conversion problem (Add x's at the index row)

This is my setup: I have an excel-file with hourly electricity prices. I want to index them by the hourly interval, file here: Data. I load the data the usual way.
library(readxl)
library(tidyverse)
rm(list = ls())
DK1 <- read_excel("DK1.xlsx")
time_index <- as.POSIXct(DK1$Datetime, format="%Y/%m/%d %H:%M:%S", tz=Sys.timezone())
test <- xts(DK1[,-1], order.by = time_index)
This is just one of many ways I've tried to index it in XTS to no avail. The index row looks wrong and I do not know what to do.
UPDATE 1: dput(head(DK1))
It appears that read_excel is converting your time column into a datetime, but with all the dates set to "1899-12-31". This can be seen by running:
> str(DK1)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 8760 obs. of 6 variables:
$ Date : POSIXct, format: "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01"...
$ Hours : POSIXct, format: "1899-12-31 00:00:00" "1899-12-31 01:00:00" "1899-12-31 02:00:00" "1899-12-31 03:00:00" ...
$ Datetime : chr "2019-01-01 00:00:00" "2019-01-01 01:00:00" "2019-01-01 02:00:00" "2019-01-01 03:00:00" ...
$ DK1 : num 211.5 75.2 -30.5 -74 -55.3 ...
This is more of a data import problem and the Datetime concat in excel can be performed in R. Generally it's simpler to have all data manipulation performed in a single spot.
library(readxl)
library(xts)
DK1 <- read_excel("DK1.xlsx")
# pasting date and time together in new column name for comparison
# note the use of strftime to remove the date information discussed earlier
DK1$Datetime2 <- paste(DK1$Date, strftime(DK1$Hours, "%H:%M:%S", tz = "UTC"))
# the format / in excel need to change to - for how it's displayed in R
DK1$time_index <- as.POSIXct(DK1$Datetime, format = "%Y-%m-%d %H:%M:%S", tz = Sys.timezone())
# filtering out the NA value of 2019-03-10 02:00:00 which is when daylight savings occurred
DK1 <- DK1[!is.na(DK1$time_index), ]
DK1a <- xts(DK1[, "DK1"], order.by = DK1$time_index)
> head(DK1a)
DK1
2019-01-01 00:00:00 211.48
2019-01-01 01:00:00 75.20
2019-01-01 02:00:00 -30.47
2019-01-01 03:00:00 -74.00
2019-01-01 04:00:00 -55.33
2019-01-01 05:00:00 -93.72
We can select the numeric column and then order.by the 'Date' which is already a Datetime class
library(xts)
xts(DK1$DK1, order.by = DK1$Date)
as the format is in the default format, we don't have to specify the format

I would like to extract the time from a character vector [duplicate]

This question already has answers here:
Convert date-time string to class Date
(4 answers)
Closed 3 years ago.
I have date&time stamp as a character variable
"2018-12-13 11:00:01 EST" "2018-10-23 22:00:01 EDT" "2018-11-03 14:15:00 EDT" "2018-10-04 19:30:00 EDT" "2018-11-10 17:15:31 EST" "2018-10-05 13:30:00 EDT"
How can I strip the time from this character vector?
PS: Can someone please help. I have tried using strptime but I am getting NA values as a result
It's a bit unclear whether you want the date or time but if you want the date then as.Date ignores any junk after the date so:
x <- c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT")
as.Date(x)
## [1] "2018-12-13" "2018-10-23"
would be sufficient to get a Date vector from the input vector x. No packages are used.
If you want the time then:
read.table(text = x, as.is = TRUE)[[2]]
## [1] "11:00:01" "22:00:01"
If you want a data frame with each part in a separate column then:
read.table(text = x, as.is = TRUE, col.names = c("date", "time", "tz"))
## date time tz
## 1 2018-12-13 11:00:01 EST
## 2 2018-10-23 22:00:01 EDT
I think the OP wants to extract the time from date-time variable (going by the title of the question).
x <- "2018-12-13 11:00:01 EST"
as.character(strptime(x, "%Y-%m-%d %H:%M:%S"), "%H:%M:%S")
[1] "11:00:01"
Another option:
library(lubridate)
format(ymd_hms(x, tz = "EST"), "%H:%M:%S")
[1] "11:00:01"
The package lubridate makes everything like this easy:
library(lubridate)
x <- "2018-12-13 11:00:01 EST"
as_date(ymd_hms(x))
You can use the as.Date function and specify the format
> as.Date("2018-12-13 11:00:01 EST", format="%Y-%m-%d")
[1] "2018-12-13"
If all values are in a vector:
x = c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT",
"2018-11-03 14:15:00 EDT", "2018-10-04 19:30:00 EDT",
"2018-11-10 17:15:31 EST", "2018-10-05 13:30:00 EDT")
> as.Date(x, format="%Y-%m-%d")
[1] "2018-12-13" "2018-10-23" "2018-11-03" "2018-10-04" "2018-11-10"
[6] "2018-10-05"

Formatting inconsistent datetime variable [duplicate]

This question already has answers here:
Convert dd/mm/yy and dd/mm/yyyy to Dates
(6 answers)
Closed 5 years ago.
I have a large dataset (a few millions observations) that contains a datetime variable with an inconsistent format: "%Y-%m-%d %H:%M:%S" ; "%m/%d/%Y and %H:%M:%S".
Here is how the dataset looks like:
df <- data.frame(var1 = c(1:6),
var2 = c("A", "B", "C", "A", "B", "C"),
datetime = c("2013-07-01 00:00:02", "2016-07-01 00:00:01",
"9/2/2014 00:01:20", "9/1/2014 00:00:25",
"1/1/2015 0:07", "6/1/2015 0:01"))
Is there an efficient way to format the datetime variable into a unique, consistent date time format?
You can use lubridate package like this.
lubridate::parse_date_time(x = df$datetime, c("ymd HMS","mdy HMS"))
[1] "2013-07-01 00:00:02 UTC" "2016-07-01 00:00:01 UTC" "2014-09-02 00:01:20 UTC"
[4] "2014-09-01 00:00:25 UTC" NA NA
Warning message:
2 failed to parse.
lubridate::parse_date_time(x = df$datetime, c("ymd HMS","mdy HMS","mdy HM"))
[1] "2013-07-01 00:00:02 UTC" "2016-07-01 00:00:01 UTC" "2014-09-02 00:01:20 UTC"
[4] "2014-09-01 00:00:25 UTC" "2015-01-01 00:07:00 UTC" "2015-06-01 00:01:00 UTC"
You can specify your date-time formats as needed, you may compare two examples I mentioned.
Hope this helps you. :)
POSIXCT solution using parse_date_time.
EDIT: incorporating #Akarsh Jain's POSIXCT formatting for better time alignment.
df$new_date <- parse_date_time(df$datetime, c("%Y-%m-%d %H:%M:%S", "%m/%d/%Y %H:%M:%S", "%m/%d/%Y %H:%M"))

POSIXct date conversion error [duplicate]

This question already has an answer here:
R: strptime() and is.na () unexpected results
(1 answer)
Closed 8 years ago.
I've encountered the following error when converting a set of dates in character format to a POSIXct object.
Example Data:
t<-c("3/11/2007 1:30", "3/11/2007 2:00", "4/11/2007 2:00")
str(t)
chr [1:3] "3/11/2007 1:30" "3/11/2007 2:00" "4/11/2007 2:00"
z<-as.POSIXct(strptime(t, format ="%m/%d/%Y %H:%M"))
z
"2007-03-11 01:30:00 MST" NA "2007-04-11 02:00:00 MDT"
str(z)
POSIXct[1:3], format: "2007-03-11 01:30:00" NA "2007-04-11 02:00:00"
My question is why is the NA returned for the second date in z? I have a dataset that contains 8 years of hourly data (from which I copied the dates above), and this NA error pops up only for dates between 3/8 - 3/14 and ONLY when the hour is 02:00:00.
I do not encounter an error if the dates are converted to POSIXlt, so that is my current work around.
Any thoughts?
Try using a time zone that does not use daylight savings time:
as.POSIXct(t, format = "%m/%d/%Y %H:%M", tz = "GMT")
## [1] "2007-03-11 01:30:00 GMT" "2007-03-11 02:00:00 GMT" "2007-04-11 02:00:00 GMT"

Resources