Converting date and hour into xts R - r

i have this table of consumptions. I am trying to convert the first two columns into a one xts date format.
1 01.01.2016 00:00:00 26.27724
2 01.01.2016 01:00:00 24.99182
3 01.01.2016 02:00:00 23.53261
4 01.01.2016 03:00:00 22.46478
5 01.01.2016 04:00:00 22.00291
6 01.01.2016 05:00:00 21.95708
7 01.01.2016 06:00:00 22.20354
8 01.01.2016 07:00:00 21.84416
i have tried the code belo and got that error.
timestamp=format(as.POSIXct(paste(datecol,hourcol)), "%d/%m/%Y %H:%M:%S")
Error in as.POSIXlt.character(x, tz, ...) :
character string is not in a standard unambiguous format
the date is character and hour is in double format.

If you were trying to combine date and time value to create timestamp, we can use as.POSIXct in base R.
df$timestamp <- as.POSIXct(paste(df$datecol,df$hourcol),
format = "%d.%m.%Y %T", tz = "UTC")
Or using lubridate
df$timestamp <- lubridate::dmy_hms(paste(df$datecol,df$hourcol))
Or using anytime
df$timestamp <- anytime::anytime(paste(df$datecol,df$hourcol))

Related

NA errors when converting datetime column in R

I'm trying to convert a column to a date-time format in R. I've tried the following conversion but it fills my output as NA:
migtimes$mig_start<- format(migtimes$mig_start, "%Y-%m-%d %H:%M:%S")
migtimes$mig_start<-strptime(x = as.character(migtimes$mig_start), format = "%Y-%m-%d %H:%M:%S")
migtimes$mig_start <- as.POSIXct(strptime(migtimes$mig_start , format = "%Y-%m-%d %H:%M:%S"), tz ="MST")
migtimes$mig_start<- strptime(x = as.character(migtimes$mig_start),
format = "%Y-%m-%d %H:%M:%S")
ymd_hms( as.character(migtimes$mig_start),tz ="MST" )
For the ymd_hmsconversion I also get an NA error :
Warning message:
All formats failed to parse. No formats found.
Here's what my dataframe looks like. When I read in my csv file it says the mig_start (which is my date field) is a factor. I want to convert this field to a 2018-12-13 22:00:00 format. I'm at a loss of what else I can try. Any suggestions?
X mig_start
1 3/20/2019 11:00
2 4/3/2019 15:00
3 3/17/2019 22:00
4 3/6/2019 12:00
5 3/6/2019 12:00
6 5/3/2019 5:01
I think it's just a matter of the format string you provided. You want it to match the strings you are converting, not the format you want the dates to print with. Try this:
migtimes <- data.frame(
X = 1:6,
mig_start = c("3/20/2019 11:00", "4/3/2019 15:00", "3/17/2019 22:00",
"3/6/2019 12:00", "3/6/2019 12:00", "5/3/2019 5:01")
)
migtimes$mig_start <- as.POSIXct(migtimes$mig_start, format = "%m/%d/%Y %H:%M",
tz = fill.this.in.with.whatever.is.appropriate.for.you)
You could also try as.POSIXlt instead of as.POSIXct, whichever you're more comfortable dealing with.
Your format string is wrong. You have month/day/year hour:minute.
Using lubridate, you can use mdy_hm():
library(lubridate)
library(dplyr)
migtimes<- migtimes %>%
mutate(dt = mdy_hm(mig_start))
Result:
X mig_start dt
1 1 3/20/2019 11:00 2019-03-20 11:00:00
2 2 4/3/2019 15:00 2019-04-03 15:00:00
3 3 3/17/2019 22:00 2019-03-17 22:00:00
4 4 3/6/2019 12:00 2019-03-06 12:00:00
5 5 3/6/2019 12:00 2019-03-06 12:00:00
6 6 5/3/2019 5:01 2019-05-03 05:01:00
Data:
migtimes <- structure(list(X = 1:6,
mig_start = c("3/20/2019 11:00", "4/3/2019 15:00", "3/17/2019 22:00",
"3/6/2019 12:00", "3/6/2019 12:00", "5/3/2019 5:01")),
class = "data.frame", row.names = c(NA, -6L))

problem with hour interval in time series data in r

I am new at using R and I am encountering a problem with historical hourly electric load data that I have downloaded.My goal is to make a load forecast based on an ARIMA model and/or Artificial Neural Networks.
The problem is that the data is in the following Date-time (hourly) format:
#> DateTime Day_ahead_Load Actual_Load
#> [1,] "01.01.2015 00:00 - 01.01.2015 01:00" "6552" "6100"
#> [2,] "01.01.2015 01:00 - 01.01.2015 02:00" "6140" "5713"
#> [3,] "01.01.2015 02:00 - 01.01.2015 03:00" "5950" "5553"
I have tried to make a POSIXct object but it didn't work:
as.Date.POSIXct(DateTime, format = "%d-%m-%Y %H:%M:%S", tz="EET", usetz=TRUE)
The message I get is that it is not in an unambiguous format. I would really appreciate your feedback on this.
Thank you in advance.
Best Regards,
Iro
You have 2 major problems. First, your DateTime column contains two dates, so you need to split that column into two. Second, your format argument has - characters but your date has . characters.
We can use separate from tidyr and mutate with across to change the columns to POSIXct.
library(dplyr)
library(tidyr)
data %>%
separate(DateTime, c("StartDateTime","EndDateTime"), " - ") %>%
mutate(across(c("StartDateTime","EndDateTime"),
~ as.POSIXct(., format = "%d.%m.%Y %H:%M",
tz="EET", usetz=TRUE)))
StartDateTime EndDateTime Day_ahead_Load Actual_Load
1 2015-01-01 00:00:00 2015-01-01 01:00:00 6552 6100
2 2015-01-01 01:00:00 2015-01-01 02:00:00 6140 5713
3 2015-01-01 02:00:00 2015-01-01 03:00:00 5950 5553

CSV date format changes after import into R

I tried to import csv with date format:
3/1/2017 0:00
3/1/2017 1:00
3/1/2017 2:00
3/1/2017 3:00
3/1/2017 4:00
3/1/2017 5:00
into R, however the date format appears in R become:
2017-03-01 00:00:00 2017-03-01 01:00:00 2017-03-01 02:00:00 2017-03-01 03:00:00 2017-03-01 04:00:00 2017-03-01 05:00:00
How can I read csv into R as the original format without changing anything?
It is in the "original" format, in the sense that you're probably looking at a POSIXct or POSIXlt object. You can reformat dates and datetimes using format() or strftime(), but this will render them character.
So as long as you're working with the datetime objects, just leave it as is. If you need to report, you can use any of the aforementioned functions to format the string:
x <- "3/1/2017 3:00"
x1 <- as.POSIXct(x, format = "%d/%m/%Y %H:%M")
x1
# [1] "2017-01-03 03:00:00 CET"
strftime(x1, format = "%d/%m/%Y %H:%M")
# [1] "03/01/2017 03:00"
format(x1, format = "%d/%m/%Y %H:%M")
# [1] "03/01/2017 03:00"

dateTime with different format in data frame

I have imported some data into R, which looks like the following:
dateTime temp
1 10/25/2005 12:00:00 15.50
2 10/25/2005 1:00:00 15.49
3 10/25/2005 2:00:00 15.52
4 10/25/2005 3:00:00 15.50
5 10/25/2005 4:00:00 15.50
6 10/25/2005 5:00:00 15.46
where the class of the dateTime column of the data.frame is factor and the second column is numeric.
I try to convert the dateTime into POSIXct format as follows:
dat[,1] <- as.POSIXct(dat[,1])
but receive the error
Error in as.POSIXlt.character(as.character(x), ...) :
character string is not in a standard unambiguous format
which I think is to do with the dateTime varying in the format that hour is presented e.g. 12, 1, 2 etc and not 12, 01, 02.
How can I change this to POSIXct?
You need to specify the format:
datetime <- factor("10/25/2005 12:00:00")
as.POSIXct(datetime)
#Error in as.POSIXlt.character(as.character(x), ...) :
# character string is not in a standard unambiguous format
as.POSIXct(datetime, format="%m/%d/%Y %H:%M:%S")
#[1] "2005-10-25 12:00:00 CEST"
Note: I advise you to always specify a time zone explicitly when creating datetime variables. Otherwise, you could get into trouble with daylight saving time.
as.POSIXct(datetime, format="%m/%d/%Y %H:%M:%S", tz="GMT")
#[1] "2005-10-25 12:00:00 GMT"

obtain hour from DateTime vector

I have a DateTime vector within a data.frame where the data frame is made up of 8760 observations representing hourly intervals throughout the year e.g.
2010-01-01 00:00
2010-01-01 01:00
2010-01-01 02:00
2010-01-01 03:00
and so on.
I would like to create a data.frame which has the original DateTime vector as the first column and then the hourly values in the second column e.g.
2010-01-01 00:00 00:00
2010-01-01 01:00 01:00
How can this be achieved?
Use format or strptime to extract the time information.
Create a POSIXct vector:
x <- seq(as.POSIXct("2012-05-21"), by=("+1 hour"), length.out=5)
Extract the time:
data.frame(
date=x,
time=format(x, "%H:%M")
)
date time
1 2012-05-21 00:00:00 00:00
2 2012-05-21 01:00:00 01:00
3 2012-05-21 02:00:00 02:00
4 2012-05-21 03:00:00 03:00
5 2012-05-21 04:00:00 04:00
If the input vector is a character vector, then you have to convert to POSIXct first:
Create some data
dat <- data.frame(
DateTime=format(seq(as.POSIXct("2012-05-21"), by=("+1 hour"), length.out=5), format="%Y-%m-%d %H:%M")
)
dat
DateTime
1 2012-05-21 00:00
2 2012-05-21 01:00
3 2012-05-21 02:00
4 2012-05-21 03:00
5 2012-05-21 04:00
Split time out:
data.frame(
DateTime=dat$DateTime,
time=format(as.POSIXct(dat$DateTime, format="%Y-%m-%d %H:%M"), format="%H:%M")
)
DateTime time
1 2012-05-21 00:00 00:00
2 2012-05-21 01:00 01:00
3 2012-05-21 02:00 02:00
4 2012-05-21 03:00 03:00
5 2012-05-21 04:00 04:00
Or generically, not treating them as dates, you can use the following provided that the time and dates are padded correctly.
library(stringr)
df <- data.frame(DateTime = c("2010-01-01 00:00", "2010-01-01 01:00", "2010-01-01 02:00", "2010-01-01 03:00"))
df <- data.frame(df, Time = str_sub(df$DateTime, -5, -1))
It depends on your needs really.
Using lubridate
library(stringr)
library(lubridate)
library(plyr)
df <- data.frame(DateTime = c("2010-01-01 00:00", "2010-01-01 01:00", "2010-01-01 02:00", "2010-01-01 03:00"))
df <- mutate(df, DateTime = ymd_hm(DateTime),
time = str_c(hour(DateTime), str_pad(minute(DateTime), 2, side = 'right', pad = '0'), sep = ':'))
On a more general note, for anyone that comes here from google and maybe wants to group by hour:
The key here is: lubridate::hour(datetime)
p22 in the cran doc here: https://cran.r-project.org/web/packages/lubridate/lubridate.pdf

Resources