NA errors when converting datetime column in R - r

I'm trying to convert a column to a date-time format in R. I've tried the following conversion but it fills my output as NA:
migtimes$mig_start<- format(migtimes$mig_start, "%Y-%m-%d %H:%M:%S")
migtimes$mig_start<-strptime(x = as.character(migtimes$mig_start), format = "%Y-%m-%d %H:%M:%S")
migtimes$mig_start <- as.POSIXct(strptime(migtimes$mig_start , format = "%Y-%m-%d %H:%M:%S"), tz ="MST")
migtimes$mig_start<- strptime(x = as.character(migtimes$mig_start),
format = "%Y-%m-%d %H:%M:%S")
ymd_hms( as.character(migtimes$mig_start),tz ="MST" )
For the ymd_hmsconversion I also get an NA error :
Warning message:
All formats failed to parse. No formats found.
Here's what my dataframe looks like. When I read in my csv file it says the mig_start (which is my date field) is a factor. I want to convert this field to a 2018-12-13 22:00:00 format. I'm at a loss of what else I can try. Any suggestions?
X mig_start
1 3/20/2019 11:00
2 4/3/2019 15:00
3 3/17/2019 22:00
4 3/6/2019 12:00
5 3/6/2019 12:00
6 5/3/2019 5:01

I think it's just a matter of the format string you provided. You want it to match the strings you are converting, not the format you want the dates to print with. Try this:
migtimes <- data.frame(
X = 1:6,
mig_start = c("3/20/2019 11:00", "4/3/2019 15:00", "3/17/2019 22:00",
"3/6/2019 12:00", "3/6/2019 12:00", "5/3/2019 5:01")
)
migtimes$mig_start <- as.POSIXct(migtimes$mig_start, format = "%m/%d/%Y %H:%M",
tz = fill.this.in.with.whatever.is.appropriate.for.you)
You could also try as.POSIXlt instead of as.POSIXct, whichever you're more comfortable dealing with.

Your format string is wrong. You have month/day/year hour:minute.
Using lubridate, you can use mdy_hm():
library(lubridate)
library(dplyr)
migtimes<- migtimes %>%
mutate(dt = mdy_hm(mig_start))
Result:
X mig_start dt
1 1 3/20/2019 11:00 2019-03-20 11:00:00
2 2 4/3/2019 15:00 2019-04-03 15:00:00
3 3 3/17/2019 22:00 2019-03-17 22:00:00
4 4 3/6/2019 12:00 2019-03-06 12:00:00
5 5 3/6/2019 12:00 2019-03-06 12:00:00
6 6 5/3/2019 5:01 2019-05-03 05:01:00
Data:
migtimes <- structure(list(X = 1:6,
mig_start = c("3/20/2019 11:00", "4/3/2019 15:00", "3/17/2019 22:00",
"3/6/2019 12:00", "3/6/2019 12:00", "5/3/2019 5:01")),
class = "data.frame", row.names = c(NA, -6L))

Related

Changing column data type from chr to date?

I have a dataset with 43,000 rows that looks like this:
dat <- data.frame(RecordNumber=1:n,
+ date=c("2/2/21 14:20","2/2/21 14:30", "2/2/21 14:40", "2/2/21 14:50", "2/2/21 15:00", "2/2/21 15:10"),
+ airTemp_C = c(2.4, -11.3, -15, -21.1, -8.5, 0.1))
# RecordNumber date airTemp_C
# 1 1 2/2/21 14:20 2.4
# 2 2 2/2/21 14:30 -11.3
# 3 3 2/2/21 14:40 -15.0
# 4 4 2/2/21 14:50 -21.1
# 5 5 2/2/21 15:00 -8.5
# 6 6 2/2/21 15:10 0.1
and I am trying to convert the date column from chr to date/datetime. Because when I visualize the data with shiny/plotly the date is out of order (maybe because its sorting the date column alphabetically since its a chr datatype not date? Not sure, thats what Im trying to figure out).
I've tried formatting the cells in Excel, I've tried lubridate's parse_date_time(), & base-R's strptime(), however the lists always return NA for all 43,000 of my date rows. Any ideas why this may be?
df$date <- strptime(df$date, format = "%m-%d-%y %H:%M:%S", tz="MST")
reassigns the entire date column as NA.
There is no %S. It may also be better to use POSIXct along with the fact that the sep is not - and it is /
dat$date <- as.POSIXct(dat$date, format = '%m/%d/%y %H:%M', tz = 'MST')
-output
dat$date
[1] "2021-02-02 14:20:00 MST" "2021-02-02 14:30:00 MST" "2021-02-02 14:40:00 MST" "2021-02-02 14:50:00 MST" "2021-02-02 15:00:00 MST"
[6] "2021-02-02 15:10:00 MST"
If we want to convert to Date class, it should be wrapped with as.Date
Or using lubridate
library(lubridate)
dat$date <- mdy_hm(dat$date)
data
dat <- structure(list(RecordNumber = 1:6, date = c("2/2/21 14:20", "2/2/21 14:30",
"2/2/21 14:40", "2/2/21 14:50", "2/2/21 15:00", "2/2/21 15:10"
), airTemp_C = c(2.4, -11.3, -15, -21.1, -8.5, 0.1)),
class = "data.frame", row.names = c(NA,
-6L))

Converting date and hour into xts R

i have this table of consumptions. I am trying to convert the first two columns into a one xts date format.
1 01.01.2016 00:00:00 26.27724
2 01.01.2016 01:00:00 24.99182
3 01.01.2016 02:00:00 23.53261
4 01.01.2016 03:00:00 22.46478
5 01.01.2016 04:00:00 22.00291
6 01.01.2016 05:00:00 21.95708
7 01.01.2016 06:00:00 22.20354
8 01.01.2016 07:00:00 21.84416
i have tried the code belo and got that error.
timestamp=format(as.POSIXct(paste(datecol,hourcol)), "%d/%m/%Y %H:%M:%S")
Error in as.POSIXlt.character(x, tz, ...) :
character string is not in a standard unambiguous format
the date is character and hour is in double format.
If you were trying to combine date and time value to create timestamp, we can use as.POSIXct in base R.
df$timestamp <- as.POSIXct(paste(df$datecol,df$hourcol),
format = "%d.%m.%Y %T", tz = "UTC")
Or using lubridate
df$timestamp <- lubridate::dmy_hms(paste(df$datecol,df$hourcol))
Or using anytime
df$timestamp <- anytime::anytime(paste(df$datecol,df$hourcol))

CSV date format changes after import into R

I tried to import csv with date format:
3/1/2017 0:00
3/1/2017 1:00
3/1/2017 2:00
3/1/2017 3:00
3/1/2017 4:00
3/1/2017 5:00
into R, however the date format appears in R become:
2017-03-01 00:00:00 2017-03-01 01:00:00 2017-03-01 02:00:00 2017-03-01 03:00:00 2017-03-01 04:00:00 2017-03-01 05:00:00
How can I read csv into R as the original format without changing anything?
It is in the "original" format, in the sense that you're probably looking at a POSIXct or POSIXlt object. You can reformat dates and datetimes using format() or strftime(), but this will render them character.
So as long as you're working with the datetime objects, just leave it as is. If you need to report, you can use any of the aforementioned functions to format the string:
x <- "3/1/2017 3:00"
x1 <- as.POSIXct(x, format = "%d/%m/%Y %H:%M")
x1
# [1] "2017-01-03 03:00:00 CET"
strftime(x1, format = "%d/%m/%Y %H:%M")
# [1] "03/01/2017 03:00"
format(x1, format = "%d/%m/%Y %H:%M")
# [1] "03/01/2017 03:00"

R add specific (different) amounts of times to entire column

I have a table in R like:
start duration
02/01/2012 20:00:00 5
05/01/2012 07:00:00 6
etc... etc...
I got to this by importing a table from Microsoft Excel that looked like this:
date time duration
2012/02/01 20:00:00 5
etc...
I then merged the date and time columns by running the following code:
d.f <- within(d.f, { start=format(as.POSIXct(paste(date, time)), "%m/%d/%Y %H:%M:%S") })
I want to create a third column called 'end', which will be calculated as the number of hours after the start time. I am pretty sure that my time is a POSIXct vector. I have seen how to manipulate one datetime object, but how can I do that for the entire column?
The expected result should look like:
start duration end
02/01/2012 20:00:00 5 02/02/2012 01:00:00
05/01/2012 07:00:00 6 05/01/2012 13:00:00
etc... etc... etc...
Using lubridate
> library(lubridate)
> df$start <- mdy_hms(df$start)
> df$end <- df$start + hours(df$duration)
> df
# start duration end
#1 2012-02-01 20:00:00 5 2012-02-02 01:00:00
#2 2012-05-01 07:00:00 6 2012-05-01 13:00:00
data
df <- structure(list(start = c("02/01/2012 20:00:00", "05/01/2012 07:00:00"
), duration = 5:6), .Names = c("start", "duration"), class = "data.frame", row.names = c(NA,
-2L))
You can simply add dur*3600 to start column of the data frame. E.g. with one date:
start = as.POSIXct("02/01/2012 20:00:00",format="%m/%d/%Y %H:%M:%S")
start
[1] "2012-02-01 20:00:00 CST"
start + 5*3600
[1] "2012-02-02 01:00:00 CST"

obtain hour from DateTime vector

I have a DateTime vector within a data.frame where the data frame is made up of 8760 observations representing hourly intervals throughout the year e.g.
2010-01-01 00:00
2010-01-01 01:00
2010-01-01 02:00
2010-01-01 03:00
and so on.
I would like to create a data.frame which has the original DateTime vector as the first column and then the hourly values in the second column e.g.
2010-01-01 00:00 00:00
2010-01-01 01:00 01:00
How can this be achieved?
Use format or strptime to extract the time information.
Create a POSIXct vector:
x <- seq(as.POSIXct("2012-05-21"), by=("+1 hour"), length.out=5)
Extract the time:
data.frame(
date=x,
time=format(x, "%H:%M")
)
date time
1 2012-05-21 00:00:00 00:00
2 2012-05-21 01:00:00 01:00
3 2012-05-21 02:00:00 02:00
4 2012-05-21 03:00:00 03:00
5 2012-05-21 04:00:00 04:00
If the input vector is a character vector, then you have to convert to POSIXct first:
Create some data
dat <- data.frame(
DateTime=format(seq(as.POSIXct("2012-05-21"), by=("+1 hour"), length.out=5), format="%Y-%m-%d %H:%M")
)
dat
DateTime
1 2012-05-21 00:00
2 2012-05-21 01:00
3 2012-05-21 02:00
4 2012-05-21 03:00
5 2012-05-21 04:00
Split time out:
data.frame(
DateTime=dat$DateTime,
time=format(as.POSIXct(dat$DateTime, format="%Y-%m-%d %H:%M"), format="%H:%M")
)
DateTime time
1 2012-05-21 00:00 00:00
2 2012-05-21 01:00 01:00
3 2012-05-21 02:00 02:00
4 2012-05-21 03:00 03:00
5 2012-05-21 04:00 04:00
Or generically, not treating them as dates, you can use the following provided that the time and dates are padded correctly.
library(stringr)
df <- data.frame(DateTime = c("2010-01-01 00:00", "2010-01-01 01:00", "2010-01-01 02:00", "2010-01-01 03:00"))
df <- data.frame(df, Time = str_sub(df$DateTime, -5, -1))
It depends on your needs really.
Using lubridate
library(stringr)
library(lubridate)
library(plyr)
df <- data.frame(DateTime = c("2010-01-01 00:00", "2010-01-01 01:00", "2010-01-01 02:00", "2010-01-01 03:00"))
df <- mutate(df, DateTime = ymd_hm(DateTime),
time = str_c(hour(DateTime), str_pad(minute(DateTime), 2, side = 'right', pad = '0'), sep = ':'))
On a more general note, for anyone that comes here from google and maybe wants to group by hour:
The key here is: lubridate::hour(datetime)
p22 in the cran doc here: https://cran.r-project.org/web/packages/lubridate/lubridate.pdf

Resources