I have been given a dataset that lists date and time separately. The dates are fine however the time is being treated as a character rather than a date/time object.
The current time column looks like "13:00", "13:05", "13:10" etc.
I tried mutating the column using as.POSIXct() however it changed the column to all NA.
This was my attempt:
data = data %>%
mutate(time = as.POSIXct(time, format = "h:m"))
I expected a similar looking column but instead of strings I wanted it to be times/dates. Thanks for any help!
The times class in chron can represent times without dates:
library(chron)
library(dplyr)
# input data
data <- data.frame(date = "2000-01-01", time = c("13:00", "13:05", "13:10"))
data %>%
mutate(date = as.chron(as.character(date)),
time = times(paste0(time, ":00")),
datetime = chron(date, time))
giving:
date time datetime
1 01/01/00 13:00:00 (01/01/00 13:00:00)
2 01/01/00 13:05:00 (01/01/00 13:05:00)
3 01/01/00 13:10:00 (01/01/00 13:10:00)
For a simple, non package solution:
I would first create a column with both the date and time in it
dateandtime <- as.character(paste(date, time, sep = ' '))
and then use the strptime function:
dateandtime <- strptime(dateandtime,
format = "%Y-%m-%d %H:%M",
tz = 'GMT')
just put the dataframe name in front of all variables, e.g.:
df$dateandtime <- as.character(paste(df$date, df$time, sep = ' '))
Hope it helps!
If you use as.POSIXct, you need to provide the format differently:
as.POSIXct("13:05", format = "%H:%M")
This however returns [1] "2019-03-26 13:05:00 CET" since date/times are represented as calendar dates plus time to the nearest second.
If you only want to use the time, you could use data.table::asITime:
data.table::as.ITime(c("13:00", "13:05", "13:10"))
This returns:
str(data.table::as.ITime(c("13:00", "13:05", "13:10")))
'ITime' int [1:3] 13:00:00 13:05:00 13:10:00
Related
I'm having trouble converting my date variable in my data frame to a new date-time variable. I know parse_date_time(x, orders="ymd HMS") but I don't know what code is needed to say: use this dataframe (workHours) and grab this column (date) now change to a new column named (date_time) and convert to "ymd HMS"
The date column already has the date and time in this format: mm/dd/yyyy hh:mm:ss and its a fct or factor.
If you have data like this -
workHours <- data.frame(date = c('3/22/2020 04:51:12', '3/15/2019 10:12:32'))
workHours
# date
#1 3/22/2020 04:51:12
#2 3/15/2019 10:12:32
You can use as.POSIXct in base R
workHours$date_time <- as.POSIXct(workHours$date, format = '%m/%d/%Y %T', tz = 'UTC')
Or lubridate::mdy_hms
workHours$date_time <- lubridate::mdy_hms(workHours$date)
Both of which would return -
# date date_time
#1 3/22/2020 04:51:12 2020-03-22 04:51:12
#2 3/15/2019 10:12:32 2019-03-15 10:12:32
class(workHours$date_time)
#[1] "POSIXct" "POSIXt"
We can use anytime
library(anytime)
workHours <- data.frame(date = c('3/22/2020 04:51:12', '3/15/2019 10:12:32'))
anytime(workHours$date)
[1] "2020-03-22 04:51:12 EDT" "2019-03-15 10:12:32 EDT"
I have a table with two columns "start" and "end" containing both dates and times of the respective start and end period as follows:
Sr. No. Start End
1 22May2001:00:00:00 27May2001:23:59:59
2 28May2001:00:00:00 26Jun2001:23:59:59
I would like to convert above date time in the following format (ISO8601 with time stamp):
Sr. No. Start End
1 2001-05-22 00:00:00 2001-05-27 23:59:59
2 2001-05-28 00:00:00 2001-06-26 23:59:59
I have used the code available at this link: http://www.stat.berkeley.edu/~s133/dates.html
View(my_table)
str(my_table)
my_table$startD <- as.Date(my_table$start, "%d%b%Y:%H:%M:%S")
my_table$startT <- strptime(my_table$start, format = "%d%b%Y:%H:%M:%S")
So far, my attempt gave me two columns like this:
StartD StartT
2001-05-22 2001-05-22
Which is not desirable. Could someone please suggest me to convert the date time in desired format through above or any alternate approach?
In answer form for clarity, you need a datetime class, which in R means either POSIXct or POSIXlt. Usually we use as.POSIXct and strptime for parsing strings into each class, respectively (as.POSIXlt exists, but rarely gets used), though there are lubridate alternatives if you like.
At its most basic,
my_table$Start <- as.POSIXct(my_table$Start, format = '%d%b%Y:%H:%M:%S')
my_table$End <- as.POSIXct(my_table$End, format = '%d%b%Y:%H:%M:%S')
my_table
## Sr.No. Start End
## 1 1 2001-05-22 2001-05-27 23:59:59
## 2 2 2001-05-28 2001-06-26 23:59:59
Note you need to specify the name of the format string, as the second parameter of as.POSIXct is actually tz (for setting the time zone). Also note that while Start looks like it's missing a time, that's because the print methods for POSIX*t don't print times at midnight, though they are still stored.
If you'd like to change both in a single line, you could use
my_table[-1] <- lapply(my_table[,-1], as.POSIXct, format = '%d%b%Y:%H:%M:%S')
or in dplyr (which prefers POSIXct over POSIXlt):
library(dplyr)
my_table %>% mutate_at(-1, as.POSIXct, format = '%d%b%Y:%H:%M:%S')
both of which return exactly the same thing. You could also use lubridate::dmy_hms, which parses to POSIXct:
library(lubridate)
my_table$Start <- dmy_hms(my_table$Start) # or lapply like above
my_table$End <- dmy_hms(my_table$End)
# or dplyr
my_table %>% mutate_at(-1, dmy_hms)
which also return the same thing.
Data
my_table <- structure(list(Sr.No. = 1:2, Start = structure(1:2, .Label = c("22May2001:00:00:00",
"28May2001:00:00:00"), class = "factor"), End = structure(c(2L,
1L), .Label = c("26Jun2001:23:59:59", "27May2001:23:59:59"), class = "factor")), .Names = c("Sr.No.",
"Start", "End"), class = "data.frame", row.names = c(NA, -2L))
Hope this helps.
my_table <- "22May2001:22:02:50"
my_table <- strptime(as.character(my_table), "%d%b%Y:%H:%M:%S")
my_table <- format(my_table, "%Y-%m-%d %H:%M:%S")
str(my_table)
I have a data frame in R with 2 columns (date_time , temp) I want to extract all the temp for the day time (between 5:59 Am to 18:00 Pm). I firstly separated date and times(hours) with this code:
Th$Hours <- format(as.POSIXct(Th$`date`,
"%Y-%m-%d %H:%M:%S", tz = ""),
format = "%H:%M")%>% as_tibble()
Th$Dates <- format(as.Date(Th$`date`,"%Y-%m-%d",
tz = ""), format = "%Y-%m-%d")%>% as_tibble()
and then I use following command to extract specific times:
Th_day<- Th[Th$Hours >= " 06:00 AM" & Th$Hours<= "18:00 PM",]%>% as_tibble()
But I get a tibble that give rows from 00:00 to 18:00 for every day :
What is the problem ?
Don't separate out the date from the time. When you use format to get the time, you are converting it to a character class, that doesn't know how to do time-based comparisons with > and <.
Instead, use hour to extract the hour component as an integer, and do comparisons on that:
library(lubridate)
Th %>%
mutate(DateTime <- as.POSIXct(date, "%Y-%m-%d %H:%M:%S", tz = "")) %>%
filter(hour(DateTime) >= 6 & hour(DateTime ) < 18)
I'm making some assumptions about your data structure - if you need more help than this please edit your question to share some sample data with dput() as the commenters have requested. dput(Th[1:5, ]) should be plenty.
Note that if you want to do a lot of operations on just the times (ignoring the date part), you could use the times class from the chron package, see here for some more info.
I would like to create a vector of dates between two specified moments in time with step 1 month, as described in this thread (Create a Vector of All Days Between Two Dates), to be then converted into factors for data visualization.
However, I'd like to have the dates in the YYYY-Mon, ie. 2010-Feb, format. But so far I managed only to have the dates in the standard format 2010-02-01, using a code like this:
require(lubridate)
first <- ymd_hms("2010-02-07 15:00:00 UTC")
start <- ymd(floor_date(first, unit="month"))
last <- ymd_hms("2017-10-29 20:00:00 UTC")
end <- ymd(ceiling_date(last, unit="month"))
> start
[1] "2010-02-01"
> end
[1] "2017-11-01"
How can I change the format to YYYY-Mon?
You can use format():
start %>% format('%Y-%b')
To create the vector, use seq():
seq(start, end, by = 'month') %>% format('%Y-%b')
Obs: Use capital 'B' for full month name: '%Y-%B'.
I am calling an api that returns the time in the format 130800Z which represents The day, hour and minutes. I'm wonder if there is an easy way to convert this?
My approach would be string splitting Sys.Date to get the month and year, then splitting 130800Z by every 2 characters and merging all the results into 2017-10-13 08:00:00
Just wondering if there is a way to use striptime(sprintf(... without having to split anything?
Try this:
strptime(x = time,format = "%d%H%M",tz = "GMT")
#[1] "2017-10-13 08:00:00 GMT"
I did this but had to use splitting
time = "130950Z"
day = substring(time, seq(1,nchar(time),2), seq(2,nchar(time),2))[1]
hour = substring(time, seq(1,nchar(time),2), seq(2,nchar(time),2))[2]
min = substring(time, seq(1,nchar(time),2), seq(2,nchar(time),2))[3]
year = unlist(str_split(Sys.Date(), "-"))[1]
month = unlist(str_split(Sys.Date(), "-"))[2]
dt = strptime(sprintf("%s-%s-%s %s:%s:%s", year, month, day, hour, min, 0), "%Y-%m-%d %H:%M:%S")
dt = "2017-10-13 09:50:00 GMT"