Say I have a start date and time "2016-01-09 15:23:26" and measurements taken at varying intervals with the date and time recorded for each measurement.
Sample data follow:
date time date_time clock
"2016-01-09" "15:23:26" 2016-01-09 15:23:2 0s (0 seconds)
"2016-01-10" "23:59:53" 2016-01-09 23:59:53 30987s (~8.61 hours)
"2016-01-10" "00:04:53" NA NA
"2016-01-10" "01:04:55" 2016-01-10 01:04:55 34889s (~9.69 hours)
If I try to use lubridate to calculate the duration, in minutes since the start time, the calculation works fine for all but row 3 (where the date is "2016-01-10" and the time is "00:04:53". I've tried to look on the web for some possible explanation for what is going wrong here but to no avail. What am I missing? My code follows:
library(lubridate)
df$date2 = mdy(df$Date)
df$time2 = hms(df$Time)
df$date_time = paste(df$date2, df$time2, sep = " ")
df$date_time = ymd_hms(df$date_time)
df$start_time = ymd_hms("2016-01-09 15:23:26", tz="UTC")
df$clock = as.duration(df$date_time - df$start_time)
When you paste date and time, because third time is less than an hour, it is not recognized by ymd_hms()
Since you already have POSIXct objects, you can sum them directly.
Simply replace :
df$date_time = paste(df$date2, df$time2, sep = " ")
df$date_time = ymd_hms(df$date_time)
by:
df$date_time = df$date2+df$time2
Related
I have a spreadsheet that has the date and 12 hour time in one column and then another column that specifies AM/PM. How do I combine these files so I can use them as a POSIXct/POSIXlt/POSIXt object?
The spreadsheet has the time column as
DAY/MONTH/YEAR HOUR:MINUTE
while hour is in a 12-hour format from a roster of check in times. The other column just says AM or PM. I am trying to combine these columns and then convert them to 24 hour time and use it as a POSIXt object.
Example of what I see:
Timesheet
AM-PM
8/10/2022 9:00
AM
8/10/2022 9:01
AM
And this continues until 5:00 PM (same day)
What I have tried so far:
Timesheet %>%
unite("timestamp_24", c("timestamp_12","am_pm"),na.rm=FALSE)%>%
mutate(timestamp=(as.POSIXct(timestamp, format = "%d-%m-%Y %H:%M"))
This does not work as when they are combined it gives:
Timestamp_24
DAY/MONTH/YEAR HOUR:MINUTE_AM
and I think this is the crux of the issue because then as.POSIXct can't read it.
Here's my solution. The approach is simply to extract the hour, +12 if it is PM, then format correctly with as.POSXct (you need to use / rather than - in the format argument if the your dataframe is at is appears in your example).
I've done that with stringr::str_replace() which allows you to set a function for the replace argument.
Timesheet %>%
mutate(
time_24hr = stringr::str_replace(
time,
"\\d+(?=:..$)",
function(x) {
hr <- as.numeric(x) %% 12
ifelse(am_pm == "PM", hr + 12, hr)
}
),
time_24hr = as.POSIXct(time_24hr, format = "%d/%m/%Y %H:%M")
)
This is the result:
time am_pm time_24hr
1 8/10/2022 9:00 AM 2022-10-08 09:00:00
2 8/10/2022 9:01 PM 2022-10-08 21:01:00
3 8/10/2022 12:01 PM 2022-10-08 12:01:00
4 8/10/2022 12:01 AM 2022-10-08 00:01:00
EDIT. realized that this didn't work for 11 and 12 as the regex was only extracting the first character before :. Also wasn't working for 12:xx times. Fixed both. Added test cases to show that these work now.
I have a data frame in R with 2 columns (date_time , temp) I want to extract all the temp for the day time (between 5:59 Am to 18:00 Pm). I firstly separated date and times(hours) with this code:
Th$Hours <- format(as.POSIXct(Th$`date`,
"%Y-%m-%d %H:%M:%S", tz = ""),
format = "%H:%M")%>% as_tibble()
Th$Dates <- format(as.Date(Th$`date`,"%Y-%m-%d",
tz = ""), format = "%Y-%m-%d")%>% as_tibble()
and then I use following command to extract specific times:
Th_day<- Th[Th$Hours >= " 06:00 AM" & Th$Hours<= "18:00 PM",]%>% as_tibble()
But I get a tibble that give rows from 00:00 to 18:00 for every day :
What is the problem ?
Don't separate out the date from the time. When you use format to get the time, you are converting it to a character class, that doesn't know how to do time-based comparisons with > and <.
Instead, use hour to extract the hour component as an integer, and do comparisons on that:
library(lubridate)
Th %>%
mutate(DateTime <- as.POSIXct(date, "%Y-%m-%d %H:%M:%S", tz = "")) %>%
filter(hour(DateTime) >= 6 & hour(DateTime ) < 18)
I'm making some assumptions about your data structure - if you need more help than this please edit your question to share some sample data with dput() as the commenters have requested. dput(Th[1:5, ]) should be plenty.
Note that if you want to do a lot of operations on just the times (ignoring the date part), you could use the times class from the chron package, see here for some more info.
I have been given a dataset that lists date and time separately. The dates are fine however the time is being treated as a character rather than a date/time object.
The current time column looks like "13:00", "13:05", "13:10" etc.
I tried mutating the column using as.POSIXct() however it changed the column to all NA.
This was my attempt:
data = data %>%
mutate(time = as.POSIXct(time, format = "h:m"))
I expected a similar looking column but instead of strings I wanted it to be times/dates. Thanks for any help!
The times class in chron can represent times without dates:
library(chron)
library(dplyr)
# input data
data <- data.frame(date = "2000-01-01", time = c("13:00", "13:05", "13:10"))
data %>%
mutate(date = as.chron(as.character(date)),
time = times(paste0(time, ":00")),
datetime = chron(date, time))
giving:
date time datetime
1 01/01/00 13:00:00 (01/01/00 13:00:00)
2 01/01/00 13:05:00 (01/01/00 13:05:00)
3 01/01/00 13:10:00 (01/01/00 13:10:00)
For a simple, non package solution:
I would first create a column with both the date and time in it
dateandtime <- as.character(paste(date, time, sep = ' '))
and then use the strptime function:
dateandtime <- strptime(dateandtime,
format = "%Y-%m-%d %H:%M",
tz = 'GMT')
just put the dataframe name in front of all variables, e.g.:
df$dateandtime <- as.character(paste(df$date, df$time, sep = ' '))
Hope it helps!
If you use as.POSIXct, you need to provide the format differently:
as.POSIXct("13:05", format = "%H:%M")
This however returns [1] "2019-03-26 13:05:00 CET" since date/times are represented as calendar dates plus time to the nearest second.
If you only want to use the time, you could use data.table::asITime:
data.table::as.ITime(c("13:00", "13:05", "13:10"))
This returns:
str(data.table::as.ITime(c("13:00", "13:05", "13:10")))
'ITime' int [1:3] 13:00:00 13:05:00 13:10:00
I need to parse dates and have a cases like "31/02/2018":
library(lubridate)
> dmy("31/02/2018", quiet = T)
[1] NA
This makes sense as the 31st of Feb does not exist. Is there a way to parse the string "31/02/2018" to e.g. 2018-02-28 ? So not to get an NA, but an actual date?
Thanks.
We can write a function assuming you would only have dates which could be higher than the actual date and would have the same format always.
library(lubridate)
get_correct_date <- function(example_date) {
#Split vector on "/" and get 3 components (date, month, year)
vecs <- as.numeric(strsplit(example_date, "\\/")[[1]])
#Check number of days in that month
last_day_of_month <- days_in_month(vecs[2])
#If the input date is higher than actual number of days in that month
#replace it with last day of that month
if (vecs[1] > last_day_of_month)
vecs[1] <- last_day_of_month
#Paste the date components together to get new modified date
dmy(paste0(vecs, collapse = "/"))
}
get_correct_date("31/02/2018")
#[1] "2018-02-28"
get_correct_date("31/04/2018")
#[1] "2018-04-30"
get_correct_date("31/05/2018")
#[1] "2018-05-31"
With small modification you can adjust the dates if they have different format or even if some dates are smaller than the first date.
I am calling an api that returns the time in the format 130800Z which represents The day, hour and minutes. I'm wonder if there is an easy way to convert this?
My approach would be string splitting Sys.Date to get the month and year, then splitting 130800Z by every 2 characters and merging all the results into 2017-10-13 08:00:00
Just wondering if there is a way to use striptime(sprintf(... without having to split anything?
Try this:
strptime(x = time,format = "%d%H%M",tz = "GMT")
#[1] "2017-10-13 08:00:00 GMT"
I did this but had to use splitting
time = "130950Z"
day = substring(time, seq(1,nchar(time),2), seq(2,nchar(time),2))[1]
hour = substring(time, seq(1,nchar(time),2), seq(2,nchar(time),2))[2]
min = substring(time, seq(1,nchar(time),2), seq(2,nchar(time),2))[3]
year = unlist(str_split(Sys.Date(), "-"))[1]
month = unlist(str_split(Sys.Date(), "-"))[2]
dt = strptime(sprintf("%s-%s-%s %s:%s:%s", year, month, day, hour, min, 0), "%Y-%m-%d %H:%M:%S")
dt = "2017-10-13 09:50:00 GMT"