How to convert times over 24:00:00 in R - r

In R I have this data.frame
24:43:30 23:16:02 14:05:44 11:44:30 ...
Note that some of the times are over 24:00:00 ! In fact all my times are within 02:00:00 to 25:59:59.
I want to subtract all entries in my dataset data with 2 hours. This way I get a regular data-set. How can I do this?
I tried this
strptime(data, format="%H:%M:%S") - 2*60*60
and this work for all entries below 23:59:59. For all entries above I simply get NA since the strptime command produce NA to all entries above 23:59:59.

Using lubridate package can make the job easier!
> library(lubridate)
> t <- '24:43:30'
> hms(t) - hms('2:0:0')
[1] "22H 43M 30S"
Update:
Converting the date back to text!
> substr(strptime(hms(t) - hms('2:0:0'),format='%HH %MM %SS'),12,20)
[1] "22:43:30"
Adding #RHertel's update:
format(strptime(hms(t) - hms('2:0:0'),format='%HH %MM %SS'),format='%H:%M:%S')
Better way of formating the lubridate object:
s <- hms('02:23:58) - hms('2:0:0')
paste(hour(s),minute(s),second(s),sep=":")
"0:23:58"

Although the answer by #amrrs solves the main problem, the formatting could remain an issue because hms() does not provide a uniform output. This is best shown with an example:
library(lubridate)
hms("01:23:45")
#[1] "1H 23M 45S"
hms("00:23:45")
#[1] "23M 45S"
hms("00:00:45")
#[1] "45S"
Depending on the time passed to hms() the output may or may not contain an entry for the hours and for the minutes. Moreover leading zeros are omitted in single-digit values of hours, minutes and seconds. This can result pretty much in a formatting nightmare if one tries to put that data into a common form.
To resolve this difficulty one could first convert the time into a duration with lubridate's as.duration() function. Then, the duration in seconds can be transformed into a POSIXct object from which the hours, minutes, and seconds can be extracted easily with format():
times <- c("24:43:30", "23:16:02", "14:05:44", "11:44:30", "02:00:12")
shifted_times <- hms(times) - hms("02:00:00")
format(.POSIXct(as.duration(shifted_times),tz="GMT"), "%H:%M:%S")
#[1] "22:43:30" "21:16:02" "12:05:44" "09:44:30" "00:00:12"
The last entry "02:00:12" would have caused difficulties if shifted_times had been passed to strptime().

Related

R: Turn timestamps into (as short as possible) integers

Edit 1: I think a possible solution would be to count the number of 15-minute intervals elapsed since a starting date. If anyone has thoughts on this, please come forward. Thanks
As the title says, I am looking for a way to turn timestamps into as small as possible integers.
Explanation of the situation:
I am working with "panelAR". I have T>N panel-data containing different timestamps that look like this (300,000 rows in total):
df$timestamp[1]
[1] "2013-08-01 00:15:00 UTC"
class(df$timestamp)
[1] "POSIXct" "POSIXt"
I am using panelAR and thus need the timestamp as an integer. I can't simply use "as.integer" because I would hit the max length for integers resulting in only NA's. This was my first try to work around this problem:
df$timestamp <- as.numeric(gsub("[: -]", "" , df$timestamp, perl=TRUE))
Subtract the numbers starting at te 3rd position (Because "20" is irrelevant) and stop before the 2nd last position (Because they all end at 00 seconds)
(I need shorter integers in order to not hit the max level of integers in R)
df$timestamp <- substr(df$timestamp, 3, nchar(df$timestamp)-2)
#Save as integer
df$timestamp <- as.integer(df$timestamp)
#Result
df$timestamp[1]
1308010015
This allows panelAR to work with it, but the numbers seem to be way too large. When I try to run a regression with it, i get the following error message:
"cannot allocate vector of size 1052.2 GB"
I am looking for a way to turn these timestamps into (as small as possible) integers in order to work with panelAR.
Any help is greatly appreciated.
this big number that you get corresponds to the number of seconds elapsed since 1970-01-01 00:00:00. Do your time stamps have regular intervals? If it is, let's say, every 15 minutes you could divide all integers by 900, and it might help.
Another option is to pick your earliest date and subtract it from the others
#generate some dates:
a <- as.POSIXct("2013-01-01 00:00:00 UTC")
b <- as.POSIXct("2013-08-01 00:15:00 UTC")
series <- seq(a,b, by = 'min')
#calculate the difference (result are integers/seconds)
integer <- as.numeric(series - min(series))
If you still get memory problems, I might combine both.
I managed to solve the main question. Since this still results in a memory error, I think it stems from the number of observations and the way panelAR computes things. I will open a separate question for that matter.
I used
df$timestampnew <- as.integer(difftime(df$timestamp, "2013-01-01 00:00:00", units = "min")/15)
to get integers that count the number of 15-min intervals elapsed since a certain date.

removing date from %d/%m/%Y %H:%M in R

The r code that I am working on is supposed to use the data collected in every five minute intervals.
The data is saved in csv format. However, due to inconsistency in the data collected, the time column in the data sometimes represent timestamp instead of just time.(dd/mm/yyyy HH:MM, instead of HH:MM)
This causes an error to my system as the system reads the data as having multiple different values for the same time value. Therefore, I would like to omit the date format from the timestamp such that the code would only read the time value.
My failed attempt was:
as.Date(data[[1]],"%H:%M")
which gave me all NA values for the time column.
I have searched for similar questions in SO, but I did not manage to find a clear answer to my question. Can anyone suggest me some possible functions to use?
I appreciate your help.
You could just strip the date portion of the text and then use as.POSIXct to convert them all to a %H:%M timestamp, e.g.:
x <- c("10:25","01/01/2014 10:30")
x <- gsub("^.+(\\d{2}:\\d{2})$","\\1",x)
as.POSIXct(x,format="%H:%M",tz="UTC")
#[1] "2014-06-02 10:25:00 UTC" "2014-06-02 10:30:00 UTC"

Strip the date and keep the time

Lots of people ask how to strip the time and keep the date, but what about the other way around? Given:
myDateTime <- "11/02/2014 14:22:45"
I would like to see:
myTime
[1] "14:22:45"
Time zone not necessary.
I've already tried (from other answers)
as.POSIXct(substr(myDateTime, 12,19),format="%H:%M:%S")
[1] "2013-04-13 14:22:45 NZST"
The purpose is to analyse events recorded over several days by time of day only.
Thanks
Edit:
It turns out there's no pure "time" object, so every time must also have a date.
In the end I used
as.POSIXct(as.numeric(as.POSIXct(myDateTime)) %% 86400, origin = "2000-01-01")
rather than the character solution, because I need to do arithmetic on the results. This solution is similar to my original one, except that the date can be controlled consistently - "2000-01-01" in this case, whereas my attempt just used the current date at runtime.
I think you're looking for the format function.
(x <- strptime(myDateTime, format="%d/%m/%Y %H:%M:%S"))
#[1] "2014-02-11 14:22:45"
format(x, "%H:%M:%S")
#[1] "14:22:45"
That's character, not "time", but would work with something like aggregate if that's what you mean by "analyse events recorded over several days by time of day only."
If the time within a GMT day is useful for your problem, you can get this with %%, the remainder operator, taking the remainder modulo 86400 (the number of seconds in a day).
stamps <- c("2013-04-12 19:00:00", "2010-04-01 19:00:01", "2018-06-18 19:00:02")
as.numeric(as.POSIXct(stamps)) %% 86400
## [1] 0 1 2

Round an POSIXct date up to the next day

I have a question similar to Round a POSIX date (POSIXct) with base R functionality, but I'm hoping to always round the date up to midnight the next day (00:00:00).
Basically, I want a function equivalent to ceiling for POSIX-formatted dates. As with the related question, I'm writing my own package, and I already have several package dependencies so I don't want to add more. Is there a simple way to do this in base R?
Maybe
trunc(x,"days") + 60*60*24
> x <- as.POSIXct(Sys.time())
> x
[1] "2012-08-09 18:40:08 BST"
> trunc(x,"days")+ 60*60*24
[1] "2012-08-10 BST"
A quick and dirty method is to convert to a Date (which truncates the time), add 1 (which is a day for Date) and then convert back to POSIX to be at midnight UTC on the next day. As #Joshua Ulrich points out, timezone/daylight savings issues may give results you don't expect:
as.POSIXct(as.Date(Sys.time())+1)
[1] "2012-08-10 01:00:00 BST"

what is the best practice of handling time in R?

I am working with a survey dataset. It has two string vectors, start and finish, indicating the time of the day when the interview was started, and finished, respectively.
They are character strings that look like: "9:24 am", "12:35 pm", and so forth. I am trying to calculate the duration of the interview based on these two. What is the best way of doing this?
I know that, for dates, there are lots of classes or functions like as.date(), as.Date(), chron(), or as.POSIXct(). So I was looking for something like as.time(), but could not find it. Should I just append a made-up date and convert the whole thing into a POSIX() date-time class, then use difftime()?
What is the best practice of handling time in R?
You need to use strptime() to convert the string to a date. For example:
strptime("9:24 am",format="%I:%M %p")
Then you can take differences just by taking one away from the other:
strptime("9:24 am",format="%I:%M %p")-strptime("12:14 am",format="%I:%M %p")
Time difference of 9.166667 hours
You can store this and then do an as.numeric() if you just want the number out, otherwise you can pass around the time objects.
Hope this helps!
one option is to use regular expressions. if you are not familiar with them, they are used to parse strings using patterns. i would research regular expressions and then here are the functions in r
hope it helps
best practice is using lubridate package
https://www.rdocumentation.org/packages/lubridate/versions/1.5.6/topics/hm
hm(c("09:10", "09:02", "1:10"))
## [1] "9H 10M 0S" "9H 2M 0S" "1H 10M 0S
Then use difftime for difference in the date time formats created above
https://stat.ethz.ch/R-manual/R-devel/library/base/html/difftime.html
difftime(time1, time2, tz,
units = c("auto", "secs", "mins", "hours",
"days", "weeks"))

Resources