I have a data frame which has a column labeled Time to indicate when an order was placed, all of the Time values are in a 07:16:00 format. I would like to convert these from a character to a numeric so that I can add another column that says the type of menu it was based off of the time.
Using as.numeric all the values become NA. Also tried to use strptime and values were also NA.
An approach is using lubridates hms.
Transforms a character or numeric vector into a period object with
the specified number of hours, minutes, and seconds.
library(lubridate)
times <- c("07:16:00", "07:18:12", "08:56:00")
new_time <- hms(times)
new_time - new_time[1]
[1] "0S" "2M 12S" "1H 40M 0S"
Related
I have a list of times that are an output of a model, these are in a decimal number format and represent the time from the start of the model running in hours, but they have not been given any units in the output, they are just numbers.
What I would like to do is convert these numbers into a date stamp format, using the numbers as hours the model has run. Specifically as either a POSIXct or Date variable so I can start use the Bupar library.
So for example I would like to convert 1.75, into 1 hour 45 minutes and 25.5 into a time that would equal 1 day 1 hour and 30 mins.
Thanks for any help
One possible way:
We could use seconds_to_period() function from lubridate after multiplying by 3600 to get seconds:
library(lubridate)
x <- 1.75
seconds_to_period(x*3600)
#[1] "1H 45M 0S"
In R I have this data.frame
24:43:30 23:16:02 14:05:44 11:44:30 ...
Note that some of the times are over 24:00:00 ! In fact all my times are within 02:00:00 to 25:59:59.
I want to subtract all entries in my dataset data with 2 hours. This way I get a regular data-set. How can I do this?
I tried this
strptime(data, format="%H:%M:%S") - 2*60*60
and this work for all entries below 23:59:59. For all entries above I simply get NA since the strptime command produce NA to all entries above 23:59:59.
Using lubridate package can make the job easier!
> library(lubridate)
> t <- '24:43:30'
> hms(t) - hms('2:0:0')
[1] "22H 43M 30S"
Update:
Converting the date back to text!
> substr(strptime(hms(t) - hms('2:0:0'),format='%HH %MM %SS'),12,20)
[1] "22:43:30"
Adding #RHertel's update:
format(strptime(hms(t) - hms('2:0:0'),format='%HH %MM %SS'),format='%H:%M:%S')
Better way of formating the lubridate object:
s <- hms('02:23:58) - hms('2:0:0')
paste(hour(s),minute(s),second(s),sep=":")
"0:23:58"
Although the answer by #amrrs solves the main problem, the formatting could remain an issue because hms() does not provide a uniform output. This is best shown with an example:
library(lubridate)
hms("01:23:45")
#[1] "1H 23M 45S"
hms("00:23:45")
#[1] "23M 45S"
hms("00:00:45")
#[1] "45S"
Depending on the time passed to hms() the output may or may not contain an entry for the hours and for the minutes. Moreover leading zeros are omitted in single-digit values of hours, minutes and seconds. This can result pretty much in a formatting nightmare if one tries to put that data into a common form.
To resolve this difficulty one could first convert the time into a duration with lubridate's as.duration() function. Then, the duration in seconds can be transformed into a POSIXct object from which the hours, minutes, and seconds can be extracted easily with format():
times <- c("24:43:30", "23:16:02", "14:05:44", "11:44:30", "02:00:12")
shifted_times <- hms(times) - hms("02:00:00")
format(.POSIXct(as.duration(shifted_times),tz="GMT"), "%H:%M:%S")
#[1] "22:43:30" "21:16:02" "12:05:44" "09:44:30" "00:00:12"
The last entry "02:00:12" would have caused difficulties if shifted_times had been passed to strptime().
I am attempting to eliminate the leading zero of a 12-hour time value, but for graphing purposes the result must be a POSIXlt value. Therefore, I can-not use regular expressions because they would leave the result as a character instead of a POSIXlt value.
My time value begins as a character.
a <- "02:57"
Then I use strptime to convert the character to the POSIXlt class. Within strptime, I use the conversion specification %l, which according to the strptime help, displays "12-hour clock time with single digits preceded by a blank".
b <- strptime(x = a, tz = "UTC", format = "%l")
The variable b is a POSIXlt value, and consists of "current date" + "02:57:00" + "local time zone". I can live with the date and time zone, but the leading zero of the 12-hour time value remains.
How can I eliminate the leading zero of the 12-hour time value and still retain POSIXlt class?
I appreciate any insight.
I would use lubridate, an excellent package for working with POSIXlt time objects.
library(lubridate)
a <- "02:57"
b <- hm(a)
yields
> b
[1] "2H 57M 0S"
no pesky 0 and sooo much other time goodness to boot. Good luck.
I have a data set, rfm, which has a column named cohort, of class Date as shown below. I also have a vector of calendar dates, rather cleverly named dates, whose elements are also of class Date:
> class(rfm)
[1] "tbl_df" "data.frame"
> class(rfm$cohort)
[1] "Date"
> class(dates)
[1] "Date"
Both rfm$cohort and dates show the date of the first of the month over a certain time period, with the possibility that dates covers a more recent set of months. My problem is simple: I just want to see how many months there are between max(rfm$cohort) and max(dates).
The lubridate package makes this easy with the interval() function, but the arguments of that function must be POSIXct, not Date objects:
> as.period(interval(ymd(as.character(max(rfm$cohort))),ymd(as.character(max(dates)))), months)
[1] "1m 0d 0H 0M 0S"
But do I really need a chained call to ymd() and as.character()? Wouldn't as.POSIXct() suffice? Here's a try:
> as.period(interval(as.POSIXct(max(rfm$cohort), tz = 'GMT'),as.POSIXct(max(dates), tz = 'GMT')), months)
Error in while (any(start + est * per < end)) est[start + est * per < :
missing value where TRUE/FALSE needed
This didn't work. It seems that lubridate wants me to set the time zone for the interval, not separately for its ends. Like this:
> as.period(interval(as.POSIXct(max(rfm$cohort)),as.POSIXct(max(dates)), tz = 'GMT'), months)
[1] "1m 0d 0H 0M 0S"
I know that this is less typing, so I should just do what lubridate wants, but why wouldn't setting the time zone of interval ends separately also work? By my reading of it, ?interval suggests that the third argument of this function, tzone, should default to getting its value from the time zone of the first. Somehow as.POSIXct() won't reveal the time zone to interval(), even if it's explicitly set in the same call. What am I missing?
I have a data set in R that contains a column of dates in the format yyyy/mm/dd. I am trying to use as.Date to convert these dates to date objects in R. However, I cannot seem to find the correct argument for origin to input into as.Date. The following code is an example of what I have been trying. I am using a CSV file from Excel, so I used origin="1899/12/30 based on other sites I have looked at.
> as.Date(2001/04/26, origin="1899/12/30")
[1] "1900-01-18"
However, this is not working since the input date 2001/04/26 is returned as "1900-01-18". I need to convert the dates into date objects so I can then convert the dates into julian dates.
You can either is as.Date with a numeric value, or with a character value. When you type just 2001/04/26 into R, that's doing division and getting 19.24 (a numeric value). And numeric values require an origin and the number you supply is the offset from that origin. So you're getting 19 days away from your origin, ie "1900-01-18". A date like Apr 26 2001 would be
as.Date(40659, origin="1899-12-30")
# [1] "2011-04-26"
If your dates from Excel "look like" dates chances are they are character values (or factors). To convert a character value to a Date with as.Date() you want so specify a format. Here
as.Date("2001/04/26", format="%Y/%m/%d")
# [1] "2001-04-26"
see ?strptime for details on the special % variables. Now if you're read your data into a data.frame with read.table or something, there's a chance your variable may be a factor. If that's the case, you'll want do convert to character with'
as.Date(as.character(mydf$datecol), format="%Y/%m/%d")