I have data below for work hours which I need to compare - start and stop with date and time. I first extract the time portion of each as start and stop variables, then use the chron package to change them from factor data to something I can compare more easily.
require(chron)
eg_data3 <- data.frame(
id = c('42', '42', '42', '42', '42'),
time_in = as.factor(c('11/5/2017 13:52', '11/4/2017 14:25', '11/5/2017 15:30', '11/5/2017 17:10', '11/6/2017 18:20')),
time_out = as.factor(c('11/5/2017 13:59', '11/4/2017 14:59', '11/5/2017 16:00', '11/5/2017 17:45', '11/6/2017 18:50')))
eg_data3$start_time <- substring(strptime(eg_data3$time_in, format = "%m/%d/%Y %H:%M"),12,19)
eg_data3$end_time <- substring(strptime(eg_data3$time_out, format = "%m/%d/%Y %H:%M"),12,19)
eg_data3$end_time <- chron(times = eg_data3$end_time)
eg_data3$start_time <- chron(times = eg_data3$start_time)
Next, I generate another variable which compares the difference between stop time 1, and start time 2, IE stop time in row 1 with start time in row 2, to see the gap between them.
require(dplyr)
eg_data3 <- eg_data3 %>% group_by(id) %>% mutate(diff_outX0_inX1 = start_time - lag(end_time))
When I do this, the variable is formatted as a decimal. I cannot for the life of me get it to display as hh:mm:ss. I have tried specifying out.format as hh:mm:ss in chron, changing time_in / time_out to numeric and character before and after extraction and applying chron(times), changing the format of the diff_ variable after, etc.
What seems like a very simple question -
How do I get the result comparison (diff_outX0_inX1) variable to display as time, either hh:mm or hh:mm:ss ?? I know the formula to convert fractional days into minutes in Excel, but I'd prefer to not write out a two step function, I assume it's a simple formatting issue.
Any help is appreciated.
EDIT - got flagged as a duplicate...OK. I asked if there was a way to do this that did not involve writing a function. The answer that was linked involves a function. First comment provided a clean simple answer. I can reproduce the answer in the comment, I could not reproduce the function myself, not nearly as helpful. I also added another solution that does not requre dplyr. No where I looked online showed me something as simple as "just format the result with chron."
Related
I've been provided a csv file with the date column as follows:
1990.12466
1990.20137
1990.2863
1990.36849
1990.45342
1990.53562
1990.62055
1990.70548
1990.78767
1990.8726
1990.95479
1991.03973
This is data I'll be using in highcharts, I can't seem to find any functionality to get these formats into YYYYMMDD
It appears like this data was made in R using something like the lubridate function but I have no way of confirming this.
Any ideas on the best way to get this data into YYYMMDD ?
Assuming that the first four digits represent the year, and the digits after the decimal represent the percentage through the year, you can use the following formula to convert these values into a MS Excel date-time code: (with dates to be converted residing in column "A")
=DATE(MID(A1,1,4),1,1)+((A1-MID(A1,1,4))*(IF(OR(MOD(MID(A1,1,4),400)=0,AND(MOD(MID(A1,1,4),4)=0,MOD(MID(A1,1,4),100)<>0)),365, 366)))
Once you have these MS Excel date-time codes, you can format the date in Excel to whatever format you need (see Format a date the way you want).
Something like this should work. First we linearly interpolate between the beginning of the year and the end of the year, and then we format the output into YYYYMMDD format as requested:
decimal_to_date = function(dt){
yr = floor(dt)
yr_begin = ISOdate(yr, 1, 1, 0, 0, 0)
yr_end = ISOdate(yr+1, 1, 1, 0, 0, 0)
interpolated_date = yr_begin + (yr_end - yr_begin) * (dt - yr)
return(format(interpolated_date, '%Y%m%d'))
}
Then for example decimal_to_date(1990.12466) returns 19900215 for February 15, 1990.
If you output the times as well as the dates, the time of day is always very near noon, which suggests something about the process that generated your data, although I'm not exactly sure what.
For what it's worth, here's a very slight variation on Michael Lugo's answer, which indeed does the trick. The ISOdate() function outputs a date-time object. Following code uses as.Date() which outputs date only. Following code also takes a brief shortcut in the calculation of the number of days in a calendar year - which you'll need for the interpolation. This shortcut requires loading a library, however, which the original answer does not.
library(lubridate)
decimals <- c(1990.12466,1990.20137,1990.2863,1990.36849,1990.45342,1990.53562,1990.62055,1990.70548,1990.78767,1990.8726,1990.95479,1991.03973)
decimal_to_date2 = function(dt){
nDays <- yday(paste0(floor(dt),"-12-31"))
day1 <- as.Date(paste0(floor(dt),"-01-01"))
interpolated_date <- day1+(dt-floor(dt))*nDays
return(format(interpolated_date, '%Y%m%d'))
}
decimal_to_date2(decimals)
Results of first answer and mine are identical.
I am relatively new to R and I have a dataset in which I am trying to convert a date and time into a numeric value. The date and time are in the format 01JUN17:00:00:00 under a variable called pickup_datetime. I have tried using the code
cab_small_sample$pickup_datetime <- as.numeric(as.Date(cab_small_sample$pickup_datetime, format = '%d%b%y'))
but this way doesn't incorporate time, I tried to add the time format to the format section of code but still did not work. Is there an R function that will convert the data into a numeric value>
R has two main time classes: "Date" and "POSIXct". POSIXct is a datetime class and you can get all the gory details at: ? DateTimeClasses. The help page for the formats used at the time of data input, however, are at ?striptime.
cab_small_sample <- data.frame(pickup_datetime = "01JUN17:00:00:00")
cab_small_sample$pickup_dt <- as.numeric(as.POSIXct(cab_small_sample$pickup_datetime,
format = '%d%b%y:%H:%M:%S'))
cab_small_sample
# pickup_datetime pickup_dt
#1 01JUN17:00:00:00 1496300400 # seconds since 1970-01-01
I find that a "destructive reassignment of values" is generally a bad idea so as a "my (best?) practice rule" I don't assign to the same column until I'm sure I have the code working properly. (And I always leave an untouched copy somewhere safe.)
lubridate is an extremely handy package for dealing with dates. It includes a variety of functions which do the date/time parsing for you, as long as you can provide the order of components. In this case, since your data is in day-month-year-hms form, you can use the dmy_hms function.
library(lubridate)
cab_small_sample <- dplyr::tibble(
pickup_datetime = c("01JUN17:00:00:00", "01JUN17:11:00:00"))
cab_small_sample$pickup_POSIX <- dmy_hms(cab_small_sample$pickup_datetime)
I'm trying to make a timeline like you'd make with any of the timevis, vistime, or timeline R packages, but I'm only interested in times and not dates. I don't mind putting a placeholder date in there, but it seems that all of these packages require the start and end times to include dates and include the date in the timeline.
I've been searching for ways to either not include dates in a timeline or only print the time but not the date in any of these package, but haven't been able to find anything. Does anyone have any ideas?
All of those packages use as.POSIXct under the hood, which requires objects to be Date objects and doesn't work with times only. So, if your data is about only one day, you can add the date on the clock times (using paste) and e.g. vistime will display only the time (ok, a date almost completely hidden in the corner):
dat <- data.frame(event = 1:2,
start = c("14:00", "16:00"),
end = c("15:30", "17:00"))
# add a Date
dat[,c("start", "end")] <- sapply(dat[,c("start", "end")], function(x) paste(Sys.Date(), x))
vistime(dat)
I use vistime version 0.7.0.9000 which can be obtained by executing devtools::install_github("shosaco/vistime").
If you want to represent times without any date information, you should try out the package hms. It is part of the tidyverse collection and is described as:
A simple class for storing durations or time-of-day values and displaying them in the hh:mm:ss format.
Example use:
library(hms)
hms(56, 34, 12)
#> 12:34:56
How can I convert dates in R to a string without dashes or slashes or letters and times without colons. For example I can get 2017-12-07 in R but I need 201712071520 to use in an Weather API call. How can I do that? For reference please see the example call below for startDateTime and endDateTime. I would like to convert the dates that I have into 20171207 format and append it with a fixed time (1520) without the colon. Thanks for helping!
I have been told this question has been asked before but the other examples are doing the opposite converting character strings into R dates and times.
Here is an example of the API I am calling:
https://api.weather.com/v3/wx/hod/conditions/historical/point?pointType=nearest&geocode=39.86,-104.67&startDateTime=201712071520&endDateTime=201712071520&units=e&format=json&apiKey=yourApiKey
Moved from comments.
If x is of R's "Date" class then use the indicated format statement:
x <- as.Date("2017-12-07") # test input
format(x, "%Y%m%d1520")
## [1] "201712071520"
See ?strptime for more on percent codes.
This is a bit more generic solution. It would look like this:
library(lubridate)
input_date = "2017-1-7" #intentionally taking different date to make it more generic
fixed_text = "1520"
input_date = ymd(input_date)
output_date = paste(year(input_date), sprintf(fmt = '%02d', month(input_date)), sprintf(fmt = '%02d', day(input_date)), fixed_text, sep = "")
print(output_date)
I have a dataset with some date time like this "{datetime:2015-07-01 09:10:00" So I wanted to remove the text, and then keep the date & the time as as.Date returns only the date. So I write this code but the only problem I have is that during the second line with strsplit, it only returns me the date time of the first line and so erase the others... I woud love to get ALL my date time not only the first. I thought about sapply maybe, but I can't make it right I have many errors or maybe with a loop for? I am novice to R so I don't really know how to do this the best way.
Could you help me please? Besides If you have another idea for the time & date format or a simple way to do it, it should be very nice of you too.
data$`Date Time`=as.character(data$`Date Time`)
data$`Date Time`=unlist(strsplit(data[,1], split='e:'))[2]
date=substr(data$`Date Time`,0,10)
date=as.Date(date)
time=substr(data$`Date Time`,12,19)
data$Date=date
data$Time=time
Thank you very much for your help!
You could use the format argument to avoid all the strsplit:
times <- as.POSIXct(data$`Date Time`, format='{datetime:%Y-%m-%d %H:%M:%S')
(The reason for the "{datetime:" in the format is because you mentioned this is the format of your strings).
This object has both date and time in it, and then you can just store it in the dataframe as a single column of type POSIXct rather than two columns of type string e.g.
data$datetime <- times
but if you do want to store the date as a Date and the time as a string (as in your example above):
data$Date <- as.Date(times)
data$Time <- strftime(times, format='%H:%M:%S')
See ?as.Date, ?as.POSIXct, ?strptime for more details on that format argument and various conversions between date and string.