Suppose I have the following data.frame foo
start.time duration
1 2012-02-06 15:47:00 1
2 2012-02-06 15:02:00 2
3 2012-02-22 10:08:00 3
4 2012-02-22 09:32:00 4
5 2012-03-21 13:47:00 5
And class(foo$start.time) returns
[1] "POSIXct" "POSIXt"
I'd like to create a plot of foo$duration v. foo$start.time. In my scenario, I'm only interested in the time of day rather than the actual day of the year. How does one go about extracting the time of day as hours:seconds from POSIXct class of vector?
This is a good question, and highlights some of the difficulty in dealing with dates in R. The lubridate package is very handy, so below I present two approaches, one using base (as suggested by #RJ-) and the other using lubridate.
Recreate the (first two rows of) the dataframe in the original post:
foo <- data.frame(start.time = c("2012-02-06 15:47:00",
"2012-02-06 15:02:00",
"2012-02-22 10:08:00"),
duration = c(1,2,3))
Convert to POSIXct and POSIXt class (two ways to do this)
# using base::strptime
t.str <- strptime(foo$start.time, "%Y-%m-%d %H:%M:%S")
# using lubridate::ymd_hms
library(lubridate)
t.lub <- ymd_hms(foo$start.time)
Now, extract time as decimal hours
# using base::format
h.str <- as.numeric(format(t.str, "%H")) +
as.numeric(format(t.str, "%M"))/60
# using lubridate::hour and lubridate::minute
h.lub <- hour(t.lub) + minute(t.lub)/60
Demonstrate that these approaches are equal:
identical(h.str, h.lub)
Then choose one of above approaches to assign decimal hour to foo$hr:
foo$hr <- h.str
# If you prefer, the choice can be made at random:
foo$hr <- if(runif(1) > 0.5){ h.str } else { h.lub }
then plot using the ggplot2 package:
library(ggplot2)
qplot(foo$hr, foo$duration) +
scale_x_datetime(labels = "%S:00")
You could rely on base R:
# Using R 2.14.2
# The same toy data
foo <- data.frame(start.time = c("2012-02-06 15:47:00",
"2012-02-06 15:02:00",
"2012-02-22 10:08:00"),
duration = c(1,2,3))
Since class POSIXct contains date-time information in a structured manner, you can rely on substr to extract the characters in time positions within the POSIXct vector. That is, given you know the format of your POSIXct (how it would be presented when printed), you can extract hours and minutes:
# Extract hour and minute as a character vector, of the form "%H:%M"
substr(foo$start.time, 12, 16)
And then paste it to an arbitrary date to convert it back to POSIXct. In the example I use January first 2012, but if you don't specify a date and instead use format R uses the current date.
# Store time information as POSIXct, using an arbitrary date
foo$time <- as.POSIXct(paste("2012-01-01", substr(foo$start.time, 12, 16)))
And both plot and ggplot2 know how to format times in POSIXct out of the box.
# Plot it using base graphics
plot(duration~time, data=foo)
# Plot it using ggplot2 (0.9.2.1)
library(ggplot2)
qplot(x=time, y=duration, data=foo)
Lubridate doesn't handle time of day data, so Hadley recommends the hms package for this type of data. Something like this would work:
library(lubridate)
foo <- data.frame(start.time = parse_datetime(c("2012-02-06 15:47:00",
"2012-02-06 15:02:00",
"2012-02-22 10:08:00")),
duration = c(1,2,3))
foo<-foo %>% mutate(time_of_day=hms::hms(second(start.time),minute(start.time),hour(start.time)))
Watch out for 2 potential issues - 1) lubridate has a different function called hms and 2) hms::hms takes the arguments in the opposite order to that suggested by its name (so that just seconds may be supplied)
This code is much faster than converting to string and back to numeric
time <- c("1979-11-13T08:37:19-0500", "2014-05-13T08:37:19-0400");
time.posix <- as.POSIXct(time, format = "%Y-%m-%dT%H:%M:%S%z");
time.epoch <- as.vector(unclass(time.posix));
time.poslt <- as.POSIXlt(time.posix, tz = "America/New_York");
time.hour.new.york <- time.poslt$hour + time.poslt$min/60 + time.poslt$sec/3600;
> time;
[1] "1979-11-13T08:37:19-0500" "2014-05-13T08:37:19-0400"
> time.posix;
[1] "1979-11-13 15:37:19 IST" "2014-05-13 15:37:19 IDT"
> time.poslt;
[1] "1979-11-13 08:37:19 EST" "2014-05-13 08:37:19 EDT"
> time.epoch;
[1] 311348239 1399984639
> time.hour.new.york;
[1] 8.621944 8.621944
It is ancient topic, but I have found very few questions and answers about this matter. My solution is the following
library(hms)
foo <- data.frame(start.time = c("2012-02-06 15:47:00",
"2012-02-06 15:02:00",
"2012-02-22 10:08:00"),
duration = c(1,2,3))
foo$start.time = as.POSIXct( foo$start.time )
g1 = ggplot( ) + xlab("") +
geom_line( data = foo, aes(x = as.hms(start.time), y = duration ), color = "steelblue" )
g1
If you would like to add manual time (!) breaks, then
time_breaks = as.POSIXlt(c(
"2012-02-06 12:35:00 MSK",
"2012-02-06 13:15:00 MSK",
"2012-02-06 14:22:00 MSK",
"2012-02-06 15:22:00 MSK"))
g1 +
scale_x_time( breaks = as.hms( time_breaks ) ) +
theme( axis.text.x = element_text( angle=45, vjust=0.25) )
Related
I am trying to convert numeric values into times and dates. I am working with a data set so it would be appreciated if you should show an example using a dataset.
Here are some examples, converting 93537 into 09:35:57 (HH:MM:SS). Additionally, I need to convert 220703 into 22-07-03 (YY:MM:DD).
I will add an example of my code below:
CPLF_data$HMS <- substr(as.POSIXct(sprintf("%04.0f", CPLF_data$StartTime), format='%H%M%S'), 12, 16)
CPLF_data$YMD <- as.POSIXct(CPLF_data$Date, tz="UTC", origin ="1970-01-01", format ="%Y-%M-%D")
The first line is correct however, it does not show seconds.
The second line is incorrect.
Thank you.
I want my final product to be a new column with the times and dates in the correct format with their own columns.
Use chron times class to get the times or if a character string is wanted use as.character on that. Use as.Date to get a Date class object. The sub puts colons between the parts of the time after which we can convert it to times class. The sprintf pads the date with 0 on the left if it is only 5 characters and otherwise leaves it as 6 characters and then we convert that to Date class.
library(chron)
time <- 93537
date <- 220703
tt <- times(sub("(..)(..)$", ":\\1:\\2", time))
tt
## [1] "09:35:37"
as.character(tt)
## [1] "09:35:37"
dd <- as.Date(sprintf("%06d", date), "%y%m%d")
dd
## [1] "2022-07-03"
as.character(dd)
## [1] "2022-07-03"
Try the ymd_hms function in the lubridate package.
output$datetime <- ymd_hms(paste(input$year, input$month, input$day,
input$HH, input$MM, input$SS, sep="-"))
You can enter 00 if you don't have seconds, for example ....
Base R does not have a class for just "time" (of day), as.POSIXct doesn't deal with "times", it deals with "date-times". The lubridate:: package does give number-like HMS values, which may be relevant, but since each row has both date and time, it seems relevant to combine them instead of putting them into separate columns.
CPLF_data |>
transform(
StartTime = as.numeric(StartTime),
Date = as.numeric(Date)
) |>
transform(
DateTime = ISOdate(
2000 + Date %/% 10000, (Date %% 10000) %/% 100, Date %% 100,
StartTime %/% 10000, (StartTime %% 10000) %/% 100, StartTime %% 100)
)
# StartTime Date DateTime
# 1 93537 220703 2022-07-03 09:35:37
Note: I'm assuming that all years are 2-digits and at/after 2000. If this is not true, it's not difficult to work around it with some custom code. Also, over to you if you want to set the timezone of this timestamp by adding tz="US/Mountain" or whichever is more appropriate for the data.
Data
CPLF_data <- data.frame(StartTime = "93537", Date = "220703")
I have a vector of date strings in the form month_name-2_digit_year i.e.
a = rbind("April-21", "March-21", "February-21", "January-21")
I'm trying to convert that vector into a vector of date objects. I'm aware this question is very similar to this: Convert non-standard date format to date in R posted some years ago, but unfortunately, it has not answered my question.
I have tried the following as.Date() calls to do this, but it just returns a vector of NA. I.e.
b = as.Date(a, format = "%B-%y")
b = as.Date(a, format = "%B%y")
b = as.Date(a, "%B-%y")
b = as.Date(a, "%B%y")
I'm also attempted to do it using the convertToDate function from the openxlsx package:
b = convertToDate(a, format = "%B-%y")
I have also tried all the above but using a single character string rather than a vector, but that produced the same issue.
I'm a little lost as to why this isn't working, as this format has worked in reverse earlier in my script (that is, I had a date object already in dd-mm-yyyy format and converted it to month_name-yy using %B-%y). Is there another way to go from string to date when the string is a non-standard (anything other than dd-mm-yyy or mm-dd-yy if you're in the US) date format?
For the record my R locales are all UK and english.
Thanks in advance.
A Date must have all three of day, month and year. Convert to yearmon class which requires only month and year and then to Date as in (1) and (2) below or add the day as in (3).
(1) and (3) give first of month and (2) gives the end of the month.
(3) uses only functions from base R.
Also consider not converting to Date at all but just use yearmon objects instead since they directly represent a year and month which is what the input represents.
library(zoo)
# test input
a <- c("April-21", "March-21", "February-21", "January-21")
# 1
as.Date(as.yearmon(a, "%B-%y"))
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# 2
as.Date(as.yearmon(a, "%B-%y"), frac = 1)
## [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
# 3
as.Date(paste(1, a), "%d %B-%y")
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
In addition to zoo, which #G. Grothendieck mentioned, you can also use clock or lubridate.
clock supports a variable precision calendar type called year_month_day. In this case you'd want "month" precision, then you can set the day to whatever you'd like and convert back to Date.
library(clock)
x <- c("April-21", "March-21", "February-21", "January-21")
ymd <- year_month_day_parse(x, format = "%B-%y", precision = "month")
ymd
#> <year_month_day<month>[4]>
#> [1] "2021-04" "2021-03" "2021-02" "2021-01"
# First of month
as.Date(set_day(ymd, 1))
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# End of month
as.Date(set_day(ymd, "last"))
#> [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
The simplest solution may be to use lubridate::my(), which parses strings in the order of "month then year". That assumes that you want the first day of the month, which may or may not be correct for you.
library(lubridate)
x <- c("April-21", "March-21", "February-21", "January-21")
# Assumes first of month
my(x)
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
I would like to create a vector of dates between two specified moments in time with step 1 month, as described in this thread (Create a Vector of All Days Between Two Dates), to be then converted into factors for data visualization.
However, I'd like to have the dates in the YYYY-Mon, ie. 2010-Feb, format. But so far I managed only to have the dates in the standard format 2010-02-01, using a code like this:
require(lubridate)
first <- ymd_hms("2010-02-07 15:00:00 UTC")
start <- ymd(floor_date(first, unit="month"))
last <- ymd_hms("2017-10-29 20:00:00 UTC")
end <- ymd(ceiling_date(last, unit="month"))
> start
[1] "2010-02-01"
> end
[1] "2017-11-01"
How can I change the format to YYYY-Mon?
You can use format():
start %>% format('%Y-%b')
To create the vector, use seq():
seq(start, end, by = 'month') %>% format('%Y-%b')
Obs: Use capital 'B' for full month name: '%Y-%B'.
Help me to find difference between times.For eg: these are the date and time
2015-11-24 16:49:14
2014-12-02 16:52:43
Need the result in HH:MM:SS format using r.
As you need difference between only the time, ignoring the dates you can first extract the time using strptime
x <- strptime(substr(a, 12, 19), format="%H:%M:%S")
y <- strptime(substr(b, 12, 19), format="%H:%M:%S")
Then using the seconds_to_period function of lubridate package you can get the time difference and then format the output using sprintf
library(lubridate)
temp <- seconds_to_period(as.numeric(difftime(y, x, units = "secs")))
sprintf('%02d:%02d:%02d', hour(temp), minute(temp), second(temp))
# [1] "00:03:29"
data
a <- as.POSIXct("2015-11-24 16:49:14")
b <- as.POSIXct("2014-12-02 16:52:43")
Following code to get the difference
library(lubridate)
interval(ymd_hms("2015-11-2416:17:38"),ymd_hms("2015-11-24 14:19:44"))
span<-interval(as.POSIXct("2015-11-24 16:17:38"),
as.POSIXct("2015-11-24 14:19:44"))
as.period(span)
Format of answer
> -1H -57M -54S
Also display the difference in year, month & date
I have data for more than 3 years. For each year I want to find the day corresponding to Jaunary 1 of that year. For example:
> x <- c('5/5/2007','12/31/2007','1/2/2008')
> #Convert to day of year (julian date) –
> strptime(x,"%m/%d/%Y")$yday+1
[1] 125 365 2
I want to know how to do the same thing but with time added. But I still get the day not time. Can anyone suggest what is the better way to find the julian date with date and time ?
> x1 <- c('5/5/2007 02:00','12/31/2007 05:58','1/2/2008 16:25')
> #Convert to day of year (julian date) –
> strptime(x1,"%m/%d/%Y %H:%M")$yday+1
[1] 125 365 2
Rather than this result, I want the output in decimal days. For example the first example would be 125.0833333 and so on.
Thank you so much.
Are you hoping to get the day + a numerical part of a day as output? If so, something like this will work:
test <- strptime(x1,"%m/%d/%Y %H:%M")
(test$yday+1) + (test$hour/24) + (test$min/(24*60))
#[1] 125.083333 365.248611 2.684028
Although this matches what you ask for, I think removing the +1 might make more sense:
(test$yday) + (test$hour/24) + (test$min/(24*60))
#[1] 124.083333 364.248611 1.684028
Though my spidey senses are tingling that Dirk is going to show up and show me how to do this with a POSIXct date/time representation.
Here is an attempt of such an answer using base functions:
mapply(julian, as.POSIXct(test), paste(format(test,"%Y"),"01","01",sep="-"))
#[1] 124.083333 364.248611 1.684028
You can also use POSIXct and POSIXlt representations along with firstof function from xts.
x1 <- c("5/5/2007 02:00", "12/31/2007 05:58", "1/2/2008 16:25")
x1
## [1] "5/5/2007 02:00" "12/31/2007 05:58" "1/2/2008 16:25"
y <- as.POSIXlt(x1, format = "%m/%d/%Y %H:%M")
result <- mapply(julian, x = as.POSIXct(y), origin = firstof(y$year + 1900))
result
## [1] 124.083333 364.248611 1.684028
if you don't want to use xts then perhaps something like this
result <- mapply(julian,
x = as.POSIXct(x1, format = "%m/%d/%Y %H:%M", tz = "GMT"),
origin = as.Date(paste0(gsub(".*([0-9]{4}).*", "\\1", x1),
"-01-01"),
tz = "GMT"))
result
## [1] 124.083333 364.248611 1.684028
If you want to do the other way around (convert day of the year to date and time), you can use this little function:
doy2date = function(mydoy){
mydate = as.Date(mydoy, origin = "2008-01-01 00:00:00", tz = "GMT")
dech = (mydoy - as.integer(mydoy)) * 24
myh = as.integer(dech)
mym = as.integer( (dech - as.integer(dech)) * 60)
mys = round(I( (((dech - as.integer(dech)) * 60) - mym) * 60), digits=0 )
posixdate = as.POSIXct(paste(mydate, " ", myh,":",mym,":",mys, sep=""), tz = "GMT")
return(posixdate)
}
As an example, if you try:
doy2date(117.6364)
The function will return "2008-04-27 15:16:25 GMT" as a POSIXct.