Lots of people ask how to strip the time and keep the date, but what about the other way around? Given:
myDateTime <- "11/02/2014 14:22:45"
I would like to see:
myTime
[1] "14:22:45"
Time zone not necessary.
I've already tried (from other answers)
as.POSIXct(substr(myDateTime, 12,19),format="%H:%M:%S")
[1] "2013-04-13 14:22:45 NZST"
The purpose is to analyse events recorded over several days by time of day only.
Thanks
Edit:
It turns out there's no pure "time" object, so every time must also have a date.
In the end I used
as.POSIXct(as.numeric(as.POSIXct(myDateTime)) %% 86400, origin = "2000-01-01")
rather than the character solution, because I need to do arithmetic on the results. This solution is similar to my original one, except that the date can be controlled consistently - "2000-01-01" in this case, whereas my attempt just used the current date at runtime.
I think you're looking for the format function.
(x <- strptime(myDateTime, format="%d/%m/%Y %H:%M:%S"))
#[1] "2014-02-11 14:22:45"
format(x, "%H:%M:%S")
#[1] "14:22:45"
That's character, not "time", but would work with something like aggregate if that's what you mean by "analyse events recorded over several days by time of day only."
If the time within a GMT day is useful for your problem, you can get this with %%, the remainder operator, taking the remainder modulo 86400 (the number of seconds in a day).
stamps <- c("2013-04-12 19:00:00", "2010-04-01 19:00:01", "2018-06-18 19:00:02")
as.numeric(as.POSIXct(stamps)) %% 86400
## [1] 0 1 2
Related
In R I have this data.frame
24:43:30 23:16:02 14:05:44 11:44:30 ...
Note that some of the times are over 24:00:00 ! In fact all my times are within 02:00:00 to 25:59:59.
I want to subtract all entries in my dataset data with 2 hours. This way I get a regular data-set. How can I do this?
I tried this
strptime(data, format="%H:%M:%S") - 2*60*60
and this work for all entries below 23:59:59. For all entries above I simply get NA since the strptime command produce NA to all entries above 23:59:59.
Using lubridate package can make the job easier!
> library(lubridate)
> t <- '24:43:30'
> hms(t) - hms('2:0:0')
[1] "22H 43M 30S"
Update:
Converting the date back to text!
> substr(strptime(hms(t) - hms('2:0:0'),format='%HH %MM %SS'),12,20)
[1] "22:43:30"
Adding #RHertel's update:
format(strptime(hms(t) - hms('2:0:0'),format='%HH %MM %SS'),format='%H:%M:%S')
Better way of formating the lubridate object:
s <- hms('02:23:58) - hms('2:0:0')
paste(hour(s),minute(s),second(s),sep=":")
"0:23:58"
Although the answer by #amrrs solves the main problem, the formatting could remain an issue because hms() does not provide a uniform output. This is best shown with an example:
library(lubridate)
hms("01:23:45")
#[1] "1H 23M 45S"
hms("00:23:45")
#[1] "23M 45S"
hms("00:00:45")
#[1] "45S"
Depending on the time passed to hms() the output may or may not contain an entry for the hours and for the minutes. Moreover leading zeros are omitted in single-digit values of hours, minutes and seconds. This can result pretty much in a formatting nightmare if one tries to put that data into a common form.
To resolve this difficulty one could first convert the time into a duration with lubridate's as.duration() function. Then, the duration in seconds can be transformed into a POSIXct object from which the hours, minutes, and seconds can be extracted easily with format():
times <- c("24:43:30", "23:16:02", "14:05:44", "11:44:30", "02:00:12")
shifted_times <- hms(times) - hms("02:00:00")
format(.POSIXct(as.duration(shifted_times),tz="GMT"), "%H:%M:%S")
#[1] "22:43:30" "21:16:02" "12:05:44" "09:44:30" "00:00:12"
The last entry "02:00:12" would have caused difficulties if shifted_times had been passed to strptime().
Given an initial date, I want to generate a sequence of dates with monthly intervals, ensuring every element has the same day as the initial date or the last day of the month in case the same day would yield an invalid date.
Sounds pretty standard, right?
Using difftime is not possible. Here's what the help file of difftime says:
Units such as "months" are not possible as they are not of constant
length. To create intervals of months, quarters or years use seq.Date
or seq.POSIXt.
But then looking at the help file of seq.POSIXt I find that:
Using "month" first advances the month without changing the day: if
this results in an invalid day of the month, it is counted forward
into the next month: see the examples.
This is the example in the help file.
seq(ISOdate(2000,1,31), by = "month", length.out = 4)
> seq(ISOdate(2000,1,31), by = "month", length.out = 4)
[1] "2000-01-31 12:00:00 GMT" "2000-03-02 12:00:00 GMT"
"2000-03-31 12:00:00 GMT" "2000-05-01 12:00:00 GMT"
So, given that the initial date is on day 31, this would yield invalid dates on February, April, etc. So, the sequence end up actually skipping those months because it "counts forward" and end up with March-02, instead of February-29.
If I start on 2000-01-31, I would like the sequence as follows:
2000-01-31
2000-02-29
2000-03-31
2000-04-30
...
And it should properly handle leap-years, so if the initial date is 2015-01-31 the sequence should be:
2015-01-31
2015-02-28
2015-03-31
2015-04-30
...
These are just examples to illustrate the problem and I do not know the initial date in advance, nor can I assume anything about it. The initial date may well be in the middle of the month (2015-01-15) in which case seq works fine. But it can also be, as in the examples, towards the end of the month on dates that using seq alone would be problematic (days 29, 30 and 31). I cannot assume either that the initial date is the last day of the month.
I have looked around trying to find a solution. In some questions here in SO (e.g. here) there is a "trick" to get the last day of a month, by getting the first day of the next month and simply subtract 1. And finding the first day is "easy" because it is just day 1.
So my solution so far is:
# Given an initial date for my sequence
initial_date <- as.Date("2015-01-31")
# Find the first day of the month
library(magrittr) # to use pipes and make the code more readable
firs_day_of_month <- initial_date %>%
format("%Y-%m") %>%
paste0("-01") %>%
as.Date()
# Generate a sequence from initial date, using seq
# This is the sequence that will have incorrect values in months that would
# have invalid dates
given_dat_seq <- seq(initial_date, by = "month", length.out = 4)
# And then generate an auxiliary sequence for the last day of the month
# I do this generating a sequence that starts the first day of the
# same month as initial date and it goes one month further
# (lenght 5 instead of 4) and substract 1 to all the elements
last_day_seq <- seq(firs_day_of_month, by = "month", length.out = 5)-1
# And finally, for each pair of elements, I take the min date of both
pmin(given_dat_seq, last_day_seq[2:5])
It works, but it is, at the same time, kinda dumb, hacky and convoluted. So I do not like it. And most importantly, I cannot believe there is no easier way to do this in R.
Can someone please point me to a simpler solution? (I guess it should have been as simple as seq(initial_date, "month", 4), but apparently it is not). I've googled it and looked here in SO and R mailing lists, but apart from the tricks I mentioned above, I couldn't find a solution.
The simplest solution is %m+% from lubridate, which solves this exact problem. So:
seq_monthly <- function(from,length.out) {
return(from %m+% months(c(0:(length.out-1))))
}
Output:
> seq_monthly(as.Date("2015-01-31"),length.out=4)
[1] "2015-01-31" "2015-02-28" "2015-03-31" "2015-04-30"
Similar to the lubridate answer, here is one using RcppBDT (which wraps the Boost Date.Time library from C++)
R> dt <- new(bdtDt, 2010, 1, 31); for (i in 1:5) { dt$addMonths(i); print(dt) }
[1] "2010-02-28"
[1] "2010-04-30"
[1] "2010-07-31"
[1] "2010-11-30"
[1] "2011-04-30"
R> dt <- new(bdtDt, 2000, 1, 31); for (i in 1:5) { dt$addMonths(i); print(dt) }
[1] "2000-02-29"
[1] "2000-04-30"
[1] "2000-07-31"
[1] "2000-11-30"
[1] "2001-04-30"
R>
I have some numbers that represent dates in milliseconds since epoch, 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970
1365368400000,
1365973200000,
1366578000000
I'm converting them to date format:
as.Date(as.POSIXct(my_dates/1000, origin="1970-01-01", tz="GMT"))
answer:
[1] "2013-04-07" "2013-04-14" "2013-04-21"
How to convert these strings back to milliseconds since epoch?
Here are your javascript dates
x <- c(1365368400000, 1365973200000, 1366578000000)
You can convert them to R dates more easily by dividing by the number of milliseconds in one day.
y <- as.Date(x / 86400000, origin = "1970-01-01")
To convert back, just convert to numeric and multiply by this number.
z <- as.numeric(y) * 86400000
Finally, check that the answer is what you started with.
stopifnot(identical(x, z))
As per the comment, you may sometimes get numerical rounding errors leading to x and z not being identical. For numerical comparisons like this, use:
library(testthat)
expect_equal(x, z)
I will provide a simple framework to handle various kinds of dates encoding and how to go back an forth. Using the R package ‘lubridate’ this is made very easy using the period and interval classes.
When dealing with days, it can be easy as one can use the as.numeric(Date) to get the number of dates since the epoch. To get any unit of time smaller than a day one can convert using the various factors (24 for hours, 24 * 60 for minutes, etc.) However, for months, the math can get a bit more tricky and thus I prefer in many instances to use this method.
library(lubridate)
as.period(interval(start = epoch, end = Date), unit = 'month')#month
This can be used for year, month, day, hour, minute, and smaller units through apply the factors.
Going the other way such as being given months since epoch:
library(lubridate)
epoch %m+% as.period(Date, unit = 'months')
I presented this approach with months as it might be the more complicated one. An advantage to using period and intervals is that it can be adjusted to any epoch and unit very easily.
I have a question similar to Round a POSIX date (POSIXct) with base R functionality, but I'm hoping to always round the date up to midnight the next day (00:00:00).
Basically, I want a function equivalent to ceiling for POSIX-formatted dates. As with the related question, I'm writing my own package, and I already have several package dependencies so I don't want to add more. Is there a simple way to do this in base R?
Maybe
trunc(x,"days") + 60*60*24
> x <- as.POSIXct(Sys.time())
> x
[1] "2012-08-09 18:40:08 BST"
> trunc(x,"days")+ 60*60*24
[1] "2012-08-10 BST"
A quick and dirty method is to convert to a Date (which truncates the time), add 1 (which is a day for Date) and then convert back to POSIX to be at midnight UTC on the next day. As #Joshua Ulrich points out, timezone/daylight savings issues may give results you don't expect:
as.POSIXct(as.Date(Sys.time())+1)
[1] "2012-08-10 01:00:00 BST"
I am manipulating some POSIXlt DateTime objects. For example I would like to add an hour:
my.lt = as.POSIXlt("2010-01-09 22:00:00")
new.lt = my.lt + 3600
new.lt
# [1] "2010-01-09 23:00:00 EST"
class(new.lt)
# [1] "POSIXct" "POSIXt"
The thing is I want new.lt to be a POSIXlt object. I know I could use as.POSIXlt to convert it back to POSIXlt, but is there a more elegant and efficient way to achieve this?
POSIXct-classed objects are internally a numeric value that allows numeric calculations. POSIXlt-objects are internally lists. Unfortunately for your desires, Ops.POSIXt (which is what is called when you use "+") coerces to POSIXct with this code:
if (inherits(e1, "POSIXlt") || is.character(e1))
e1 <- as.POSIXct(e1)
Fortunately, if you just want to and an hour there is a handy alternative to adding 3600. Instead use the list structure and add 1 to the hour element:
> my.lt$hour <- my.lt$hour +1
> my.lt
[1] "2010-01-09 23:00:00"
This approach is very handy when you want to avoid thorny questions about DST changes, at least if you want adding days to give you the same time-of-day.
Edit (adding #sunt's code demonstrating that Ops.POSIXlt is careful with time "overflow".))
my.lt = as.POSIXlt("2010-01-09 23:05:00")
my.lt$hour=my.lt$hour+1
my.lt
# [1] "2010-01-10 00:05:00"
Short answer: No
Long answer:
POSIXct and POSIXlt objects are two specific types of the more general POSIXt class (not in a strictly object oriented inheritance sense, but in a quasi-object oriented implementation sense). Code freely switches between these. When you add to a POSIXlt object, the actual function used is +.POSIXt, not one specifically for POSIXlt. Inside this function, the argument is converted into a POSIXct and then dealt with (added to).
Additionally, POSIXct is the number of seconds from a specific date and time. POSIXlt is a list of date parts (seconds, minutes, hours, day of month, month, year, day of week, day of year, DST info) so adding to that directly doesn't make any sense. Converting it to a number of seconds (POSIXct) and adding to that does make sense.
It may not be significantly more elegant, but
seq.POSIXt( from=Sys.time(), by="1 hour", length.out=2 )[2]
IMHO is more descriptive than
Sys.time()+3600; # 60 minutes * 60 seconds
because the code itself documents that you're going for a "POSIX" "seq"uence incremented "by 1 hour", but it's a matter of taste. Works just fine on POSIXlt, but note that it returns a POSIXct either way. Also works for "days". See help(seq.POSIXt) for details on how it handles months, daylight savings, etc.
?POSIXlt tells you that:
Any conversion that needs to go between the two date-time classes requires a timezone: conversion from "POSIXlt" to "POSIXct" will validate times in the selected timezone.
So I guess that 3600 not being a POSIXlt object, there is an automatic conversion.
I would stick with simple:
new.lt = as.POSIXlt(my.lt + 3600)
class(new.lt)
[1] "POSIXlt" "POSIXt"
It's not that much of a hassle to add as.POSIXlt before your time operation.