I have a string/character variable contains a calendar date, eg,
x <- "2018-10-31"
I also have a variable y contains time, say 200 days.
y <- 200
How do I find out the calendar date for x + y?
I am not familiar with date type in R and struggle with how to approach this.
An add-on question, would this calculation be different if y = 4.3 months? Of course I can convert this into days, though wonder if there is more direct way to handle months without converting.
You could utilise the lubridate package, which is specifically designed for handling date time data.
library(lubridate)
x <- ymd("2018-10-31")
x + days(200)
[1] "2019-05-19"
lubridate works with 'period' objects, which require integers, so you would need to convert "4.3" months into something interpretable beforehand. "4.3" doesn't mean anything concrete in terms of date-time calculation anyways.
Related
I was wondering if there was a way to calculate time differences using the xts package without having to convert time values etc. if possible. I have an xts object with a time format given as 2010-02-15 13:35:59.123 (where the .123 is the milliseconds).
Now, I would like to find the number of milliseconds until the end of the day (i.e. 17:00:00). The problem however is that I basically have to do a few conversions of the data before I can do this (such as using as.POSIXct) and this becomes more complicated since I have to do it for several different days and possibly even different times. For this reason, I would prefer to not have to convert the "end of day time" and leave it as 17:00:00 such that in order to find the number of milliseconds between the present time and the end of day time I can just have a fairly simple operation such as 17:00:00.000 - 13:35:59.123 = ...
Is there a simple way to do this with minimal conversions? I'm certain xts has a function which I don't know of but I couldn't find anything in the documentation :/
EDIT: I forgot to mention, I tried the more 'straightforward' route by trying to compute the time differences by first trying to use the function as.POSIXct(16:00:00, format = "%H:%M:%S") but this gives an error, and I'm honestly not sure why...
You should be able to do this using a combination of ave(), .indexDate(), and a custom function. You didn't provide a reproducible example, so here's one using the daily data that comes with xts.
library(xts)
data(sample_matrix)
x <- as.xts(sample_matrix)
secsRemaining <- function(x) { end(x)-index(x) })
tdiff <- ave(x[,1], as.yearmon(index(x)), FUN = secsRemaining)
tdiff[86:92,]
# Open
# 2007-03-28 259200
# 2007-03-29 172800
# 2007-03-30 86400
# 2007-03-31 0
# 2007-04-01 2505600
# 2007-04-02 2419200
# 2007-04-03 2332800
In your case, the call would use .indexDate(x) instead of as.yearmon(index(x)).
tdiff <- ave(x[,1], .indexDate(x), FUN = secsRemaining)
Also note that this call to ave() only works on a 1-column xts object. Seems like a bug that it doesn't. Also note that you have to use FUN = with ave(), since the FUN argument occurs after ....
This is what i have done so far but its wrong.
earthquakes<- c(6.6,6.8,8.4)
dates <- (13/02/2001 ,28/02/2001,23/06/2001)
plot(earthquakes,dates)
I have only started learning R. Please help.
earthquakes<- c(6.6,6.8,8.4)
dates <- as.Date(c("13/02/2001", "28/02/2001", "23/06/2001"), format="%d/%m/%Y")
plot(dates, earthquakes)
You had a few issues:
Dates should be in quotes (otherwise R will think you're trying to do arithmetic (i.e. 13 divided by 02 divied by 2001)
To convert dates to actual date objects, use as.Date, pass a vector of dates (this is the c(... part), and then specify the format that they are in so that R knows what to do with the strings
you had x and y swapped
Note, the as.Date step is not strictly necessary, but if you don't do that, then the x axis of the plot will plot every item equidistant, irrespective of how far apart the dates actually are in time.
How to convert between year,month,day and dates in R?
I know one can do this via strings, but I would prefer to avoid converting to strings, partly because maybe there is a performance hit?, and partly because I worry about regionalization issues, where some of the world uses "year-month-day" and some uses "year-day-month".
It looks like ISODate provides the direction year,month,day -> DateTime , although it does first converts the number to a string, so if there is a way that doesn't go via a string then I prefer.
I couldn't find anything that goes the other way, from datetimes to numerical values? I would prefer not needing to use strsplit or things like that.
Edit: just to be clear, what I have is, a data frame which looks like:
year month day hour somevalue
2004 1 1 1 1515353
2004 1 1 2 3513535
....
I want to be able to freely convert to this format:
time(hour units) somevalue
1 1515353
2 3513535
....
... and also be able to go back again.
Edit: to clear up some confusion on what 'time' (hour units) means, ultimately what I did was, and using information from How to find the difference between two dates in hours in R?:
forwards direction:
lh$time <- as.numeric( difftime(ISOdate(lh$year,lh$month,lh$day,lh$hour), ISOdate(2004,1,1,0), units="hours"))
lh$year <- NULL; lh$month <- NULL; lh$day <- NULL; lh$hour <- NULL
backwards direction:
... well, I didnt do backwards yet, but I imagine something like:
create difftime object out of lh$time (somehow...)
add ISOdate(2004,1,1,0) to difftime object
use one of the solution below to get the year,month,day, hour back
I suppose in the future, I could ask the exact problem I'm trying to solve, but I was trying to factorize my specific problem into generic reusable questions, but maybe that was a mistake?
Because there are so many ways in which a date can be passed in from files, databases etc and for the reason you mention of just being written in different orders or with different separators, representing the inputted date as a character string is a convenient and useful solution. R doesn't hold the actual dates as strings and you don't need to process them as strings to work with them.
Internally R is using the operating system to do these things in a standard way. You don't need to manipulate strings at all - just perhaps convert some things from character to their numerical equivalent. For example, it is quite easy to wrap up both operations (forwards and backwards) in simple functions you can deploy.
toDate <- function(year, month, day) {
ISOdate(year, month, day)
}
toNumerics <- function(Date) {
stopifnot(inherits(Date, c("Date", "POSIXt")))
day <- as.numeric(strftime(Date, format = "%d"))
month <- as.numeric(strftime(Date, format = "%m"))
year <- as.numeric(strftime(Date, format = "%Y"))
list(year = year, month = month, day = day)
}
I forego the a single call to strptime() and subsequent splitting on a separation character because you don't like that kind of manipulation.
> toDate(2004, 12, 21)
[1] "2004-12-21 12:00:00 GMT"
> toNumerics(toDate(2004, 12, 21))
$year
[1] 2004
$month
[1] 12
$day
[1] 21
Internally R's datetime code works well and is well tested and robust if a bit complex in places because of timezone issues etc. I find the idiom used in toNumerics() more intuitive than having a date time as a list and remembering which elements are 0-based. Building on the functionality provided would seem easier than trying to avoid string conversions etc.
I'm a bit late to the party, but one other way to convert from integers to date is the lubridate::make_date function. See the example below from R for Data Science:
library(lubridate)
library(nycflights13)
library(tidyverse)
a <- flights %>%
mutate(date = make_date(year, month, day))
Found one solution for going from date to year,month,day.
Let's say we have a date object, that we'll create here using ISOdate:
somedate <- ISOdate(2004,12,21)
Then, we can get the numerical components of this as follows:
unclass(as.POSIXlt(somedate))
Gives:
$sec
[1] 0
$min
[1] 0
$hour
[1] 12
$mday
[1] 21
$mon
[1] 11
$year
[1] 104
Then one can get what one wants for example:
unclass(as.POSIXlt(somedate))$mon
Note that $year is [actual year] - 1900, month is 0-based, mday is 1-based (as per the POSIX standard)
I have started using data.table. Indeed it is very fast and quite nice syntax. I am having trouble with dates. I like to use lubridate. In many of my data sets I have dates or dates and times and have used lubridate to manipulate them. Lubridate stores the instant as a POSIX class. I have seen answers here that create new variables for instance just to get the year eg. 2005. I do not like that. There are times that I will be analyzing by year and other times by quarter and other times by month and other times by durations. I would like to do something simple such as this
mydatatable[,length(medical.record.number),by=year(date.of.service)]
that should give me the number of patient encounters in a given year. The by function is not working.
Error in names(byval) = as.character(bysuborig) :
'names' attribute [2] must be the same length as the vector [1]
Can you please point me to vignettes where data.tables is used with dates and where manipulations and categorizations of those dates are done on the fly.
This uses one of the examples in the help(IDateTime) page. It shows that you canc hange to syntax for the by=argument to a character value in the form " = " or (after #Matthew Dowle's comment below) you can try to use the functional form that you were using (although I have not been able to get it to work myself. I did get the preferred form: by=list(wday=wday(idate)) to work.) Note that the key creation assumes an IDateTime class since there is no idate or itime variable. Those are attributes of the class
datetime <- seq(as.POSIXct("2001-01-01"), as.POSIXct("2001-01-03"), by = "5 hour")
(af <- data.table(IDateTime(datetime), a = rep(1:2, 5), key = "a,idate,itime"))
af[, length(a), by = "wday = wday(idate)"]
wday V1
[1,] 2 4
[2,] 3 5
[3,] 4 1
I regret that I come to post here after being burned-out on hours of Internet searching regarding this simplistic question.
I have several data sets to plot in R, each consisting of two columns of data: time, date. I am using R 2.11.0 on a Windows computer, via the Rgui.
Time is "time of day" that an event is observed. As an example, it is recognized as:
Factor w/ 87 levels "5:53","5:54",..: 84 85 85 85 86 ...
Date is calendar date, recognized as:
Class 'Date' num [1:730] 13879 13880 13881 13882 13883 ...
The time values are recorded in the format of a 24-hr clock, h:mm or hh:mm. The date values are displayed yyyy-mm-dd.
I want to plot time (y-axis) vs. date (x-axis).
Using
plot(date,time)
gives an accurate-looking plot, but the y-axis is labeled as the numeric factor values (about 0 to 90), rather than the desired, temporally-ordered levels of the factor variable. The x-axis is labeled in the desired, human-readable format.
How can I correct this? Is there a "time of day" format in R that I can convert my "time" variable into? I will subsequently like to do arithmetic on the time values as well, and would not mind having to carry one column of values to use in plotting and one column of values for maths.
I ran across several examples online of manipulation of (date + time) variables in R, and converting those to different formats. I do not believe this is my problem, as I have separate fields for time and date and want to plot one against the other.
My thanks to you in advance for your suggestions, or your directions to a web-accessible resource (no appropriate libraries or bookstores at my location).
There may be an easier way to do this, but you can always label the y-axis yourself. Adjust the ticksAt vector below to find something that looks suitable for your data.
Data <- data.frame(date=Sys.Date()+1:10,time=paste(5,41:50,sep=":"))
with(Data, plot(date,time,yaxt="n"))
ticksAt <- c(1,3,5,7,9)
axis(2, at=ticksAt, labels=as.character(Data$time)[ticksAt])
?plot.zoo has some good examples of how to create pretty axis annotations, though some of them may be zoo-specific. ?par is also a good resource.
ts and timeSeries are two good choices.
Take a look at Related
Let's assume you have two vectors, one of Date class named "dt" and the other a factor named "tm":
x <- paste(as.character(dt[1:2]), as.character(tm))
strptime(x, "%Y-%m-%d %H:%M")
## [1] "2008-01-01 05:53:00" "2008-01-02 05:54:00"
class(strptime(x, "%Y-%m-%d %H:%M"))
## [1] "POSIXt" "POSIXlt"