Day-of-the-year (1-366) in R [duplicate] - r

The title has it: how do you convert a POSIX date to day-of-year?

An alternative is to format the "POSIXt" object using strftime():
R> today <- Sys.time()
R> today
[1] "2012-10-19 19:12:04 BST"
R> doy <- strftime(today, format = "%j")
R> doy
[1] "293"
R> as.numeric(doy)
[1] 293
which is preferable to remembering that the day of the years is zero-based in the POSIX standard.

As ?POSIXlt reveals, a $yday suffix to a POSIXlt date (or even a vector of such) will convert to day of year. Beware that POSIX counts Jan 1 as day 0, so you might want to add 1 to the result.
It took me embarrassingly long to find this, so I thought I'd ask and answer my own question.
Alternatively, the excellent lubridate package provides the yday function, which is just a wrapper for the above method. It conveniently defines similar functions for other units (month, year, hour, ...).
today <- Sys.time()
yday(today)

I realize it isn't quite what the poster was looking for, but I needed to convert POSIX date-times into a fractional day of the year for time series analysis and ended up doing this:
today <- Sys.time()
doy2015f<-difftime(today,as.POSIXct(as.Date("2015-01-01 00:00", tzone="GMT")),units='days')

The data.table package also provides a yday() function.
require(data.table)
today <- Sys.time()
yday(today)

This is the way how I do it:
as.POSIXlt(c("15.4", "10.5", "15.5", "10.6"), format = "%d.%m")$yday
# [1] 104 129 134 160

Related

How to convert date to datetime to seconds since UNIX epoch in R with lubridate?

I'm noticing this very confusing behavior.
library(lubridate)
x = as_date(-25567)
as.integer(as_datetime(x)) # Returns NA
How can I get this to return the seconds since (or in this case before) UNIX epoch?
This works with base R, now that we covered that you really want as.Date("1970-01-01").
R> as.POSIXct("1900-01-01 00:00:00")
[1] "1900-01-01 CST"
R> as.numeric(as.POSIXct("1900-01-01 00:00:00"))
[1] -2208967200
R>
I vaguely recall some OS-level irritations for dates prior to the epoch. This may fail for you on the world's most commonly used OS but that is not really R's fault...

sequence of monthly dates making sure it's the same day, or the last day of month in case of invalid

Given an initial date, I want to generate a sequence of dates with monthly intervals, ensuring every element has the same day as the initial date or the last day of the month in case the same day would yield an invalid date.
Sounds pretty standard, right?
Using difftime is not possible. Here's what the help file of difftime says:
Units such as "months" are not possible as they are not of constant
length. To create intervals of months, quarters or years use seq.Date
or seq.POSIXt.
But then looking at the help file of seq.POSIXt I find that:
Using "month" first advances the month without changing the day: if
this results in an invalid day of the month, it is counted forward
into the next month: see the examples.
This is the example in the help file.
seq(ISOdate(2000,1,31), by = "month", length.out = 4)
> seq(ISOdate(2000,1,31), by = "month", length.out = 4)
[1] "2000-01-31 12:00:00 GMT" "2000-03-02 12:00:00 GMT"
"2000-03-31 12:00:00 GMT" "2000-05-01 12:00:00 GMT"
So, given that the initial date is on day 31, this would yield invalid dates on February, April, etc. So, the sequence end up actually skipping those months because it "counts forward" and end up with March-02, instead of February-29.
If I start on 2000-01-31, I would like the sequence as follows:
2000-01-31
2000-02-29
2000-03-31
2000-04-30
...
And it should properly handle leap-years, so if the initial date is 2015-01-31 the sequence should be:
2015-01-31
2015-02-28
2015-03-31
2015-04-30
...
These are just examples to illustrate the problem and I do not know the initial date in advance, nor can I assume anything about it. The initial date may well be in the middle of the month (2015-01-15) in which case seq works fine. But it can also be, as in the examples, towards the end of the month on dates that using seq alone would be problematic (days 29, 30 and 31). I cannot assume either that the initial date is the last day of the month.
I have looked around trying to find a solution. In some questions here in SO (e.g. here) there is a "trick" to get the last day of a month, by getting the first day of the next month and simply subtract 1. And finding the first day is "easy" because it is just day 1.
So my solution so far is:
# Given an initial date for my sequence
initial_date <- as.Date("2015-01-31")
# Find the first day of the month
library(magrittr) # to use pipes and make the code more readable
firs_day_of_month <- initial_date %>%
format("%Y-%m") %>%
paste0("-01") %>%
as.Date()
# Generate a sequence from initial date, using seq
# This is the sequence that will have incorrect values in months that would
# have invalid dates
given_dat_seq <- seq(initial_date, by = "month", length.out = 4)
# And then generate an auxiliary sequence for the last day of the month
# I do this generating a sequence that starts the first day of the
# same month as initial date and it goes one month further
# (lenght 5 instead of 4) and substract 1 to all the elements
last_day_seq <- seq(firs_day_of_month, by = "month", length.out = 5)-1
# And finally, for each pair of elements, I take the min date of both
pmin(given_dat_seq, last_day_seq[2:5])
It works, but it is, at the same time, kinda dumb, hacky and convoluted. So I do not like it. And most importantly, I cannot believe there is no easier way to do this in R.
Can someone please point me to a simpler solution? (I guess it should have been as simple as seq(initial_date, "month", 4), but apparently it is not). I've googled it and looked here in SO and R mailing lists, but apart from the tricks I mentioned above, I couldn't find a solution.
The simplest solution is %m+% from lubridate, which solves this exact problem. So:
seq_monthly <- function(from,length.out) {
return(from %m+% months(c(0:(length.out-1))))
}
Output:
> seq_monthly(as.Date("2015-01-31"),length.out=4)
[1] "2015-01-31" "2015-02-28" "2015-03-31" "2015-04-30"
Similar to the lubridate answer, here is one using RcppBDT (which wraps the Boost Date.Time library from C++)
R> dt <- new(bdtDt, 2010, 1, 31); for (i in 1:5) { dt$addMonths(i); print(dt) }
[1] "2010-02-28"
[1] "2010-04-30"
[1] "2010-07-31"
[1] "2010-11-30"
[1] "2011-04-30"
R> dt <- new(bdtDt, 2000, 1, 31); for (i in 1:5) { dt$addMonths(i); print(dt) }
[1] "2000-02-29"
[1] "2000-04-30"
[1] "2000-07-31"
[1] "2000-11-30"
[1] "2001-04-30"
R>

Calendar date arithmetic in R

Is there a way in R to do calendar arithmetic, e.g.
> as.Date('2014-03-30') - months(1)
[1] 2014-02-28
except in reality there's no such months function. This can be done with awareness of leap years and daylight savings time in SQL and Java, but I can't find a way to do it in R. I thought I'd get clever and use seq but no:
> seq(as.POSIXct('2014-03-30', tz='UTC'), by = '-1 months', length=2)[2]
[1] "2014-03-02 UTC"
Here is one way using RcppBDT which wraps
(parts of) Boost Date_Time for use by R:
R> library(RcppBDT)
R> dt <- new(bdtDt, 2014, 3, 30)
R> dt
[1] "2014-03-30"
R> dt$addMonths(-1)
R> dt
[1] "2014-02-28"
R>
This is too long for an organized comment, but months is a function, and using as.POSIXlt (as opposed to ct) can allow for easy extraction of date attributes.
test <- as.POSIXlt('2014-03-30', tz='UTC')
attributes(test)$names
test$mon
months(test)
Given that test$mon returns a numeric value, it would be easy to perform arithmetic on the months. However, subtracting 1 month from January just gives you -1 (Jan is 0), and redefining test$mon <- test$mon - 1 doesn't seem to be of much help.
Nonetheless, depending on your application, the above information may still be useful.

Round an POSIXct date up to the next day

I have a question similar to Round a POSIX date (POSIXct) with base R functionality, but I'm hoping to always round the date up to midnight the next day (00:00:00).
Basically, I want a function equivalent to ceiling for POSIX-formatted dates. As with the related question, I'm writing my own package, and I already have several package dependencies so I don't want to add more. Is there a simple way to do this in base R?
Maybe
trunc(x,"days") + 60*60*24
> x <- as.POSIXct(Sys.time())
> x
[1] "2012-08-09 18:40:08 BST"
> trunc(x,"days")+ 60*60*24
[1] "2012-08-10 BST"
A quick and dirty method is to convert to a Date (which truncates the time), add 1 (which is a day for Date) and then convert back to POSIX to be at midnight UTC on the next day. As #Joshua Ulrich points out, timezone/daylight savings issues may give results you don't expect:
as.POSIXct(as.Date(Sys.time())+1)
[1] "2012-08-10 01:00:00 BST"

How do you convert POSIX date to day of year?

The title has it: how do you convert a POSIX date to day-of-year?
An alternative is to format the "POSIXt" object using strftime():
R> today <- Sys.time()
R> today
[1] "2012-10-19 19:12:04 BST"
R> doy <- strftime(today, format = "%j")
R> doy
[1] "293"
R> as.numeric(doy)
[1] 293
which is preferable to remembering that the day of the years is zero-based in the POSIX standard.
As ?POSIXlt reveals, a $yday suffix to a POSIXlt date (or even a vector of such) will convert to day of year. Beware that POSIX counts Jan 1 as day 0, so you might want to add 1 to the result.
It took me embarrassingly long to find this, so I thought I'd ask and answer my own question.
Alternatively, the excellent lubridate package provides the yday function, which is just a wrapper for the above method. It conveniently defines similar functions for other units (month, year, hour, ...).
today <- Sys.time()
yday(today)
I realize it isn't quite what the poster was looking for, but I needed to convert POSIX date-times into a fractional day of the year for time series analysis and ended up doing this:
today <- Sys.time()
doy2015f<-difftime(today,as.POSIXct(as.Date("2015-01-01 00:00", tzone="GMT")),units='days')
The data.table package also provides a yday() function.
require(data.table)
today <- Sys.time()
yday(today)
This is the way how I do it:
as.POSIXlt(c("15.4", "10.5", "15.5", "10.6"), format = "%d.%m")$yday
# [1] 104 129 134 160

Resources