Is there an inverse of the yday lubridate function? - r

I have a list of days in the format "day of the year" obtained by applying lubridate::yday() function to a list of dates. For instance, from the following dates (mm-dd-yyyy format):
01-01-2015
01-02-2015
...
by applying yday() you get
1
2
...
Is there a function that can do the reverse given the yday output and the year? Ie, from a yday value and a year, get back to a date in the mm-dd-yyyy format?

To do this with the lubridate package, you use the parse_date_time() function with j as the order = argument.
Example:
my.yday <- "1"
parse_date_time(x = my.yday, orders = "j")
# [1] "2021-01-01 UTC"
The default appears to be the current year. If you want to specify the year just add it in!
my.yday <- "1"
parse_date_time(x = paste(1995, my.yday), orders = "yj")
# [1] "1995-01-01 UTC"
Note: make sure you supply the julian day as character, not numeric. If numeric it will fail when over 3 digits (weird bug that they don't plan to fix)

Any sequence added to a Date() type creates a new Date() sequence with just that offset.
Witness:
R> as.Date("2016-01-01") + 0:9
[1] "2016-01-01" "2016-01-02" "2016-01-03"
[4] "2016-01-04" "2016-01-05" "2016-01-06"
[7] "2016-01-07" "2016-01-08" "2016-01-09"
[10] "2016-01-10"
R> as.Date("2016-01-01") + 100:109
[1] "2016-04-10" "2016-04-11" "2016-04-12"
[4] "2016-04-13" "2016-04-14" "2016-04-15"
[7] "2016-04-16" "2016-04-17" "2016-04-18"
[10] "2016-04-19"
R>
So once again a so-called lubridate question as nothing to do with that package but simply requires to know how the base R types function.

> yday ("1990-03-17") - 1 + as.Date ("1990-01-01")
[1] "1990-03-17"

Just as round out the answer by #DirkEddelbuettel, since I was having the same issue and its been 3 years.
In order to get the desired mm-dd-yyyy for a given dataset where you have only the year and the calendar day as a result of lubridate::yday().
You just need to offset a Date() type object for that given year starting at january 1st, by what your yday() output is subtracted by one.
Then if you want to get the month or day within that month you can use lubridate's month() and day() functions to pull those parts.
(I assume you have the year since that will impact the calendar date b/c of leap years, messing up your month/day assignment. If not, any year will do)
library(dplyr)
library(magrittr)
#Example dataset with years on/around leap year
my_df <- data.frame(
year = c(2010, 2011, 2012, 2013, 2014, 2015),
my_yday = c(150, 150, 150, 150, 150, 150)
)
#skip straight back to the yyyy-mm-dd format
my_df %>% mutate(new_date = as.Date(paste0(year, "-01-01")) + (my_yday - 1))
#Get every component
my_df %>%
mutate(
new_day = lubridate::day(as.Date(str_c(year, "-01-01")) + (my_yday - 1)),
new_month = lubridate::month(as.Date(paste0(year, "-01-01")) + (my_yday - 1)),
new_date = as.POSIXct(str_c(new_month, new_day, year, sep = "/"),
format = "%m/%d/%y"))

Related

Changing date formats and calculating duration using lubridate

I am trying to get the duration between issue_d and last_pymnt_d in my dataset using the lubridate package. issue_d is in the following formatting chr "2015-05-01T00:00:00Z" and last_pymnt_d is in chr "Feb-2017". I need them in the same format (just need "my" or "myd" is fine if "my" is not an option) Then I need to know calculate between issue_d and last_pymnt_d.
lcDataSet2$issue_d<-parse_date_time(lcDataSet2$issue_d, "myd")
turns my issue_d into NA. I also get the below error when even just trying to view last_pymnt_d in date format
as.Date(lcRawData$last_pymnt_d)
Error in charToDate(x) :
character string is not in a standard unambiguous format
How can I get these into the same date format and then calculate the duration?
The order and letter case of the format string is important for parsing dates.
library(lubridate)
parse_date_time('2015-05-01T00:00:00Z', 'Y-m-d H:M:S')
[1] "2015-05-01 UTC"
parse_date_time('Feb-2017', 'b-Y')
[1] "2017-02-01 UTC"
If wanting just the month and year there is a zoo function
library(zoo)
date1 <- as.yearmon('2015-05-01T00:00:00Z')
[1] "May 2015"
date2 <- as.yearmon('Feb-2017', '%b-%Y')
[1] "Feb 2017"
difftime(date2, date1)
Time difference of 642 days
The zoo package gives you a function as.yearmon to covert dates into yearmon objects containing only the month and year. Since your last_pymnt_d is only month and year, the best date difference you will get is number of months:
library(zoo)
issue_d <- "2015-05-01T00:00:00Z"
last_pymnt_d <- "Feb-2017"
diff <- as.yearmon(last_pymnt_d, format = "%b-%Y") - as.yearmon(as.Date(issue_d))
diff
1.75
Under the hood, the yearmon object is a number of years, with the decimal component representing the months. A difference in yearmon of 1.75 is 1 year and 9 months.
diff_months <- paste(round(diff * 12, 0), "months")
"21 months"
diff_yearmon <- paste(floor(diff), "years and", round((diff %% 1) * 12, 0), "months")
diff_yearmon
"1 years and 9 months"

Convert Month Year to date r (from excel) [duplicate]

library(xts)
data <- data.frame(year_week = c("2016-46", "2016-47", "2016-48"),
satisfaction = c(0.25, 0.45, 0.58))
data = xts(data[-1], order.by = as.POSIXct(data$year_week, format = "%Y-%W"))
I want to create an xts object from the data.frame data where the dates keep the format year-week. When I am running the code the columns take the form 2016-12-05 which is incorrect and far from what I am trying to achieve.
This is a variant on a quasi-FAQ on 'by can one not parse year and month as a date': because a date is day and month and year.
Or year, a year and week and day. Otherwise you are indeterminate:
> as.Date(format(Sys.Date(), "%Y-%W-%d"), "%Y-%W-%d")
[1] "2017-12-04"
>
using
> Sys.Date()
[1] "2017-12-04"
> format(Sys.Date(), "%Y-%W-%d")
[1] "2017-49-04"
>
so %W works on input and output provided you also supply a day.
For input data that does not have a day, you could just add a given weekday, say 1:
> as.Date(paste0(c("2016-46", "2016-47", "2016-48"), "-1"), "%Y-%W-%w")
[1] "2016-11-14" "2016-11-21" "2016-11-28"
>

as.POSIXct does not recognise date format = "%Y-%W"

library(xts)
data <- data.frame(year_week = c("2016-46", "2016-47", "2016-48"),
satisfaction = c(0.25, 0.45, 0.58))
data = xts(data[-1], order.by = as.POSIXct(data$year_week, format = "%Y-%W"))
I want to create an xts object from the data.frame data where the dates keep the format year-week. When I am running the code the columns take the form 2016-12-05 which is incorrect and far from what I am trying to achieve.
This is a variant on a quasi-FAQ on 'by can one not parse year and month as a date': because a date is day and month and year.
Or year, a year and week and day. Otherwise you are indeterminate:
> as.Date(format(Sys.Date(), "%Y-%W-%d"), "%Y-%W-%d")
[1] "2017-12-04"
>
using
> Sys.Date()
[1] "2017-12-04"
> format(Sys.Date(), "%Y-%W-%d")
[1] "2017-49-04"
>
so %W works on input and output provided you also supply a day.
For input data that does not have a day, you could just add a given weekday, say 1:
> as.Date(paste0(c("2016-46", "2016-47", "2016-48"), "-1"), "%Y-%W-%w")
[1] "2016-11-14" "2016-11-21" "2016-11-28"
>

POSIX date from dates in weekly time format

I have dates encoded in a weekly time format (European convention >> 01 through 52/53, e.g. "2016-48") and would like to standardize them to a POSIX date:
require(magrittr)
(x <- as.POSIXct("2016-12-01") %>% format("%Y-%V"))
# [1] "2016-48"
as.POSIXct(x, format = "%Y-%V")
# [1] "2016-01-11 CET"
I expected the last statement to return "2016-12-01" again. What am I missing here?
Edit
Thanks to Dirk, I was able to piece it together:
y <- sprintf("%s-1", x)
While I still don't get why this doesn't work
(as.POSIXct(y, format = "%Y-%V-%u"))
# [1] "2016-01-11 CET"
this does
(as.POSIXct(y, format = "%Y-%U-%u")
# [1] "2016-11-28 CET"
Edit 2
Oh my, I think using %V is a very bad idea in general:
as.POSIXct("2016-01-01") %>% format("%Y-%V")
# [1] "2016-53"
Should this be considered to be on a "serious bug" level that requires further action?!
Sticking to either %U or %W seems to be the right way to go
as.POSIXct("2016-01-01") %>% format("%Y-%U")
# [1] "2016-00"
Edit 3
Nope, not quite finished/still puzzled: the approach doesn't work for the very first week
(x <- as.POSIXct("2016-01-01") %>% format("%Y-%W"))
# [1] "2016-00"
as.POSIXct(sprintf("%s-1", x), format = "%Y-%W-%u")
# [1] NA
It does for week 01 as defined in the underlying convention when using %U or %W (so "week 2", actually)
as.POSIXct("2016-01-1", format = "%Y-%W-%u")
# [1] "2016-01-04 CET"
As I have to deal a lot with reporting by ISO weeks, I've created the ISOweek package some years ago.
The package includes the function ISOweek2date() which returns the date of a given weekdate (year, week of the year, day of week according to ISO 8601). It's the inverse function to date2ISOweek().
With ISOweek, your examples become:
library(ISOweek)
# define dates to convert
dates <- as.Date(c("2016-12-01", "2016-01-01"))
# convert to full ISO 8601 week-based date yyyy-Www-d
(week_dates <- date2ISOweek(dates))
[1] "2016-W48-4" "2015-W53-5"
# convert back to class Date
ISOweek2date(week_dates)
[1] "2016-12-01" "2016-01-01"
Note that date2ISOweek() requires a full ISO week-based date in the format yyyy-Www-d including the day of the week (1 to 7, Monday to Sunday).
So, if you only have year and ISO week number you have to create a character string with a day of the week specified.
A typical phrase in many reports is, e.g., "reporting week 31 ending 2017-08-06":h
yr <- 2017
wk <- 31
ISOweek2date(sprintf("%4i-W%02i-%1i", yr, wk, 7))
[1] "2017-08-06"
Addendum
Please, see this answer for another use case and more background information on the ISOweek package.

Get Dates of a Certain Weekday from a Year in R

How might I generate a list of date objects (POSIXct or lt) for each Monday of a year?
For instance this year would be (In Year, Month, Day):
2012_01_02,
2012_01_09,
2102_01_16,
etc
EDIT: On further reflection, here's a cleaner function for doing the same thing:
getAllMondays <- function(year) {
days <- as.POSIXlt(paste(year, 1:366, sep="-"), format="%Y-%j")
Ms <- days[days$wday==1]
Ms[!is.na(Ms)] # Needed to remove NA from day 366 in non-leap years
}
getAllMondays(2012)
Here's a function that'll perform the more general task of finding the first Monday in an arbitrary year, and then listing it and all of the other Mondays in that year. It uses seq.POSIXt(), and the argument by = "week" (which is also available for seq.Date()).
getAllMondays <- function(year) {
day1 <- as.POSIXlt(paste(year, "01-01", sep="-"))
day365 <- as.POSIXlt(paste(year, "12-31", sep="-"))
# Find the first Monday of year
week1 <- as.POSIXlt(seq(day1, length.out=7, by="day"))
monday1 <- week1[week1$wday == 1]
# Return all Mondays in year
seq(monday1, day365, by="week")
}
head(getAllMondays(2012))
# [1] "2012-01-02 PST" "2012-01-09 PST" "2012-01-16 PST" "2012-01-23 PST"
# [5] "2012-01-30 PST" "2012-02-06 PST"
I found seq.Date which is part of base. Not sure if there are caveats to this method but it seems to do what I want:
x = seq(as.Date("2012/01/02"), as.Date("2013/01/01"), "7 days")
as.POSIXct(x)
as.Date("2012_01_02", format="%Y_%m_%d") +seq(0,366,by=7) # 2012 is a leap year.
If you really want them as DateTimes with all the attendant hassles of timezones then you can coerce them with as.POSIXct.

Resources