R strptime Monday date from weeknumber weird - r

I am making a trivial error here but cannot get my head around figuring out what is the problem.
I need to get the date of the Monday of the week of a random date.. Seems I am getting something quite different
mydate <- date("2013-11-05")
format(mydate, "%A") # this is Tuesday, right
#[1] "Tuesday"
month(mydate) # Month November, right
#[1] 11
myyr <- year(mydate); myyr # year is 2013, right
#[1] 2013
day(mydate) # day number is 5, right
#[1] 5
mywk <- isoweek(mydate);mywk # weeknumber is 45, right (yes, not US convention here)
#[1] 45
format(mydate, "%V") # weeknumber is 45, right as well
#[1] "45"
# Monday of week 45 is 2013-11-04 but strptime following gives something else...
strptime(paste0(myyr, "Monday", mywk), "%Y%A%V")
#[1] "2013-11-19 EET"
# and for checking
strptime("2013Monday45","%Y%A%V")
#[1] "2013-11-19 EET"
Thanks in advance

Gabor's comment is all you need, essentially. Here is a full function:
mondayForGivenDate <- function(d) {
if (class(d) != "Date") d <- anytime::anydate(d)
d - as.POSIXlt(d)$wday + 1
}
Running this for today (a Saturday), next Monday and previous Saturday gets us three different Monday's as you expect:
R> mondayForGivenDate(Sys.Date())
[1] "2016-11-14"
R> mondayForGivenDate(Sys.Date()+2)
[1] "2016-11-21"
R> mondayForGivenDate(Sys.Date()-7)
[1] "2016-11-07"
R>
The use of the anydate() function from the anytime is optional but nice because you now get to use many different input formats:
R> mondayForGivenDate(20161119)
[1] "2016-11-14"
R> mondayForGivenDate("20161119")
[1] "2016-11-14"
R> mondayForGivenDate("2016-11-19")
[1] "2016-11-14"
R> mondayForGivenDate("2016-Nov-19")
[1] "2016-11-14"
R>
The key point, once again, is to work with the proper Date and/or POSIXt classes in R which almost always give you what is needed -- in this case the wday component for the day of the week need to revert back to the week's Monday.

Related

Why subtracting months from a lubridate date gives inconsistent results? [duplicate]

I'm trying to subtract n months from a date as follows:
maturity <- as.Date("2012/12/31")
m <- as.POSIXlt(maturity)
m$mon <- m$mon - 6
but the resulting date is 01-Jul-2012, and not 30-Jun-2012, as I should expect.
Is there any short way to get such result?
1) seq.Date. Note that June has only 30 days so it cannot give June 31st thus instead it gives July 1st.
seq(as.Date("2012/12/31"), length = 2, by = "-6 months")[2]
## [1] "2012-07-01"
If we knew it was at month end we could do this:
seq(as.Date(cut(as.Date("2012/12/31"), "month")), length=2, by="-5 month")[2]-1
## "2012-06-30"
2) yearmon. Also if we knew it was month end then we could use the "yearmon" class of the zoo package like this:
library(zoo)
as.Date(as.yearmon(as.Date("2012/12/31")) -.5, frac = 1)
## [1] "2012-06-30"
This converts the date to "yearmon" subtracts 6 months (.5 of a year) and then converts it back to "Date" using frac=1 which means the end of the month (frac=0 would mean the beginning of the month). This also has the advantage over the previous solution that it is vectorized automatically, i.e. as.Date(...) could have been a vector of dates.
Note that if "Date" class is only being used as a way of representing months then we can get rid of it altogether and directly use "yearmon" since that models what we want in the first place:
as.yearmon("2012-12") - .5
## [1] "Jun 2012"
3) mondate. A third solution is the mondate package which has the advantage here that it returns the end of the month 6 months ago without having to know that we are month end:
library(mondate)
mondate("2011/12/31") - 6
## mondate: timeunits="months"
## [1] 2011/06/30
This is also vectorized.
4) lubridate. This lubridate answer has been changed in line with changes in the package:
library(lubridate)
as.Date("2012/12/31") %m-% months(6)
## [1] "2012-06-30"
lubridate is also vectorized.
5) sqldf/SQLite
library(sqldf)
sqldf("select date('2012-12-31', '-6 months') as date")
## date
## 1 2012-07-01
or if we knew we were at month end:
sqldf("select date('2012-12-31', '+1 day', '-6 months', '-1 day') as date")
## date
## 1 2012-06-30
you can use lubridate package for this
library(lubridate)
maturity <- maturity %m-% months(6)
there is no reason for changing the day field.
you can set your day field back to the last day in that month by
day(maturity) <- days_in_month(maturity)
lubridate works correctly with such calculations:
library(lubridate)
as.Date("2000-01-01") - days(1) # 1999-12-31
as.Date("2000-03-31") - months(1) # 2000-02-29
but sometimes fails:
as.Date("2000-02-29") - years(1) # NA, should be 1999-02-28
tidyverse has added the clock package in addition to the lubridate package that has nice functionality for this:
library(clock)
# sequence of dates
date_build(2018, 1:5, 31, invalid = "previous")
[1] "2018-01-31" "2018-02-28" "2018-03-31" "2018-04-30" "2018-05-31"
When the date is sequenced, 2018-02-31 is not a valid date. The invalid argument makes explicit what to do in this case: go to the last day of the "previous" valid date.
There is also a series add functions, but in your case you would use add_months. Again it has the invalid argument that you can specify:
x <- as.Date("2022-03-31")
# The previous valid moment in time
add_months(x, -1, invalid = "previous")
[1] "2022-02-28"
# The next valid moment in time, 2022-02-31 is not a valid date
add_months(x, -1, invalid = "next")
[1] "2022-03-01"
# Overflow the days. There were 28 days in February, 2020, but we
# specified 31. So this overflows 3 days past day 28.
add_months(x, -1, invalid = "overflow")
[1] "2022-03-03"
You can also specify invalid to be NA or if you leave off this argument you could get an error.
Technically you cannot add/subtract 1 month to all dates (although you can add/subtract 30 days to all dates, but I suppose, that's not something you want). I think this is what you are looking for
> lubridate::ceiling_date(as.Date("2020-01-31"), unit = "month")
[1] "2020-02-01"
> lubridate::floor_date(as.Date("2020-01-31"), unit = "month")
[1] "2020-01-01"
UPDATE, I just realised that Tung-nguyen also wrote the same method and has a two line version here https://stackoverflow.com/a/44690219/19563460
Keeping this answer here so newbies can see different ways of doing it
With the R updates, you can now do this easily in base R using seq.date(). Here are some examples of implementing this that should work without additional packages
ANSWER 1: typing directly
maturity <- as.Date("2012/12/31")
seq(maturity, length.out=2, by="-3 months")[2]
# see here for more help
?seq.date
ANSWER 2: Adding in some flexibility, e.g. 'n' months
maturity <- as.Date("2012/12/31")
n <- 3
bytime <- paste("-",n," months",sep="")
seq(maturity,length.out=2,by=bytime)[2]
ANSWER 3: Make a function
# Here's a little function that will let you add X days/months/weeks
# to any base R date. Commented for new users
#---------------------------------------------------------
# MyFunction
# DateIn, either a date or a string that as.Date can convert into one
# TimeBack, number of units back/forward
# TimeUnit, unit of time e.g. "weeks"/"month"/"days"
# Direction can be "back" or "forward", not case sensitive
#---------------------------------------------------------
MyFunction <- function(DateIn,TimeBack,TimeUnit,Direction="back"){
#--- Set up the by string
if(tolower(Direction)=="back"){
bystring <- paste("-",TimeBack," ",tolower(TimeUnit),sep="")
}else{
bystring <- paste(TimeBack," ",tolower(TimeUnit),sep="")
}
#--- Return the new date using seq in the base package
output <- seq(as.Date(DateIn),length.out=2,by=bystring)[2]
return(output)
}
# EXAMPLES
MyFunction("2000-02-29",3,"months","forward")
Answer <- MyFunction(DateIn="2002-01-01",TimeBack=14,
TimeUnit="weeks",Direction="back")
print(Answer)
maturity <- as.Date("2012/12/31")
n <- 3
MyFunction(DateIn=maturity,TimeBack=n,TimeUnit="months",Direction="back")
ANSWER 4: I quite like my little function, so I just uploaded it to my mini personal R package.
This is freely available, so now technically the answer is use the JumpDate function from the Greatrex.Functions package
Can't guarantee it'll work forever and no support available, but you're welcome to use it.
# Install/load my package
install.packages("remotes")
remotes::install_github('hgreatrex/Greatrex.Functions',force=TRUE)
library(Greatrex.Functions)
# run it
maturity <- as.Date("2012/12/31")
n <- 3
Answer <- JumpDate(DateIn=maturity,TimeBack=n,TimeUnit="months",
Direction="back",verbose=TRUE)
print(Answer)
JumpDate("2000-02-29",3,"months","forward")
# Help file here
?Greatrex.Functions::JumpDate
You can see how I made the function/package here:
https://github.com/hgreatrex/Greatrex.Functions/blob/master/R/JumpDate.r
With nice instructions here on making your own mini compilation of functions.
http://web.mit.edu/insong/www/pdf/rpackage_instructions.pdf
and here
How do I insert a new function into my R package?
Hope that helps! I hope it's also useful to see the different levels of designing an answer to a coding problem, depending on how often you need it and the level of flexibility you need.

POSIX date from dates in weekly time format

I have dates encoded in a weekly time format (European convention >> 01 through 52/53, e.g. "2016-48") and would like to standardize them to a POSIX date:
require(magrittr)
(x <- as.POSIXct("2016-12-01") %>% format("%Y-%V"))
# [1] "2016-48"
as.POSIXct(x, format = "%Y-%V")
# [1] "2016-01-11 CET"
I expected the last statement to return "2016-12-01" again. What am I missing here?
Edit
Thanks to Dirk, I was able to piece it together:
y <- sprintf("%s-1", x)
While I still don't get why this doesn't work
(as.POSIXct(y, format = "%Y-%V-%u"))
# [1] "2016-01-11 CET"
this does
(as.POSIXct(y, format = "%Y-%U-%u")
# [1] "2016-11-28 CET"
Edit 2
Oh my, I think using %V is a very bad idea in general:
as.POSIXct("2016-01-01") %>% format("%Y-%V")
# [1] "2016-53"
Should this be considered to be on a "serious bug" level that requires further action?!
Sticking to either %U or %W seems to be the right way to go
as.POSIXct("2016-01-01") %>% format("%Y-%U")
# [1] "2016-00"
Edit 3
Nope, not quite finished/still puzzled: the approach doesn't work for the very first week
(x <- as.POSIXct("2016-01-01") %>% format("%Y-%W"))
# [1] "2016-00"
as.POSIXct(sprintf("%s-1", x), format = "%Y-%W-%u")
# [1] NA
It does for week 01 as defined in the underlying convention when using %U or %W (so "week 2", actually)
as.POSIXct("2016-01-1", format = "%Y-%W-%u")
# [1] "2016-01-04 CET"
As I have to deal a lot with reporting by ISO weeks, I've created the ISOweek package some years ago.
The package includes the function ISOweek2date() which returns the date of a given weekdate (year, week of the year, day of week according to ISO 8601). It's the inverse function to date2ISOweek().
With ISOweek, your examples become:
library(ISOweek)
# define dates to convert
dates <- as.Date(c("2016-12-01", "2016-01-01"))
# convert to full ISO 8601 week-based date yyyy-Www-d
(week_dates <- date2ISOweek(dates))
[1] "2016-W48-4" "2015-W53-5"
# convert back to class Date
ISOweek2date(week_dates)
[1] "2016-12-01" "2016-01-01"
Note that date2ISOweek() requires a full ISO week-based date in the format yyyy-Www-d including the day of the week (1 to 7, Monday to Sunday).
So, if you only have year and ISO week number you have to create a character string with a day of the week specified.
A typical phrase in many reports is, e.g., "reporting week 31 ending 2017-08-06":h
yr <- 2017
wk <- 31
ISOweek2date(sprintf("%4i-W%02i-%1i", yr, wk, 7))
[1] "2017-08-06"
Addendum
Please, see this answer for another use case and more background information on the ISOweek package.

Converting dates before January 1, 1970 in R

I am trying to convert a column of dates into Date objects in R, but I can't seem to get the desired results. These individuals have birth dates before January 1, 1970, so when I use as.Date R converts a date like 1/12/54, for example, to 2054-01-12. How can I work around this? Thanks so much.
No need for add-on packages, base R is fine. But you need to specify the century:
R> as.Date("1954-01-12")
[1] "1954-01-12"
R>
If you need non-default formats, just specify them:
R> as.Date("19540112", "%Y%m%d")
[1] "1954-01-12"
R>
Edit: In case your data really comes in using the %y% format, and you happen to make the policy decision that the 19th century is needed
, here is one base R way of doing it:
R> d <- as.Date("540112", "%y%m%d")
R> dlt <- as.POSIXlt(d)
R> dlt$year <- dlt$year - 100
R> as.Date(dlt)
[1] "1954-01-12"
R>
If everything is in the 1900s, its a one-liner - just format it with a two-digit year at the start and slap a 19 on the front and convert to a date. Again. Man this would look cool some %>% stuff:
s = c("1/12/54","1/12/74")
as.Date(format(as.Date(s,format="%d/%m/%y"), "19%y%m%d"), "%Y%m%d")
# [1] "1954-12-01" "1974-12-01"
If years from "69" to "99" are 1800s, then here's another one-liner:
library(dplyr) # for pipe operator:
s %>% as.Date(format="%d/%m/%y") %>%
format("%y%m%d") %>%
(function(d){
paste0(ifelse(d>700101,"18","19"),d)
}) %>%
as.Date("%Y%m%d")
## [1] "1954-12-01" "1874-12-01"
Note not thoroughly tested so might be some off-by-one errors or I've mixed months and days because you need to be ISO8601 Compliant
I would do:
library(lubridate)
x <- as.Date("1/12/54", format = "%m/%d/%y")
year(x) <- 1900 + year(x) %% 100
> x
[1] "1954-01-12"

Day of the Week, Month, Year, Week of the Year in R

Are there any built on functions that can be used on a data frame object to generate variables on a class Date time series to create day of the Week, Month, Year, Week of the Year, etc in R?
The weekdays, months, quarters, functions in the base package generate text output, looking for numerical output to denote that 3/5/2012, for example, is a Friday, 3rd day of the month, 1 week of the month, and the 63 day of the year, etc.
You get a few of those just from POSIXlt, with its weird convention. Year needs to 1900,
month is on the 0 to 11 range -- but you do get weekday and day-of-the-year.
R> dd <- as.Date("2012-05-03")
R> as.POSIXlt(dd)
[1] "2012-05-03 UTC"
Then
R> unclass(as.POSIXlt(dd))
$sec
[1] 0
$min
[1] 0
$hour
[1] 0
$mday
[1] 3
$mon
[1] 4
$year
[1] 112
$wday
[1] 4
$yday
[1] 123
$isdst
[1] 0
attr(,"tzone")
[1] "UTC"
R>
You can use the lubridate package to do a lot with dates.
From the help file: Lubridate provides tools that make it easier to parse and manipulate dates.
For example:
> library(lubridate)
> d <- today()
> d
[1] "2014-04-29"
> day(d)
[1] 29
> month(d)
[1] 4
> year(d)
[1] 2014
> week(d)
[1] 18
> weekdays(d)
[1] "Tuesday"
> days_in_month(d)
Apr
30
I prefer it to the built-in functions because it has a lot of date splicing, casting and arithmetic functions.
There are a couple of options that I can think of.
First, you could use the class as.POSIXlt so that you can subset with things like df$date$yday. The as.POSIXlt() includes the elements of dates as a list underneath that can be accessed that way.
Also, the package lubridate has functions like
yday(x)
wday(x)
mday(x)

R Strptime Year and Month with No Delimiter returning NA

I'm probably doing something stupid and not seeing it, but:
> strptime("201101","%Y%m")
[1] NA
From help strptime:
%Y Year with century
%m Month as decimal number (01–12)
Just paste a day field (say, "01") that you ignore:
R> shortdate <- "201101"
R> as.Date(paste(shortdate, "01", sep=""), "%Y%m%d")
[1] "2011-01-01"
R>
I prefer as.Date() for dates and strptime() for POSIXct objects, i.e. dates and times.
You can then convert the parsed Date object into a POSIXlt object to retrieve year and month:
R> mydt <- as.Date(paste(shortdate, "01", sep=""), "%Y%m%d")
R> myp <- as.POSIXlt(mydt)
R> c(myp$year, myp$mon)
[1] 111 0
R>
This is standard POSIX behaviour with years as "year - 1900" and months as zero-indexed.
Edit seven years later: For completeness, and as someone just upvoted this, the functions in my anytime package can help:
R> anytime::anydate("201101") ## returns a Date
[1] "2011-01-01"
R> anytime::anytime("201101") ## returns a Datetime
[1] "2011-01-01 CST"
R>
The use a different parser (from Boost Date_time which is more generous and imputes the missing day (or day/hour/minute/second in the second case).

Resources