unexpected behaviour of as.Date() with %W format - r

I thought I had understood how I can use as.Date() but I get an unexpected result when using the UK convention for counting weeks in a given year (%W).
I have a year-week combination and want to find the date corresponding to Monday and Sunday of that week (if they exist, so it's fine that NA is returned if the week does not contain Mon or Sun). In the UK week counting, the week starts with a Monday. So to find the Monday of, say, week 1 in 2016, I use the following code, which returns the correct result (4 Jan 2016):
as.Date("2016011", format = "%Y%W%u")
To find the Sunday of that week, I change the last number to 7 because %u takes 1 to be Monday and 7 to be Sunday (I have also used %w instead with its definition of Sunday 0 and Monday 1 but with the same result):
as.Date("2016017", format = "%Y%W%u")
My expected output is 10 Jan 2016 but I get 3 Jan 2016. So it seems that as.Date() treats the week as beginning with Sunday. This however contradicts the definition of %W.
Any ideas what I'm missing? Thanks!

For working with dates, I recommend the lubridate package.
Below are some examples.
library(lubridate)
date1 = ymd("20160103")
weekdays(date1) #Local name of the day of the week
#[1] "niedziela"
wday(date1) #Weekday number regional setting
#[1] 1
wday(date1, week_start = 1) #Weekday number where Monday is 1 day
#[1] 7
You can set lubridate.week.start option to control this parameter globally.
Here's a bit more information.

Related

Convert from character to date in a "YYYY-WW" format in R

I have a hard time converting character to date in R.
I have a file where the dates are given as "2014-01", where the first is the year and the second is the week of the year. I want to convert this to a date type.
I have tried the following
z <- as.Date('2014-01', '%Y-%W')
print(z)
Output: "2014-12-05"
Which is not what I desire. I want to get the same format out, ie. the output should be "2014-01" but now as a date type.
It sounds like you are dealing with some version of year week, which exists in three forms in lubridate:
week() returns the number of complete seven day periods that have
occurred between the date and January 1st, plus one.
isoweek() returns the week as it would appear in the ISO 8601 system,
which uses a reoccurring leap week.
epiweek() is the US CDC version of epidemiological week. It follows
same rules as isoweek() but starts on Sunday. In other parts of the
world the convention is to start epidemiological weeks on Monday,
which is the same as isoweek.
Lubridate has functions to extract these from a date, but I don't know of a built-in way to go the other direction, from week to one representative day (out of 7 possible). One simple way if you're dealing with the first version would be to add 7 * (Week - 1) to jan 1 of the year.
library(dplyr)
data.frame(yearweek = c('2014-01', '2014-03')) %>%
tidyr::separate(yearweek, c("Year", "Week"), convert = TRUE) %>%
mutate(Date = as.Date(paste0(Year, "-01-01")) + 7 * (Week-1))
Year Week Date
1 2014 1 2014-01-01
2 2014 3 2014-01-15

R not recognizing week number correctly [duplicate]

String contains 'YEAR WEEK' and I want to transform it with parse_date_time() to a date object but I can't make the code work:
parse_date_time(c("201510"), "YW")
I don't have to use lubridate, can be other packages, too.
Before converting year-week to a date you have to specify a day of the week but more importantly you have to ensure which of the different conventions is being used.
Base R's strptime() function knows 3 definitions of week of the year (but supports only 2 of them on input) and 2 definitions of weekday number,
see ?strptime:
Week of the year
US convention: Week of the year as decimal number (00–53) using Sunday as the first day 1 of the week (and typically with the first Sunday of the year as day 1 of week 1): %U
UK convention: Week of the year as decimal number (00–53) using Monday as the first day of week (and typically with the first Monday of the year as day 1 of week 1): %W
ISO 8601 definition: Week of the year as decimal number (01–53) as defined in ISO 8601. If the week (starting on Monday) containing 1 January has four or more days in the new year, then it is considered week 1. Otherwise, it is the last week of the previous year, and the next week is week 1: %V which is accepted but ignored on input.
Note that there is also a week-based year (%G and %g) which is to be used with %V as it may differ from the calendar year (%Y and %y).
Numeric weekday
Weekday as a decimal number (1–7, Monday is 1): %u
Weekday as decimal number (0–6, Sunday is 0): %w
Interestingly, there is no format for the case Sunday is counted as day 1 of the week.
Converting year-week-day with the different conventions
If we append day 1 to the string and use the different formats we do get
as.Date("2015101", "%Y%U%u")
# [1] "2015-03-09"
as.Date("2015101", "%Y%U%w")
# [1] "2015-03-09"
as.Date("2015101", "%Y%W%u")
# [1] "2015-03-09"
as.Date("2015101", "%Y%W%w")
# [1] "2015-03-09"
as.Date("2015101", "%G%V%u")
# [1] NA
For weekday formats %u and %w we do get the same result because day 1 is Monday in both conventions (but watch out when dealing with Sundays).
For 2015, the US and the UK definition for week of the year coincide but this is not true for all years, e.g., not for 2001, 2007, and 2018:
as.Date("2018101", "%Y%U%u")
#[1] "2018-03-12"
as.Date("2018101", "%Y%W%u")
#[1] "2018-03-05"
The ISO 8601 format specifiers aren't supported on input. Therefore, I had created the ISOweek package some years ago:
ISOweek::ISOweek2date("2015-W10-1")
#[1] "2015-03-02"
Edit: Using Thursday to associate a week with a month
As mentioned above you need to specify a day of the week to get a full calendar date. This is also required if the dates need to be aggregated by month later on.
If no weekday is specified and if the dates are supposed to be aggregated by month later on, you may take the Thursday of each week as reference day (following a suggestion by djhurio). This ensures that the whole week is assigned to the month to which the majority of the days of the week belong to.
For example, taking Sunday as reference day would return
ISOweek::ISOweek2date("2015-W09-7")
[1] "2015-03-01"
which consequently would associate the whole week to the month of March although only one day of the week belongs to March while the other 6 days belong to February. Taking Thursday as reference day will return a date in February:
ISOweek::ISOweek2date("2015-W09-4")
[1] "2015-02-26"
Yes ISOweek package does this
ISOweek::ISOweek2date(isoWeek)
but for the other direction, check up the newer lubridate package as well
ISOweek::date2ISOweek(yourDate)
lubridate::isoweek(ymd(yourDate))

Yearweek is parsed wrongly in R

Problem: I am facing the problem that R parses a date (30 December 2019) into yearweek wrongly (Output: 2019 W01). I do not know why this is happening. Any suggestions what to change/alternative way of coding?
format(lubridate::ymd("2019-12-30"), "%Y W%V")
# Output
# 2019 W01
# Desired Output:
# 2019 W52
From the strptime documentation:
%U
Week of the year as decimal number (00–53) using Sunday as the first day 1 of the
week (and typically with the first Sunday of the year as day 1 of week 1). The US
convention.
%V
Week of the year as decimal number (01–53) as defined in ISO 8601. If the week
(starting on Monday) containing 1 January has four or more days in the new year,
then it is considered week 1. Otherwise, it is the last week of the previous year,
and the next week is week 1. (Accepted but ignored on input.)
%W
Week of the year as decimal number (00–53) using Monday as the first day of week
(and typically with the first Monday of the year as day 1 of week 1). The UK
convention.
It sounds like you may want either %U or %W, depending on whether you want to treat Sunday or Monday as the start of the week.
Note however that these can result in values between 00 and 53, which is a consequence of fixing the start of the week to a particular weekday (either Sunday or Monday). Doing that means that there can actually be a partial week at the start and at the end of the year.
If you prefer to count based on week number 1 beginning on the first day of the year, you can use the function lubridate::week.
For example:
library(lubridate)
year_week <- function(date) paste0(year(date), ' W', week(date))
year_week(ymd("2019-01-01"))
# Result: "2019 W1"
year_week(ymd("2019-12-30"))
# Result: "2019 W52"
After some more research I found that this is the best solution:
format(lubridate::ymd("2019-12-30"), "%G W%V")
Use %G instead of %Y to reflect that the week-based year (%G and %g) may differ from the calendar year (%Y and %y).
See also: https://community.rstudio.com/t/converting-week-number-and-year-into-date/27202/2

Transform year/week to date object

String contains 'YEAR WEEK' and I want to transform it with parse_date_time() to a date object but I can't make the code work:
parse_date_time(c("201510"), "YW")
I don't have to use lubridate, can be other packages, too.
Before converting year-week to a date you have to specify a day of the week but more importantly you have to ensure which of the different conventions is being used.
Base R's strptime() function knows 3 definitions of week of the year (but supports only 2 of them on input) and 2 definitions of weekday number,
see ?strptime:
Week of the year
US convention: Week of the year as decimal number (00–53) using Sunday as the first day 1 of the week (and typically with the first Sunday of the year as day 1 of week 1): %U
UK convention: Week of the year as decimal number (00–53) using Monday as the first day of week (and typically with the first Monday of the year as day 1 of week 1): %W
ISO 8601 definition: Week of the year as decimal number (01–53) as defined in ISO 8601. If the week (starting on Monday) containing 1 January has four or more days in the new year, then it is considered week 1. Otherwise, it is the last week of the previous year, and the next week is week 1: %V which is accepted but ignored on input.
Note that there is also a week-based year (%G and %g) which is to be used with %V as it may differ from the calendar year (%Y and %y).
Numeric weekday
Weekday as a decimal number (1–7, Monday is 1): %u
Weekday as decimal number (0–6, Sunday is 0): %w
Interestingly, there is no format for the case Sunday is counted as day 1 of the week.
Converting year-week-day with the different conventions
If we append day 1 to the string and use the different formats we do get
as.Date("2015101", "%Y%U%u")
# [1] "2015-03-09"
as.Date("2015101", "%Y%U%w")
# [1] "2015-03-09"
as.Date("2015101", "%Y%W%u")
# [1] "2015-03-09"
as.Date("2015101", "%Y%W%w")
# [1] "2015-03-09"
as.Date("2015101", "%G%V%u")
# [1] NA
For weekday formats %u and %w we do get the same result because day 1 is Monday in both conventions (but watch out when dealing with Sundays).
For 2015, the US and the UK definition for week of the year coincide but this is not true for all years, e.g., not for 2001, 2007, and 2018:
as.Date("2018101", "%Y%U%u")
#[1] "2018-03-12"
as.Date("2018101", "%Y%W%u")
#[1] "2018-03-05"
The ISO 8601 format specifiers aren't supported on input. Therefore, I had created the ISOweek package some years ago:
ISOweek::ISOweek2date("2015-W10-1")
#[1] "2015-03-02"
Edit: Using Thursday to associate a week with a month
As mentioned above you need to specify a day of the week to get a full calendar date. This is also required if the dates need to be aggregated by month later on.
If no weekday is specified and if the dates are supposed to be aggregated by month later on, you may take the Thursday of each week as reference day (following a suggestion by djhurio). This ensures that the whole week is assigned to the month to which the majority of the days of the week belong to.
For example, taking Sunday as reference day would return
ISOweek::ISOweek2date("2015-W09-7")
[1] "2015-03-01"
which consequently would associate the whole week to the month of March although only one day of the week belongs to March while the other 6 days belong to February. Taking Thursday as reference day will return a date in February:
ISOweek::ISOweek2date("2015-W09-4")
[1] "2015-02-26"
Yes ISOweek package does this
ISOweek::ISOweek2date(isoWeek)
but for the other direction, check up the newer lubridate package as well
ISOweek::date2ISOweek(yourDate)
lubridate::isoweek(ymd(yourDate))

Calculating the number of weeks for each year based on dates using R

I have a dataset with dates of 2 different years (2009 and 2010) and would like to have the corresponding week number for each date.
My dataset is similar to this:
anim <- c(012,023,045,098,067)
dob <- c("01-09-2009","12-09-2009","22-09-2009","10-10-2010","28-10-2010")
mydf <- data.frame(anim,dob)
mydf
anim dob
1 12 01-09-2009
2 23 12-09-2009
3 45 22-09-2009
4 98 10-10-2010
5 67 28-10-2010
I would like to have variable "week" in the third column with the corresponding week numbers for each date.
EDIT:
Note: Week one begins on January 1st, week two begins on January 8th for each year
Any help would be highly appreciated.
Baz
Your definition of "week of year"
EDIT: Note: Week one begins on January 1st, week two begins on January 8th for each year
differs from the standard ones supported by strftime:
%U
Week of the year as decimal number (00–53) using Sunday as the first day 1
of the week (and typically with the first Sunday of the year as day 1 of
week 1). The US convention.
%W
Week of the year as decimal number (00–53) using Monday as the first day
of week (and typically with the first Monday of the year as day 1 of week
1). The UK convention.
So you need to compute it based on the day-of-year number.
mydf$week <- (as.numeric(strftime(as.POSIXct(mydf$dob,
format="%d-%m-%Y"),
format="%j")) %/% 7) + 1
Post 2011 Answer
library(lubridate)
mydf$week <- week(mydf$week)
lubridate package is straight-forward for day-to-day tasks like this.
If you want to do how many weeks (or 7 day periods) have passed between your date of interest and the first day of the year, regardless of what day of the week it was on the first of the year, the following is a solution (using floor_date from lubridate).
mydf$weeks <- difftime(mydf$dob, floor_date(mydf$dob, "year"), units = c("weeks")))

Resources