Second to last Wednesday of month in R - r

In R, how can I produce a list of dates of all 2nd to last Wednesdays of the month in a specified date range? I've tried a few things but have gotten inconsistent results for months with five Wednesdays.

To generate a regular sequence of dates you can use seq with dates for parameter from and to. See the seq.Date documentation for more options.
Create a data frame with the date, the month and weekday. And then obtain the second to last wednesday for each month with the help of aggregate.
day_sequence = seq(as.Date("2020/1/1"), as.Date("2020/12/31"), "day")
df = data.frame(day = day_sequence,
month = months(day_sequence),
weekday = weekdays(day_sequence))
#Filter only wednesdays
df = df[df$weekday == "Wednesday",]
result = aggregate(day ~ month, df, function(x){head(tail(x,2),1)})
tail(x,2) will return the last two rows, then head(.., 1) will give you the first of these last two.
Result:
month day
1 April 2020-04-22
2 August 2020-08-19
3 December 2020-12-23
4 February 2020-02-19
5 January 2020-01-22
6 July 2020-07-22
7 June 2020-06-17
8 March 2020-03-18
9 May 2020-05-20
10 November 2020-11-18
11 October 2020-10-21
12 September 2020-09-23

There are probably simpler ways of doing this but the function below does what the question asks for. it returns a named vector of days such that
They are between from and to.
Are weekday day, where 1 is Monday.
Are n to last of the month.
By n to last I mean the nth counting from the end of the month.
whichWeekday <- function(from, to, day, n, format = "%Y-%m-%d"){
from <- as.Date(from, format = format)
to <- as.Date(to, format = format)
day <- as.character(day)
d <- seq(from, to, by = "days")
m <- format(d, "%Y-%m")
f <- c(TRUE, m[-1] != m[-length(m)])
f <- cumsum(f)
wed <- tapply(d, f, function(x){
i <- which(format(x, "%u") == day)
x[ tail(i, n)[1] ]
})
y <- as.Date(wed, origin = "1970-01-01")
setNames(y, format(y, "%Y-%m"))
}
whichWeekday("2019-01-01", "2020-03-31", 4, 2)
# 2019-01 2019-02 2019-03 2019-04 2019-05
#"2019-01-23" "2019-02-20" "2019-03-20" "2019-04-17" "2019-05-22"
# 2019-06 2019-07 2019-08 2019-09 2019-10
#"2019-06-19" "2019-07-24" "2019-08-21" "2019-09-18" "2019-10-23"
# 2019-11 2019-12 2020-01 2020-02 2020-03
#"2019-11-20" "2019-12-18" "2020-01-22" "2020-02-19" "2020-03-18"

Related

How to convert week numbers into date format using R

I am trying to convert a column in my dataset that contains week numbers into weekly Dates. I was trying to use the lubridate package but could not find a solution. The dataset looks like the one below:
df <- tibble(week = c("202009", "202010", "202011","202012", "202013", "202014"),
Revenue = c(4543, 6764, 2324, 5674, 2232, 2323))
So I would like to create a Date column with in a weekly format e.g. (2020-03-07, 2020-03-14).
Would anyone know how to convert these week numbers into weekly dates?
Maybe there is a more automated way, but try something like this. I think this gets the right days, I looked at a 2020 calendar and counted. But if something is off, its a matter of playing with the (week - 1) * 7 - 1 component to return what you want.
This just grabs the first day of the year, adds x weeks worth of days, and then uses ceiling_date() to find the next Sunday.
library(dplyr)
library(tidyr)
library(lubridate)
df %>%
separate(week, c("year", "week"), sep = 4, convert = TRUE) %>%
mutate(date = ceiling_date(ymd(paste(year, "01", "01", sep = "-")) +
(week - 1) * 7 - 1, "week", week_start = 7))
# # A tibble: 6 x 4
# year week Revenue date
# <int> <int> <dbl> <date>
# 1 2020 9 4543 2020-03-01
# 2 2020 10 6764 2020-03-08
# 3 2020 11 2324 2020-03-15
# 4 2020 12 5674 2020-03-22
# 5 2020 13 2232 2020-03-29
# 6 2020 14 2323 2020-04-05

Filter data by last 12 Months of the total data available in R

R:
I have a data-set with N Products sales value from some yyyy-mm-dd to some yyyy-mm-dd, I just want to filter the data for the last 12 months for each product in the data-set.
Eg:
Say, I have values from 2016-01-01 to 2020-02-01
So now I want to filter the sales values for the last 12 months that is from 2019-02-01 to 2020-02-01
I just cannot simply mention a "filter(Month >= as.Date("2019-04-01") & Month <= as.Date("2020-04-01"))" because the end date keeps changing for my case as every months passes by so I need to automate the case.
You can use :
library(dplyr)
library(lubridate)
data %>%
group_by(Product) %>%
filter(between(date, max(date) - years(1), max(date)))
#filter(date >= (max(date) - years(1)) & date <= max(date))
You can test whether the date is bigger equal the maximal date per product minus 365 days:
library(dplyr)
df %>%
group_by(Products) %>%
filter(Date >= max(Date)-365)
# A tibble: 6 x 2
# Groups: Products [3]
Products Date
<dbl> <date>
1 1 2002-01-21
2 1 2002-02-10
3 2 2002-02-24
4 2 2002-02-10
5 2 2001-07-01
6 3 2005-03-10
Data
df <- data.frame(
Products = c(1,1,1,1,2,2,2,3,3,3),
Date = as.Date(c("2000-02-01", "2002-01-21", "2002-02-10",
"2000-06-01", "2002-02-24", "2002-02-10",
"2001-07-01", "2003-01-02", "2005-03-10",
"2002-05-01")))
If your aim is to just capture entries from today back to the same day last year, then:
The function Sys.Date() returns the current date as an object of type Date. You can then convert that to POSIXlc form to adjust the year to get the start date. For example:
end.date <- Sys.Date()
end.date.lt <- asPOSIXlt(end.date)
start.date.lt <- end.date.lt
start.date.lt$year <- start.date.lt$year - 1
start.date <- asPOSIXct(start.date.lt)
Now this does have one potential fail-state: if today is February 29th. One way to deal with that would be to write a "today.last.year" function to do the above conversion, but give an explicit treatment for leap years - possibly including an option to count "today last year" as either February 28th or March 1st, depending on which gives you the desired behaviour.
Alternatively, if you wanted to filter based on a start-of-month date, you can make your function also set start.date.lt$day = 1, and so forth if you need to adjust in different ways.
Input:
product date
1: a 2017-01-01
2: b 2017-04-01
3: a 2017-07-01
4: b 2017-10-01
5: a 2018-01-01
6: b 2018-04-01
7: a 2018-07-01
8: b 2018-10-01
9: a 2019-01-01
10: b 2019-04-01
11: a 2019-07-01
12: b 2019-10-01
Code:
library(lubridate)
library(data.table)
DT <- data.table(
product = rep(c("a", "b"), 6),
date = seq(as.Date("2017-01-01"), as.Date("2019-12-31"), by = "quarter")
)
yearBefore <- function(x){
year(x) <- year(x) - 1
x
}
date_DT <- DT[, .(last_date = last(date)), by = product]
date_DT[, year_before := yearBefore(last_date)]
result <- DT[, date_DT[DT, on = .(product, year_before <= date), nomatch=0]]
result[, last_date := NULL]
setnames(result, "year_before", "date")
Output:
product date
1: a 2018-07-01
2: b 2018-10-01
3: a 2019-01-01
4: b 2019-04-01
5: a 2019-07-01
6: b 2019-10-01
Is this what you are looking for?

How to repeate the value of the last day of February for a leap year in R?

I have a data.frame that doesn't account for leap year (ie all years are 365 days). I would like to repeat the last day value in February during the leap year. The DF in my code below has fake data set, I intentionally remove the leap day value in DF_NoLeapday. I would like to add a leap day value in DF_NoLeapday by repeating the value of the last day of February in a leap year (in our example it would Feb 28, 2004 value). I would rather like to have a general solution to apply this to many years data.
set.seed(55)
DF <- data.frame(date = seq(as.Date("2003-01-01"), to= as.Date("2005-12-31"), by="day"),
A = runif(1096, 0,10),
Z = runif(1096,5,15))
DF_NoLeapday <- DF[!(format(DF$date,"%m") == "02" & format(DF$date, "%d") == "29"), ,drop = FALSE]
We can use complete on the 'date' column which is already a Date class to expand the rows to fill in the missing dates
library(dplyr)
library(tidyr)
out <- DF_NoLeapday %>%
complete(date = seq(min(date), max(date), by = '1 day'))
dim(out)
#[1] 1096 3
out %>%
filter(date >= '2004-02-28', date <= '2004-03-01')
# A tibble: 3 x 3
# date A Z
# <date> <dbl> <dbl>
#1 2004-02-28 9.06 9.70
#2 2004-02-29 NA NA
#3 2004-03-01 5.30 7.35
By default, the other columns values are filled with NA, if we need to change it to a different value, it can be done within complete with fill
If we need the previous values, then use fill
out <- out %>%
fill(A, Z)
out %>%
filter(date >= '2004-02-28', date <= '2004-03-01')
# A tibble: 3 x 3
# date A Z
# <date> <dbl> <dbl>
#1 2004-02-28 9.06 9.70
#2 2004-02-29 9.06 9.70
#3 2004-03-01 5.30 7.35

R: assign months to day of the year

Here's my data which has 10 years in one column and 365 day of another year in second column
dat <- data.frame(year = rep(1980:1989, each = 365), doy= rep(1:365, times = 10))
I am assuming all years are non-leap years i.e. they have 365 days.
I want to create another column month which is basically month of the year the day belongs to.
library(dplyr)
dat %>%
mutate(month = as.integer(ceiling(day/31)))
However, this solution is wrong since it assigns wrong months to days. I am looking for a dplyr
solution possibly.
We can convert it to to datetime class by using the appropriate format (i.e. %Y %j) and then extract the month with format
dat$month <- with(dat, format(strptime(paste(year, doy), format = "%Y %j"), '%m'))
Or use $mon to extract the month and add 1
dat$month <- with(dat, strptime(paste(year, doy), format = "%Y %j")$mon + 1)
tail(dat$month)
#[1] 12 12 12 12 12 12
This should give you an integer value for the months:
dat$month.num <- month(as.Date(paste(dat$year, dat$doy), '%Y %j'))
If you want the month names:
dat$month.names <- month.name[month(as.Date(paste(dat$year, dat$doy), '%Y %j'))]
The result (only showing a few rows):
> dat[29:33,]
year doy month.num month.names
29 1980 29 1 January
30 1980 30 1 January
31 1980 31 1 January
32 1980 32 2 February
33 1980 33 2 February

Convert character month name to date time object

I must be missing something simple.
I have a data.frame of various date formats and I'm using lubridate which works great with everything except month names by themselves. I can't get the month names to convert to date time objects.
> head(dates)
From To
1 June August
2 January December
3 05/01/2013 10/30/2013
4 July November
5 06/17/2013 10/14/2013
6 05/04/2013 11/23/2013
Trying to change June into date time object:
> as_date(dates[1,1])
Error in charToDate(x) :
character string is not in a standard unambiguous format
> as_date("June")
Error in charToDate(x) :
character string is not in a standard unambiguous format
The actual year and day do not matter. I only need the month. zx8754 suggested using dummy day and year.
lubridate can handle converting the name or abbreviation of a month to its number when it's paired with the rest of the information needed to make a proper date, i.e. a day and year. For example:
lubridate::mdy("August/01/2013", "08/01/2013", "Aug/01/2013")
#> [1] "2013-08-01" "2013-08-01" "2013-08-01"
You can utilize that to write a function that appends "/01/2013" to any month names (I threw in abbreviations as well to be safe). Then apply that to all your date columns (dplyr::mutate_all is just one way to do that).
name_to_date <- function(x) {
lubridate::mdy(ifelse(x %in% c(month.name, month.abb), paste0(x, "/01/2013"), x))
}
dplyr::mutate_all(dates, name_to_date)
#> From To
#> 1 2013-06-01 2013-08-01
#> 2 2013-01-01 2013-12-01
#> 3 2013-05-01 2013-10-30
#> 4 2013-07-01 2013-11-01
#> 5 2013-06-17 2013-10-14
#> 6 2013-05-04 2013-11-23
The following is a crude example of how you could achieve that.
Given that dummy values are fine:
match(dates[1, 1], month.abb)
The above would return you, given that we had Dec in dates[1. 1]:
12
To generate the returned value above along with dummy number in a date format, I tried:
tmp = paste(match(dates[1, 1], month.abb), "2013", sep="/")
which gives us:
12/2013
and then lastly:
result = paste("01", tmp, sep="/")
which returns:
01/12/2013
I am sure there are more flexible approaches than this; but this is just an idea, which I just tried.
Using a custom function:
# dummy data
df1 <- read.table(text = "
From To
1 June August
2 January December
3 05/01/2013 10/30/2013
4 July November
5 06/17/2013 10/14/2013
6 05/04/2013 11/23/2013", header = TRUE, as.is = TRUE)
# custom function
myFun <- function(x, dummyDay = "01", dummyYear = "2013"){
require(lubridate)
x <- ifelse(substr(x, 1, 3) %in% month.abb,
paste(match(substr(x, 1, 3), month.abb),
dummyDay,
dummyYear, sep = "/"), x)
#return date
mdy(x)
}
res <- data.frame(lapply(df1, myFun))
res
# From To
# 1 2013-06-01 2013-08-01
# 2 2013-01-01 2013-12-01
# 3 2013-05-01 2013-10-30
# 4 2013-07-01 2013-11-01
# 5 2013-06-17 2013-10-14
# 6 2013-05-04 2013-11-23

Resources