convert year week string to date - r

I have a column of strings in my data set formatted as year week (e.g. '201401' is equivalent to 7th April 2014, or the first fiscal week of the year)
I am trying to convert these to a proper date so I can manipulate them later, however I always receive the dame date for a given year, specifically the 14th of April.
e.g.
test_set <- c('201401', '201402', '201403')
as.Date(test_set, '%Y%U')
gives me:
[1] "2014-04-14" "2014-04-14" "2014-04-14"

Try something like this:
> test_set <- c('201401', '201402', '201403')
>
> extractDate <- function(dateString, fiscalStart = as.Date("2014-04-01")) {
+ week <- substr(dateString, 5, 6)
+ currentDate <- fiscalStart + 7 * as.numeric(week) - 1
+ currentDate
+ }
>
> extractDate(test_set)
[1] "2014-04-07" "2014-04-14" "2014-04-21"
Basically, I'm extracting the weeks from the start of the year, converting it to days and then adding that number of days to the start of the fiscal year (less 1 day to make things line up).

Not 100% sure what is your desired output but this may work
as.Date(paste0(substr(test_set, 1, 4), "-04-07")) +
(as.numeric(substr(test_set, 5, 6)) - 1) * 7
# [1] "2014-04-07" "2014-04-14" "2014-04-21"

Related

change the weekend date to weekday date in R

I am reading in an excel file in R and calculating the date 6 months prior to the date. If the date is falls on Weekend, need to change the date to the following weekday.
for example: if date is 2020-2-7, the six months prior is 2019-08-11. Which is Sunday.
How do I change the date to 2019-08-12?
I tried the following code:
date <- as.date.character("2020-2-7")
nxtd <- date-180
if(weekdays(nxtd)=="Saturday"){nxtd <- date-182} else if(weekdays(nxtd)=="Sunday"){nxtd <- date-181}
else{nxtd <- date-180}
this code gives an error/warning " the condition has length > 1 and only the first element will be used"
How do I resolve it?
library(lubridate)
d1 = as.Date("2020-2-9")
d2 = d1 - 180
if(weekdays(d2) %in% c("Saturday", "Sunday")){
floor_date(d2 - 3, "week") + 8
} else {
d2
}

Format date strings comprising weeks and quarters as Date objects

I have dates in an R dataframe column formatted as character strings as WK01Q32014.
I want to turn each date into a Date() object.
So I altered the format to make it look like 01-3-2014. I want to try to do something like as.Date("01-3-2014","%W-%Q-%Y") for example, but there is no format code for quarters that I know of.
Is there any way to do this using the lubridate, zoo, or any other libraries?
I dont know of any specific function, but here's a basic one:
convert_WQ_to_Date <- function(D) {
weeks <- as.integer(substr(D, 3, 4))
quarter <- as.integer(substr(D, 6, 6))
year <- substr(D, 7, 10)
days <- 7 * ((quarter - 1) * 13 + (weeks-1))
as.Date(sprintf("%s-01-01", year)) + days
}
Example
D <- c("WK01Q32014", "WK01Q12014", "WK05Q42014", "WK01Q22014", "WK02Q32014")
convert_WQ_to_Date(D)
[1] "2014-07-02" "2014-01-01" "2014-10-29" "2014-04-02" "2014-07-09"
The week, quarter and year does not uniquely define a date so we will have to add some assumption. Here we add the assumption that the first week is the first day of the quarter, the second week is 7 days later and so on,
Below, we extract the qtr-year part and use as.yearqtr in the zoo package to convert that to a yearqtr object and then use as.Date to convert that to a date which is the first of the quarter. We then extract the week, subtract 1 and multiply by 7 to get the days offset. Adding the first of the quarter to the offset gives the result:
library(zoo)
xx <- "01-3-2014" # week-quarter-year
qtr.start <- as.Date(as.yearqtr(sub("...", "", xx), "%q-%Y"))
days <- 7 * (as.numeric(sub("-.*", "", xx)) - 1)
qtr.start + days
## [1] "2014-07-01"
Assuming the traditional notion of each quarter starting respectively at the 1st January, 1st April, 1st July and 1st September (in line with the quarters function), just start at these dates and add 7 days for each week:
x <- c("01-3-2014","01-1-2014","05-4-2014","01-2-2014","02-3-2014")
y <- as.numeric(substr(x,6,9))
m <- as.numeric(substr(x,4,4))
d <- as.numeric(substr(x,1,2))
as.Date(paste(y,(m-1)*3+1,"01",sep="-")) + (7*(d-1))
#[1] "2014-07-01" "2014-01-01" "2014-10-29" "2014-04-01" "2014-07-08"

Count the number of Fridays or Mondays in Month in R

I would like a function that counts the number of specific days per month..
i.e.. Nov '13 -> 5 fridays.. while Dec'13 would return 4 Fridays..
Is there an elegant function that would return this?
library(lubridate)
num_days <- function(date){
x <- as.Date(date)
start = floor_date(x, "month")
count = days_in_month(x)
d = wday(start)
sol = ifelse(d > 4, 5, 4) #estimate that is the first day of the month is after Thu or Fri then the week will have 5 Fridays
sol
}
num_days("2013-08-01")
num_days(today())
What would be a better way to do this?
1) Here d is the input, a Date class object, e.g. d <- Sys.Date(). The result gives the number of Fridays in the year/month that contains d. Replace 5 with 1 to get the number of Mondays:
first <- as.Date(cut(d, "month"))
last <- as.Date(cut(first + 31, "month")) - 1
sum(format(seq(first, last, "day"), "%w") == 5)
2) Alternately replace the last line with the following line. Here, the first term is the number of Fridays from the Epoch to the next Friday on or after the first of the next month and the second term is the number of Fridays from the Epoch to the next Friday on or after the first of d's month. Again, we replace all 5's with 1's to get the count of Mondays.
ceiling(as.numeric(last + 1 - 5 + 4) / 7) - ceiling(as.numeric(first - 5 + 4) / 7)
The second solution is slightly longer (although it has the same number of lines) but it has the advantage of being vectorized, i.e. d could be a vector of dates.
UPDATE: Added second solution.
There are a number of ways to do it. Here is one:
countFridays <- function(y, m) {
fr <- as.Date(paste(y, m, "01", sep="-"))
to <- fr + 31
dt <- seq(fr, to, by="1 day")
df <- data.frame(date=dt, mon=as.POSIXlt(dt)$mon, wday=as.POSIXlt(dt)$wday)
df <- subset(df, df$wday==5 & df$mon==df[1,"mon"])
return(nrow(df))
}
It creates the first of the months, and a day in the next months.
It then creates a data frame of month index (on a 0 to 11 range, but we only use this for comparison) and weekday.
We then subset to a) be in the same month and b) on a Friday. That is your result set, and
we return the number of rows as your anwser.
Note that this only uses base R code.
Without using lubridate -
#arguments to pass to function:
whichweekday <- 5
whichmonth <- 11
whichyear <- 2013
#function code:
firstday <- as.Date(paste('01',whichmonth,whichyear,sep="-"),'%d-%m-%Y')
lastday <- if(whichmonth == 12) { '31-12-2013' } else {seq(as.Date(firstday,'%d-%m-%Y'), length=2, by="1 month")[2]-1}
sum(
strftime(
seq.Date(
from = firstday,
to = lastday,
by = "day"),
'%w'
) == whichweekday)

Add a month to a Date [duplicate]

This question already has answers here:
How to subtract months from a date in R?
(6 answers)
Closed 4 years ago.
I am trying to add a month to a date i have. But then its not possible in a straight manner so far. Following is what i tried.
d <- as.Date("2004-01-31")
d + 60
# [1] "2004-03-31"
Adding wont help as the month wont be overlapped.
seq(as.Date("2004-01-31"), by = "month", length = 2)
# [1] "2004-01-31" "2004-03-02"
Above might work , but again its not straight forward.
Also its also adding 30 days or something to the date which has issues like the below
seq(as.Date("2004-01-31"), by = "month", length = 10)
# [1] "2004-01-31" "2004-03-02" "2004-03-31" "2004-05-01" "2004-05-31" "2004-07-01" "2004-07-31" "2004-08-31" "2004-10-01" "2004-10-31"
In the above , for the first 2 dates , month haven’t changed.
Also the following approach also failed for month but was success for year
d <- as.POSIXlt(as.Date("2010-01-01"))
d$year <- d$year +1
d
# [1] "2011-01-01 UTC"
d <- as.POSIXlt(as.Date("2010-01-01"))
d$month <- d$month +1
d
Error in format.POSIXlt(x, usetz = TRUE) : invalid 'x' argument
What is the right method to do this ?
Function %m+% from lubridate adds one month without exceeding last day of the new month.
library(lubridate)
(d <- ymd("2012-01-31"))
1 parsed with %Y-%m-%d
[1] "2012-01-31 UTC"
d %m+% months(1)
[1] "2012-02-29 UTC"
It is ambiguous when you say "add a month to a date".
Do you mean
add 30 days?
increase the month part of the date by 1?
In both cases a whole package for a simple addition seems a bit exaggerated.
For the first point, of course, the simple + operator will do:
d=as.Date('2010-01-01')
d + 30
#[1] "2010-01-31"
As for the second I would just create a one line function as simple as that (and with a more general scope):
add.months= function(date,n) seq(date, by = paste (n, "months"), length = 2)[2]
You can use it with arbitrary months, including negative:
add.months(d, 3)
#[1] "2010-04-01"
add.months(d, -3)
#[1] "2009-10-01"
Of course, if you want to add only and often a single month:
add.month=function(date) add.months(date,1)
add.month(d)
#[1] "2010-02-01"
If you add one month to 31 of January, since 31th February is meaningless, the best to get the job done is to add the missing 3 days to the following month, March. So correctly:
add.month(as.Date("2010-01-31"))
#[1] "2010-03-03"
In case, for some very special reason, you need to put a ceiling to the last available day of the month, it's a bit longer:
add.months.ceil=function (date, n){
#no ceiling
nC=add.months(date, n)
#ceiling
day(date)=01
C=add.months(date, n+1)-1
#use ceiling in case of overlapping
if(nC>C) return(C)
return(nC)
}
As usual you could add a single month version:
add.month.ceil=function(date) add.months.ceil(date,1)
So:
d=as.Date('2010-01-31')
add.month.ceil(d)
#[1] "2010-02-28"
d=as.Date('2010-01-21')
add.month.ceil(d)
#[1] "2010-02-21"
And with decrements:
d=as.Date('2010-03-31')
add.months.ceil(d, -1)
#[1] "2010-02-28"
d=as.Date('2010-03-21')
add.months.ceil(d, -1)
#[1] "2010-02-21"
Besides you didn't tell if you were interested to a scalar or vector solution. As for the latter:
add.months.v= function(date,n) as.Date(sapply(date, add.months, n), origin="1970-01-01")
Note: *apply family destroys the class data, that's why it has to be rebuilt.
The vector version brings:
d=c(as.Date('2010/01/01'), as.Date('2010/01/31'))
add.months.v(d,1)
[1] "2010-02-01" "2010-03-03"
Hope you liked it))
Vanilla R has a naive difftime class, but the Lubridate CRAN package lets you do what you ask:
require(lubridate)
d <- ymd(as.Date('2004-01-01')) %m+% months(1)
d
[1] "2004-02-01"
Hope that helps.
The simplest way is to convert Date to POSIXlt format.
Then perform the arithmetic operation as follows:
date_1m_fwd <- as.POSIXlt("2010-01-01")
date_1m_fwd$mon <- date_1m_fwd$mon +1
Moreover, incase you want to deal with Date columns in data.table, unfortunately, POSIXlt format is not supported.
Still you can perform the add month using basic R codes as follows:
library(data.table)
dt <- as.data.table(seq(as.Date("2010-01-01"), length.out=5, by="month"))
dt[,shifted_month:=tail(seq(V1[1], length.out=length(V1)+3, by="month"),length(V1))]
Hope it helps.
"mondate" is somewhat similar to "Date" except that adding n adds n months rather than n days:
> library(mondate)
> d <- as.Date("2004-01-31")
> as.mondate(d) + 1
mondate: timeunits="months"
[1] 2004-02-29
Here's a function that doesn't require any packages to be installed. You give it a Date object (or a character that it can convert into a Date), and it adds n months to that date without changing the day of the month (unless the month you land on doesn't have enough days in it, in which case it defaults to the last day of the returned month). Just in case it doesn't make sense reading it, there are some examples below.
Function definition
addMonth <- function(date, n = 1){
if (n == 0){return(date)}
if (n %% 1 != 0){stop("Input Error: argument 'n' must be an integer.")}
# Check to make sure we have a standard Date format
if (class(date) == "character"){date = as.Date(date)}
# Turn the year, month, and day into numbers so we can play with them
y = as.numeric(substr(as.character(date),1,4))
m = as.numeric(substr(as.character(date),6,7))
d = as.numeric(substr(as.character(date),9,10))
# Run through the computation
i = 0
# Adding months
if (n > 0){
while (i < n){
m = m + 1
if (m == 13){
m = 1
y = y + 1
}
i = i + 1
}
}
# Subtracting months
else if (n < 0){
while (i > n){
m = m - 1
if (m == 0){
m = 12
y = y - 1
}
i = i - 1
}
}
# If past 28th day in base month, make adjustments for February
if (d > 28 & m == 2){
# If it's a leap year, return the 29th day
if ((y %% 4 == 0 & y %% 100 != 0) | y %% 400 == 0){d = 29}
# Otherwise, return the 28th day
else{d = 28}
}
# If 31st day in base month but only 30 days in end month, return 30th day
else if (d == 31){if (m %in% c(1, 3, 5, 7, 8, 10, 12) == FALSE){d = 30}}
# Turn year, month, and day into strings and put them together to make a Date
y = as.character(y)
# If month is single digit, add a leading 0, otherwise leave it alone
if (m < 10){m = paste('0', as.character(m), sep = '')}
else{m = as.character(m)}
# If day is single digit, add a leading 0, otherwise leave it alone
if (d < 10){d = paste('0', as.character(d), sep = '')}
else{d = as.character(d)}
# Put them together and convert return the result as a Date
return(as.Date(paste(y,'-',m,'-',d, sep = '')))
}
Some examples
Adding months
> addMonth('2014-01-31', n = 1)
[1] "2014-02-28" # February, non-leap year
> addMonth('2014-01-31', n = 5)
[1] "2014-06-30" # June only has 30 days, so day of month dropped to 30
> addMonth('2014-01-31', n = 24)
[1] "2016-01-31" # Increments years when n is a multiple of 12
> addMonth('2014-01-31', n = 25)
[1] "2016-02-29" # February, leap year
Subtracting months
> addMonth('2014-01-31', n = -1)
[1] "2013-12-31"
> addMonth('2014-01-31', n = -7)
[1] "2013-06-30"
> addMonth('2014-01-31', n = -12)
[1] "2013-01-31"
> addMonth('2014-01-31', n = -23)
[1] "2012-02-29"
addedMonth <- seq(as.Date('2004-01-01'), length=2, by='1 month')[2]
addedQuarter <- seq(as.Date('2004-01-01'), length=2, by='1 quarter')[2]
I turned antonio's thoughts into a specific function:
library(DescTools)
> AddMonths(as.Date('2004-01-01'), 1)
[1] "2004-02-01"
> AddMonths(as.Date('2004-01-31'), 1)
[1] "2004-02-29"
> AddMonths(as.Date('2004-03-30'), -1)
[1] "2004-02-29"

Extract Date in R

I struggle mightily with dates in R and could do this pretty easily in SPSS, but I would love to stay within R for my project.
I have a date column in my data frame and want to remove the year completely in order to leave the month and day. Here is a peak at my original data.
> head(ds$date)
[1] "2003-10-09" "2003-10-11" "2003-10-13" "2003-10-15" "2003-10-18" "2003-10-20"
> class((ds$date))
[1] "Date"
I "want" it to be.
> head(ds$date)
[1] "10-09" "10-11" "10-13" "10-15" "10-18" "10-20"
> class((ds$date))
[1] "Date"
If possible, I would love to set the first date to be October 1st instead of January 1st.
Any help you can provide will be greatly appreciated.
EDIT: I felt like I should add some context. I want to plot an NHL player's performance over the course of a season which starts in October and ends in April. To add to this, I would like to facet the plots by each season which is a separate column in my data frame. Because I want to compare cumulative performance over the course of the season, I believe that I need to remove the year portion, but maybe I don't; as I indicated, I struggle with dates in R. What I am looking to accomplish is a plot that compares cumulative performance over relative dates by season and have the x-axis start in October and end in April.
> d = as.Date("2003-10-09", format="%Y-%m-%d")
> format(d, "%m-%d")
[1] "10-09"
Is this what you are looking for?
library(ggplot2)
## make up data for two seasons a and b
a = as.Date("2010/10/1")
b = as.Date("2011/10/1")
a.date <- seq(a, by='1 week', length=28)
b.date <- seq(b, by='1 week', length=28)
## make up some score data
a.score <- abs(trunc(rnorm(28, mean = 10, sd = 5)))
b.score <- abs(trunc(rnorm(28, mean = 10, sd = 5)))
## create a data frame
df <- data.frame(a.date, b.date, a.score, b.score)
df
## Since I am using ggplot I better create a "long formated" data frame
df.molt <- melt(df, measure.vars = c("a.score", "b.score"))
levels(df.molt$variable) <- c("First season", "Second season")
df.molt
Then, I am using ggplot2 for plotting the data:
## plot it
ggplot(aes(y = value, x = a.date), data = df.molt) + geom_point() +
geom_line() + facet_wrap(~variable, ncol = 1) +
scale_x_date("Date", format = "%m-%d")
If you want to modify the x-axis (e.g., display format), then you'll probably be interested in scale_date.
You have to remember Date is a numeric format, representing the number of days passed since the "origin" of the internal date counting :
> str(Date)
Class 'Date' num [1:10] 14245 14360 14475 14590 14705 ...
This is the same as in EXCEL, if you want a reference. Hence the solution with format as perfectly valid.
Now if you want to set the first date of a year as October 1st, you can construct some year index like this :
redefine.year <- function(x,start="10-1"){
year <- as.numeric(strftime(x,"%Y"))
yearstart <- as.Date(paste(year,start,sep="-"))
year + (x >= yearstart) - min(year) + 1
}
Testing code :
Start <- as.Date("2009-1-1")
Stop <- as.Date("2011-11-1")
Date <- seq(Start,Stop,length.out=10)
data.frame( Date=as.character(Date),
year=redefine.year(Date))
gives
Date year
1 2009-01-01 1
2 2009-04-25 1
3 2009-08-18 1
4 2009-12-11 2
5 2010-04-05 2
6 2010-07-29 2
7 2010-11-21 3
8 2011-03-16 3
9 2011-07-09 3
10 2011-11-01 4

Resources