extract month from a character year and doy in R - r

Given a year and a day (in Julian day), how can I extract the month? For e.g.
Year <- '2000'
Doy <- '159'
I want to extract the month for the above Year and Doy. I thought first I will convert this into date and then extract the month out of it using format(mydate,"%m")
# first convert into date and then extract the month
as.Date(paste0(Year'-',Doy), format = '%Y-%d')
NA
This gives me NA.

%d is for day of month. %j is for day of year where Jan 1 is day of year 1, Jan 2 is day of year 2, ..., Dec 31 is day of year 365 (or 366 on leap years). See ?strptime for the percent codes.
Year <- '2000'
Doy <- '159'
date <- as.Date(paste(Year, Doy), format = "%Y %j"); date
## [1] "2000-06-07"
as.numeric(format(date, "%m")) # month number
## [1] 6

Related

Get days based on integer week and year in R

I have many strings like "200046". The first four digits are the year, and the last two is the number of the week per year. I'm trying to find the 7 days of the week for that week. I tried something like
date = as.Date(str, "%Y%M")
but it returns "2000-01-29" which is not the 46th week of 2000. How can I do that?
Add day of the week to str.
str <- '200046'
as.Date(paste0(str, 1), "%Y%U%u")
#[1] "2000-11-13"
This is 1st day (Monday) of 46th week of 2000.
Now to get all days of the week you can do :
as.Date(paste0(str, 1), "%Y%U%u") + 0:6
#[1] "2000-11-13" "2000-11-14" "2000-11-15" "2000-11-16" "2000-11-17" "2000-11-18" "2000-11-19"

Adding fiscal month end

I would like to mutate a fiscal month-end date to a dataset in R. In my company the fiscal month-end would be on 21st of that. For example
12/22/2019 to 1/21/2020 will be Jan-2020
1/22/2020 to 2/21/2020 will be Feb-2020
2/22/2020 to 3/21/2020 will be Mar-2020
etc
Dataset
Desired_output
How would I accomplish this in R. The Date column in my data is %m/%d/%Y(1/22/2020)
You could extract the date and if date is greater than 22 add 10 days to it and get the date in month-year format :
transform(dat, Fiscal_Month = format(Date +
ifelse(as.integer(format(Date, '%d')) >= 22, 10, 0), '%b %Y'))
# Date Fiscal_Month
#1 2020-01-20 Jan 2020
#2 2020-01-21 Jan 2020
#3 2020-01-22 Feb 2020
#4 2020-01-23 Feb 2020
#5 2020-01-24 Feb 2020
This can also be done without ifelse like this :
transform(dat, Fiscal_Month = format(Date + c(0, 10)
[(as.integer(format(Date, '%d')) >= 22) + 1], '%b %Y'))
data
Used this sample data :
dat <- data.frame(Date = seq(as.Date('2020-01-20'), by = '1 day',length.out = 5))
1) yearmon We perform the following steps:
create test data d which shows both a date in the start of period month (i.e. 22nd or later) and a date in the end of period month (i.e. 21st or earlier)
convert the input d to Date class giving dd
subtract 21 days thereby shifting it to the month that starts the fiscal period
convert that to ym of yearmon class (which represents a year and a month without a day directly and internally represents it as the year plus 0 for Jan, 1/12 for Feb, ..., 11/12 for Dec) and then add 1/12 to get to the month at the end of fiscal period.
format it as shown. (We could omit this step, i.e. the last line of code, if the default format, e.g. Jan 2020, that yearmon uses is ok.
The whole thing could easily be written in a single line of code but we have broken it up for clarity.
library(zoo)
d <- c("1/22/2020", "1/21/2020") # test data
dd <- as.Date(d, "%m/%d/%Y")
ym <- as.yearmon(dd - 21) + 1/12
format(ym, "%b-%y")
## [1] "Feb-20" "Jan-20"
2) Base R This could be done using only in base R as follows. We make use of dd from above. cut computes the first of the month that dd-21 lies in (but not as a Date class object) and then as.Date converts it to one. Adding 31 shifts it to the end of period month and formatting this we get the final answer.
format(as.Date(cut(dd - 21, "month")) + 31, "%b-%y")
## [1] "Feb-20" "Jan-20"

How to Get the Same Weekday Last Year Given any Given Year?

I would like to get the same day last year given any year. How can I best do this in R. For example, given Sunday 2010/01/03, I would like to obtain the Sunday of the same week the year before.
# "Sunday"
weekdays(as.Date("2010/01/03", format="%Y/%m/%d"))
# "Saturday"
weekdays(as.Date("2009/01/03", format="%Y/%m/%d"))
To find the same weekday one year ago, simply subtract 52 weeks or 364 days from the given date:
d <- as.Date("2010-01-03")
weekdays(d)
#[1] "Sunday"
d - 52L * 7L
#[1] "2009-01-04"
weekdays(d - 52L * 7L)
#[1] "Sunday"
Please note that the calendar year has 365 days (or 366 days in a leap year) which is one or two days more than 52 weeks. So, the calendar date of the same weekday one year ago moves on by one or two days. (Or, it explains why New Year's Eve is always on a different weekday.)
Using lubridate the following formula will give you the corresponding weekday in the same week in the previous year:
as.Date(dDate - 364 - ifelse(weekdays( dDate - 363) == weekdays( dDate ), 1, 0))
Where dDate is some date, i.e. dDate <- as.Date("2016-02-29"). The ifelse accounts for leap years.
Here's a simple algorithm. subtract 365 days from the day of interest. Adjust that day to the closest matching day of the week using the Tableau code below (easily translatable into other languages). This is equivalent to the rule in the table below (with 1 = Monday and 7 = Sunday). Basically you adjust day - 365 to be on the correct day of the week either in the same week if that moves <= 3 days otherwise you use the matching weekday from the previous/next week. It'll choose whichever leads to the least difference in terms of # of days.
[day prior year raw] = [day] - 365
[matching day prior year] =
if abs(datepart('weekday',[day]) - datepart('weekday',[day prior year raw]))<= 3
then [day prior year raw]+datepart('weekday',[day]) - datepart('weekday',[day prior year raw])
else [day prior year raw]+(if datepart('weekday',[day]) > datepart('weekday',[day prior year raw])
then -7+(datepart('weekday',[day]) - datepart('weekday',[day prior year raw]))
else 7+(datepart('weekday',[day]) - datepart('weekday',[day prior year raw])) end
)
end)
Look at ?years in package lubridate. This creates a period object which correctly spans a period, across leap years.
> library(lubridate)
> # set the reference date
> d1 = as.Date("2017/01/03", format="%Y/%m/%d")
>
> # verify across years and leap years
> d1 - years(1)
[1] "2016-01-03"
> d1 - years(2)
[1] "2015-01-03"
> d1 - years(3)
[1] "2014-01-03"
> d1 - years(4)
[1] "2013-01-03"
> d1 - years(5)
[1] "2012-01-03"
>
> weekdays(d1 - years(1))
[1] "Sunday"
> weekdays(d1 - years(2))
[1] "Saturday"
>
> # feb 29 on year period in yields NA
> ymd("2016/02/29") - years(1)
[1] NA
>
> # feb 29 in a non-leap year fails to convert
> ymd("2015/02/29") - years(1)
[1] NA
Warning message:
All formats failed to parse. No formats found.
>
> # feb 29, leap year with 4 year period works.
> ymd("2016/02/29") - years(4)
[1] "2012-02-29"
>

Convert date to day-of-week in R

I have a date in this format in my data frame:
"02-July-2015"
And I need to convert it to the day of the week (i.e. 183). Something like:
df$day_of_week <- weekdays(as.Date(df$date_column))
But this doesn't understand the format of the dates.
You could use lubridate to convert to day of week or day of year.
library(lubridate)
# "02-July-2015" is Thursday
date_string <- "02-July-2015"
dt <- dmy(date_string)
dt
## [1] "2015-07-02 UTC"
### Day of week : (1-7, Sunday is 1)
wday(dt)
## [1] 5
### Day of year (1-366; for 2015, only 365)
yday(dt)
## [1] 183
### Or a little shorter to do the same thing for Day of year
yday(dmy("02-July-2015"))
## [1] 183
day = as.POSIXct("02-July-2015",format="%d-%b-%Y")
# see ?strptime for more information on date-time conversions
# Day of year as decimal number (001–366).
format(day,format="%j")
[1] "183"
#Weekday as a decimal number (1–7, Monday is 1).
format(day,format="%u")
[1] "4"
This is what anotherFishGuy supposed, plus converting the values to as.numeric so they fit through classifier.
# day <- Sys.time()
as.num.format <- function(day, ...){
as.numeric(format(day, ...))
}
doy <- as.num.format(day,format="%j")
doy <- as.num.format(day,format="%u")
hour <- as.num.format(day, "%H")

How to calculate date based on week number in R

I was wondering if there is a way to get the begin of the week date based on a week number in R? For example, if I enter week number = 10, it should give me 9th March, 2014.
I know how to get the reverse (aka..given a date, get the week number by using as.POSIX functions).
Thanks!
Prakhar
You can try this:
first.day <- as.numeric(format(as.Date("2014-01-01"), "%w"))
week <- 10
as.Date("2014-01-01") + week * 7 - first.day
# [1] "2014-03-09"
This assumes weeks start on Sundays. First, find what day of the week Jan 1 is, then, just add 7 * number of weeks to Jan 1, - the day of week Jan 1 is.
Note this is slightly different to what you get if you use %W when doing the reverse, as from that perspective the first day of the week seems to be Monday:
format(seq(as.Date("2014-03-08"), by="1 day", len=5), "%W %A %m-%d")
# [1] "09 Saturday 03-08" "09 Sunday 03-09" "10 Monday 03-10" "10 Tuesday 03-11"
# [5] "10 Wednesday 03-12"
but you can adjust the above code easily if you prefer the Monday centric view.
You may try the ISOweek2date function in package ISOweek.
Create a function which takes year, week, weekday as arguments and returns date(s):
date_in_week <- function(year, week, weekday){
w <- paste0(year, "-W", sprintf("%02d", week), "-", weekday)
ISOweek2date(w)
}
date_in_week(year = 2014, week = 10, weekday = 1)
# [1] "2014-03-03"
This date is corresponds to an ISO8601 calendar (see %V in ?strptime). I assume you are using the US convention (see %U in ?strptime). Then some tweeking is needed to convert between ISO8601 and US standard:
date_in_week(year = 2014, week = 10 + 1, weekday = 1) - 1
# [1] "2014-03-09"
You can enter several weekdays, e.g.
date_in_week(year = 2014, week = 10 + 1, weekday = 1:3) - 1
# [1] "2014-03-09" "2014-03-10" "2014-03-11"
You can also use strptime to easily get dates from weeks starting on Mondays:
first_date_of_week <- function(year, week){
strptime(paste(year, week, 1), format = "%Y %W %u")
}
You can accomplish this using the package lubridate
library(lubridate)
start = ymd("2014-01-01")
#[1] "2014-01-01 UTC"
end = start+10*weeks()
end = end-(wday(end)-1)*days()
#[1] "2014-03-09 UTC"

Resources