I'd like to get week number in year which starts on Friday in R.
For example:
2016-01-01 to 2016-01-07 is 1st week of the year
2016-01-08 to 2016-01-14 is 2nd week of the year
Can you help?
One package you might find useful is Lubridate. It really helps with tasks using dates and times.
For example:
> library(lubridate)
> week(as.POSIXct("2016-01-01"))
[1] 1
Or, if you are curious about what week number Halloween falls on:
> week(as.Date("2016-10-31"))
[1] 44
for more info: https://cran.r-project.org/web/packages/lubridate/lubridate.pdf
You can use lubridate in this case, I've just had the same problem as you are. In my case I need it to be started on Wed, so this code below is to start week on Wednesday,
weekOnWed <- function(theDate) {
formattedDate <- dmy (theDate)
i <- 1
while (i <= 7) {
isWed <- dmy(paste("0", as.character(i), "/01/", year(formattedDate), sep = ""))
## because Wed is the 4th day of the week on Lubridate
you need to change the number below to 6 (as Fri is 6th day) on Lubridate
if (wday(isWed) == 4) {
break;
}
else {
i <- i + 1
}
}
firstWed <- day(isWed)
if (firstWed > 1) {
firstWeek <- 2
}
else {
firstWeek <- 1
}
rangeWeek <- as.integer(formattedDate - isWed) %/% 7
weekNum <- rangeWeek + firstWeek
weekNum
}
input: string (in dd-mm-yyyy) format
output of this function is week number in integer
hopefully that helps :)
Related
I have a data frame in R with the week of the year that I would like to convert to a date. I know I have to pick a year and a day of the week so I am fixing those values at 2014 and 1. Converting this to a date seems simple:
as.Date(paste(2014,df$Week,1,sep=""),"%Y%U%u")
But this code only works if week is greater than 9. Week 1 - 9 returns NA. If I change the week to 01,02,03... it still returns NA.
Anyone see what I am missing?
as.Date is calling the 1 to 9 as NA as it is expects two digits for the week number and can't properly parse it.
To fix it, add in some - to split things up:
as.Date(paste(2014, df$Week, 1, sep="-"), "%Y-%U-%u")
An alternative solution is to use date arithmetic from the lubridate package:
lubridate::ymd( "2014-01-01" ) + lubridate::weeks( df$Week - 1 )
The -1 is necessary because 2014-01-01 is already week 1. In other words, we want:
df$Week == 1 to map to 2014-01-01 (which is ymd("2014-01-01") + weeks(1-1))
df$Week == 2 to map to 2014-01-08 (which is ymd("2014-01-01") + weeks(2-1))
and so on.
Another option with lubridate
lubridate::parse_date_time(paste(2014, df$Week, 1, sep="/"),'Y/W/w')
W - week number, w - weekday number, 0-6 (Sun-Sat)
Another alternative is to make sure that week numbers have two digits, which can be done using stringr::str_pad(), which will add a pad="0" to make sure there are width=2 digits:
year <- 2015
week <- 1
as.Date(paste(year, week, "1", sep=""), "%Y%U%u")
#> [1] NA
as.Date(paste(year, stringr::str_pad(week,width=2, pad="0"), "1", sep=""), "%Y%U%u")
#> [1] "2015-01-05"
as.Date(paste(year, week, "1", sep="-"), "%Y-%U-%u")
#> [1] "2015-01-05"
Created on 2021-04-19 by the reprex package (v1.0.0)
It will be like using 2nd year = (week-52), 3rd year = (week -104)...so on
for(i in 1:456548)
{
if (train[i,2] > 0 & train[i,2] <53)
{
train["weekdate"] <- as.Date(paste(2016, train$week, 1, sep="-"), "%Y-%U-%u")
}
if (train[i,2] > 52 & train[i,2] <105)
{
train["weekdate"] <- as.Date(paste(2017, (train$week-52), 1, sep="-"), "%Y-%U-%u")
}
if (train[i,2] > 104 & train[i,2] <150)
{
train["weekdate"] <- as.Date(paste(2018, (train$week-104), 1, sep="-"), "%Y-%U-%u")
}
}
I have a column of strings in my data set formatted as year week (e.g. '201401' is equivalent to 7th April 2014, or the first fiscal week of the year)
I am trying to convert these to a proper date so I can manipulate them later, however I always receive the dame date for a given year, specifically the 14th of April.
e.g.
test_set <- c('201401', '201402', '201403')
as.Date(test_set, '%Y%U')
gives me:
[1] "2014-04-14" "2014-04-14" "2014-04-14"
Try something like this:
> test_set <- c('201401', '201402', '201403')
>
> extractDate <- function(dateString, fiscalStart = as.Date("2014-04-01")) {
+ week <- substr(dateString, 5, 6)
+ currentDate <- fiscalStart + 7 * as.numeric(week) - 1
+ currentDate
+ }
>
> extractDate(test_set)
[1] "2014-04-07" "2014-04-14" "2014-04-21"
Basically, I'm extracting the weeks from the start of the year, converting it to days and then adding that number of days to the start of the fiscal year (less 1 day to make things line up).
Not 100% sure what is your desired output but this may work
as.Date(paste0(substr(test_set, 1, 4), "-04-07")) +
(as.numeric(substr(test_set, 5, 6)) - 1) * 7
# [1] "2014-04-07" "2014-04-14" "2014-04-21"
I want to be able to create a water year column for a time series. The US water year is from Oct-Sept and is considered the year it ends on. For example the 2014 water year is from October 1, 2013 - September 30, 2014.
This is the US water year, but not the only water year. Therefore I want to enter in a start month and have a water year calculated for the date.
For example if my data looks like
date
2008-01-01 00:00:00
2008-02-01 00:00:00
2008-03-01 00:00:00
2008-04-01 00:00:00
.
.
.
2008-12-01 00:00:00
I want my function to work something like:
wtr_yr <- function(data, start_month) {
does stuff
}
Then my output would be
wtr_yr(data, 2)
date wtr_yr
2008-01-01 00:00:00 2008
2008-02-01 00:00:00 2009
2008-03-01 00:00:00 2009
2008-04-01 00:00:00 2009
.
.
.
2009-01-01 00:00:00 2009
2009-02-01 00:00:00 2010
2009-03-01 00:00:00 2010
2009-04-01 00:00:00 2010
I started by breaking the date up into separate columns, but I don't think that is the best way to go about it. Any advice?
Thanks in advance!
We can use POSIXlt to come up with an answer.
wtr_yr <- function(dates, start_month=9) {
# Convert dates into POSIXlt
dates.posix = as.POSIXlt(dates)
# Year offset
offset = ifelse(dates.posix$mon >= start_month - 1, 1, 0)
# Water year
adj.year = dates.posix$year + 1900 + offset
# Return the water year
adj.year
}
Let's now use this function in an example.
# Sample input vector
dates = c("2008-01-01 00:00:00",
"2008-02-01 00:00:00",
"2008-03-01 00:00:00",
"2008-04-01 00:00:00",
"2009-01-01 00:00:00",
"2009-02-01 00:00:00",
"2009-03-01 00:00:00",
"2009-04-01 00:00:00")
# Display the function output
wtr_yr(dates, 2)
# Combine the input and output vectors in a dataframe
df = data.frame(dates, wtr_yr=wtr_yr(dates, 2))
I had a similar problem a while back but dealing with fiscal years that started in October. I found this function which also computes the quarters within the year. For one part, I only wanted it to output the fiscal year, so I edited a tiny part of the function to do that. There is surely a much cleaner/efficient way of doing it, but this should work for smaller data sets. Here is the edited function:
getYearQuarter <- function(x,
firstMonth=7,
fy.prefix='FY',
quarter.prefix='Q',
sep='-',
level.range=c(min(x), max(x)) ) {
if(level.range[1] > min(x) | level.range[2] < max(x)) {
warning(paste0('The range of x is greater than level.range. Values ',
'outside level.range will be returned as NA.'))
}
quarterString <- function(d) {
year <- as.integer(format(d, format='%Y'))
month <- as.integer(format(d, format='%m'))
y <- ifelse(firstMonth > 1 & month >= firstMonth, year+1, year)
q <- cut( (month - firstMonth) %% 12, breaks=c(-Inf,2,5,8,Inf),
labels=paste0(quarter.prefix, 1:4))
return(paste0(fy.prefix, substring(y,3,4)))
}
vals <- quarterString(x)
levels <- unique(quarterString(seq(
as.Date(format(level.range[1], '%Y-%m-01')),
as.Date(format(level.range[2], '%Y-%m-28')), by='month')))
return(factor(vals, levels=levels, ordered=TRUE))
}
Your input vector should be type Date, and then specify the start month. Assuming you have a data frame(df) with the 'date' column as in your question, this should do the trick.
df$wtr_yr <- getYearQuarter(df$date, firstMonth=10)
You can also achieve adding a column by water year by using the "lfstat" package
https://www.rdocumentation.org/packages/lfstat/versions/0.9.4/topics/water_year
How can a date/time object in R be transformed on the fraction of a julian day?
For example, how can I turn this date:
date <- as.POSIXct('2006-12-12 12:00:00',tz='GMT')
into a number like this
> fjday
[1] 365.5
where julian day is elapsed day counted from the january 1st. The fraction 0.5 means that it's 12pm, and therefore half of the day.
This is just an example, but my real data covers all the 365 days of year 2006.
Since all your dates are from the same year (2006) this should be pretty easy:
julian(date, origin = as.POSIXct('2006-01-01', tz = 'GMT'))
If you or another reader happen to expand your dataset to other years, then you can set the origin for the beginning of each year as follows:
sapply(date, function(x) julian(x, origin = as.POSIXct(paste0(format(x, "%Y"),'-01-01'), tz = 'GMT')))
Have a look at the difftime function:
> unclass(difftime('2006-12-12 12:00:00', '2006-01-01 00:00:00', tz="GMT", units = "days"))
[1] 345.5
attr(,"units")
[1] "days"
A function to convert POSIX to julian day, an extension of the answer above, source it before using.
julian_conv <- function(x) {
if (is.na(x)) { # Because julian() cannot accept NA values
return(NA)
}
else {
j <-julian(x, origin = as.POSIXlt(paste0(format(x, "%Y"),'-01-01')))
temp <- unclass(j) # To unclass the object julian day to extract julian day
return(temp[1] + 1) # Because Julian day 1 is 1 e.g., 2016-01-01
}
}
Example:
date <- as.POSIXct('2006-12-12 12:00:00')
julian_conv(date)
#[1] 345.5
I would like a function that counts the number of specific days per month..
i.e.. Nov '13 -> 5 fridays.. while Dec'13 would return 4 Fridays..
Is there an elegant function that would return this?
library(lubridate)
num_days <- function(date){
x <- as.Date(date)
start = floor_date(x, "month")
count = days_in_month(x)
d = wday(start)
sol = ifelse(d > 4, 5, 4) #estimate that is the first day of the month is after Thu or Fri then the week will have 5 Fridays
sol
}
num_days("2013-08-01")
num_days(today())
What would be a better way to do this?
1) Here d is the input, a Date class object, e.g. d <- Sys.Date(). The result gives the number of Fridays in the year/month that contains d. Replace 5 with 1 to get the number of Mondays:
first <- as.Date(cut(d, "month"))
last <- as.Date(cut(first + 31, "month")) - 1
sum(format(seq(first, last, "day"), "%w") == 5)
2) Alternately replace the last line with the following line. Here, the first term is the number of Fridays from the Epoch to the next Friday on or after the first of the next month and the second term is the number of Fridays from the Epoch to the next Friday on or after the first of d's month. Again, we replace all 5's with 1's to get the count of Mondays.
ceiling(as.numeric(last + 1 - 5 + 4) / 7) - ceiling(as.numeric(first - 5 + 4) / 7)
The second solution is slightly longer (although it has the same number of lines) but it has the advantage of being vectorized, i.e. d could be a vector of dates.
UPDATE: Added second solution.
There are a number of ways to do it. Here is one:
countFridays <- function(y, m) {
fr <- as.Date(paste(y, m, "01", sep="-"))
to <- fr + 31
dt <- seq(fr, to, by="1 day")
df <- data.frame(date=dt, mon=as.POSIXlt(dt)$mon, wday=as.POSIXlt(dt)$wday)
df <- subset(df, df$wday==5 & df$mon==df[1,"mon"])
return(nrow(df))
}
It creates the first of the months, and a day in the next months.
It then creates a data frame of month index (on a 0 to 11 range, but we only use this for comparison) and weekday.
We then subset to a) be in the same month and b) on a Friday. That is your result set, and
we return the number of rows as your anwser.
Note that this only uses base R code.
Without using lubridate -
#arguments to pass to function:
whichweekday <- 5
whichmonth <- 11
whichyear <- 2013
#function code:
firstday <- as.Date(paste('01',whichmonth,whichyear,sep="-"),'%d-%m-%Y')
lastday <- if(whichmonth == 12) { '31-12-2013' } else {seq(as.Date(firstday,'%d-%m-%Y'), length=2, by="1 month")[2]-1}
sum(
strftime(
seq.Date(
from = firstday,
to = lastday,
by = "day"),
'%w'
) == whichweekday)