Bizdays doesn't exclude weekends - r

I am trying to calculate utilization rates by relative employee lifespans. I need to assign a total number of hours available to this employee between the earliest and furthest date in which time was recorded. From there I will use this as the divisor in utilization rate = workhours / totalhours.
When testing the bizdays function, I tried a simple example.
bizdays::bizdays("2020-02-07","2020-02-14")
[1] 7
Any reason why the function is not returning the correct number of business days?
I am expecting 5 business days since 2/07 was a Friday so only 1 week should be included.
The goals is to use bizdays in the following function that will be applied to a grouped df with gapply.
timeentry = function(x){
end_date = max(x$terminus)#creates an end_date variable from further end date in the group
start_date = min(x$onset) #creates a start_date from earliest start date in the group
start_date %>% bizdays(end_date) * 8 #subtracts dates and multiple by 8 to get work hours between two dates
}
I will apply the function in this manner. Unfortunately, it returns an error suggesting it cannot allocate vector of size 4687 gb. This is a separate issue I hope someone can point out.
util = group %>% gapply(.,timeentry)
where group is the grouped df.

Try setting up your calendar with create.calendar
library(bizdays)
create.calendar(name = "demo", weekdays = c("saturday", "sunday"))
bizdays::bizdays("2020-02-07","2020-02-14", cal = "demo")
[1] 5

Related

How can I conditionally add a value to rows that match a certain condition in R?

I run some code for the next four weeks to determine capacity hours for our professional services team. I need a way to account for the reduction in available hours due to company holidays. I have a table below with some hypothetical holidays. The table contains the actual date of the holiday and the Monday of the week it begins on.
Each week, I run a model to project PTO and capacity for the next four weeks. I also want to check my list of holidays to see if one occurs. If it does, I want a new column Holiday Hours to reflect the amount of holiday hours for that week, which is Billable Capacity Hours / 5 for a single holiday in a 5-day work week.
library(tidyverse)
library(lubridate)
Four_week_capacity_PTO <- read_csv("Week,Billable PTO Hours,Billable Capacity Hours\n
2022-07-11,10.00000,691\n
2022-07-18,80.03333,691\n
2022-07-25,19.00000,691\n
2022-08-01,33.0000,691\n
2022-08-08,30.6000,691\n")
holidays <- ymd(c("2022-08-02","2022-08-04","2022-11-24","2022-12-26"))
holiday_df <- tibble(
holidays,
Week = holidays - wday(holidays, week_start = 2), #This finds the Monday of the week in which the holiday occurs
)
For some reason, when I try to create the Holiday Hours variable using the below code, the code is executing with no errors or warnings but no variable is being created.
Four_week_capacity_PTO_holiday <- Four_week_capacity_PTO %>%
rowwise() %>%
mutate(
for (i in length(Week)){
`Holiday Hours` = if_else(
holiday_df$Week[i] %in% Week, `Billable PTO Hours`/5, 0)
}
)
The other issue I'm having is that if a week has two holidays, such as Christmas Eve and Christmas, I need to reduce the weekly Billable Capacity Hours by two days of work -- i.e., (Billable Capacity Hours / 5) * 2...
I have tested the individual conditionals in my for loop and they are correctly evaluating to TRUE and FALSE as it iterates, so I don't understand why no variable is being created.
How about making your holiday_df include a count of holidays for each week
holidays,
Week = holidays - wday(holidays, week_start = 2), #This finds the Monday of the week in which the holiday occurs
) %>% group_by(Week) %>% summarise(H_Count=n())
Then join that to your original data set and use the H_Count for your logic and calculations
Four_week_capacity_PTO_holiday <- Four_week_capacity_PTO %>%
left_join(holiday_df) %>%
mutate(`Holiday Hours` =if_else(H_Count>0,`Billable PTO Hours`/5, 0 ),
NEW_BCH=if_else(is.na(H_Count), `Billable Capacity Hours`, `Billable Capacity Hours`/5*H_Count))
Or if you didn't want to joinn and keep the backbone of your previous work flow you don't really need the for loop but need to swith the order of the "in" argument. It is not useful for the Billable Capactiy hours though without the count.
Four_week_capacity_PTO_holiday <- Four_week_capacity_PTO %>%
mutate(
`Holiday Hours` = if_else(
Week %in% holiday_df$Week, `Billable PTO Hours`/5, 0)
)

Trying to calculate a difference in dates, while excluding weekends - R-studio

IT_tickets[,"ticket_age"] <- NA
{R aging_count for Tasks}
IT_tickets$ticket_age[c(all_tasks)] <- difftime(IT_tickets$closed_at_date[c(all_tasks)], IT_tickets$sys_created_date[c(all_tasks)], units = "days")
I have this column called "ticket age" in my dataset IT_tickets, which calculates the difference in days when a ticket gets created and closed. How can I recode this so that it excludes weekends from the difference in days.
Similar to how NETWORK days function works in Excel.
If you don't have to include the holidays, you can do this
IT_tickets$ticket_age[c(all_tasks)] <- sum(!weekdays(seq(IT_tickets$sys_created_date[c(all_tasks)],
IT_tickets$closed_at_date[c(all_tasks)],
"days")) %in% c("Saturday", "Sunday")) - 1
If you want to include the start date into the count, you can remove the subtraction of 1.
Another way:
IT_tickets$ticket_age[c(all_tasks)] <- (IT_tickets$ticket_age[c(all_tasks)]%/%7) * 5 + IT_tickets$ticket_age[c(all_tasks)]%%7

how to calculate MTD in R for a different month

The task is to find number of days in MTD for two months, the month that had highest inflow numbers (July) in my case and the present month.
Because I plan to run the statement as a script everyday, I don't want to hardcode anything.
The dataframe is like this:
SERVICE BEST MONTH TOTAL BEST MONTH MTD CURR. MONTH MTD
No of Working Days
..
..
..
For "BEST MONTH TOTAL", I used following statement:
report[1,2] <- sum(!weekdays(seq(as.Date('2019-07-01'), as.Date('2019-07-
31'), 'days')) %in% c('Sunday','Saturday'))
For current month no of days MTD, the number of days I calculated using:
difftime(Sys.Date(),'2019-09-01',units = "days" )
This gives the output:
Time difference of 12.22917 days
Is there a way that I can get just the interger 12?
And how do I calculate BEST MONTH MTD? Is there a function that'll help go back to same date as sys.date() in the month of July to calculate number of working days MTD?
i.e. essentially what I need is:
difftime('2019-07-13','2019-07-01', units = "days")
But don't want to hardcode '2019-07-13' as I want to run this as a script and want to avoid changing date every day. Also I just need the difference in integer without "Time difference of ... days".
To convert to the number of days, as a numeric:
as.numeric(difftime(Sys.Date(),'2019-09-01',units = "days" ))
Is this what you want?
trunc(difftime(Sys.Date(),'2019-09-01',units = "days"))
#output
#Time difference of 12 days
best_month_mtd <- function(y) {
trunc(difftime(Sys.Date(),y, units = "days"))
}
#typical usage
best_month_mtd('2019-09-01')
#output
#Time difference of 12 days
That should give you what you want.
Update
If you just want to compute the number of days elapsed from the start date of the month, a one liner using lubridate might be handy. This might be different from the actual number of working days. The later requires handling days closed by a calendar in an organization as well as national holidays.
library(lubridate)
# -------------------------------------------------------------------------
#The start date of the month can be computed using floor_date() from lubridate
#floor_date() takes a date-time object and rounds it down to the nearest
#boundary of the specified time unit.
month_start <- floor_date(Sys.Date(), "month")
month_start
#"2019-09-01"
# -------------------------------------------------------------------------
#difftime between Sys.Date() and month_start
difftime(Sys.Date(),month_start, units = "days")
#Time difference of 12 days
# -------------------------------------------------------------------------
# to print only the number of days
paste0(trunc(difftime(Sys.Date(),month_start, units = "days")))
#"12"
# -------------------------------------------------------------------------
#Using the above a one liner can be constructed follows.
paste0(trunc(difftime(Sys.Date(),floor_date(Sys.Date(), "month"), units = "days")))

Calculate mean of one column for 14 rows before certain row, as identified by date for each group (year)

I would like to calculate mean of Mean.Temp.c. before certain date, such as 1963-03-23 as showed in date2 column in this example. This is time when peak snowmelt runoff occurred in 1963 in my area. I want to know 10 day’s mean temperature before this date (ie., 1963-03-23). How to do it? I have 50 years data, and each year peak snowmelt date is different.
example data
You can try:
library(dplyr)
df %>%
mutate(date2 = as.Date(as.character(date2)),
ten_day_mean = mean(Mean.Temp.c[between(date2, "1963-03-14", "1963-03-23")]))
In this case the desired mean would populate the whole column.
Or with data.table:
library(data.table)
setDT(df)[between(as.Date(as.character(date2)), "1963-03-14", "1963-03-23"), ten_day_mean := mean(Mean.Temp.c)]
In the latter case you'd get NA for those days that are not relevant for your date range.
Supposing date2 is a Date field and your data.frame is called x:
start_date <- as.Date("1963-03-23")-10
end_date <- as.Date("1963-03-23")
mean(x$Mean.Temp.c.[x$date2 >= start_date & x$date2 <= end_date])
Now, if you have multiple years of interest, you could wrap this code within a for loop (or [s|l]apply) taking elements from a vector of dates.

Next week day for a given vector of dates

I'm trying to get the next week day for a vector of dates in R. My approach was to create a vector of weekdays and then find the date to the weekend date I have. The problem is that for Saturday and some holidays (which are a lot in my country) i end up getting the previous week day which doesn't work.
This is an example of my problem:
vecDates = as.Date(c("2011-01-11","2011-01-12","2011-01-13","2011-01-14","2011-01-17","2011-01-18",
"2011-01-19","2011-01-20","2011-01-21","2011-01-24"))
testDates = as.Date(c("2011-01-22","2011-01-23"))
findInterval(testDates,vecDates)
for both dates the correct answer should be 10 which is "2011-01-24" but I get 9.
I though of a solution where I remove all the previous dates to the date i'm analyzing, and then use findInterval. It works but it is not vectorized and therefore kind of slow which does not work for my actual purpose.
Does this do what you want?
vecDates = as.Date(c("2011-01-11","2011-01-12",
"2011-01-13","2011-01-14",
"2011-01-17","2011-01-18",
"2011-01-19","2011-01-20",
"2011-01-21","2011-01-24"))
testDates = as.Date(c("2011-01-20","2011-01-22","2011-01-23"))
get_next_biz_day <- function(testdays, bizdays){
o <- findInterval(testdays, bizdays) + 1
bizdays[o]
}
get_next_biz_day(testDates, vecDates)
#[1] "2011-01-21" "2011-01-24" "2011-01-24"

Resources