Trying to calculate a difference in dates, while excluding weekends - R-studio - r

IT_tickets[,"ticket_age"] <- NA
{R aging_count for Tasks}
IT_tickets$ticket_age[c(all_tasks)] <- difftime(IT_tickets$closed_at_date[c(all_tasks)], IT_tickets$sys_created_date[c(all_tasks)], units = "days")
I have this column called "ticket age" in my dataset IT_tickets, which calculates the difference in days when a ticket gets created and closed. How can I recode this so that it excludes weekends from the difference in days.
Similar to how NETWORK days function works in Excel.

If you don't have to include the holidays, you can do this
IT_tickets$ticket_age[c(all_tasks)] <- sum(!weekdays(seq(IT_tickets$sys_created_date[c(all_tasks)],
IT_tickets$closed_at_date[c(all_tasks)],
"days")) %in% c("Saturday", "Sunday")) - 1
If you want to include the start date into the count, you can remove the subtraction of 1.
Another way:
IT_tickets$ticket_age[c(all_tasks)] <- (IT_tickets$ticket_age[c(all_tasks)]%/%7) * 5 + IT_tickets$ticket_age[c(all_tasks)]%%7

Related

Bizdays doesn't exclude weekends

I am trying to calculate utilization rates by relative employee lifespans. I need to assign a total number of hours available to this employee between the earliest and furthest date in which time was recorded. From there I will use this as the divisor in utilization rate = workhours / totalhours.
When testing the bizdays function, I tried a simple example.
bizdays::bizdays("2020-02-07","2020-02-14")
[1] 7
Any reason why the function is not returning the correct number of business days?
I am expecting 5 business days since 2/07 was a Friday so only 1 week should be included.
The goals is to use bizdays in the following function that will be applied to a grouped df with gapply.
timeentry = function(x){
end_date = max(x$terminus)#creates an end_date variable from further end date in the group
start_date = min(x$onset) #creates a start_date from earliest start date in the group
start_date %>% bizdays(end_date) * 8 #subtracts dates and multiple by 8 to get work hours between two dates
}
I will apply the function in this manner. Unfortunately, it returns an error suggesting it cannot allocate vector of size 4687 gb. This is a separate issue I hope someone can point out.
util = group %>% gapply(.,timeentry)
where group is the grouped df.
Try setting up your calendar with create.calendar
library(bizdays)
create.calendar(name = "demo", weekdays = c("saturday", "sunday"))
bizdays::bizdays("2020-02-07","2020-02-14", cal = "demo")
[1] 5

R carry forward last observation n times by group

This one is driving me nuts. I have a large data.table with monthly stock data. Every June I assign every stock to one of 10 portfolios based on an accounting variable. I would like to carry forward the assigned portfolio variable to the next 11 month until each stock gets assigned to a new portfolio 1 to 10 in June next year. na.locf is basically what I'm looking for but I am running into 2 issues:
Some stocks lack sufficient accounting data the next year, so they shouldn't be assigned to a portfolio in that year (i.e. portfolio variable should stay NA). But of course na.locf keeps carrying forward the portfolio number until there is a new one.
Some stocks may get delisted after e.g. 3 months so they don't have another 11 month of data.
That's why I looking for a code that carries forward the last observation a maxium of 11 times until June next year (when there is a new portfolio number).
That's the na.locf solution right now with the 2 issues (PERMNO is the stock identifier):
COMPUSTAT_CRSP_IBES1[,
Portfolio_Monthly := na.locf(Portfolio_Monthly,
na.rm = FALSE),
by = PERMNO]
I tried to use rep but that didn't work at all:
COMPUSTAT_CRSP_IBES1[,
Portfolio_Monthly := if_else(!is.na(Portfolio_Monthly),
rep(Portfolio_Monthly, 11),
NA),
by = PERMNO]
Thank's for any hints!
You can create and/or use your fiscal year (June - May) as one of the group by criteria in your na.locf solution
#show data before calculations
data.frame(dat)
#demo FY calculation
dat[, FY := year(MONTH) + as.numeric(month(MONTH) >= 6)]
#actual code
dat[, Portfolio_Monthly := zoo::na.locf(Portfolio_Monthly, na.rm=FALSE),
by=list(PERMNO, year(MONTH) + as.numeric(month(MONTH) >= 6))]
#show results
data.frame(dat)
sample data:
library(data.table)
set.seed(0L)
dat <- data.table(PERMNO=rep(LETTERS[1:12], each=20),
MONTH=rep(seq(as.Date("2000-01-01"), by="1 month", length.out=20), 12),
Portfolio_Monthly=NA_real_)
for (i in sample(1:dat[,.N], 5)) {
set(dat, i, 3L, rnorm(1))
}
setorder(dat, PERMNO, MONTH)

Remove incomplete month from monthly return calculation

I have some code for grabbing stock prices and calculating monthly returns. I would like to drop the last return if the price used to calculate it did not occur at month end. For example, running the code below returns prices through 2014-06-13. And, the monthlyReturn function calculates a return for June even though there hasn't been a full month. Is there an easy way to make sure monthlyReturn is only computing returns on full months or to drop the last month from the return vector if it wasn't calculated on a full month of prices?
library(quantmod)
symbols <- c('XLY', 'XLP', 'XLE', 'XLF', 'XLV', 'XLI', 'XLB', 'XLK', 'XLU')
Stock <- xts()
Prices <- xts()
for (i in 1:length(symbols)){
Stock <- getSymbols(symbols[i],auto.assign = FALSE)
Prices <- merge(Prices,Stock[,6])
}
returns <- do.call(cbind, lapply(Prices, monthlyReturn, leading=FALSE))
names(returns) <- symbols
I found this bit of code, but it seems to have some limitations. Is there a way to improve this?
if(tail(index(x.xts),1) != as.Date(as.yearmon(tail(index(x.xts),1)), frac=1)){
x.m.xts = x.m.xts[-dim(x.m.xts)[1],]
}
# That test isn't quite right, but its close. It won't work on the first
# day of a new month when the last business day wasn't the last day of
# the month. It will work for the second day.
You can use negative subsetting with xts:::last.xts. This will remove the last month
last(returns, "-1 months")
But you only want to remove the last month if the month hasn't ended yet, so compare the month of the last row, with the month of the current date.
if (format(end(returns), "%Y%m") == format(Sys.Date(), "%Y%m"))
returns <- last(returns, "-1 month")

How do I subset every day except the last five days of zoo data?

I am trying to extract all dates except for the last five days from a zoo dataset into a single object.
This question is somewhat related to How do I subset the last week for every month of a zoo object in R?
You can reproduce the dataset with this code:
set.seed(123)
price <- rnorm(365)
data <- cbind(seq(as.Date("2013-01-01"), by = "day", length.out = 365), price)
zoodata <- zoo(data[,2], as.Date(data[,1]))
For my output, I'm hoping to get a combined dataset of everything except the last five days of each month. For example, if there are 20 days in the first month's data and 19 days in the second month's, I only want to subset the first 15 and 14 days of data respectively.
I tried using the head() function and the first() function to extract the first three weeks, but since each month will have a different amount of days according to month or leap year months, it's not ideal.
Thank you.
Here are a few approaches:
1) as.Date Let tt be the dates. Then we compute a Date vector the same length as tt which has the corresponding last date of the month. We then pick out those dates which are at least 5 days away from that:
tt <- time(zoodata)
last.date.of.month <- as.Date(as.yearmon(tt), frac = 1)
zoodata[ last.date.of.month - tt >= 5 ]
2) tapply/head For each month tapply head(x, -5) to the data and then concatenate the reduced months back together:
do.call("c", tapply(zoodata, as.yearmon(time(zoodata)), head, -5))
3) ave Define revseq which given a vector or zoo object returns sequence numbers in reverse order so that the last element corresponds to 1. Then use ave to create a vector ix the same length as zoodata which assigns such reverse sequence numbers to the days of each month. Thus the ix value for the last day of the month will be 1, for the second last day 2, etc. Finally subset zoodata to those elements corresponding to sequence numbers greater than 5:
revseq <- function(x) rev(seq_along(x))
ix <- ave(seq_along(zoodata), as.yearmon(time(zoodata)), FUN = revseq)
z <- zoodata[ ix > 5 ]
ADDED Solutions (1) and (2).
Exactly the same way as in the answer to your other question:
Split dataset by month, remove last 5 days, just add a "-":
library(xts)
xts.data <- as.xts(zoodata)
lapply(split(xts.data, "months"), last, "-5 days")
And the same way, if you want it on one single object:
do.call(rbind, lapply(split(xts.data, "months"), last, "-5 days"))

Replace values in an xts object according to some events on specific dates in R

I have two signal series and a data series as below.
BuyDates<-seq(as.Date("2013/1/1"), as.Date("2013/3/1"), by = "5 days")
SellDates<-seq(as.Date("2013/1/1"), as.Date("2013/3/1"), by = "7 days")
data<- xts(c(rnorm(32,100,3)),seq(as.Date("2013/1/1"), as.Date("2013/2/1"), by = "days"))
What i want is,the dates on which data gets buy signal from BuyDates,the value of data should be replaced by 1 and for SellDates it should be -1.And,on the remaining days in the sequence,1 or -1 should be carried forward till it gets the opposite signal,and for the days till the 1st signal,value should be replaced with NA.
kindly help
You can subset the data as usual:
data<- xts(rep(NA, 32),seq(as.Date("2013/1/1"), as.Date("2013/2/1"), by = "days"))
data[BuyDates] <- 1
data[SellDates] <- -1
Then you can carry forward the non-NA values using na.locf.
na.locf(data)

Resources