Wrong week-ending date using 'to.weekly' function in 'xts' package - r

I have a really odd issue... I am using the to.weekly and to.period function to convert a daily xts object to weekly data. In most instances, I get the week-ending date as a Friday (day.of.week function will return 5) (e.g. "2010-01-08", "2011-02-11"), but there are a few cases where I get something other than Friday (Saturday/Sunday/Thursday/etc.)
I have tried to.weekly and to.period(x, period = 'weeks') and both return the same problem.
Why is this happening? Is there a work-around for this??
Thanks!!
[EDIT: EXAMPLE BELOW]
test.dates <- as.Date(c("2010-04-27","2010-04-28","2010-04-29","2010-04-30","2010-05-03","2010-05-04","2010-05-05","2010-05-06","2010-05-07","2010-05-10","2010-05-11","2010-05-12","2010-05-13","2010-05-14","2010-05-17","2010-05-18","2010-05-19","2010-05-20","2010-05-21","2010-05-22","2010-05-24","2010-05-25","2010-05-26","2010-05-27","2010-05-28","2010-06-01","2010-06-02","2010-06-03","2010-06-04"))
test.data <- rnorm(length(test.dates),mean=1,sd=2)
test.xts <- xts(x=test.data,order.by=test.dates)
#Function that takes in a vector of zoo/xts objects (e.g. "2010-01-08") and returns the day of the week for each
dayofweek <- function(x) {
placeholder <- vector("list",length=length(x))
names(placeholder) <- x
for(i in 1:length(x)) {placeholder[[i]] <- month.day.year(x[i])}
placeholder2 <- rep(NA,times=length(x))
for(i in 1:length(x)) {placeholder2[i] <- day.of.week(placeholder[[i]][[1]],placeholder[[i]][[2]],placeholder[[i]][[3]])}
return(placeholder2)}
This returns the date(s) that are not Friday: time(to.weekly(test.xts))[dayofweek(time(to.weekly(test.xts))) != 5]

You have 2 problems with your example:
Your dayofweek function is a bit cumbersome, and probably incorrect in its results.
Your example dates is missing some dates, such as 05-23-2010.
Here is a cleaned-up version of your code:
library(xts)
test.dates <- as.Date(c("2010-04-27","2010-04-28","2010-04-29","2010-04-30","2010-05-03","2010-05-04","2010-05-05","2010-05-06","2010-05-07","2010-05-10","2010-05-11","2010-05-12","2010-05-13","2010-05-14","2010-05-17","2010-05-18","2010-05-19","2010-05-20","2010-05-21","2010-05-22","2010-05-24","2010-05-25","2010-05-26","2010-05-27","2010-05-28","2010-06-01","2010-06-02","2010-06-03","2010-06-04"))
test.data <- rnorm(length(test.dates),mean=1,sd=2)
test.xts <- xts(x=test.data,order.by=test.dates)
test.weekly <- to.weekly(test.xts)
library(lubridate)
test.weekly[wday(test.weekly, label = TRUE, abbr = TRUE) != "Fri"]
The only result of this function is
test.xts.Open test.xts.High test.xts.Low test.xts.Close
2010-05-22 -1.705749 1.273982 -2.084203 -1.502611
The problem of course, is that this week ends on 05-23-2010, but that date is not present in the time series. Therefore, to.weekly uses the next closest date as the end point, which is 05-22-2010. This is the source of your problem.
Here is a better example, which reveals no issue with the to.weekly function.
library(lubridate); library(xts)
test.dates <- seq(as.Date("1900-01-01"),as.Date("2011-10-01"),by='days')
test.dates <- test.dates[wday(test.dates)!=1 & wday(test.dates)!=7] #Remove weekends
test.data <- rnorm(length(test.dates),mean=1,sd=2)
test.xts <- xts(x=test.data,order.by=test.dates)
test.weekly <- to.weekly(test.xts)
test.weekly[wday(test.weekly, label = TRUE, abbr = TRUE) != "Fri"]

Related

lapply with multiple function arguments

library(quantmod)
library(xts)
getSymbols("SY1.DE", from = "2019-4-10", to = "2019-4-19", auto.assign = TRUE)
getSymbols("PEP", from = "2019-4-9", to = "2019-4-19", auto.assign = TRUE)
calcreturn <- function(data, amount = 24) {
start <- as.numeric(data[,4][1])
end <- as.numeric(data[,4][nrow(data)])
difference <- end - start
winning <- difference * amount
return(winning)
}
allstocks <- list(SY1.DE, PEP)
amount <- list(24, 23)
lapply(allstocks, calcreturn)
Hello everbody!
This is my code to calculate my returns for my stocks. However, the amount of stocks i bought differ, so lapply does only work when the amount argument does not change. Is there a day to deal with changing arguments?
Thank you!
You can modify your lapply to run over an index pairing one by one stock with amount:
lapply(1:length(allstocks), function(x) calcreturn(allstocks[[x]], amount[[x]]))

Perform test, get result from another column for the same row

I'm creating a function but i need some help with best practices.
Active.Test <- function(date) {
date <- rep(date,length(df$Start.Date))
active <- rep(0,length(df$Start.Date))
active[date > df$Start.Date & date < df$End.Date] <- 1
active[df$Start.Date == df$End.Date ] <- df$Active.Time
return (active)
}
I basically want to check if a date (which is passed to the function) is between the start and end date in my data frame. If it is, assign a 1. If the start and end dates are equal, get the result from the same row in Active.Time column. Everything else has a default value of 0.
This returns an error as it's retrieving a vector which is of a different size for the second test.
I can re-write the above as:
Active.Test <- function(date) {
date <- rep(date,length(df$Start.Date))
active <- rep(0,length(df$Start.Date))
active[date > df$Start.Date & date < df$End.Date] <- 1
active[df$Start.Date == df$End.Date] <- df$Active.Time[df$Start.Date == df$End.Date]
return (active)
}
This will then get the correct element from the Active.Time column but this doesn't seem to be an elegant way to write this. I'm also guessing it's slower as i'm performing the same check twice as many times.
Could you please help me re-write this using best practices?
EDIT: Here's some code to get a few rows of data and then test use the function by checking to see if the start and end dates encompass 25/05/2016.
#Create a data frame
df <- data.frame(End.Date = as.Date(c("1/05/2016","28/05/2016", "25/05/2016"), format = "%d/%m/%Y"), Start.Date = as.Date(c("20/04/2016 11:00","20/05/2016 23:00", "25/05/2016 10:00"), format = "%d/%m/%Y" ), Active.Time = as.numeric(c(0.5,0.4,0.8)))
#Test the function
df$new <- Active.Test(as.Date("25/05/2016", format = "%d/%m/%Y"))
Thanks
# Using the data.table approach
library(data.table)
# Make data table instead of data.frame (you can also do as.data.table(df) to get a data.table)
my_dt <- data.table(Start.Date=as.Date(c("20/04/2016 11:00","20/05/2016 23:00", "25/05/2016 10:00"), format = "%d/%m/%Y" ),
End.Date=as.Date(c("1/05/2016","28/05/2016", "25/05/2016"), format = "%d/%m/%Y"),
Active.Time = as.numeric(c(0.5,0.4,0.8))
)
setkey(my_dt)
# Sample date to test
datte <- as.Date("25/05/2016", format = "%d/%m/%Y")
# Create function with conditions and result to return
Active.Test <- function(datte, Start.Date, End.Date, Active.Time) {
if(datte > Start.Date & datte < End.Date){
return(1)
}
else if(Start.Date==End.Date){
return(Active.Time)
}
else{return(0)}
}
# Test function
my_dt[, res:=Active.Test(datte, Start.Date, End.Date, Active.Time), by=1:nrow(my_dt)]
See data.table vignette for more on data.table. Also, in your function above, note the warning you get when you run df$new <- Active.Test(as.Date("25/05/2016", format = "%d/%m/%Y"))!

Rollapply backwards time series in R

I need to fill backwards the historical prices knowing the returns (in real situation they are simulated).
So far I have this code:
library(quantmod)
getSymbols("AAPL")
df = AAPL["2014-01-01/2015-01-01", "AAPL.Close"]
df_ret = diff(log(df),1)
# imagine the half of the past prices are missing
df["2014-01-01/2014-07-01"] = NA
df_tot = cbind(df, df_ret)
fillBackwards = function(data, range_to_fill){
index_array = index(data[range_to_fill,])
data_out = data
for (i in (length(index_array)-1):1){
inx = index_array[i]
inx_0 = index_array[i+1]
data_out[inx,1] = exp(-(data_out[inx_0,2]))*(data_out[inx_0,1])
}
return (data_out)
}
df_filled = fillBackwards(df_tot,"2014-01-01/2014-07-02")
sum(AAPL["2014-01-01/2015-01-01", "AAPL.Close"] - df_filled[,1]) # zero up to computation error, i.e. identical
This works perfect, but a bit slow. Could you please suggest something using build-in rollapply()
# i want something like this
df_filled = rollapply(df_tot["2014-07-02/2014-01-01",], by=-1, function(x) {....})
You don't need rollapply, or a loop. You can use cumprod on the returns. Here's a version of fillBackwards that uses cumprod:
fillBackwards <- function(data, range_to_fill) {
data_range <- data[range_to_fill,]
returns <- rev(coredata(data_range[-1L, 2L]))
last_price <- drop(coredata(last(data_range[, 1L])))
new_prices <- rev(last_price * cumprod(exp(-returns)))
data[range_to_fill, 1L] <- c(last_price, new_prices)
return(data)
}

Converting date in For Loop in R - origin must be supplied

I have a sequence of dates in R, and for each date I need to get the year, month, and day. I tried to use the strftime function to print out the year, but R behaves very strangely. This code fails:
# sequence of dates
dates <- seq(as.Date("1987-03-29"), as.Date("1991-12-31"), by=1)
# this fails with "'origin' must be supplied" error:
for (d in dates) {
year <- strftime(d, "%Y")
print(year)
}
The exact error message is: Error in as.POSIXlt.numeric(x, tz = tz) : 'origin' must be supplied
On the other hand, this code works without any error:
# sequence of dates
dates <- seq(as.Date("1987-03-29"), as.Date("1991-12-31"), by=1)
# this works
for (i in 1: length(dates)) {
year <- strftime(dates[i], "%Y")
print(year)
}
Why does the first example fail and the second example works? I suspect that in the first example R is trying to convert my date to some kind of POSIXct object and in the second example it doesn't? I'm confused why there's any difference and I'd appreciate an explanation of what's going on. I'm using R version 3.2.2.
The for is creating d as numeric. Here are two approaches.
Below the comments were removed and only the code lines marked ## have been changed.
1) list Use a list like this:
dates <- seq(as.Date("1987-03-29"), as.Date("1991-12-31"), by=1)
for (d in as.list(dates)) { ##
year <- strftime(d, "%Y")
print(year)
}
2) as.Date or convert d back to "Date" class.
dates <- seq(as.Date("1987-03-29"), as.Date("1991-12-31"), by=1)
for (d in dates) {
year <- strftime(as.Date(d, origin = "1970-01-01"), "%Y") ##
print(year)
}

Find first Tuesday of Month

I am trying to write a function which takes a vector of dates as an input and returns a vector of dates -- where the output is the date of the first Tuesday of the month which matches the input date.
So 2012-11-19 --> 2012-11-06, etc.
I have had some success with a single date, but have not been able to generalise to the vector case. Could someone please help?
This is what I have so far:
firstTuesday <- function(tt){
ct <- as.POSIXct(tt)
lt <- as.POSIXlt(tt)
firstOf <- as.POSIXlt(ct - 60*60*24* (lt$mday - 1))
if (firstOf$wday > 2)
{
adjDays <- (9 - firstOf$wday)
firstTues <- as.POSIXlt(as.POSIXct(firstOf) + 60*60*24*adjDays)
}
else {
adjDays <- (2 - firstOf$wday)
firstTues <- as.POSIXlt(as.POSIXct(firstOf) + 60*60*24*adjDays)
}
return(firstTues)
}
Which works for a single date: firstTuesday(Sys.Date()) but yielded junk for vectors of dates (due to issues with if not being a vectorised control operator, i think).
I got around my limited understanding by using indexing. The following code seems to do the trick.
firstTuesday <- function(tt){
ct <- as.POSIXct(tt)
lt <- as.POSIXlt(tt)
firstOf <- as.POSIXlt(ct - 60*60*24* (lt$mday - 1))
firstTue <- as.POSIXct(firstOf)
idx <- firstOf$wday > 2
firstTue[idx] <- as.POSIXct(firstOf[idx]) + 60*60*24*(9 - firstOf$wday[idx])
firstTue[!idx] <- as.POSIXct(firstOf[!idx]) + 60*60*24*(2 - firstOf$wday[!idx])
return(firstTue)
}
This uses lubridate and makes the logic a little simpler. Given a vector of dates the second function will return a vector of characters, similar to your input. You can change things around to suit your needs.
library(lubridate)
getTuesday = function(x) {
date = ymd(x)
first = floor_date(date,"month")
dow = sapply(seq(0,6),function(x) wday(first+days(x)))
firstTuesday = first + days(which(dow==3)-1)
return(firstTuesday)
}
getMultipleTuesdays = function(y) {
tmp = lapply(y, getTuesday)
tmp = lapply(tmp, as.character)
return(unlist(tmp))
}
Edit
Sample input/output
getMultipleTuesdays(c("2012-11-19","2012-11-19","2011-01-15"))
[1] "2012-11-06" "2012-11-06" "2011-01-04"
Here's a simple solution using base functions:
firstDayOfMonth <- function(dates, day="Mon", abbreviate=TRUE) {
# first 7 days of month
s <- lapply(as.Date(format(dates,"%Y-%m-01")), seq, by="day", length.out=7)
# first day of month
d <- lapply(s, function(d) d[weekdays(d,abbreviate)==day])
# unlist converts to atomic, so use do.call(c,...) instead
do.call(c, d)
}
Well, maybe the do.call at the end isn't so simple... but it's a handy piece of knowledge. :)
R> d <- as.Date(c("2012-11-19","2012-11-19","2011-01-15"))
R> firstDayOfMonth(d, "Tuesday", FALSE)
[1] "2012-11-06" "2012-11-06" "2011-01-04"

Resources