How does R handle entering and leaving positions with quantmod? - r

library(quantmod)
library(PerformanceAnalytics)
s <- get(getSymbols('SPY'))["2012::"]
s$sma20 <- SMA(Cl(s) , 20)
s$position <- ifelse(Cl(s) > s$sma20 , 1 , -1)
myReturn <- lag(s$position) * dailyReturn(s)
charts.PerformanceSummary(cbind(dailyReturn(s),myReturn))
I found the code above on another stackoverflow forum which was relatively old. I was wondering with this simple strategy above that trades when SMA close prices are above 20.
What does it mean when it enters a trade? Does it enter a trade every time the close time is above 20 SMA? Or does it enter the position once at SMA above 20 then exits when it is lower than 20?
I don't see the portion of the code where it leaves the trade.
Simple question, but I wasn't sure how R calculates the return on these strategies.
Thank you in advance.

After searching and researching, the dailyreturns(S) will be the open and close of each trade on a daily basis. It calculates the return based on the numbers obtain from the daily indicators and how we reacted and then it is multipled to the dailyreturns to obtain final numbers.

Related

In Surv(start_time, end_time, new_death) : Stop time must be > start time, NA created

I am using the package "survival" to fit a cox model with time intervals (intervals are 30 days long). I am reading the data in from an xlsx worksheet. I keep getting the error that says my stop time must be greater than my start time. The start values are all smaller than the stop values.
I checked to make sure these are being read in as numbers which they are. I also changed them to integers which did not solve the problem. I used this code to see if any observations met this criterion:
a <- a1[which(a1$end_time > a1$start_time),]
About half the dataset meets this criterion, but when I look at the data all the start times appear to be less than the end times.
Does anyone know why this is happening and how I can fix it? I am an R newbie so perhaps there is something obvious I don't know about?
model1<- survfit(Surv(start_time, end_time, censor) ~ exp, data=a1, weights = weight)
enter image description here

Increasing Bootstrap size in R

I have a time series on returns which is approximately 20 year long. Based on this time series, I want to compute somekind of a moving bootstrap to calculate the mean returns for every observation.
Let me demonstrate this on an example:
Let´s say we have information starting at 01.01.1990 and I want to compute the means with bootstrap starting at 02101.1991.
At 01.01.1991 I want to comupte the mean based on the returns between 01.01.1991-01.01.1990.
Then, on 02.10.1991 I also want to take into account the return of 02.01.1991 and therefore want to calculate the mean with bootstrap based on the returns from 01.01.1990-02.01.1991.
To sum up, the data for my bootstrap should increase by 1 through the time series.
I hope that you can understand what I am trying to say.
I would appreciate any help.
Cheers
Sven
So I managed to answer the question by myself
Let say we want to get the means calculated with bootstrap starting at 01.01.1991 which is the 300th observation in our sample
(Overall we have 1000 observations in our time series)
then the code is the following one:
h <- rep(1, 1000)
for (i in 300:1000) {
h[i] <- mean( sample(rawdata$retoil[1:i] , 5000 , replace=TRUE))
}
the first 300 row of h are 1's and can be deleted in the end
Hope I could help some of you :)

Can't get discount.rate function to calculate rate of return

I'm trying to use the discount.rate function in the package FinCal to calculate the rate of return and it doesn't seem to work for me.
discount.rate(n=360,pv=-100000,fv=0,pmt=500,type=0)
n=360 means there are 360 payments (in other words, a 30 year loan)
pv= present value (meaning the bank gives a borrower $100,000 to purchase a home)
pmt = monthly payment
fv = future value (set to 0 because bank gives $100,000 initially but after 30 years receives nothing back except for the monthly mortgage payments)
type = 0 means that payments are made at the end of each period
I get the following error:
*Error in uniroot(function(r) fv.simple(r, n, pv) + fv.annuity(r, n, pmt, :
f.upper = f(upper) is NA*
I used the same values in a similar finance function in SAS and it worked fine. Thanks for any help.
Per the suggestion of one user, I tried it in Excel and it worked fine also. Works fine in SAS and Excel but not in R.
The FinCal package does not allow you to specify the compounding/discounting frequency (12 per year in this case) as an argument to the discount.rate() function and this seems to be causing an issue. If you convert pmt into per year (6,000 = 12*500) and set n=30 the function gives you 4.31%, which is the stated annual rate:
discount.rate(n=30, pv= -100000, fv=0, pmt=6000, type=0)
You then use the ear(r, m) function from the same package with m=12 and r=0.04306572 to get the Effective Annual Rate (EAR) of 4.39% :
ear(0.04306572,12)
Hope this helps.
It was caused by uniroot. The default interval in discount.rate() is (1e-10, 1e10), when I changed it to (1e-4, 1), I got 0.003683461.
discount.rate(n=360,pv=-100000,fv=0,pmt=500,type=0,lower=0.0001, upper = 1)
[1] 0.003683461
I add two new parameters 'lower' and 'upper' to discount.rate() so you can try different intervals now. You need reinstall FinCal package.
library("devtools")
install_github("felixfan/FinCal") # from GitHub, now
or
install.packages("FinCal",dependencies=TRUE) # from CRAN, several days later

Using xts with slightly different date structures

I'm working on implementing a finance model in R. I'm using quantmod::getSymbols(), which is returning a xts object. I'm using both stock data from google (or yahoo) and economic/yield data from FRED. Right now I'm receiving errors for non-conformable arrays when attempting to do a comparison.
require(quantmod)
fiveYearsAgo = Sys.Date() - (365 * 5)
bondIndex <- getSymbols("LQD",src="google",from = fiveYearsAgo, auto.assign = FALSE)[,c(0,4)]
bondIndex$score <- 0
bondIndex$low <- runMin(bondIndex,365)
bondIndex$high <- runMax(bondIndex,365)
bondIndex$score <- ifelse(bondIndex > (bondIndex$low * 1.006), bondIndex$score + 1, bondIndex$score)
# Error in `>.default`(bondIndex, (bondIndex$low * 1.006)) :
# non-conformable arrays
bondIndex$score <- ifelse(bondIndex < (bondIndex$high * .994), bondIndex$score - 1, bondIndex$score)
# Error in `<.default`(bondIndex, (bondIndex$high * 0.994)) :
# non-conformable arrays
print (bondIndex$score)
I added the following before the offending line:
print (length(bondIndex))
print (length(bondIndex$low))
print (length(bondIndex$high))
My results were 5024, 1256, and 1256. I want them to be same length where every day has the close, 52 week high, and 52 week low. I additionally want to add more data so the days also have a 50 day moving average. Further still, what really put an ax in my progress was implementing yield data from FRED. My theory is that stock and bond markets have different holidays, resulting in slightly different days with day. In this case, I'd like to na.spline() the missing data.
I know I'm going about this wrong way, what's the best way to do what I'm attempting? I want to have each row be a day, then have columns for close price, high, low, moving average, a few different yields for that day and finally a "score" that has a daily value based on the other data for that day.
Thanks for the help and let me know if you want or need more information.
You need to tell your statement what variable you want. right now you are asking if bondIndex is greater or less than low or high. This doesn't make sense. Presumably you want bondIndex[,1] aka bondIndex$LQD.Close:
bondIndex$score <- ifelse(bondIndex[,1] > (bondIndex$low * 1.006), bondIndex$score + 1, bondIndex$score)
bondIndex$score <- ifelse(bondIndex[,1] < (bondIndex$high * .994), bondIndex$score - 1, bondIndex$score)
As a side note, Sys.Date() - (365 * 5) is not five years ago (hint, leap years). This will be a bug that might bite you down the line.

Time Series Clustering in R

I have two time series- a baseline (x) and one with an event (y). I'd like to cluster based on dissimilarity of these two time series. Specifically, I'm hoping to create new features to predict the event. I'm much more familiar with clustering, but fairly new to time series.
I've tried a few different things with a limited understanding...
Simulating data...
x<-rnorm(100000,mean=1,sd=10)
y<-rnorm(100000,mean=1,sd=10)
This package seems awesome but there is limited information available on SO or Google.
library(TSclust)
d<-diss.ACF(x, y)
the value of d is
[,1]
[1,] 0.07173596
I then move on to clustering...
hc <- hclust(d)
but I get the following error:
Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") :
missing value where TRUE/FALSE needed
My assumption is this error is because I only have one value in d.
Alternatively, I've tried the following on a single timeseries (the event).
library(dtw)
distMatrix <- dist(y, method="DTW")
hc <- hclust(y, method="complete")
but it takes FOREVER to run the distance Matrix.
I have a couple of guesses at what is going wrong, but could use some guidance.
My questions...
Do I need a set of baseline and a set of event time series? Or is one pairing ok to start?
My time series are quite large (100000 rows). I'm guessing this is causing the SLOW distMatrix calculation. Thoughts on this?
Any resources on applied clustering on large time series are welcome. I've done a pretty thorough search, but I'm sure there are things I haven't found.
Is this the code you would use to accomplish these goals?
Thanks!

Resources