I've got a data series which is of the form date. open high low and close (prices). I want to create local maxima and minima for the close column of the data. I further want to buy after 2 days of local minima # close and sell after two days of local maxima # close. I further want to calculate the profit and loss for the same. the code for the same is as under.
require(quantmod)
tckr1<-"^NSEI"
start<-Sys.Date()-200
end<- format(Sys.Date(),"%Y-%m-%d") # yyyy-mm-dd
getSymbols(tckr1, from=start, to=end)
data<- NSEI$NSEI.Close
data$n <- 1:nrow(data)
data$z <- ZigZag(data$NSEI.Close , change = 2 , percent = T)
data$level<- data[c(findPeaks(data$z) , findValleys(data$z)) - 1 , ]
data$NSEI.Close.1<- NULL
data$n.1<- NULL
data$trade<- lag(data$level,2)
Now i need the data column to tell me when to buy and sell by +1 and -1 and also to calculate the profit and loss for the same. In this above mentioned data i will buy when n= 29 # 5719.70 and when n=36 # 5851.20 etc.
regards
Ashish
require(quantmod)
tckr1<-"^NSEI"
start<-Sys.Date()-200
end<- format(Sys.Date(),"%Y-%m-%d") # yyyy-mm-dd
getSymbols(tckr1, from=start, to=end)
data<- NSEI$NSEI.Close
data$n <- 1:nrow(data)
data$z <- ZigZag(data$NSEI.Close , change = 2 , percent = T)
data$level<- data[c(findPeaks(data$z) , findValleys(data$z)) - 1 , ]
ex <- data[c(findPeaks(data$z) , findValleys(data$z)) - 1 , ]
data$trade<- data$level
data$trade[is.na(data$level)]<- 0
data$trade[data$trade!=0,]<- c(1,-1)
This way you can get your trade column +/- 1.
Just to provide an answer to the final "P/L" curve part of the question, the below code will generate an equity curve based upon ANUP's code,
require(PerformanceAnalytics)
ex <- data[c(findPeaks(data$z) , findValleys(data$z)) - 1 , ]
returns <- ROC(ex$NSEI.Close)*(Lag(ex$trade))
equity <- exp(cumsum(na.trim(returns)))
charts.PerformanceSummary(equity)
Related
I am in interested in finding Pearson correlation coefficients between a list of genes. Basically, I have Affymetrix gene level expression matrix (genes in the rows and sample ID on the columns), and I have annotation data of microarray experiment observation where sample ID in the rows and description identification on the columns.
data
> expr_mat[1:8, 1:3]
Tarca_001_P1A01 Tarca_003_P1A03 Tarca_004_P1A04
1_at 6.062215 6.125023 5.875502
10_at 3.796484 3.805305 3.450245
100_at 5.849338 6.191562 6.550525
1000_at 3.567779 3.452524 3.316134
10000_at 6.166815 5.678373 6.185059
100009613_at 4.443027 4.773199 4.393488
100009676_at 5.836522 6.143398 5.898364
10001_at 6.330018 5.601745 6.137984
> anodat[1:8, 1:3]
V1 V2 V3
1 SampleID GA Batch
2 Tarca_001_P1A01 11 1
3 Tarca_013_P1B01 15.3 1
4 Tarca_025_P1C01 21.7 1
5 Tarca_037_P1D01 26.7 1
6 Tarca_049_P1E01 31.3 1
7 Tarca_061_P1F01 32.1 1
8 Tarca_051_P1E03 19.7 1
goal:
I intend to see how the genes in each sample are correlated with GA value of corresponding samples in the annotation data, then generate sub expression matrix of keeping high correlated genes with target observation data anodat$GA.
my attempt:
gene_corrs <- function(expr_mat, anno_mat){
stopifnot(ncol(expr_mat)==nrow(anno_mat))
res <- list()
lapply(colnames(expr_mat), function(x){
lapply(x, rownames(y){
if(colnames(x) %in% rownames(anno_mat)){
cor_mat <- stats::cor(y, anno_mat$GA, method = "pearson")
ncor <- ncol(cor_mat)
cmatt <- col(cor_mat)
ord <- order(-cmat, cor_mat, decreasing = TRUE)- (ncor*cmatt - ncor)
colnames(ord) <- colnames(cor_mat)
res <- cbind(ID=c(cold(ord), ID2=c(ord)))
res <- as.data.frame(cbind(out, cor=cor_mat[res]))
res <- cbind(res, cor=cor_mat[out])
res <- as.dara.frame(res)
}
})
})
return(res)
}
however, my above implementation didn't return what I expected, I need to filter out the genes by finding genes which has a strong correlation with anodat$GA.
Another attempt:
I read few post about similar issue and some people discussed about using limma package. Here is my attempt by using limma. Here I used anodat$GA as a covariate to fit limma linear model:
library(limma)
fit <- limma::lmFit(expr_mat, design = model.matrix( ~ 0 + anodat$GA)
fit <- eBayes(fit)
topTable(fit, coef=2)
then I am expecting to get a correlation matrix from the above code, and would like to do following in order to get filtered sub expression matrix:
idx <- which( (abs(cor) > 0.8) & (upper.tri(cor)), arr.ind=TRUE)
idx <- unique(c(idx[, 1],idx[, 2])
correlated.genes <- matrix[idx, ]
but I still didn't get the right answer. I am confident about using limma approach but I couldn't figure out what went wrong above code again. Can anyone point me out how to make this work? Is there any efficient way to make this happen?
Don't have your data so hard to double check, but in the abstract I would try this:
library(matrixTests)
cors <- row_cor_pearson(expr_mat, anodat$GA)
which(cors$cor > 0.9) # to get the indeces of genes with correlation > 0.9
I'm new to R and programming in general, and I'm struggling with a for-loop for building the lx function in a life table.
I have the age function x, the death function qx (the probability that someone aged exactly x will die before reaching age x+1), and the surviving function px = 1 - qx.
I want to write a function that returns a vector with all the lx values from first to last age in my table. The function is simple...
I've defined cohort = 1000000. The first age in my table is x = 5, so, considering x = 5...
l_(x) = cohort
And, from now on, l_(x+n) = l_(x+n-1)*p_(x+n-1)
I've searched about for-loops, and I can only get my code working for lx[1] and lx[2], and I get nothing for lx[n] if n > 2.
I wrote that function:
living_x <- function(px, cohort){
result <- vector("double", length(px))
l_x <- vector("double", length(px))
for (i in 1:length(px)){
if (i == 1){
l_x[i] = cohort
}
else l_x[i] = l_x[i-1]*px[i-1]
result[i] = l_x
print(result)
}
}
When I run it, I get several outputs (more than length(px)) and "There were 50 or more warnings (use warnings() to see the first 50)".
When I run warnings(), I get "In result[i] <- l_x : number of items to replace is not a multiple of replacement length" for every number.
Also, everything I try besides it give me different errors or only calculate lx for lx[1] and lx[2]. I know there's something really wrong with my code, but I still couldn't identify it. I'd be glad if someone could give me a hint to find out what to change.
Thank you!
Here's an approach using dplyr from the tidyverse packages, to use px to calculate lx. This can be done similarly in "Base R" using excerpt$lx = 100000 * cumprod(1 - lag(excerpt$qx)).
lx is provided in the babynames package, so we can check our work:
library(tidyverse)
library(babynames)
# Get excerpt with age, qx, and lx.
excerpt <- lifetables %>%
filter(year == 2010, sex == "F") %>%
select(x, qx_given = qx, lx_given = lx)
excerpt
# A tibble: 120 x 3
x qx_given lx_given
<dbl> <dbl> <dbl>
1 0 0.00495 100000
2 1 0.00035 99505
3 2 0.00022 99471
4 3 0.00016 99449
5 4 0.00012 99433
6 5 0.00011 99421
7 6 0.00011 99410
8 7 0.0001 99399
9 8 0.0001 99389
10 9 0.00009 99379
# ... with 110 more rows
Using that data to estimate lx_calc:
est_lx <- excerpt %>%
mutate(px = 1 - qx_given,
cuml_px = cumprod(lag(px, default = 1)),
lx_calc = cuml_px * 100000)
And finally, comparing visually the given lx with the one calculated based on px. They match exactly.
est_lx %>%
gather(version, val, c(lx_given, lx_calc)) %>%
ggplot(aes(x, val, color = version)) + geom_line()
I could do it in a very simple way after thinking for some minutes more.
lx = c()
for (i in 2:length(px)){
lx[1] = 10**6
lx[i] = lx[i-1]*px[i-1]
}
I'm trying to replicate an Excel solver model in R. It's a simple problem to start looking to maximize points with the only constraint being limited the number events than can be played. So I have a two column data frame with a tournament number and project points. In Excel we have a Play Yes/no binary column and multiply it by the points and set to maximize, allowing the model to change the Play Yes/No column to 0 or 1. The constraint limits the sum of the play yes/no variable to the constraint value, for example 25.
library(lpSolve)
tournament<-rep(1:48,1)
mean<-c(12.2,30.4,30.9,44.1,31.3,27.6,31.5,25.0,31.2,24.0,28.0,23.9,14.1,9.5,17.2,37.8,30.5,43.0,32.1,30.7,30.2,37.0,32.1,28.9,23.7,4.6,29.0,29.1,30.7,31.6,49.5,25.1,30.2,10.3,30.3,21.8,88.5,31.0,30.9,2.9,31.1,30.3,29.7,63.7,31.6,91.6,30.6,31.0)
aggdata<-data.frame(tournament,mean)
maxevents <-25
obj<-aggdata$mean
con <- rep(1,nrow(aggdata))
dir <- c("==")
rhs <- maxevents
result <- lp("max", obj, con, dir, rhs, all.bin = TRUE)
The result looks at only 3 rows of the data frame and it should look at the top 25. Eventually, I'll add additional constraints as I know lp is not required for this simple example, but need to get past this roadblock first.
library(lpSolve)
#objective function
obj <- rep(1, nrow(aggdata))
#constraints
con <- matrix(c(obj <- rep(1, nrow(aggdata)),
as.vector(aggdata$point)), nrow = 2, byrow = T) #you can add another constraints here and make 'nrow' equals to number of total constraints
dir <- c("==", "<=")
rhs <- c(25, #total number of tournament
1000) #let's assume that total points can't exceeds 1000
#optimization solution
result <- lp ("max", obj, con, dir, rhs, all.bin=TRUE)
result$solution
Sample data:
aggdata <- data.frame(tournament = rep(1:48,1),
point = c(12.2,30.4,30.9,44.1,31.3,27.6,31.5,25.0,31.2,24.0,28.0,23.9,14.1,
9.5,17.2,37.8,30.5,43.0,32.1,30.7,30.2,37.0,32.1,28.9,23.7,4.6,
29.0,29.1,30.7,31.6,49.5,25.1,30.2,10.3,30.3,21.8,88.5,31.0,30.9,
2.9,31.1,30.3,29.7,63.7,31.6,91.6,30.6,31.0))
# tournament point
#1 1 12.2
#2 2 30.4
#3 3 30.9
#4 4 44.1
#5 5 31.3
#6 6 27.6
I new new to R and am trying to program a pair trading strategy in R.
I have already written the code for downloading the data. And have created additional columns and prepared the data. Now i need to calculate the trading signals.
My signal rules are as follows.
- If Z-Score is greater than 2.25 , Sell the pair; Buy back when Z-Score is less than 0.25.
- If Z-Score is less than -2.25 , Buy the pair; sell (Exit) when z-score is above -0.25.
- close any open position if there is a change in signal.
When we sell a pair, we sell the first stock and buy the second stock. In this case, we sell ACC and Buy Ambujacem.
When we buy a pair, we buy the first stock and sell the second stock. In this case, we buy ACC and Sell Ambujacem.
Could anyone help me with the coding for the trading signals.
Enclosing the code.
Regards,
Subash
# Trading Code
library(quantmod)
getSymbols("ACC.NS", from=as.Date('2007-01-01'), to=as.Date('2015-07-24'))
getSymbols("AMBUJACEM.NS", from=as.Date('2007-01-01'), to=as.Date('2015-07-24'))
acc=ACC.NS[,6]
amb=AMBUJACEM.NS[,6]
t.zoo <- merge(acc, amb, all=TRUE)
t.zoo=as.data.frame(t.zoo)
typeof(t.zoo)
t.zoo=na.omit(t.zoo)
#adding columns
t.zoo$spread <- 0
t.zoo$adfTest <- 0
t.zoo$mean <- 0
t.zoo$stdev <- 0
t.zoo$zScore <- 0
t.zoo$signal <- 0
t.zoo$BuyPrice <- 0
t.zoo$SellPrice <- 0
t.zoo$LongReturn <- 0
t.zoo$ShortReturn <- 0
t.zoo$Slippage <- 0
t.zoo$TotalReturn <- 0
#preparing the data
#Calculating the pair ratio
t.zoo$pairRatio <- t.zoo$ACC.NS.Adjusted/t.zoo$AMBUJACEM.NS.Adjusted
#Calculate the log prices of the two time series
t.zoo$LogA <- log10(t.zoo$ACC.NS.Adjusted)
t.zoo$LogB <- log10(t.zoo$AMBUJACEM.NS.Adjusted)
#Calculating the spread
t.zoo$spread <- t.zoo$ACC.NS.Adjusted/t.zoo$AMBUJACEM.NS.Adjusted
#Calculating the mean
# Computes the mean using the SMA function
# choose the number of days for calculating the mean
SMAdays = 20
t.zoo$mean <- SMA(t.zoo$spread,SMAdays)
#Calculating the Std Deviation
t.zoo$stdev <- rollapply(t.zoo$spread,20,sd, fill=NA, align='right')
#Calculating the Z Score
t.zoo$zScore <- (t.zoo$pairRatio - t.zoo$mean)/t.zoo$spread
View(t.zoo)
#Calculation of trading signals and trading prices
#Trigger sell or buy signal if Z Score moves above 2.25 or below -2.25.
# Close position if Z Score reaches 0.2 or -0.2.
# close any open position if there is a change in signal.
I think the main issue was to come up with trading signals for a strategy that depends not only on the current level of indicator but also on the direction from which the indicator is crossed.
There were a number of problems with the code posted in comments, including use of single = for comparisons . So I've worked it afresh
Here's my attempt at solving this. It seems to be fine. I've added some plotting code to eyeball the results. I suggest you check the result over different periods.
This code comes after the one in the original question . Only difference is that I have kept t.zoo as an xts/zoo object and not converted it to data.frame. Also, I've multiplied zScores with 100
It generates trigger dates and also a column depicting the state of strategy. Calculating returns would be easy from there
colnames(t.zoo)
#t.zoo must be an xts object
#working on a separate xts object
sigs<- t.zoo[, c("ACC.NS.Adjusted", "AMBUJACEM.NS.Adjusted" , "zScore")]
# creating my own triggers as there are not enough good values
# buyTrig<- mean(t.zoo$zScore ,na.rm = T) - 1*sd(t.zoo$zScore ,na.rm = T)
# sellTrig<- (-1) * buyTrig
# sqOffTrig<- mean(t.zoo$zScore ,na.rm = T) - 0.5*sd(t.zoo$zScore ,na.rm = T)
# Another approach: scaling tz.zoo to fit your criterion
sigs$zScore<- sigs$zScore*100
buyTrig<- (-2.25)
sellTrig<- (-1) * buyTrig
sqOffTrig<- 0.25
cat ( buyTrig, sellTrig , sqOffTrig)
hist(sigs$zScore, breaks = 40)
abline(v=c(buyTrig,sellTrig), col="red")
abline(v=c(-sqOffTrig, sqOffTrig), col="green")
sum(sigs$zScore >= -sqOffTrig & sigs$zScore<= sqOffTrig , na.rm = T) # 139
sigs$action<- 0
sigs$mode <- NA
sigs$zLag<- lag.xts(sigs$zScore,1)
sigs[19:22,]
#these are not the real trigger dates, but they will serve our purpose
# along with na.locf
buyTrigDays<- time(sigs[sigs$zScore<= buyTrig & sigs$zLag > buyTrig, ])
sellTrigDays<- time(sigs[sigs$zScore>= sellTrig & sigs$zLag < sellTrig, ])
#square offs
buySqOffDays<- time( sigs[sigs$zScore>= (-1*sqOffTrig) & sigs$zLag < (-1*sqOffTrig), ] )
buySqOffDays
sellSqOffDays<- time( sigs[sigs$zScore<= (sqOffTrig) & sigs$zLag > (sqOffTrig), ] )
sellSqOffDays
sigs$mode[buyTrigDays]=1 ; sigs$mode[sellTrigDays]= -1;
sigs$mode[buySqOffDays]=0 ; sigs$mode[sellSqOffDays]= 0;
sigs$mode
# use local fill to repeat these triggered position into future
# till you meet another non NA value
sigs$mode<- na.locf(sigs$mode, fromLast = F)
plot((sigs$zScore["2015"] ))
points(sigs$zScore[sigs$mode==1], col="red", on=1, pch = 19)
points(sigs$zScore[sigs$mode==-1], col="green", on=1 , pch = 19)
points(sigs$zScore[sigs$mode==0], col="blue", on=1)
sum(is.na(sigs$mode))
#now to get the real dates when square off is triggered
trigdays<- time( sigs[diff(sigs$mode,1) != 0, ] ) #when the value changes
squareOffTrigger_real<- time(sigs[sigs$mode==0][trigdays])
buyTrigger_real<- time(sigs[sigs$mode==1] [trigdays])
sellTrigger_real<- time(sigs[sigs$mode==-1][trigdays])
#check
length(sellTrigger_real) + length(buyTrigger_real) == length(squareOffTrigger_real)
plot(sigs$zScore["2015"])
points(sigs$zScore[buyTrigger_real] , col="blue", pch = 19, on=1)
points(sigs$zScore[sellTrigger_real] , col="red", pch = 19, on=1)
points(sigs$zScore[squareOffTrigger_real] , col="green", pch = 19, on=1)
abline(h=c(-sqOffTrig, sqOffTrig) , col= "green" )
# further calculations can be easily made using either the mode
# column or the trigger dates computed at the end
I want to reallocate a strategy portfolio at specific dates:
require(PerformanceAnalytics)
require(TTR)
require(quantmod)
Get asset prices and obtain the daily discrete Returns
tickers = c("ABI.BR","AI.PA","AIR.PA","ALV.DE","ASML.AS")
getSymbols(tickers, from="2012-01-01", to="2013-12-01")
close.prices = do.call(merge, lapply(tickers, function(x) Cl(get(x))))
colnames(close.prices) = c("Anheuser-Busch InBev",
"L'Air Liquide","AIRBUS GROUP","Allianz","ASML HLDG")
assets.ret = ROC(close.prices,type="discrete")[-1]
Now I obtain RSI signals by applying the RSI function to each asset
rsi.fct = function(x) RSI(x, n=20, maType = SMA)
rsi = xts(apply(close.prices, 2, rsi.fct),
order.by=index(rsi.fct(close.prices[,1]) ) )
> tail(rsi)
Anheuser-Busch InBev L'Air Liquide AIRBUS GROUP Allianz ASML HLDG
2013-11-22 51.15171 49.36494 60.25836 61.07143 46.84159
2013-11-25 54.95495 50.82237 63.54717 61.07143 49.63168
2013-11-26 49.65470 52.55102 58.29563 58.18182 48.59023
2013-11-27 54.60575 61.81980 57.94677 62.05674 52.11640
2013-11-28 46.52778 60.76994 57.85061 63.35616 45.70000
2013-11-29 50.99905 61.90476 56.09756 65.49296 48.82479
The strategy is as follows: I buy the asset when the RSI is < 30 and do not buy when RSI >= 30
ret.mat.rsi = lag(ifelse (rsi < 30, 1, 0))*assets.ret
Now this is the part where I have problems. The returns from the ret.mat.rsi are daily returns.
Assume that I want to look at the rsi matrix at the first day of the month e.g.
> rsi[110]
Anheuser-Busch InBev L'Air Liquide AIRBUS GROUP Allianz ASML HLDG
2012-06-01 39.66126 31.1599 30.39443 17.17647 43.85172
I want to buy the first 4 assets equally weighted into my portfolio since their RSI is below 30
and leave the positions unchanged for the rest of the month (regardless of further RSI signals) until the first day of the next month:
> rsi[131]
Anheuser-Busch InBev L'Air Liquide AIRBUS GROUP Allianz ASML HLDG
2012-07-02 84.69529 73.87205 66.25561 74.52642 71.65021
where I choose to buy none of the assets.
The whole question is now how to elegantly code an automatic reallocation of the portfolio
at specific dates, i.e. at the beginning of each month (could also be each week or every three weeks). The portfolio return should only consist of those assets that fulfill the indicator condition (here RSI < 30) at the reallocation date.
How I would have coded your example:
require(quantmod)
tickers <- c("ABI.BR","AI.PA","AIR.PA","ALV.DE","ASML.AS")
myEnv <- new.env()
getSymbols(tickers, from="2012-01-01", to="2013-12-01", env=myEnv)
close.prices <- do.call(merge, eapply(myEnv, Cl))
close.prices <- close.prices[,pmatch(tickers,colnames(close.prices))]
colnames(close.prices) <- c("Anheuser-Busch InBev",
"L'Air Liquide","AIRBUS GROUP","Allianz","ASML HLDG")
assets.ret <- ROC(close.prices,type="discrete")[-1]
rsi.fct <- function(x) RSI(x, n=20, maType = SMA)
rsi <- xts(apply(close.prices, 2, rsi.fct), index(close.prices))
Now, to answer your question, use GSee's startpoints function to get the first RSI value for each month. startpoints allows you to choose any number of weeks, months, quarters, etc as the rebalancing period.
startpoints <- function (x, on = "months", k = 1) {
head(endpoints(x, on, k) + 1, -1)
}
# get the signal at the beginning of each month
rsi.signal <- lag(ifelse(rsi < 30, 1, 0))[startpoints(rsi),]
# rsi.signal is monthly; we need a daily series where each day has the
# value from the first day of the month, so we merge with an empty xts
# object that has the daily index and use na.locf to fill the gaps
rsi.signal <- merge(rsi.signal, xts(,index(rsi)), fill=na.locf)
# now calculate returns
rsi.ret <- rsi.signal * assets.ret