I am trying to use the R package ROI for a simple portfolio optimization problem.
I can get the results using the quadprog solver "manually", but I'd really like to understand how the ROI package works.
Unfortunately I run into an error, even though I am sticking to the provided example by Stefan Theussl at http://statmath.wu.ac.at/courses/optimization/Presentations/ROI-2011.pdf (slide 26,27)
Here is the code:
library(fPortfolio)
library(ROI)
data(LPP2005.RET)
lppData <- 100 * LPP2005.RET[, 1:6]
r <- mean(lppData)
foo <- Q_objective(Q = cov(lppData), L = rep(0, ncol(lppData)))
full_invest <- L_constraint(rep(1, ncol(lppData)), "==", 1)
target_return <- L_constraint(apply(lppData, 2, mean), "==",r)
op <- OP(objective = foo, constraints = rbind(full_invest, target_return))
sol <- ROI_solve(op, solver = "quadprog")
The error message I get is:
Error in (dir == "<=") | (dir = q = "<") : operations are possible
only for numeric, logical or complex types
Thanks for your help!
It turns out that there was a bug in the ROI quadprog plugin which has been fixed by the developer.
Related
The goal: Use current lpSolve code to create a new code using the lpSolveAPI package.
The background: I have been using lpSolve to find an optimal solution, for purposes of creating fantasy sports contest lineups, which maximizes the projected points (DK) of the players on the team versus the maximum allowed total salary (SALARY) - with a handful of other constraints to fit the rules of the contest. I have discovered in a few instances, however, lpSolve fails to find the most optimal solution. It seemingly overlooks the best points/dollar solution for some unknown reason and finds only the nth best solution instead. Unfortunately, I do not have an example of this as I had issues with my archive drive recently and lost quite a bit of data.
My research/ask: I have read other threads here that have had similar issues with lpSolve (like this one here). In those instances, lpSolveAPI was able to see the optimal solution when lpSolve could not. Not being familiar with lpSolveAPI, I am looking for assistance from someone familiar with both packages in converting my current code to instead take advantage of the lpSolveAPI package and eliminate lpSolve oversight going forward. I have tried but, for some reason, I keep getting lost in the translation.
My lpSolve code:
# count the number of unique teams and players
unique_teams = unique(slate_players$TEAM)
unique_players = unique(slate_players$PLAYERID)
# define the objective for the solver
obj = slate_players$DK
# create a constraint matrix for the solver
con = rbind(t(model.matrix(~ POS + 0, slate_players)), #Positions
t(model.matrix(~ PLAYERID + 0, slate_players)), #DupPlayers
t(model.matrix(~ TEAM + 0, slate_players)), #SameTeam
rep(1,nrow(slate_players)), #TotPlayers
slate_players$SALARY) #MaxSalary
# set the direction for each of the constraints
dir = c("==", #1B
"==", #2B
"==", #3B
"==", #C
"==", #OF
"==", #SP
"==", #SS
rep('<=',length(unique_players)), #DupPlayers
rep('<=',length(unique_teams)), #SameTeam
"==", #TotPlayers
"<=") #MaxSalary
# set the limits for the right-hand side of the constraints
rhs = c(1, #1B
1, #2B
1, #3B
1, #C
3, #OF
2, #SP
1, #SS
rep(1,length(unique_players)), #DupPlayers
rep(5,length(unique_teams)), #SameTeam
10, #TotPlayers
50000) #MaxSalary
# find the optimal solution using the solver
result = lp("max", obj, con, dir, rhs, all.bin = TRUE)
# create a data frame for the players in the optimal solution
solindex = which(result$solution==1)
optsolution = slate_players[solindex,]
Thank you for your help!
This should be straightforward:
library(lpSolveAPI)
ncons <- nrow(con)
nvars <- length(obj)
lprec <- make.lp(nrow=ncons, ncol=ncols)
set.objfn(lprec, obj)
set.type(lprec, 1:nvars, "binary") # all.bin=TRUE
for (i in 1:ncons) {
set.row(lprec, row=i, xt=con[i,])
set.constr.type(lprec, dir[i], constraints=i)
set.rhs(lprec, b=rhs[i], constraints=i)
}
status <- solve(lprec)
if(status!=0) stop("no solution found, error code=", status)
sol <- get.variables(lprec)
This code is untested since your question has missing data references and no expected solution.
I am using PC algorithm function, in which Conditional Independence is one of the attribute. Facing error in the following code. Note that 'data' here is the data that I have been using, and 1,6,2 in gaussCItest are the node positions in my adjacency matrix x and y of the data.
code:
library(pcalg)
suffstat <- list(C = cor(data), n = nrow(data))
pc.data <- pc(suffstat,
indepTest=gaussCItest(1,6,2,suffstat),
p=ncol(data),alpha=0.01)
Error:
Error in indepTest(x, y, nbrs[S], suffStat) :
could not find function "indepTest"
Below is the code that worked.removed the parameters for gaussCItest as its a function, which can be used directly.
library(pcalg)
suffstat <- list(C = cor(data), n = nrow(data))
pc.data <- pc(suffstat,indepTest=gaussCItest, p=ncol(data),alpha=0.01)
I am attempting to run the corr.test equation in R, with code that my professor submitted and tested on his system. Unfortunately, when I run it I am getting an error that "object sef not found".
This is confounding both my professor and I, and having done a thorough search, we're not sure how to address this.
I really appreciate any help you can provide.
Edit: Here is the code I am using:
trendan1 <- read.table("trendan1.for.R.dat", header=TRUE, na.strings=".")
head(trendan1)
tail(trendan1)
attributes(trendan1)
is.matrix(trendan1)
id <- trendan1$id
famenv1 <- trendan1$famenv1
famenv2 <- trendan1$famenv2
famenv3 <- trendan1$famenv3
conf1 <- trendan1$conf1
conf2 <- trendan1$conf2
conf3 <- trendan1$conf3
trendan1dataset1 <- cbind(id,famenv1,famenv2,famenv3,conf1,conf2,conf3)
attributes(trendan1dataset1)
is.matrix(trendan1dataset1)
is.data.frame(trendan1dataset1)
require("psych")
describe(trendan1dataset1[,2:7])
print(describe(trendan1dataset1[,2:7]), digits=6)
famave <- (1*famenv1 + 1*famenv2 + 1*famenv3)/3
famlin <- -1*famenv1 + 0*famenv2 + 1*famenv3
famquad <- 1*famenv1 - 2*famenv2 + 1*famenv3;
trendandataset2 <- cbind(famenv1,famenv2,famenv3,famave,famlin,famquad)
print(describe(trendandataset2), digits=6)
hist(famenv1)
boxplot(famenv1)
abline(h=mean(famenv1))
qqnorm(famenv1,ylab="famenv1")
qqline(famenv1)
shapiro.test(famenv1)
hist(famenv2)
boxplot(famenv2)
abline(h=mean(famenv2)) # add mean to the boxplot
qqnorm(famenv1,ylab="famenv2")
qqline(famenv2)
shapiro.test(famenv2)
corvars1 <- cbind(famenv1,famenv2,famenv3)
cor(corvars1,use = "everything", method = "pearson")
cov(corvars1,use = "everything")
sscp1 <- t(corvars1)%*%(corvars1) #Matrix multiplcation
sscp1
rc1 <- corr.test(corvars1,
use="pairwise",method="pearson",adjust="holm",alpha=.05, ci=FALSE)
attributes(rc1)
print(rc1$p, digits=6)
This is a bug that sometimes happens when you do not evaluate confidence interval. It should be fixed if u change the option to ci=TRUE, or simply delete this option as the default is ci=TRUE.
Very new user here. I am trying to use lsoda to solve differential equations stratified into two layers (as denoted by the for(s in 1:2) loop).
When running this full code, I keep getting the error message
object 'N' not found
no matter where or how I try to define N.
Can anyone help spot the error or advise on what I'm doing wrong? Thanks in advance.
R code:
library(deSolve)
Dyn <- function(t, var,par) {
with(as.list(c(par, var)), {
for(s in 1:2){
#Derivatives
dX[s] <- mu*N[s] - sigma*X[s] - (c[s]*beta*(InD[s] +ID[s]+ IdT[s])/N[s])*X[s] - mu*X[s]
dXint[s] <- sigma*X[s] - (1-omega)*(c[s]*beta*(InD[s] +ID[s]+ IdT[s])/N[s])*Xint[s] - mu*Xprep[s]
dInD[s] <- (c[s]*beta*(InD[s] +ID[s]+ IdT[s])/N[s])*X[s] - psi*InD[s]- mu*InD[s]
dID[s] <- (1-omega)*(c[s]*beta*(InD[s] +ID[s]+ IdT[s]) /N[s])*Xint[s]+ psi*InD[s]- mu*ID[s]
N[s] <- X[s]+Xint[s]+InD[s]+ID[s]
diffs <- c(dX[s], dXint[s], dInD[s], dID[s], N[s])}
return(list(diffs))
})}
#Defining parameter and initial values
par <- c(mu=0.033, sigma=0.29, beta=0.40, c=c(2, 30), Ctot=1773600, N=c(332550, 36950), psi=0.022, omega=0.44)
init <- c(X=c(332550,36950), Xint=c(0,0), InD=c(1,1), ID=c(0,0))
t <- seq(0, 30, by=0.1)
#Numerical solution#
Hom.sol <- lsoda(init, t, Dyn,par)
I think you are mixing up parameters and variables. N seems to be defined as a parameter par with dimension 2. However, in your model definition you are updating N with dimension 1.
I asked a question a few weeks back regarding how one would do optimization in R(Optimizing for Vector Using Optimize R). Now that I have a proper grip with basic optimization in R, I would like to start employing GA's to solve for solutions.
Given the objective function:
div.ratio <- function(weight, vol, cov.mat){
weight <- weight / sum(weight)
dr <- (t(weight) %*% vol) / (sqrt(t(weight) %*% cov.mat %*% (weight)))
return(-dr)
}
I am using genalg package for optimizing, specifically the "rbga.bin" function. But the thing is one cannot seem to pass in more than one parameter, ie can't pass in vol and cov.mat. Am I missing something or understanding this incorrectly.
Edit:
In the genalg package, there is a function called rbga.bin which is the one I am using.
Here is the simple code from previous question that can get you started:
rm(list=ls())
require(RCurl)
sit = getURLContent('https://github.com/systematicinvestor/SIT/raw/master/sit.gz', binary=TRUE, followlocation = TRUE, ssl.verifypeer = FALSE)
con = gzcon(rawConnection(sit, 'rb'))
source(con)
close(con)
load.packages('quantmod')
data <- new.env()
tickers<-spl("VTI,VGK,VWO,GLD,VNQ,TIP,TLT,AGG,LQD")
getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign = T)
for(i in ls(data)) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)
bt.prep(data, align='remove.na', dates='1990::2013')
prices<-data$prices[,-10]
ret<-na.omit(prices/mlag(prices) - 1)
vol<-apply(ret,2,sd)
cov.mat<-cov(ret)
out <- optim(par = rep(1 / length(vol), length(vol)), # initial guess
fn = div.ratio,
vol = vol,
cov.mat = cov.mat,
method = "L-BFGS-B",
lower = 0,
upper = 1)
opt.weights <- out$par / sum(out$par) #optimal weights
While the above optim function works just fine, I was thinking if it is possible to reproduce this using a GA algorithm. So in the future if I am searching for multiple objectives I will be able to do this faster compared to GA. (I am not sure if it is faster, but this is the step to take to find out)
GAmodel <- rbga.bin(size = 7, #genes
popSize = 200, #initial number of chromosomes
iters = 100, #number of iterations
mutationChance = 0.01, #chance of mutation
evalFunc = div.ratio) #objective function
Doing the above seems to produce an error as div.ratio needs extra paramters, so I am looking for some help in structuring my problem so that it will be able to produce the optimal answer. I hope the above edit clarifies things.
Thanks
This is what you need:
GAmodel <- rbga(stringMin=rep(0, length(vol)), stringMax=rep(1, length(vol)),
popSize = 200,
iters = 100,
mutationChance = 0.01,
evalFunc = function(weight) div.ratio(weight, vol=vol, cov.mat=cov.mat))
(see first and last lines above).
The problems were:
vectors weight and vol must match lengths.
function evalFunc is called with a single parameter, causing the others to be missing. As I understand, you want to optimize in the weight vector only, keeping vol and cov.mat fixed.
If you want weight to be treated as a continuous variable, then use rbga instead.