How to solve this quadratic optimization problem in R? - r

I am currently trying to implement a bigger simulation exercise but i'm stuck with this bit.
The aim is to find the vector p* (2x1) that maximizes this function (p* = argmax of h):
Equation
Also Y and q are given and all other quantities in the function are defined using them.
P_priority_i <- function(unknown, arg1, arg2, i){
mu = 2
delta = 0.00001
c <- c(pbar[i,] + rep(delta,m))
e <- rep(0,2)
s <- rep(0,2)
for (j in 1:m){
e[j] <- x[i,j] + sum(A[[j]][i,]*min(pbar[i,j],arg1[i,j]))
}
if(y[i,'countries'] != 'IT'){
s[1] <- min(pbar[i,1],(max(0,sum(arg2*e)))/arg2[1])
s[2] <- min(pbar[i,2],(max(0,sum(arg2*e)-arg2[1]*s[1]))/arg2[2])
value <- -0.5*t(c-unknown)%*%diag(arg2/(c-s))%*%(c-unknown)
return(value)
} else {
s[2] <- min(pbar[i,2],(max(0,sum(arg2*e)))/arg2[2])
s[1] <- min(pbar[i,1],(max(0,sum(arg2*e)-arg2[2]*s[2]))/arg2[1])
value <- -0.5*t(c-unknown)%*%diag(arg2/(c-s))%*%(c-unknown)
return(value)
}}
I've checked the formulation of the function, whose output is a scalar, and it is correct.
I also have 3 constraints on p*:
Constraints
where \bar{p} and x are given quantities.
I've found quadprog package but I don't know how to solve this particular problem using solve.QP function () which supposes an objective function as (− d^T b + 0.5 b^T D b). The problem is that the argument of my maximization should be p and not (c-p) (also the constrains are formulated w.r.t p).
How can i set up this in R?

Related

Integration problem in R when I use the function "integrate"

I'm trying to compute a kind of Gini index using a generated dataset.
But, I got a problem in the last integrate function.
If I try to integrate the function named f1,
R says
Error in integrate(Q, 0, p) : length(upper) == 1 is not TRUE
My code is
# set up parameters b>a>1 and the number of observations n
n <- 1000
a <- 2
b <- 4
# generate x and y
# where x follows beta distribution
# y = 10x+3
x <- rbeta(n,a,b)
y <- 10*x+3
# the starting point of the integration having problem
Q <- function(q) {
quantile(y,q)
}
# integrate the function Q from 0 to p
G <- function(p) {
integrate(Q,0,p)
}
# compute a function
L <- function(p) {
numer <- G(p)$value
dino <- G(1)$value
numer/dino
}
# the part having problem
d <- 3
f1 <- function(p) {
((1-p)^(d-2))*L(p)
}
integrate(f1,0,1) # In this integration, the aforementioned error appears
I think, the repeated integrate could make a problem but I have no idea what is the exact problem.
Please help me!
As mentioned by #John Coleman, integrate needs to have a vectorized function and a proper subdivisions option to fulfill the integral task. Even if you have already provided a vectorized function for integral, it is sometimes tricky to properly set the subdivisions in integrate(...,subdivisions = ).
To address your problem, I recommend integral from package pracma, where you still a vectorized function for integral (see what I have done to functions G and L), but no need to set subdivisions manually, i.e.,
library(pracma)
# set up parameters b>a>1 and the number of observations n
n <- 1000
a <- 2
b <- 4
# generate x and y
# where x follows beta distribution
# y = 10x+3
x <- rbeta(n,a,b)
y <- 10*x+3
# the starting point of the integration having problem
Q <- function(q) {
quantile(y,q)
}
# integrate the function Q from 0 to p
G <- function(p) {
integral(Q,0,p)
}
# compute a function
L <- function(p) {
numer <- Vectorize(G)(p)
dino <- G(1)
numer/dino
}
# the part having problem
d <- 3
f1 <- function(p) {
((1-p)^(d-2))*L(p)
}
res <- integral(f1,0,1)
then you will get
> res
[1] 0.1283569
The error that you reported is due to the fact that the function in integrate must be vectorized and integrate itself isn't vectorized.
From the help (?integrate):
f must accept a vector of inputs and produce a vector of function
evaluations at those points. The Vectorize function may be helpful to
convert f to this form.
Thus one "fix" is to replace your definition of f1 by:
f1 <- Vectorize(function(p) {
((1-p)^(d-2))*L(p)
})
But when I run the resulting code I always get:
Error in integrate(Q, 0, p) : maximum number of subdivisions reached
A solution might be to assemble a large number of quantiles and then smooth it out and use that rather than your Q, although the error here strikes me as odd.

R - finding roots for a cartesian product of function parameters

Given a function f(x,c,d) of x that also depends on some parameters c and d. I would like to find the zeroes for a cartesian product of certain values c_1,...,c_n and d_1,...,d_m of the parameters, i.e. an x_ij such that f(x_ij,c_i,d_j)=0 for i=1,...,n and j=1,...,m. Although not that crucial I am applying a Newton-Raphson algorithm for the root finding:
newton.raphson <- function(f, a, b, tol = 1e-5, n = 1000){
require(numDeriv) # Package for computing f'(x)
x0 <- a # Set start value to supplied lower bound
k <- n # Initialize for iteration results
# Check the upper and lower bounds to see if approximations result in 0
fa <- f(a)
if (fa == 0.0){
return(a)
}
fb <- f(b)
if (fb == 0.0) {
return(b)
}
for (i in 1:n) {
dx <- genD(func = f, x = x0)$D[1] # First-order derivative f'(x0)
x1 <- x0 - (f(x0) / dx) # Calculate next value x1
k[i] <- x1 # Store x1
# Once the difference between x0 and x1 becomes sufficiently small, output the results.
if (abs(x1 - x0) < tol) {
root.approx <- tail(k, n=1)
res <- list('root approximation' = root.approx, 'iterations' = k)
return(res)
}
# If Newton-Raphson has not yet reached convergence set x1 as x0 and continue
x0 <- x1
}
print('Too many iterations in method')
}
The actual function that I am interest is more complicated, but the following example illustrates my problem.
test.function <- function(x=1,c=1,d=1){
return(c*d-x)
}
Then for any given c_i and d_j I can easily calculate the zero by
newton.raphson(function(x) test.function(x,c=c_i,d=d_j),0,1)[1]
which here is obviously just the product c_i*d_j.
Now I tried to define a function that finds for two given vectors (c_1,...,c_n) and (d_1,...,d_m) the zeroes for all combinations. For this, I tried to define
zeroes <- function(ci=1,dj=1){
x<-newton.raphson(function(x) test.function(x,c=ci,d=dj),0,1)[1]
return(as.numeric(x))
}
and then use the outer-function, e.g.
outer(c(1,2),c(1,2,3),FUN=zeroes)
Unfortunately, this did not work. I got an error message
Error during wrapup: dims [product 6] do not match the length of object [1]
There might be also a much better solution to my problem. I am happy for any input.

How to obtain the probability distribution of a sum of dependent discrete random variables more efficiently

I hope you are well. I was wondering if you could help me with the question provided in the attached link, please. Below the link I attach an R-code that solves the problem recursively for particular values of the parameters of the distributions involved. However, I realized that this method is inefficient. Thanks a lot for your help.
How to obtain the probability distribution of a sum of dependent discrete random variables more efficiently
library(boot) # The library boot is necessary to use the command inv.logit.
TMax <- 500 # In this R-code, I am using TMax instead of using T.
M <- 2000
beta0 <- 1
beta1 <- 0.5
Prob_S <- function(k, r){ # In this R-code, I am using r instead of using t.
if(r == 1){
Aux <- dbinom(x = k, size = M, prob = inv.logit(beta0))
}
if(r %in% 2:TMax){
Aux <- 0
for(u in 0:k){
Aux <- Aux + dbinom(x = k - u, size = M - u,
prob = inv.logit(beta0 + beta1 * u)) * Prob_S(u, r - 1)
}
}
Aux
}
m <- 300
P <- Prob_S(k = m, r = TMax) # Computing P takes a loooong time. :(

Non-comformable arguments in R

I am re-writting an algorithm I did in C++ in R for practice called the Finite Difference Method. I am pretty new with R so I don't know all the rules regarding vector/matrix multiplication. For some reason I am getting a non-conformable arguments error when I do this:
ST_u <- matrix(0,M,1)
ST_l <- matrix(0,M,1)
for(i in 1:M){
Z <- matrix(gaussian_box_muller(i),M,1)
ST_u[i] <- (S0 + delta_S)*exp((r - (sigma*sigma)/(2.0))*T + sigma*sqrt(T)%*%Z)
ST_l[i] <- (S0 - delta_S)*exp((r - (sigma*sigma)/(2.0))*T + sigma*sqrt(T)%*%Z)
}
I get this error:
Error in sqrt(T) %*% Z : non-conformable arguments
Here is my whole code:
gaussian_box_muller <- function(n){
theta <- runif(n, 0, 2 * pi)
rsq <- rexp(n, 0.5)
x <- sqrt(rsq) * cos(theta)
return(x)
}
d_j <- function(j, S, K, r, v,T) {
return ((log(S/K) + (r + (-1^(j-1))*0.5*v*v)*T)/(v*(T^0.5)))
}
call_delta <- function(S,K,r,v,T){
return (S * dnorm(d_j(1, S, K, r, v, T))-K*exp(-r*T) * dnorm(d_j(2, S, K, r, v, T)))
}
Finite_Difference <- function(S0,K,r,sigma,T,M,delta_S){
ST_u <- matrix(0,M,1)
ST_l <- matrix(0,M,1)
for(i in 1:M){
Z <- matrix(gaussian_box_muller(i),M,1)
ST_u[i] <- (S0 + delta_S)*exp((r - (sigma*sigma)/(2.0))*T + sigma*sqrt(T)%*%Z)
ST_l[i] <- (S0 - delta_S)*exp((r - (sigma*sigma)/(2.0))*T + sigma*sqrt(T)%*%Z)
}
Delta <- matrix(0,M,1)
totDelta <- 0
for(i in 1:M){
if(ST_u[i] - K > 0 && ST_l[i] - K > 0){
Delta[i] <- ((ST_u[i] - K) - (ST_l[i] - K))/(2*delta_S)
}else{
Delta <- 0
}
totDelta = totDelta + exp(-r*T)*Delta[i]
}
totDelta <- totDelta * 1/M
Var <- 0
for(i in 1:M){
Var = Var + (Delta[i] - totDelta)^2
}
Var = Var*1/M
cat("The Finite Difference Delta is : ", totDelta)
call_Delta_a <- call_delta(S,K,r,sigma,T)
bias <- abs(call_Delta_a - totDelta)
cat("The bias is: ", bias)
cat("The Variance of the Finite Difference method is: ", Var)
MSE <- bias*bias + Var
cat("The marginal squared error is thus: ", MSE)
}
S0 <- 100.0
delta_S <- 0.001
K <- 100.0
r <- 0.05
sigma <- 0.2
T <- 1.0
M <- 10
result1 <- Finite_Difference(S0,K,r,sigma,T,M,delta_S)
I can't seem to figure out the problem, any suggestions would be greatly appreciated.
In R, the %*% operator is reserved for multiplying two conformable matrices. As one special case, you can also use it to multiply a vector by a matrix (or vice versa), if the vector can be treated as a row or column vector that conforms to the matrix; as a second special case, it can be used to multiply two vectors to calculate their inner product.
However, one thing it cannot do is perform scalar multipliciation. Scalar multiplication of vectors or matrices always uses the plain * operator. Specifically, in the expression sqrt(T) %*% Z, the first term sqrt(T) is a scalar, and the second Z is a matrix. If what you intend to do here is multiply the matrix Z by the scalar sqrt(T), then this should just be written sqrt(T) * Z.
When I made this change, your program still didn't work because of another bug -- S is used but never defined -- but I don't understand your algorithm well enough to attempt a fix.
A few other comments on the program not directly related to your original question:
The first loop in Finite_Difference looks suspicious: guassian_box_muller(i) generates a vector of length i as i varies in the loop from 1 up to M, and forcing these vectors into a column matrix of length M to generate Z is probably not doing what you want. It will "reuse" the values in a cycle to populate the matrix. Try these to see what I mean:
matrix(gaussian_box_muller(1),10,1) # all one value
matrix(gaussian_box_muller(3),10,1) # cycle of three values
You also use loops in many places where R's vector operations would be easier to read and (typically) faster to execute. For example, your definition of Var is equivalent to:
Var <- sum((Delta - totDelta)^2)/M
and the definitions of Delta and totDelta could also be written in this simplified fashion.
I'd suggest Googling for "vector and matrix operations in r" or something similar and reading some tutorials. Vector arithmetic in particular is idiomatic R, and you'll want to learn it early and use it often.
You might find it helpful to consider the rnorm function to generate random Gaussians.
Happy R-ing!

Trying to use the collin function in the R package FME to identify parameters and then fit them using modFit

So I have a system of ode's and some data I am using the R packages deSolve and FME to fit the parameters of the ode system to data. I am getting a singular matrix result when I fit the full parameter set to the data. So I went back and looked at the collinearity of the parameters using a collinearity index cut-off of 20 as suggested in all the FME package documentation I then picked a few models with subsets of parameters to fit. Then when I run modFit I get this error:
Error in approx(xMod, yMod, xout = xDat) :
need at least two non-NA values to interpolate
Can anyone enlighten me as to a fix for this. Everything else is working fine. So this is not a coding problem.
Here is a minimal working example (removing r=2 in modFit creates the error which I can fix in the minimal working example but not in my actual problem so I doubt a minimal working example helps here):
`## =======================================================================
## Now suppose we do not know K and r and they are to be fitted...
## The "observations" are the analytical solution
## =======================================================================
# You need these packages
library('deSolve')
library('FME')
## logistic growth model
TT <- seq(1, 100, 2.5)
N0 <- 0.1
r <- 0.5
K <- 100
## analytical solution
Ana <- cbind(time = TT, N = K/(1 + (K/N0 - 1) * exp(-r*TT)))
time <- 0:100
parms <- c(r = r, K = K)
x <- c(N = N0)
logist <- function(t, x, parms) {
with(as.list(parms), {
dx <- r * x[1] * (1 - x[1]/K)
list(dx)
})
}
## Run the model with initial guess: K = 10, r = 2
parms["K"] <- 10
parms["r"] <- 2
init <- ode(x, time, logist, parms)
## FITTING algorithm uses modFit
## First define the objective function (model cost) to be minimised
## more general: using modFit
Cost <- function(P) {
parms["K"] <- P[1]
parms["r"] <- P[2]
out <- ode(x, time, logist, parms)
return(modCost(out, Ana))
}
(Fit<-modFit(p = c(K = 10,r=2), f = Cost))
summary(Fit)`
I think the problem is in your Cost function. If you don't provide both K and r, then the cost function will override the start value of r to NA. You can test this:
Cost <- function(P) {
parms["K"] <- P[1]
parms["r"] <- P[2]
print(parms)
#out <- ode(x, time, logist, parms)
#return(modCost(out, Ana))
}
Cost(c(K=10, r = 2))
Cost(c(K=10))
This function works:
Cost <- function(P) {
parms[names(P)] <- P
out <- ode(x, time, logist, parms)
return(modCost(out, Ana))
}
The vignette FMEDyna is very helpful: https://cran.r-project.org/web/packages/FME/vignettes/FMEdyna.pdf See page 14 on how to specify the Objective (Cost) function.

Resources