I am trying to use the optim function in R to match theoretical data with experimental data.
Basically, I have a function f which computes an output (say a matrix), depending on some parameters defined outside that function. For example, simplifying for clarity:
a=0.2
b=-5
c=9
f=function(dx,Xmax){
Nx=round(Xmax/dx) #Nb of columns
out=matrix(0,2,Nx) #Matrix initialization
for (i in 1:Nx){
out[1,i]=(a^2*dx*i)/b #random example
out[2,i]=(c*(dx*i)^2-b)/a #random example
}
return(out)
}
result=f(0.1,10)
(The actual function f calls lots of functions outside, which use the parameters say a,b,c).
I have some experimental datas for some values of x, say (random again)
expr=data.frame(x=c(0,2,5,7),y=c(1,8,14,23))
I would like to use optim to adjust parameters a,b,c such that the function
WLS=function(theo,expr){
#theo is extracted from the output of f
out=sum((theo-y)/theo)^2) #weighted least squares
}
is minimized, where theo is given by function f for x related to expr (as the experimental data are for some values on x only)
The issue here is, from all the examples of optim I saw, the function to minimize (here WLS) must contain the parameters to vary as an argument. Is my only option is to insert a,b,c into the arguments of f (so we would have f=function(dt,Xmax,a,b,c)), then having something like for WLS
WLS=function(expr,param){
theo=f(dt,Xmax,param[1],param[2],param[3])
#then extract the values of theo for only the x I am interested in (not written here) and plugged
#into theo again
out=sum((theo-y)/theo)^2) #weighted least squares
}
Or are there better ways to deal with that problem?
Thanks in adavance
I am trying to translate an Excel Solver problem (which solves correctly) into R code. I am aware that there is an array of packages allowing to solve constrained optimisation problems, but I still have not found one being able to formulate my problem and solve it.
The problem is the following
min sum(x)
s.t.:
g(f(x)) > 0.9
where:
x is a vector of boolean parameters to be identified
f(x) = x+4
g(f(x)) = SUMIF(f(x), "<90", M)/1000
M is a vector of constant terms with length equal to the lenght of the parameters of X.
In words, the problem aims at minimising the objective function sum(x). Here, x is the vector of boolean parameters the value of which must be determined, f(x) is a function which sums 4 to each parameter, and g(f(x)) is the conditional (to the value of each corresponding f(x)) sum of the vector of constant terms M.
I struggle to set up the problem with every constrained optimisation R package I have found. Is this problem solvable in R in the first place? If so, how?
I would like to evaluate the inverse Student's t-distribution function for small values, e.g., 1e-18, in Matlab. The degrees of freedom is 2.
Unfortunately, Matlab returns NaN:
tinv(1e-18,2)
NaN
However, if I use R's built-in function:
qt(1e-18,2)
-707106781
The result is sensible. Why can Matlab not evaluate the function for this small value? The Matlab and R results are quite similar to 1e-15, but for smaller values the difference is considerable:
tinv(1e-16,2)/qt(1e-16,2) = 1.05
Does anyone know what is the difference in the implemented algorithms of Matlab and R, and if R gives correct results, how could I effectively calculate the inverse t-distribution, in Matlab, for smaller values?
It appears that R's qt may use a completely different algorithm than Matlab's tinv. I think that you and others should report this deficiency to The MathWorks by filing a service request. By the way, in R2014b and R2015a, -Inf is returned instead of NaN for small values (about eps/8 and less) of the first argument, p. This is more sensible, but I think they should do better.
In the interim, there are several workarounds.
Special Cases
First, in the case of the Student's t-distribution, there are several simple analytic solutions to the inverse CDF or quantile function for certain integer parameters of ν. For your example of ν = 2:
% for v = 2
p = 1e-18;
x = (2*p-1)./sqrt(2*p.*(1-p))
which returns -7.071067811865475e+08. At a minimum, Matlab's tinv should include these special cases (they only do so for ν = 1). It would probably improve the accuracy and speed of these particular solutions as well.
Numeric Inverse
The tinv function is based on the betaincinv function. It appears that it may be this function that is responsible for the loss of precision for small values of the first argument, p. However, as suggested by the OP, one can use the CDF function, tcdf, and root-finding methods to evaluate the inverse CDF numerically. The tcdf function is based on betainc, which doesn't appear to be as sensitive. Using fzero:
p = 1e-18;
v = 2
x = fzero(#(x)tcdf(x,v)-p, 0)
This returns -7.071067811865468e+08. Note that this method is not very robust for values of p close to 1.
Symbolic Solutions
For more general cases, you can take advantage of symbolic math and variable precision arithmetic. You can use identities in terms of Gausian hypergeometric functions, 2F1, as given here for the CDF. Thus, using solve and hypergeom:
% Supposedly valid for or x^2 < v, but appears to work for your example
p = sym('1e-18');
v = sym(2);
syms x
F = 0.5+x*gamma((v+1)/2)*hypergeom([0.5 (v+1)/2],1.5,-x^2/v)/(sqrt(sym('pi')*v)*gamma(v/2));
sol_x = solve(p==F,x);
vpa(sol_x)
The tinv function is based on the betaincinv function. There is no equivalent function or even an incomplete Beta function in the Symbolic Math toolbox or MuPAD, but a similar 2F1 relation for the incomplete Beta function can be used:
p = sym('1e-18');
v = sym(2);
syms x
a = v/2;
F = 1-x^a*hypergeom([a 0.5],a+1,x)/(a*beta(a,0.5));
sol_x = solve(2*abs(p-0.5)==F,x);
sol_x = sign(p-0.5).*sqrt(v.*(1-sol_x)./sol_x);
vpa(sol_x)
Both symbolic schemes return results that agree to -707106781.186547523340184 using the default value of digits.
I've not fully validated the two symbolic methods above so I can't vouch for their correctness in all cases. The code also needs to be vectorized and will be slower than a fully numerical solution.
I want to minimize function FlogV (working with a multinormal distribution, Z is data matrix NxC; SIGMA it´s a square matrix CxC of var-covariance of data, R a vector with length C)
FLogV <- function(P){
(here I define parameters, P, within R and SIGMA)
logC <- (C/2)*N*log(2*pi)+(1/2)*N*log(det(SIGMA))
SOMA.t <- 0
for (j in 1:N){
SOMA.t <- SOMA.t+sum(t(Z[j,]-R)%*%solve(SIGMA)%*%(Z[j,]-R))
}
MlogV <- logC + (1/2)*SOMA.t
return(MlogV)
}
minLogV <- optim(P,FLogV)
All this is part of an extend code which was already tested and works well, except in the most important thing: I can´t optimize because I get this error:
“Error in solve.default(SIGMA) :
system is computationally singular: reciprocal condition number = 3.57726e-55”
If I use ginv() or pseudoinverse() or qr.solve() I get:
“Error in svd(X) : infinite or missing values in 'x'”
The thing is: if I take the SIGMA matrix after the error message, I can solve(SIGMA), the eigen values are all positive and the determinant is very small but positive
det(SIGMA)
[1] 3.384674e-76
eigen(SIGMA)$values
[1] 0.066490265 0.024034173 0.018738777 0.015718562 0.013568884 0.013086845
….
[31] 0.002414433 0.002061556 0.001795105 0.001607811
I already read several papers about change matrices like SIGMA (which are close to singular), did several transformations on data scale and form but I realized that, for a 34x34 matrix like the example, after det(SIGMA) close to e-40, R assumes it like 0 and calculation fails; also I can´t reduce matrix dimensions and can´t input in my function correction algorithms to singular matrices because R can´t evaluate it working with this optimization functions like optim. I really appreciate any suggestion to this problem.
Thanks in advance,
Maria D.
It isn't clear from your post whether the failure is coming from det() or solve()
If its just the solve in the quadratic term, you may want to try the two argument version of solve, it can be a bit more stable. solve(X,Y) is the same as solve(X) %*% Y
If you can factor sigma using chol(), you will get a triangular matrix such that LL'=Sigma. The determinant is the product of the diagonals, and you might try this for the quadratic term:
crossprod( backsolve(L, Z[j,]-R))
In R, I wrote a log-likelihood function containing two recursive calculation. The log-likelihood function works properly (it gives answer for known values of parameters), but when I try to maximize it using optim(), it takes too much time. How can I optimize the code? Thanks in advance for ideas.
This is the log-likelihood function for a markov regime switching model with a dependence structure using copula functions.
Named g in the for loop:
Named p in the for loop:
Named f in the codes:
Some data:
u <- cbind(rt(100,10),rt(100,13))
f function:
f=function(u,p,e1,e2){
s=diag(2);s[1,2]=p
ff=dcopula.gauss(cbind(pt(u[,1],e1),pt(u[,2],e2)),Sigma=s)*dt(u[,1],e1)*dt(u[,2],e2)
return(ff)
}
log-likelihood function:
loglik=function(x){
p11<-x[1];p12<-x[2];p21<-x[3];p22<-x[4];p31<-x[5];p32<-x[6];r<-x[7];a1<-x[8];a2<-x[9];s<-x[10];b1<-x[11];b2<-x[12];t<-x[13];c1<-x[14];c2<-x[15]
p1=c(numeric(nrow(u)));p2=c(numeric(nrow(u)));p3=c(numeric(nrow(u)))
g=c(numeric(nrow(u)))
p1_0=.3
p2_0=.3
g[1]<-(p1_0*f(u,r,a1,a2)[1])+(p2_0*f(u,s,b1,b2)[1])+((1-(p1_0+p2_0))*f(u,t,c1,c2)[1])
p1[1]<-((p1_0*p11*f(u,r,a1,a2)[1])+(p2_0*p21*f(u,r,a1,a2)[1])+((1-(p1_0+p2_0))*p31*f(u,r,a1,a2)[1]))/g[1]
p2[1]<-((p1_0*p12*f(u,s,b1,b2)[1])+(p2_0*p22*f(u,s,b1,b2)[1])+((1-(p1_0+p2_0))*p32*f(u,s,b1,b2)[1]))/g[1]
p3[1]<-((p1_0*(1-(p11+p12))*f(u,t,c1,c2)[1])+(p2_0*(1-(p21+p22))*f(u,t,c1,c2)[1])+((1-(p1_0+p2_0))*(1-(p31+p32))*f(u,t,c1,c2)[1]))/g[1]
for(i in 2:nrow(u)){
g[i]<-(p1[i-1]*p11*f(u,r,a1,a2)[i])+(p1[i-1]*p12*f(u,s,b1,b2)[i])+(p1[i-1]*(1-(p11+p12))*f(u,t,c1,c2)[i])+
(p2[i-1]*p21*f(u,r,a1,a2)[i])+(p2[i-1]*p22*f(u,s,b1,b2)[i])+(p2[i-1]*(1-(p21+p22))*f(u,t,c1,c2)[i])+
(p3[i-1]*p31*f(u,r,a1,a2)[i])+(p3[i-1]*p32*f(u,s,b1,b2)[i])+(p3[i-1]*(1-(p31+p32))*f(u,t,c1,c2)[i])
p1[i]<-((p1[i-1]*p11*f(u,r,a1,a2)[i])+(p1[i-1]*p12*f(u,s,b1,b2)[i])+(p1[i-1]*(1-(p11+p12))*f(u,t,c1,c2)[i]))/g[i]
p2[i]<-((p2[i-1]*p21*f(u,r,a1,a2)[i])+(p2[i-1]*p22*f(u,s,b1,b2)[i])+(p2[i-1]*(1-(p21+p22))*f(u,t,c1,c2)[i]))/g[i]
p3[i]<-((p3[i-1]*p31*f(u,r,a1,a2)[i])+(p3[i-1]*p32*f(u,s,b1,b2)[i])+(p3[i-1]*(1-(p31+p32))*f(u,t,c1,c2)[i]))/g[i]
}
return(-sum(log(g)))
}
Optimization:
library(QRM)
library(copula)
start=list(0,1,0,0,0,0,1,9,7,-1,10,13,1,6,4)
##
optim(start,loglik,lower=c(rep(0,6),-1,1,1,-1,1,1,-1,1,1),
upper=c(rep(1,6),1,Inf,Inf,1,Inf,Inf,1,Inf,Inf),
method="L-BFGS-B") -> fit
This looks like a question for Stack-Overflow.
Something that springs to my mind is:
Define a vector containing the values f(.,.,.,.) in order to avoid doing k*nrow(u) evaluations of the same function and simply call those entries of interest.
It seems like the loop could be replaced by matrix and/or vector products. However, without further information it is unclear what the code is doing and it would take eons to extract this information from the code.