With this program below, I will get the error:
solve.default(Sigma0[cs.idx, cs.idx]) : 'a' is 0-diml
But, when I check the em() function step by step, I mean, sentence by sentence without function, there is no error within solve(). So I am confused and desperate for help, Thank you!
###----------------------------------------------------------------
### Maximal Likelihood estimation of mean and covariance
### for multivariate normal distribution by EM algorithm,
### for demonstration purposes only
###----------------------------------------------------------------
em<-function(xdata,mu0,Sigma0){
n<-nrow(xdata)
p<-ncol(xdata)
err<-function(mu0,Sigma0,mu1,Sigma1){
th0<-c(mu0,as.vector(Sigma0))
th1<-c(mu1,as.vector(Sigma1))
sqrt(sum((th0-th1)*(th0-th1)))
}
mu1<-mu0+1
Sigma1<-Sigma0+1
while(err(mu0,Sigma0,mu1,Sigma1)>1e-6){
mu1<-mu0
Sigma1<-Sigma0
zdata<-xdata
Ai<-matrix(0,p,p)
for(i in 1:n){
if(any(is.na(xdata[i,]))){
zi<-xdata[i,]
na.idx<-(1:p)[is.na(zi)]
cs.idx<-(1:p)[-na.idx]
Sigma012<-Sigma0[na.idx,cs.idx,drop=FALSE]
Sigma022.iv<-solve(Sigma0[cs.idx,cs.idx])
zdata[i,na.idx]<-mu0[na.idx]+(Sigma012%*%Sigma022.iv)%*%(zi[cs.idx]-mu0[cs.idx])
Ai[na.idx,na.idx]<-Ai[na.idx,na.idx]+Sigma0[na.idx,na.idx]-Sigma012%*%Sigma022.iv%*%t(Sigma012)
}
}
mu0<-colMeans(zdata)
Sigma0<-(n-1)*cov(zdata)/n+Ai/n
}
return(list(mu=mu0,Sigma=Sigma0))
}
##A simulation example
library(MASS)
set.seed(1200)
p=3
mu<-c(1,0,-1)
n<-1000
Sig <- matrix(c(1, .7, .6, .7, 1, .4, .6, .4, 1), nrow = 3)
triv<-mvrnorm(n,mu,Sig)
misp<-0.2 #MCAR probability
misidx<-matrix(rbinom(3*n,1,misp)==1,nrow=n)
triv[misidx]<-NA
#exclude the cases whose entire elements were missed
er<-which(apply(apply(triv,1,is.na),2,sum)==p)
if(length(er)>=1) triv<-triv[-er,]
#initial values
mu0<-rep(0,p)
Sigma0<-diag(p)
system.time(rlt<-em(triv,mu0,Sigma0))
#a better initial values
mu0<-apply(triv,2,mean,na.rm=TRUE)
nas<-is.na(triv)
na.num<-apply(nas,2,sum)
zdata<-triv
zdata[nas]<-rep(mu0,na.num)
Sigma0<-cov(zdata)
system.time(rlt<-em(triv,mu0,Sigma0))
Your er<-which(apply(apply(triv,1,is.na),2,sum)==) piece of code is not valid. As a comment above it states, you wish to remove complete NA cases. If so, er<-which(apply(apply(triv,1,is.na),2,sum)==ncol(triv)) is the right piece of code.
The error itself happens when there is a complete NA case still present in triv when being passed to em. At some point, cs.idx is empty, so Sigma0[cs.idx,cs.idx] is also empty, which is reflected by the error message.
However, if the correction above is applied, everything runs fine:
> system.time(rlt<-em(triv,mu0,Sigma0))
user system elapsed
0.46 0.00 0.47
> rlt
$mu
[1] 0.963058487 -0.006246175 -1.024260183
$Sigma
[,1] [,2] [,3]
[1,] 0.9721301 0.6603700 0.5549126
[2,] 0.6603700 1.0292379 0.3745184
[3,] 0.5549126 0.3745184 0.9373208
Related
Lets say I train a model in R.
model <- lm(as.formula(paste((model_Data)[2],"~",paste((model_Data)[c(4,5,6,7,8,9,10,11,12,13,15,16,17,18,20,21,22,63,79,90,91,109,125,132,155,175,197,202,210,251,252,279,287,292,300,313,318)],collapse="+"),sep="")),data=model_Data)
I then use the model to predict an unknown.
prediction <- predict(model,unknown[1,])
1
8.037219
Instead of using predict lets pull out the coefficients and do it manually.
model$coefficients
9.250265284
0.054054202
0.052738367
-0.55119556
0.019686046
0.392728331
0.794558094
0.200555755
-0.63218309
0.050404541
0.089660195
-0.04889444
-0.24645514
0.225817891
-0.10411162
0.108317865
0.004281512
0.219695437
0.037514904
-0.00914805
0.077885231
0.656321472
-0.05436867
0.033296525
0.072551915
-0.11498145
-0.03414029
0.081145352
0.11187141
0.690106624
NA
-0.11112986
-0.18002883
0.006238802
0.058387332
-0.04469568
-0.02520228
0.121577926
Looks like the model couldn't find a coefficient for one of the variables.
Here are the independent variables for our unknown.
2.048475484
1.747222331
-1.240658767
-1.26971135
-0.61858754
-1.186401425
-1.196781456
-0.437969964
-1.37330171
-1.392555895
-0.147275619
0.315190159
0.544014105
-1.137999082
0.464498153
-1.825631473
-1.824991143
0.61730876
-1.311527708
-0.457725059
-0.455920549
-0.196326975
0.636723746
0.128123676
-0.0064055
-0.788435688
-0.493452602
-0.563353694
-0.441559371
-1.083489708
-0.882784077
-0.567873188
1.068504735
1.364721122
0.294178454
2.302875604
-0.998685333
If I multiply each independent variable by it's coefficient and add on the intercept the predicted value for the unknown is 8.450137349
The predict function gave us 8.037219 and the manual calculation gave 8.450137349. What is happening within the predict function that is causing it to predict a different value than the manual calculation? What has to be done to make the values match?
I get a lot closer to the predict answer when using the code below:
b <- c(9.250265284, 0.054054202, 0.052738367, -0.55119556, 0.019686046, 0.392728331, 0.794558094, 0.200555755, -0.63218309, 0.050404541, 0.089660195, -0.04889444, -0.24645514, 0.225817891, -0.10411162, 0.108317865, 0.004281512, 0.219695437, 0.037514904, -0.00914805, 0.077885231, 0.656321472, -0.05436867, 0.033296525, 0.072551915, -0.11498145, -0.03414029, 0.081145352, 0.11187141, 0.690106624, NA, -0.11112986, -0.18002883, 0.006238802, 0.058387332, -0.04469568, -0.02520228, 0.121577926)
x <- c(1, 2.048475484, 1.747222331, -1.240658767, -1.26971135, -0.61858754, -1.186401425, -1.196781456, -0.437969964, -1.37330171, -1.392555895, -0.147275619, 0.315190159, 0.544014105, -1.137999082, 0.464498153, -1.825631473, -1.824991143, 0.61730876, -1.311527708, -0.457725059, -0.455920549, -0.196326975, 0.636723746, 0.128123676, -0.0064055, -0.788435688, -0.493452602, -0.563353694, -0.441559371, -1.083489708, -0.882784077, -0.567873188, 1.068504735, 1.364721122, 0.294178454, 2.302875604, -0.998685333)
# remove the missing value in `b` and the corresponding value in `x`
x <- x[-31]
b <- b[-31]
x %*% b
# [,1]
# [1,] 8.036963
I have 2 datasets not very different to each other. Each dataset has 27 rows of actual and forecast values. When tested against Solver in Excel for minimization of the absolute error (abs(actual - par * forecast) they both give nearly equal values for the parameter 'par'. However, when each of these data sets are passed on to the same optimization function that I have written, it only works for one of them. For the other data set, the objective always gets evaluated to zero (0) with'par' assisgned the upper bound value.
This is definitely incorrect. What I am not able to understand is why is R doing so?
Here are the 2 data sets :-
test
dateperiod,usage,fittedlevelusage
2019-04-13,16187.24,17257.02
2019-04-14,16410.18,17347.49
2019-04-15,18453.52,17246.88
2019-04-16,18113.1,17929.24
2019-04-17,17712.54,17476.67
2019-04-18,15098.13,17266.89
2019-04-19,13026.76,15298.11
2019-04-20,13689.49,13728.9
2019-04-21,11907.81,14122.88
2019-04-22,13078.29,13291.25
2019-04-23,15823.23,14465.34
2019-04-24,14602.43,15690.12
2019-04-25,12628.7,13806.44
2019-04-26,15064.37,12247.59
2019-04-27,17163.32,16335.43
2019-04-28,17277.18,16967.72
2019-04-29,20093.13,17418.99
2019-04-30,18820.68,18978.9
2019-05-01,18799.63,17610.66
2019-05-02,17783.24,17000.12
2019-05-03,17965.56,17818.84
2019-05-04,16891.25,18002.03
2019-05-05,18665.49,18298.02
2019-05-06,21043.86,19157.41
2019-05-07,22188.93,21092.36
2019-05-08,22358.08,21232.56
2019-05-09,22797.46,22229.69
Optimization result from R
$minimum
[1] 1.018188
$objective
[1] 28031.49
test1
dateperiod,Usage,fittedlevelusage
2019-04-13,16187.24,17248.29
2019-04-14,16410.18,17337.86
2019-04-15,18453.52,17196.25
2019-04-16,18113.10,17896.74
2019-04-17,17712.54,17464.45
2019-04-18,15098.13,17285.82
2019-04-19,13026.76,15277.10
2019-04-20,13689.49,13733.90
2019-04-21,11907.81,14152.27
2019-04-22,13078.29,13337.53
2019-04-23,15823.23,14512.41
2019-04-24,14602.43,15688.68
2019-04-25,12628.70,13808.58
2019-04-26,15064.37,12244.91
2019-04-27,17163.32,16304.28
2019-04-28,17277.18,16956.91
2019-04-29,20093.13,17441.80
2019-04-30,18820.68,18928.29
2019-05-01,18794.10,17573.40
2019-05-02,17779.00,16969.20
2019-05-03,17960.16,17764.47
2019-05-04,16884.77,17952.23
2019-05-05,18658.16,18313.66
2019-05-06,21036.49,19149.12
2019-05-07,22182.11,21103.37
2019-05-08,22335.57,21196.23
2019-05-09,22797.46,22180.51
Optimization result from R
$minimum
[1] 1.499934
$objective
[1] 0
The optimization function used is shown below :-
optfn <- function(x)
{act <- x$usage
fcst <- x$fittedlevelusage
fn <- function(par)
{sum(abs(act - (fcst * par)))
}
adjfac <- optimize(fn, c(0.5, 1.5))
return(adjfac)
}
adjfacresults <- optfn(test)
adjfacresults <- optfn(test1)
Optimization result from R
adjfacresults <- optfn(test)
$minimum
[1] 1.018188
$objective
[1] 28031.49
Optimization result from R
adjfacresults <- optfn(test1)
$minimum [1]
1.499934
$objective
[1] 0
Can anyone help to identify why is R not doing the same process over the 2 data sets and outputting the correct results in both the cases.
The corresponding results using Excel Solver for the 2 datasets are as follows :-
For 'test' data set
par value = 1.018236659
objective function valule (min) : 28031
For 'test1' data set
par value = 1.01881062927878
objective function valule (min) : 28010
Best regards
Deepak
That's because the second column of test1 is named Usage, not usage. Therefore, act = x$usage is NULL, and the function fn returns sum(abs(NULL - something)) = sum(NULL) = 0. You have to rename this column to usage.
I am using a Kalman filter to estimate various Dynamic and Arbitrage free Nelson-Siegel models for yield curves. I give some starting values to optim and the algorithm converges just fine. However, when I want to calculate standard errors using the Hessian supplied by the optim algorithm, I get NaN's due to nonpositive values on the diagonal of the Variance covariance matrix. I think it is because I have a highly nonlinear function with many local optima, however it keeps happening for all starting values I try.
The function I use is optim together with the default Nelder-Mead algorithm.
The command I use is
opt_para<-optim(par=par0, fn=Kalman_filter, y=y,
maturities=maturities,control=list(maxit=20000),hessian=TRUE) The starting values are given in par0, which is
> par0
[1] 9.736930e-01 1.046646e+00 5.936238e-01 4.444669e-02 2.889251e-07 6.646960e+00 7.715964e-01 9.945551e-01 9.663361e-01
[10] 6.000000e-01 6.000000e-01 6.000000e-01 6.000000e-02 5.000000e-01 5.000000e-01 5.000000e-01 5.000000e-01
The optimoutput that I get is
$par[1] 0.833208307 1.373442068 0.749313983 0.646577154 0.237102069 6.882644818 0.788775982 0.918378263 0.991982038
[10] 0.748509055 0.005115171 0.392213941 0.717186499 0.121525623 0.386227284
0.001970431 0.845279611
$value
[1] 575.7886
$counts
function gradient
5225 NA
$convergence
[1] 0
$message
NULL
I then use the following command to produce the standard errors of the estimates.
hessian<-opt_para$hessian
fish_info<-solve(hessian,tol=1e-100)
st_errors<- diag(sqrt(fish_info))
st_errors
I get the following output
st_errors
[1] NaN NaN 2.9170315888 NaN NaN NaN 0.0294300357 0.0373614751 NaN
[10] 0.0785349634 0.0005656580 NaN 0.0470600219 0.0053255251 0.0408666177 0.0001561243 0.4540428740
The NaNs are being produced to a negative value on the diagonal, which should be impossible in a variance-covariance matrix. However, I suspect that it is due to the optimization procedure being not correct.
To be clear, I also include the function I want to optimize. It is a Kalman-filter with updating equations and some restrictions built in.
Kalman_filter<-function(par, y, maturities){
b0<-c(par[1],par[2],par[3])
P0<-diag(c(par[4],par[5],par[6]))
Phi<-diag(c(par[7],par[8],par[9]))
mu<-c(par[10],par[11],par[12])
lambda<-par[13]
sigma11<-par[14]
sigma21<-par[15]
sigma22<-par[16]
sigma33<-par[17]
m=length(b0)
n=length(y[,1])
d<-length(y[1,])
sigma_eps<-sigma11*diag(d)
sigma_nu<-diag(c(sigma21^2,sigma22^2,sigma33^2))*(1/12)
colnames(sigma_nu)<-c("level","slope","Curvat")
X<-matrix(cbind(rep(1,length(maturities)), slope_factor(lambda,maturities), curv_factor(lambda,maturities)),ncol=3)
colnames(X)<-c("level","slope","Curvature")
bt<-matrix(NA, nrow=m, ncol=n+1)
Pt<-array(NA, dim=c(m,m,n+1))
btt<-matrix(NA, nrow=m,ncol=n+1)
Ptt<-array(NA, dim=c(m,m,n+1))
vt<-matrix(NA, nrow=d, ncol=n)
eigen_values<-eigen(Phi,only.values=TRUE)$values
if(eigen_values[1]>=1||eigen_values[2]>=1||eigen_values[3]>=1){
loglike=-70000000
}else{
c<- (diag(3) - Phi)%*% mu
loglike<-0
i<-1
btt[,1]<-b0
Ptt[,,1]<-P0
while(i< n+1){
bt[,i]<- c+ Phi%*% btt[,i]
Pt[,,i] <- Phi%*% tcrossprod(Ptt[,,i],Phi) + sigma_nu
vt[,i]<- y[i,] - X%*% bt[,i]
ft<-X%*% tcrossprod(Pt[,,i], X) + sigma_eps
det_f<-det(ft)
if( is.nan(det_f) || is.na(det_f)|| is.infinite(det_f)){
loglike<- - 700000000
} else
{
if(det_f<0){
loglike <- - 700000000
} else
{
if (abs(det_f>1e-20)){
logdet_f<- log(det_f)
f_inv<- solve(ft, tol=1e-200)
Kt<- tcrossprod(Pt[,,i],X)%*% f_inv
btt[,i+1] <- bt[,i] + Kt%*% vt[,i]
Ptt[,,i+1] <- (diag(3) - Kt%*% X)%*% Pt[,,i]
loglike_contr<- -0.5*d*log(2*pi) - 0.5 * logdet_f - 0.5*
crossprod(vt[,i],f_inv)%*% vt[,i]
loglike<-loglike+loglike_contr
} else
{ loglike<- -700000}
}
}
i<-i+1
}
}
return(-loglike)
}
Any help would be appreciated.
I have just solved the problem, I programmed the likelihood function once more with the only input parameters, the likelihood estimates from optim. After this, I used the hessianfunction from the numDerivpackage. This produces viable estimates for the standard errors.
When I run the R command:
outer(37:42, 37:42, complex, 1)
I get an error
"Error in dim(robj) <- c(dX, dY) : dims [product 36] do not match the length of object [37]"
in my R session. But when I run
outer(36:42, 36:42, complex, 1)
I have a valid matrix as a result. The problem persists for all values greater than 36. And there is no problem for all values less then 37.
Is this a bug?
My system: Microsoft R Open 3.4.4 / RStudio 1.1.447 / Ubuntu 16.04
More specifically, when running the function with arguments m:n, m:n it returns the error whenever n < (n - m + 1)^2 [citation needed]. Try for example outer(20:23, 20:23, complex, 1) and outer(20:24, 20:24, complex, 1), where the first will fail but the latter won't, because 24 < (24-20+1)^2. I suspect this has to do with the first argument of complex being length.out, which defines the length of the vector to return - not really an explanation, I know. So your first argument 37:42 is passed to the length.out parameter. This does not make a lot of sense so please correct me if I am wrong, but I think what you want to do is the following:
outer(37:42, 37:42, function(x,y) {complex(1, real = x, imaginary = y)})
Which outputs:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 37+37i 37+38i 37+39i 37+40i 37+41i 37+42i
[2,] 38+37i 38+38i 38+39i 38+40i 38+41i 38+42i
[3,] 39+37i 39+38i 39+39i 39+40i 39+41i 39+42i
[4,] 40+37i 40+38i 40+39i 40+40i 40+41i 40+42i
[5,] 41+37i 41+38i 41+39i 41+40i 41+41i 41+42i
[6,] 42+37i 42+38i 42+39i 42+40i 42+41i 42+42i
Hope this helps.
The problem is in the 4th argument: it should be named:
outer(37:42, 37:42, complex, length.out = 1)
works fine!
I am looking at a way to optimize a function in R having several constraints. That's a piece of cake using Excel but I cannot make it work in R.
What I want is to find the set of parameters that maximizes a function under the contraints that parameters should be non-increasing and that the sum of parameters x_i ...x_max is bound for each i.
I wrote a simple example. It works for two parameters but not for three. For three parameters it looks like the optimization procedure is not doing anything.
In real-life cases I would like to use between 12 and 120 parameters so I am a bit worried it does not work with 3 ...
So any help is welcome ... and thanks in advance for the (eventual) reply.
The code for two parameters is (working)
Omp <- function (p)
{
calc <- -p[1]-2*p[2]
return (calc)
}
ui1 <-matrix(c(-1,0,1,-1,-1,-1),ncol =2)
ci1 <-c(-100,-70,0)-0.0000001
init1 <-c(100,0)
(ui1 %*% init1) - ci1
tst <- constrOptim(init1, Omp, grad = NULL, ui = ui1, ci = ci1)
The output is conform expectations:
> sum(tst$par)
[1] 100
> tst$par
[1] 50 50
The code for 3 parameters is (not working)
Opm <- function (p)
{
calc <- -p[1]-2*p[2]-3*p[3]
print(calc)
return (calc)
}
ui1 <-matrix(c(-1,0,0,1,0,-1,-1,0,-1,1,-1,-1,-1,0,-1),ncol =3)
ci1 <-c(-100,-70,0,0,0)-0.0000001
init1 <-c(65,35,0)
(ui1 %*% init1) - ci1
tst <- constrOptim(init1, Opm, grad = NULL, ui = ui1, ci = ci1)
It runs but always remains close to the initial guess.
> tst$par
[1] 6.500000e+01 3.500000e+01 9.685755e-08
Someone else may be able to provide more insight, but your starting values may not be far enough inside the feasible region.
As you stated, this does not produce the expected result:
ui1 <-matrix(c(-1,0,0,1,0,-1,-1,0,-1,1,-1,-1,-1,0,-1),ncol =3)
ci1 <-c(-100,-70,0,0,0)-0.0000001
init1 <-c(65,35,0)
(ui1 %*% init1) - ci1
tst <- constrOptim(init1, Opm, grad = NULL, ui = ui1, ci = ci1)
round(tst$par)
[1] 65 35 0
But adjusting the small offset in ci1 slightly, I get a different result - similar to your expectations in your first example.
ci1 <-c(-100,-70,0,0,0)-0.00001
init1 <-c(65,35,0)
(ui1 %*% init1) - ci1
tst <- constrOptim(init1, Opm, grad = NULL, ui = ui1, ci = ci1)
round(tst$par)
[1] 50 50 0