lme warning message because of random effects - r
I have a data frame with 5 variables: Lot / Wafer / Serial Number / Voltage / Amplification. In this data frame there are 1020 subsets grouped by Serial_number. Each subset has a certain number of measurement data points (Amplification against voltage).
I fit the data with
summary(fit2.lme <- lmer(log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 1) | Serial_number),
+ data = APD))
which yields:
Linear mixed model fit by REML ['lmerMod']
Formula: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 1) | Serial_number)
Data: APD
REML criterion at convergence: 35286.1
Scaled residuals:
Min 1Q Median 3Q Max
-20.7724 -0.2438 -0.1297 0.2434 13.2663
Random effects:
Groups Name Variance Std.Dev. Corr
Serial_number (Intercept) 1.439e-02 0.1199
poly(Voltage, 1) 2.042e+03 45.1908 -0.76
Residual 8.701e-02 0.2950
Number of obs: 76219, groups: Serial_number, 1020
Fixed effects:
Estimate Std. Error t value
(Intercept) 5.944e-02 3.914e-03 15.2
poly(Voltage, 3)1 5.862e+02 1.449e+00 404.5
poly(Voltage, 3)2 -1.714e+02 3.086e-01 -555.4
poly(Voltage, 3)3 4.627e+01 3.067e-01 150.8
Correlation of Fixed Effects:
(Intr) p(V,3)1 p(V,3)2
ply(Vlt,3)1 -0.713
ply(Vlt,3)2 0.015 -0.004
ply(Vlt,3)3 0.004 0.012 0.018
and when I add a higher polynomial in the random effects I get a warning:
> summary(fit3.lme <- lmer(log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) | Serial_number),
+ data = APD))
Linear mixed model fit by REML ['lmerMod']
Formula: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) | Serial_number)
Data: APD
REML criterion at convergence: 16285.9
Scaled residuals:
Min 1Q Median 3Q Max
-20.5042 -0.2393 -0.0697 0.3165 13.9634
Random effects:
Groups Name Variance Std.Dev. Corr
Serial_number (Intercept) 1.584e-02 0.1259
poly(Voltage, 2)1 1.777e+03 42.1536 -0.67
poly(Voltage, 2)2 1.579e+03 39.7365 0.87 -0.95
Residual 6.679e-02 0.2584
Number of obs: 76219, groups: Serial_number, 1020
Fixed effects:
Estimate Std. Error t value
(Intercept) 5.858e-02 4.062e-03 14.4
poly(Voltage, 3)1 5.938e+02 1.351e+00 439.5
poly(Voltage, 3)2 -1.744e+02 1.276e+00 -136.7
poly(Voltage, 3)3 5.719e+01 2.842e-01 201.2
Correlation of Fixed Effects:
(Intr) p(V,3)1 p(V,3)2
ply(Vlt,3)1 -0.641
ply(Vlt,3)2 0.825 -0.906
ply(Vlt,3)3 -0.001 0.030 -0.004
convergence code: 1
Model failed to converge with max|grad| = 2.22294 (tol = 0.002, component 1)
Model is nearly unidentifiable: large eigenvalue ratio
- Rescale variables?
Warning messages:
1: In optwrap(optimizer, devfun, getStart(start, rho$lower, rho$pp), :
convergence code 1 from bobyqa: bobyqa -- maximum number of function evaluations exceeded
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 2.22294 (tol = 0.002, component 1)
3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: large eigenvalue ratio
- Rescale variables?
The data is as following (can provide the complete data of course, if desired). It includes 77479 observables of 6 variables:
'data.frame': 77479 obs. of 6 variables:
$ Serial_number: num 9.12e+08 9.12e+08 9.12e+08 9.12e+08 9.12e+08 ...
$ Lot : int 9 9 9 9 9 9 9 9 9 9 ...
$ Wafer : int 912 912 912 912 912 912 912 912 912 912 ...
$ Amplification: num 1 1 1.01 1.01 1.01 ...
$ Voltage : num 25 30 34.9 44.9 49.9 ...
and the data itself looks like:
Serial_number Lot Wafer Amplification Voltage
1 912009913 9 912 1.00252 24.9681
2 912009913 9 912 1.00452 29.9591
3 912009913 9 912 1.00537 34.9494
(...)
73 912009913 9 912 918.112 375.9850
74 912009913 9 912 1083.74 377.9990
75 912009897 9 912 1.00324 19.9895
76 912009897 9 912 1.00449 29.9777
(...)
What does the warnings mean?
According to the anova the fit3.lme model describes the data better:
> anova(fit3.lme, fit2.lme)
refitting model(s) with ML (instead of REML)
Data: APD
Models:
fit2.lme: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 1) |
fit2.lme: Serial_number)
fit3.lme: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) |
fit3.lme: Serial_number)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
fit2.lme 8 35294 35368 -17638.9 35278
fit3.lme 11 16264 16366 -8121.1 16242 19036 3 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In optwrap(optimizer, devfun, x#theta, lower = x#lower, calc.derivs = TRUE, :
convergence code 1 from bobyqa: bobyqa -- maximum number of function evaluations exceeded
Therefore I would like to use that model but I stuck in the warning.
update:
center and scale predictors
ss.CS<- transform(APD, Voltage=scale(Voltage))
> fit31.lme<- update(fit3.lme, data=ss.CS)
Error in poly(dots[[i]], degree, raw = raw, simple = raw) :
'degree' must be less than number of unique points
Also for the other variable (don't know for which it makes sense)
> ss.CS<- transform(APD, Amplitude=scale(Amplitude))
Error in scale(Amplitude) : object 'Amplitude' not found
> ss.CS<- transform(APD, Amplification=scale(Amplification))
> fit31.lme<- update(fit3.lme, data=ss.CS)
Warning messages:
1: In log(Amplification) : NaNs produced
2: In log(log(Amplification)) : NaNs produced
3: In log(Amplification) : NaNs produced
4: In log(log(Amplification)) : NaNs produced
5: In log(Amplification) : NaNs produced
6: In log(log(Amplification)) : NaNs produced
7: Some predictor variables are on very different scales: consider rescaling
check singularity
> diag.vals<- getME(fit3.lme, "theta")[getME(fit3.lme, "lower")==0]
> any(diag.vals<- 1e-6)
[1] TRUE
Warning message:
In any(diag.vals <- 1e-06) : coercing argument of type 'double' to logical
compute gradient and Hessian with Richardson extrapolation
> devfun<- update(fit3.lme, devFunOnly=TRUE)
> if(isLMM(fit3.lme)){
+ pars<- getME(fit3.lme, "theta")
+ } else {
+ pars<- getME(fit3.lme, c("theta", "fixef"))
+ }
> if (require("numDeriv")) {
+ cat("hess:\n"); print(hess <- hessian(devfun, unlist(pars)))
+ cat("grad:\n"); print(grad <- grad(devfun, unlist(pars)))
+ cat("scaled gradient:\n")
+ print(scgrad <- solve(chol(hess), grad))
+ }
Loading required package: numDeriv
hess:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 39137.840764 -56.189442277 -1.348127e+02 3.789427141 25.456612941 -3.806942811
[2,] -56.189442 0.508077776 6.283795e-01 -0.068882737 -0.056159369 0.003228274
[3,] -134.812704 0.628379462 1.061584e+00 -0.079620905 -0.152816413 0.007457255
[4,] 3.789427 -0.068882737 -7.962090e-02 0.516054976 0.534346634 0.001513168
[5,] 25.456613 -0.056159369 -1.528164e-01 0.534346634 0.901191745 -0.002344407
[6,] -3.806943 0.003228274 7.457255e-03 0.001513168 -0.002344407 0.179283416
grad:
[1] -22.9114985 2.2229416 -0.2959238 0.6790044 -0.2343368 -0.4020556
scaled gradient:
[1] -0.1123624 4.4764140 -0.8777938 1.3980054 -0.4223921 -0.9508207
> fit3.lme#optinfo$derivs
$gradient
[1] -22.9118920 2.2229424 -0.2959264 0.6790037 -0.2343360 -0.4020605
$Hessian
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 39137.915527 -56.20745850 -134.87176514 3.74780273 25.47540283 -3.79016113
[2,] -56.207458 0.44262695 0.61462402 -0.04736328 -0.06585693 0.02130127
[3,] -134.871765 0.61462402 1.04296875 -0.10467529 -0.23223877 0.05438232
[4,] 3.747803 -0.04736328 -0.10467529 0.52026367 0.50909424 -0.02130127
[5,] 25.475403 -0.06585693 -0.23223877 0.50909424 0.68481445 -0.02044678
[6,] -3.790161 0.02130127 0.05438232 -0.02130127 -0.02044678 0.07617188
4. restart the fit from the original value (or a slightly perturbed value):
> fit3.lme.restart <- update(fit3.lme, start=pars)
> summary(fit3.lme.restart)
Linear mixed model fit by REML ['lmerMod']
Formula: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) | Serial_number)
Data: APD
REML criterion at convergence: 16250.3
Scaled residuals:
Min 1Q Median 3Q Max
-20.4868 -0.2404 -0.0697 0.3166 13.9464
Random effects:
Groups Name Variance Std.Dev. Corr
Serial_number (Intercept) 1.823e-02 0.1350
poly(Voltage, 2)1 2.124e+03 46.0903 -0.77
poly(Voltage, 2)2 1.937e+03 44.0164 0.90 -0.96
Residual 6.668e-02 0.2582
Number of obs: 76219, groups: Serial_number, 1020
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.05823 0.00434 13.4
poly(Voltage, 3)1 593.83396 1.47201 403.4
poly(Voltage, 3)2 -174.61257 1.40711 -124.1
poly(Voltage, 3)3 57.15901 0.28427 201.1
Correlation of Fixed Effects:
(Intr) p(V,3)1 p(V,3)2
ply(Vlt,3)1 -0.735
ply(Vlt,3)2 0.868 -0.927
ply(Vlt,3)3 -0.001 0.028 -0.003
5. try all available optimizers
> source(system.file("utils", "allFit.R", package="lme4"))
Loading required package: optimx
Loading required package: dfoptim
> fit3.lme.all <- allFit(fit3.lme)
bobyqa : [OK]
Nelder_Mead : [OK]
nlminbw : [OK]
nmkbw : [OK]
optimx.L-BFGS-B : [OK]
nloptwrap.NLOPT_LN_NELDERMEAD : [OK]
nloptwrap.NLOPT_LN_BOBYQA : [OK]
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
4: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
5: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
6: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
7: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
8: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
ss <- summary(fit3.lme.all)
> ss$ fixef ## extract fixed effects
(Intercept) poly(Voltage, 3)1 poly(Voltage, 3)2 poly(Voltage, 3)3
bobyqa 0.05822789 593.8340 -174.6126 57.15901
Nelder_Mead 0.05822787 593.8340 -174.6126 57.15902
nlminbw 0.05822787 593.8340 -174.6126 57.15902
nmkbw 0.05841966 593.7804 -174.4999 57.17107
optimx.L-BFGS-B 0.05822845 593.8336 -174.6116 57.16183
nloptwrap.NLOPT_LN_NELDERMEAD 0.05823870 593.8330 -174.6076 57.16039
nloptwrap.NLOPT_LN_BOBYQA 0.05823870 593.8330 -174.6076 57.16039
> ss$ llik ## log-likelihoods
bobyqa Nelder_Mead nlminbw nmkbw optimx.L-BFGS-B
-8125.125 -8125.125 -8125.125 -8129.827 -8125.204
nloptwrap.NLOPT_LN_NELDERMEAD nloptwrap.NLOPT_LN_BOBYQA
-8125.137
> ss$ sdcor ## SDs and correlations
Serial_number.(Intercept) Serial_number.poly(Voltage, 2)1.(Intercept) Serial_number.poly(Voltage, 2)2.(Intercept)
bobyqa 0.1350049 46.09013 44.01631
Nelder_Mead 0.1350064 46.09104 44.01705
nlminbw 0.1350065 46.09106 44.01707
nmkbw 0.1347214 46.11336 43.81219
optimx.L-BFGS-B 0.1356576 46.32849 44.27171
nloptwrap.NLOPT_LN_NELDERMEAD 0.1347638 45.97995 43.91054
nloptwrap.NLOPT_LN_BOBYQA 0.1347638 45.97995 43.91054
Serial_number.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2
bobyqa -0.7665898 0.9042387 -0.9608662
Nelder_Mead -0.7665981 0.9042424 -0.9608680
nlminbw -0.7665980 0.9042425 -0.9608680
nmkbw -0.7658163 0.9076551 -0.9649999
optimx.L-BFGS-B -0.7713801 0.9067725 -0.9617129
nloptwrap.NLOPT_LN_NELDERMEAD -0.7645748 0.9034336 -0.9606020
nloptwrap.NLOPT_LN_BOBYQA -0.7645748 0.9034336 -0.9606020
sigma
bobyqa 0.2582156
Nelder_Mead 0.2582156
nlminbw 0.2582156
nmkbw 0.2584714
optimx.L-BFGS-B 0.2582244
nloptwrap.NLOPT_LN_NELDERMEAD 0.2582207
nloptwrap.NLOPT_LN_BOBYQA 0.2582207
> ss$ theta ## Cholesky factors
Serial_number.(Intercept) Serial_number.poly(Voltage, 2)1.(Intercept) Serial_number.poly(Voltage, 2)2.(Intercept)
bobyqa 0.5228377 -136.8323 154.1396
Nelder_Mead 0.5228438 -136.8364 154.1428
nlminbw 0.5228439 -136.8365 154.1429
nmkbw 0.5212237 -136.6278 153.8521
optimx.L-BFGS-B 0.5253478 -138.3947 155.4631
nloptwrap.NLOPT_LN_NELDERMEAD 0.5218936 -136.1436 153.6293
nloptwrap.NLOPT_LN_BOBYQA 0.5218936 -136.1436 153.6293
Serial_number.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2
bobyqa 114.6181 -71.06063 1.578418e+01
Nelder_Mead 114.6186 -71.06067 1.578354e+01
nlminbw 114.6187 -71.06067 1.578351e+01
nmkbw 114.7270 -71.14411 3.440466e-42
optimx.L-BFGS-B 114.1731 -70.65227 1.527854e+01
nloptwrap.NLOPT_LN_NELDERMEAD 114.7688 -71.19817 1.568481e+01
nloptwrap.NLOPT_LN_BOBYQA 114.7688 -71.19817 1.568481e+01
> ss$ which.OK ## which fits worked
bobyqa Nelder_Mead nlminbw nmkbw optimx.L-BFGS-B
TRUE TRUE TRUE TRUE TRUE
nloptwrap.NLOPT_LN_NELDERMEAD nloptwrap.NLOPT_LN_BOBYQA
TRUE TRUE
Due to users's coment I add the following:
> bam(log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs="re") + s(Voltage, Serial_number, bs="re"), data=APD, discrete = TRUE)
Family: gaussian
Link function: identity
Formula:
log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs = "re") +
s(Voltage, Serial_number, bs = "re")
Estimated degrees of freedom:
9 993 987 total = 1990.18
fREML score: -226.8182
> summary(bam(log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs="re") + s(Voltage, Serial_number, bs="re"), data=APD, discrete = TRUE))
Family: gaussian
Link function: identity
Formula:
log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs = "re") +
s(Voltage, Serial_number, bs = "re")
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.11500 0.01896 6.066 1.31e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(Voltage) 8.998 9 89229 <2e-16 ***
s(Serial_number) 993.441 1019 55241 <2e-16 ***
s(Voltage,Serial_number) 986.741 1019 36278 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.989 Deviance explained = 99%
fREML = -226.82 Scale est. = 0.051396 n = 76219
On https://uploadfiles.io/n7h9z you can download the r script and data.
Update
(some plots concerning the gam model):
Here are all measurement data points transformed double-logarithmically:
The physical behaviour of the device is at least exponentially and even almost double-exponentially (as I found in a book). By transforming them double-logarithmically the almost behave then linearly. A polynomial of degree described the data already well but a polynomial degree of three did it better, though. I guess this can also be seen on the plot why that.
Some additional plots (I'm not used to GAMs so I just add them):
You can download the data from the link: https://uploadfiles.io/n7h9z
The convergence warnings disappeared when I removed all data points <2. I stumbled over this by coincidence..
Probably this is somehow connected to the issue that for each subset within the range from 0 to about 50 all data points are almost exactly the same (and have values of about ~1).
Related
How to fix model convergence error in glmer code?
I am in the process of trying to run the following code and am continuously getting the same error: > model5 <- glmer(violentyn~vpul + bmi_new + wmax + (1|fid), data = cohort4, family = binomial) Warning messages: 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.254024 (tol = 0.002, component 1) 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue Rescale variables? A couple of details about the variables I am using: I am predicting violent behavior in sons (binary 0/1) from sons' resting heart rate (continuous), alongside BMI and physical energy capacity variables as covariates (also continuous). I am clustering on the family id variable. This is a very large population sized dataset with fathers and sons included, but currently this analysis is only utilizing son-variables. Upon looking at ideas I also tried running the above code with this optimizer modification at the end: control=glmerControl(optimizer="bobyqa")) but am still getting the same error. Does anyone have any thoughts on 1) why this is happening? or 2) things I can try to resolve this error? When running allFit I am getting: > summary(Newmodel) $which.OK bobyqa Nelder_Mead nlminbwrap optimx.L-BFGS-B nloptwrap.NLOPT_LN_NELDERMEAD TRUE TRUE TRUE TRUE TRUE nloptwrap.NLOPT_LN_BOBYQA TRUE $msgs $msgs$bobyqa $msgs$bobyqa[[1]] [1] "Model failed to converge with max|grad| = 0.0492524 (tol = 0.002, component 1)" $msgs$bobyqa[[2]] [1] "Model is nearly unidentifiable: very large eigenvalue\n - Rescale variables?" $msgs$Nelder_Mead $msgs$Nelder_Mead[[1]] [1] "Model failed to converge with max|grad| = 0.0208731 (tol = 0.002, component 1)" $msgs$Nelder_Mead[[2]] [1] "Model is nearly unidentifiable: very large eigenvalue\n - Rescale variables?" $msgs$nlminbwrap [1] "boundary (singular) fit: see help('isSingular')" $msgs$`optimx.L-BFGS-B` [1] "unable to evaluate scaled gradient" "Model failed to converge: degenerate Hessian with 1 negative eigenvalues" $msgs$nloptwrap.NLOPT_LN_NELDERMEAD [1] "unable to evaluate scaled gradient" "Model failed to converge: degenerate Hessian with 1 negative eigenvalues" $msgs$nloptwrap.NLOPT_LN_BOBYQA [1] "Model failed to converge with max|grad| = 17.0248 (tol = 0.002, component 1)" $fixef (Intercept) bobyqa -12.325043 Nelder_Mead -12.326691 nlminbwrap -3.119328 optimx.L-BFGS-B -12.328315 nloptwrap.NLOPT_LN_NELDERMEAD -12.325046 nloptwrap.NLOPT_LN_BOBYQA -11.525685 $llik bobyqa Nelder_Mead nlminbwrap optimx.L-BFGS-B nloptwrap.NLOPT_LN_NELDERMEAD -7945.366 -7945.358 -15968.103 -7945.366 -7945.365 nloptwrap.NLOPT_LN_BOBYQA -7987.759 $sdcor fid.(Intercept) bobyqa 28.77715048132 Nelder_Mead 28.81300356231 nlminbwrap 0.00004213222 optimx.L-BFGS-B 28.83265512110 nloptwrap.NLOPT_LN_NELDERMEAD 28.79536607386 nloptwrap.NLOPT_LN_BOBYQA 22.37746729938 $theta fid.(Intercept) bobyqa 28.77715048132 Nelder_Mead 28.81300356231 nlminbwrap 0.00004213222 optimx.L-BFGS-B 28.83265512110 nloptwrap.NLOPT_LN_NELDERMEAD 28.79536607386 nloptwrap.NLOPT_LN_BOBYQA 22.37746729938 $times user.self sys.self elapsed user.child sys.child bobyqa 169.55 12.90 182.62 NA NA Nelder_Mead 240.92 18.10 259.18 NA NA nlminbwrap 9.69 0.37 10.06 NA NA optimx.L-BFGS-B 226.92 10.24 237.44 NA NA nloptwrap.NLOPT_LN_NELDERMEAD 136.09 5.62 141.89 NA NA nloptwrap.NLOPT_LN_BOBYQA 80.90 3.26 84.19 NA NA $feval bobyqa Nelder_Mead nlminbwrap optimx.L-BFGS-B nloptwrap.NLOPT_LN_NELDERMEAD 142 191 NA 50 103 nloptwrap.NLOPT_LN_BOBYQA 96 attr(,"class") [1] "summary.allFit"
Why am I getting good accuracy but low ROC AUC for multiple models?
My dataset size is 42542 x 14 and I am trying to build different models like logistic regression, KNN, RF, Decision trees and compare the accuracies. I get a high accuracy but low ROC AUC for every model. The data has about 85% samples with target variable = 1 and 15% with target variable 0. I tried taking samples in order to handle this imbalance, but it still gives the same results. Coeffs for glm are as follow: glm(formula = loan_status ~ ., family = "binomial", data = lc_train) Deviance Residuals: Min 1Q Median 3Q Max -2.7617 0.3131 0.4664 0.6129 1.6734 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -8.264e+00 8.338e-01 -9.911 < 2e-16 *** annual_inc 5.518e-01 3.748e-02 14.721 < 2e-16 *** home_own 4.938e-02 3.740e-02 1.320 0.186780 inq_last_6mths1 -2.094e-01 4.241e-02 -4.938 7.88e-07 *** inq_last_6mths2-5 -3.805e-01 4.187e-02 -9.087 < 2e-16 *** inq_last_6mths6-10 -9.993e-01 1.065e-01 -9.380 < 2e-16 *** inq_last_6mths11-15 -1.448e+00 3.510e-01 -4.126 3.68e-05 *** inq_last_6mths16-20 -2.323e+00 7.946e-01 -2.924 0.003457 ** inq_last_6mths21-25 -1.399e+01 1.970e+02 -0.071 0.943394 inq_last_6mths26-30 1.039e+01 1.384e+02 0.075 0.940161 inq_last_6mths31-35 -1.973e+00 1.230e+00 -1.604 0.108767 loan_amnt -1.838e-05 3.242e-06 -5.669 1.43e-08 *** purposecredit_card 3.286e-02 1.130e-01 0.291 0.771169 purposedebt_consolidation -1.406e-01 1.032e-01 -1.362 0.173108 purposeeducational -3.591e-01 1.819e-01 -1.974 0.048350 * purposehome_improvement -2.106e-01 1.189e-01 -1.771 0.076577 . purposehouse -3.327e-01 1.917e-01 -1.735 0.082718 . purposemajor_purchase -7.310e-03 1.288e-01 -0.057 0.954732 purposemedical -4.955e-01 1.530e-01 -3.238 0.001203 ** purposemoving -4.352e-01 1.636e-01 -2.661 0.007800 ** purposeother -3.858e-01 1.105e-01 -3.493 0.000478 *** purposerenewable_energy -8.150e-01 3.036e-01 -2.685 0.007263 ** purposesmall_business -9.715e-01 1.186e-01 -8.191 2.60e-16 *** purposevacation -4.169e-01 2.012e-01 -2.072 0.038294 * purposewedding 3.909e-02 1.557e-01 0.251 0.801751 open_acc -1.408e-04 4.147e-03 -0.034 0.972923 gradeB -4.377e-01 6.991e-02 -6.261 3.83e-10 *** gradeC -5.858e-01 8.340e-02 -7.024 2.15e-12 *** gradeD -7.636e-01 9.558e-02 -7.990 1.35e-15 *** gradeE -7.832e-01 1.115e-01 -7.026 2.13e-12 *** gradeF -9.730e-01 1.325e-01 -7.341 2.11e-13 *** gradeG -1.031e+00 1.632e-01 -6.318 2.65e-10 *** verification_statusSource Verified 6.340e-02 4.435e-02 1.429 0.152898 verification_statusVerified 6.864e-02 4.400e-02 1.560 0.118739 dti -4.683e-03 2.791e-03 -1.678 0.093373 . fico_range_low 6.705e-03 9.292e-04 7.216 5.34e-13 *** term 5.773e-01 4.499e-02 12.833 < 2e-16 *** emp_length2-4 years 6.341e-02 4.911e-02 1.291 0.196664 emp_length5-9 years -3.136e-02 5.135e-02 -0.611 0.541355 emp_length10+ years -2.538e-01 5.185e-02 -4.895 9.82e-07 *** delinq_2yrs2+ 5.919e-02 9.701e-02 0.610 0.541754 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 25339 on 29779 degrees of freedom Residual deviance: 23265 on 29739 degrees of freedom AIC: 23347 Number of Fisher Scoring iterations: 10 The confusion matrix for LR is as below: Confusion Matrix and Statistics Reference Prediction 0 1 0 32 40 1 1902 10788 Accuracy : 0.8478 95% CI : (0.8415, 0.854) No Information Rate : 0.8485 P-Value [Acc > NIR] : 0.5842 Kappa : 0.0213 Mcnemar's Test P-Value : <2e-16 Sensitivity : 0.016546 Specificity : 0.996306 Pos Pred Value : 0.444444 Neg Pred Value : 0.850118 Prevalence : 0.151544 Detection Rate : 0.002507 Detection Prevalence : 0.005642 Balanced Accuracy : 0.506426 'Positive' Class : 0 Is there any way I can improve the AUC?
If someone presents a confusion matrix and talks about low ROC AUC, it usually means that he/she has converted predictions/probabilities into 0 and 1, while ROC AUC formula does not require that - it works on raw probabilities, what gives much better results. If the aim is to obtain the best AUC value, it is good to set it as an evaluation metric while training, which enables to obtain better results than with other metrics.
How to retrieve AIC value in `rmgarch`
I tried to use gogarchspec, gogarchfit and gogarchforecast in rmgarch yesterday but noticed there has no aic value able to be retrieved. > fit *------------------------------* * GO-GARCH Fit * *------------------------------* Mean Model : VAR (Lag) : 1 (Robust) : FALSE GARCH Model : gjrGARCH Distribution : mvnorm ICA Method : fastica No. Factors : 3 No. Periods : 1475 Log-Likelihood : NA ------------------------------------ U (rotation matrix) : [,1] [,2] [,3] [1,] -0.059 0.9973 0.0435 [2,] -0.594 -0.0701 0.8017 [3,] 0.803 0.0214 0.5962 A (mixing matrix) : [,1] [,2] [,3] [1,] 4.49e-05 -0.0258 0.000127 [2,] 1.80e-03 -0.0260 -0.004237 [3,] 4.19e-03 -0.0257 -0.000478 [4,] 6.78e-05 -0.0259 0.000116 Above is the fit model and below is the attributes of the model but the aic values seen unable to retrieve. > ## attributes of univariate stage 1 > attributes(attributes(fit)$mfit$ufit) $`fit` $`fit`[[1]] *---------------------------------* * GARCH Model Fit * *---------------------------------* Conditional Variance Dynamics ----------------------------------- GARCH Model : gjrGARCH(1,1) Mean Model : ARFIMA(0,0,0) Distribution : norm Optimal Parameters ------------------------------------ Estimate Std. Error t value Pr(>|t|) omega 0.005346 0.002855 1.8723 0.061167 alpha1 0.057142 0.012491 4.5746 0.000005 beta1 0.955136 0.006263 152.4948 0.000000 gamma1 -0.026556 0.012794 -2.0756 0.037931 Robust Standard Errors: Estimate Std. Error t value Pr(>|t|) omega 0.005346 0.004330 1.2344 0.217043 alpha1 0.057142 0.014525 3.9340 0.000084 beta1 0.955136 0.004851 196.9049 0.000000 gamma1 -0.026556 0.016780 -1.5826 0.113509 LogLikelihood : -2016.878 Information Criteria ------------------------------------ Akaike 2.7402 Bayes 2.7545 Shibata 2.7402 Hannan-Quinn 2.7455 Weighted Ljung-Box Test on Standardized Residuals ------------------------------------ statistic p-value Lag[1] 0.003505 0.952788 Lag[2*(p+q)+(p+q)-1][2] 6.443923 0.016755 Lag[4*(p+q)+(p+q)-1][5] 12.082773 0.002649 d.o.f=0 H0 : No serial correlation Weighted Ljung-Box Test on Standardized Squared Residuals ------------------------------------ statistic p-value Lag[1] 2.143 0.1432 Lag[2*(p+q)+(p+q)-1][5] 3.096 0.3898 Lag[4*(p+q)+(p+q)-1][9] 3.467 0.6801 d.o.f=2 Weighted ARCH LM Tests ------------------------------------ Statistic Shape Scale P-Value ARCH Lag[3] 1.363 0.500 2.000 0.2430 ARCH Lag[5] 1.384 1.440 1.667 0.6233 ARCH Lag[7] 1.470 2.315 1.543 0.8274 Nyblom stability test ------------------------------------ Joint Statistic: 1.0821 Individual Statistics: omega 0.07099 alpha1 0.10828 beta1 0.11995 gamma1 0.09015 Asymptotic Critical Values (10% 5% 1%) Joint Statistic: 1.07 1.24 1.6 Individual Statistic: 0.35 0.47 0.75 Sign Bias Test ------------------------------------ t-value prob sig Sign Bias 0.2798 0.7796 Negative Sign Bias 1.0104 0.3125 Positive Sign Bias 0.3739 0.7086 Joint Effect 1.9887 0.5748 Adjusted Pearson Goodness-of-Fit Test: ------------------------------------ group statistic p-value(g-1) 1 20 292.2 8.015e-51 2 30 322.3 3.064e-51 3 40 315.2 6.495e-45 4 50 345.2 3.814e-46 Elapsed time : 0.6845639 $`fit`[[2]] *---------------------------------* * GARCH Model Fit * *---------------------------------* Conditional Variance Dynamics ----------------------------------- GARCH Model : gjrGARCH(1,1) Mean Model : ARFIMA(0,0,0) Distribution : norm Optimal Parameters ------------------------------------ Estimate Std. Error t value Pr(>|t|) omega 0.002153 0.000008 261.73 0 alpha1 0.102055 0.000012 8831.33 0 beta1 0.884322 0.002903 304.64 0 gamma1 -0.102709 0.000139 -740.41 0 Robust Standard Errors: Estimate Std. Error t value Pr(>|t|) omega 0.002153 0.000282 7.6367 0 alpha1 0.102055 0.000145 704.3463 0 beta1 0.884322 0.046124 19.1728 0 gamma1 -0.102709 0.006434 -15.9638 0 LogLikelihood : -215.4113 Information Criteria ------------------------------------ Akaike 0.29751 Bayes 0.31187 Shibata 0.29749 Hannan-Quinn 0.30286 Weighted Ljung-Box Test on Standardized Residuals ------------------------------------ statistic p-value Lag[1] 3.416 0.06456 Lag[2*(p+q)+(p+q)-1][2] 3.526 0.10118 Lag[4*(p+q)+(p+q)-1][5] 3.744 0.28773 d.o.f=0 H0 : No serial correlation Weighted Ljung-Box Test on Standardized Squared Residuals ------------------------------------ statistic p-value Lag[1] 0.0001372 0.9907 Lag[2*(p+q)+(p+q)-1][5] 0.0008338 1.0000 Lag[4*(p+q)+(p+q)-1][9] 0.0035321 1.0000 d.o.f=2 Weighted ARCH LM Tests ------------------------------------ Statistic Shape Scale P-Value ARCH Lag[3] 0.0001646 0.500 2.000 0.9898 ARCH Lag[5] 0.0004930 1.440 1.667 1.0000 ARCH Lag[7] 0.0032936 2.315 1.543 1.0000 Nyblom stability test ------------------------------------ Joint Statistic: 3.7551 Individual Statistics: omega 0.3142 alpha1 1.6889 beta1 0.2903 gamma1 1.7023 Asymptotic Critical Values (10% 5% 1%) Joint Statistic: 1.07 1.24 1.6 Individual Statistic: 0.35 0.47 0.75 Sign Bias Test ------------------------------------ t-value prob sig Sign Bias 0.16496 0.8690 Negative Sign Bias 0.19577 0.8448 Positive Sign Bias 0.17061 0.8646 Joint Effect 0.07736 0.9944 Adjusted Pearson Goodness-of-Fit Test: ------------------------------------ group statistic p-value(g-1) 1 20 66.98 2.901e-07 2 30 89.45 4.416e-08 3 40 84.46 3.381e-05 4 50 108.76 2.002e-06 Elapsed time : 2.061266 $`fit`[[3]] *---------------------------------* * GARCH Model Fit * *---------------------------------* Conditional Variance Dynamics ----------------------------------- GARCH Model : gjrGARCH(1,1) Mean Model : ARFIMA(0,0,0) Distribution : norm Optimal Parameters ------------------------------------ Estimate Std. Error t value Pr(>|t|) omega 0.002290 0.001686 1.35806 0.174445 alpha1 0.033963 0.007582 4.47926 0.000007 beta1 0.966294 0.003796 254.58599 0.000000 gamma1 -0.002514 0.007508 -0.33489 0.737707 Robust Standard Errors: Estimate Std. Error t value Pr(>|t|) omega 0.002290 0.002543 0.90055 0.367828 alpha1 0.033963 0.007854 4.32413 0.000015 beta1 0.966294 0.004485 215.46707 0.000000 gamma1 -0.002514 0.009729 -0.25844 0.796069 LogLikelihood : -1976.537 Information Criteria ------------------------------------ Akaike 2.6855 Bayes 2.6998 Shibata 2.6855 Hannan-Quinn 2.6908 Weighted Ljung-Box Test on Standardized Residuals ------------------------------------ statistic p-value Lag[1] 4.856e-04 9.824e-01 Lag[2*(p+q)+(p+q)-1][2] 1.243e+01 4.364e-04 Lag[4*(p+q)+(p+q)-1][5] 3.929e+01 7.361e-11 d.o.f=0 H0 : No serial correlation Weighted Ljung-Box Test on Standardized Squared Residuals ------------------------------------ statistic p-value Lag[1] 0.5075 0.4762 Lag[2*(p+q)+(p+q)-1][5] 1.6769 0.6964 Lag[4*(p+q)+(p+q)-1][9] 4.2823 0.5417 d.o.f=2 Weighted ARCH LM Tests ------------------------------------ Statistic Shape Scale P-Value ARCH Lag[3] 0.5119 0.500 2.000 0.4743 ARCH Lag[5] 2.0544 1.440 1.667 0.4593 ARCH Lag[7] 3.8795 2.315 1.543 0.3643 Nyblom stability test ------------------------------------ Joint Statistic: 1.5558 Individual Statistics: omega 0.28550 alpha1 0.12862 beta1 0.18188 gamma1 0.09135 Asymptotic Critical Values (10% 5% 1%) Joint Statistic: 1.07 1.24 1.6 Individual Statistic: 0.35 0.47 0.75 Sign Bias Test ------------------------------------ t-value prob sig Sign Bias 1.15840 0.24689 Negative Sign Bias 1.72872 0.08407 * Positive Sign Bias 0.03879 0.96906 Joint Effect 3.23634 0.35660 Adjusted Pearson Goodness-of-Fit Test: ------------------------------------ group statistic p-value(g-1) 1 20 317.7 4.733e-56 2 30 345.8 6.180e-56 3 40 351.4 6.851e-52 4 50 339.5 4.331e-45 Elapsed time : 0.7231081 $desc $desc$`type` [1] "equal" $class [1] "uGARCHmultifit" attr(,"package") [1] "rugarch" There has no something like Information Criteria after log.likelihoods when I check the names list. Same with using str() to check all available string in the list. > ## attributes of univariate stage 2 > names(attributes(attributes(attributes(fit)$mfit$ufit)[[1]][[1]])$fit) [1] "hessian" "cvar" "var" "sigma" [5] "condH" "z" "LLH" "log.likelihoods" [9] "residuals" "coef" "robust.cvar" "A" [13] "B" "scores" "se.coef" "tval" [17] "matcoef" "robust.se.coef" "robust.tval" "robust.matcoef" [21] "fitted.values" "convergence" "kappa" "persistence" [25] "timer" "ipars" "solver" > names(attributes(attributes(attributes(fit)$mfit$ufit)[[1]][[2]])$fit) [1] "hessian" "cvar" "var" "sigma" [5] "condH" "z" "LLH" "log.likelihoods" [9] "residuals" "coef" "robust.cvar" "A" [13] "B" "scores" "se.coef" "tval" [17] "matcoef" "robust.se.coef" "robust.tval" "robust.matcoef" [21] "fitted.values" "convergence" "kappa" "persistence" [25] "timer" "ipars" "solver" > names(attributes(attributes(attributes(fit)$mfit$ufit)[[1]][[3]])$fit) [1] "hessian" "cvar" "var" "sigma" [5] "condH" "z" "LLH" "log.likelihoods" [9] "residuals" "coef" "robust.cvar" "A" [13] "B" "scores" "se.coef" "tval" [17] "matcoef" "robust.se.coef" "robust.tval" "robust.matcoef" [21] "fitted.values" "convergence" "kappa" "persistence" [25] "timer" "ipars" "solver"
You don't provide a reproducible example, so this code is untested but might provide a solution: rugarch:::.information.test(likelihood(fit#mfit$ufit), nObs = nrow(fitted(fit#mfit$ufit)), nPars = 4)$AIC
I digged into the source code and the following fill give you what you need: object = fit if(object#model$modelinc[1]>0){ npvar = dim(object#model$varcoef)[1] * dim(object#model$varcoef)[2] } else{ npvar = 0 } m = dim(object#model$modeldata$data)[2] T = object#model$modeldata$T itest = rugarch:::.information.test(object#mfit$llh, nObs = T, nPars = npvar + (m^2 - m)/2 + length(object#mfit$matcoef[,1])) itest$AIC
Logistic regression model does not converge using glmer() function
I tried to create mixed-effect logistic regression model using glmer() function, however the model does not converge. Firstly, I changed categorical variables to from vectors to factors. schwa_completed_2$Outcome <- as.factor(schwa_completed_2$Outcome) schwa_completed_2$frequency_grouped <- as.factor(schwa_completed_2$frequency_grouped) schwa_completed_2$sonority_grouped <- as.factor(schwa_completed_2$sonority_grouped) schwa_completed_2$participant_gender <- as.factor(schwa_completed_2$participant_gender) schwa_completed_2$participant_age_group <- as.factor(schwa_completed_2$participant_age_group) schwa_completed_2$Speaker <- as.factor(schwa_completed_2$Speaker) Also there is one more continuous variable. Then I created a model model <- glmer(Outcome ~ frequency_grouped + sonority_grouped + syl_sec_EN + participant_gender + participant_age_group + 1|Speaker, data = schwa_completed_2, family = binomial, optimizer = "bobyqa") Unfortunately, the model does not converge. If I got rid off "Speaker" effect the model works just fine, however, the results probably are skewed. Warning messages: 1: In commonArgs(par, fn, control, environment()) : maxfun < 10 * length(par)^2 is not recommended. 2: In optwrap(optimizer, devfun, start, rho$lower, control = control, : convergence code 1 from bobyqa: bobyqa -- maximum number of function evaluations exceeded 3: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf, : failure to converge in 10000 evaluations 4: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.0785481 (tol = 0.001, component 1) Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod'] Family: binomial ( logit ) Formula: Outcome ~ frequency_grouped + sonority_grouped + syl_sec_EN + participant_gender + participant_age_group + 1 | Speaker Data: schwa_completed_2 AIC BIC logLik deviance df.resid 1820.8 2066.1 -864.4 1728.8 1486 Scaled residuals: Min 1Q Median 3Q Max -2.5957 -0.6255 -0.3987 0.7714 3.4432 Random effects: Groups Name Variance Std.Dev. Corr Speaker (Intercept) 2.08476 1.4439 frequency_groupedmoderately_frequent 0.78914 0.8883 -0.15 frequency_groupedvery_frequent 3.07514 1.7536 -0.90 0.35 sonority_groupedsonorants 1.33795 1.1567 0.82 -0.44 -0.91 sonority_groupedstops 1.76849 1.3298 0.02 -0.42 -0.36 0.51 sonority_groupedvowels 2.97690 1.7254 0.23 0.02 -0.32 0.55 0.77 syl_sec_EN 0.03217 0.1794 -0.62 -0.42 0.32 -0.44 0.11 -0.52 participant_genderM 0.41458 0.6439 -0.86 -0.18 0.77 -0.77 -0.24 -0.62 0.82 participant_age_groupY 0.52428 0.7241 0.46 0.80 -0.20 0.06 -0.44 0.08 -0.73 -0.63 Number of obs: 1532, groups: Speaker, 40 Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.7650 0.1862 -4.108 3.99e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 convergence code: 0 Model failed to converge with max|grad| = 0.0785481 (tol = 0.001, component 1) failure to converge in 10000 evaluations Is it because of the too complicated model or my laptop is not powerful enough? I don't know what should I do at this point. Is very anything I can do to fix this?
Ok, so what helped me was grouping the speakers with group by, and then scale the syl_sec_EN variable
Convergence glmer() warnings
I have a little convergence concerns for one of my models, the data are in file attached: https://mon-partage.fr/f/06LTiBGt/ . To explain the data a little, it is a matter of knowing whether the formation of the pre-nymphs is impacted by the different modalities. The obs column corresponds to formed / unformed nymph. The day variables are very important in the model. I would like to use at least the hive variable as a random effect. I tried to add to the function the bobyqa control, but the problem of convergence persists and follow the ?convergence. But alls optmizers converge C Can i considers that it's false positive ? Thank you in advance, library("lme4", lib.loc="~/R/win-library/3.3") > glmpn<-glmer(Obs~moda*jour+1|ruch)+1|code_test),data=dataall_pn,family=binomial(logit),glmerControl(optimizer="bobyqa")) Warning message: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.0054486 (tol = 0.001, component 1) > summary(glmpn) Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod'] Family: binomial ( logit ) Formula: Obs ~ moda * jour + (1 | ruch) + (1 | code_test) Data: dataall_pn Control: glmerControl(optimizer = "bobyqa") AIC BIC logLik deviance df.resid 22477.2 22651.2 -11216.6 22433.2 20072 Scaled residuals: Min 1Q Median 3Q Max -8.4511 -0.9370 0.4435 0.6890 1.5559 Random effects: Groups Name Variance Std.Dev. ruch (Intercept) 0.2811 0.5302 code_test (Intercept) 0.2475 0.4975 Number of obs: 20094, groups: ruch, 7; code_test, 5 Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.47806 0.46287 -7.514 5.73e-14 *** modaA 0.46866 0.50489 0.928 0.353281 modaL -2.77363 0.75599 -3.669 0.000244 *** modaLA 2.04869 0.52218 3.923 8.73e-05 *** modaP 2.19098 0.48984 4.473 7.72e-06 *** modaB 1.75376 0.49874 3.516 0.000437 *** modaBP 2.27875 0.52120 4.372 1.23e-05 *** modaPL 2.01771 0.48696 4.143 3.42e-05 *** modaBL 1.06337 0.48795 2.179 0.029312 * modaBLP 1.93218 0.51939 3.720 0.000199 *** jour 0.41973 0.02981 14.079 < 2e-16 *** modaA:jour -0.06559 0.04188 -1.566 0.117369 modaL:jour 0.19876 0.06491 3.062 0.002198 ** modaLA:jour -0.22419 0.04267 -5.254 1.49e-07 *** modaP:jour -0.25363 0.04012 -6.322 2.58e-10 *** modaB:jour -0.19555 0.04097 -4.773 1.82e-06 *** modaBP:jour -0.23478 0.04262 -5.509 3.61e-08 *** modaPL:jour -0.24454 0.03988 -6.131 8.71e-10 *** modaBL:jour -0.16590 0.04003 -4.145 3.40e-05 *** modaBLP:jour -0.21726 0.04245 -5.119 3.08e-07 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Correlation matrix not shown by default, as p = 20 > 12. Use print(x, correlation=TRUE) or vcov(x) if you need it convergence code: 0 Model failed to converge with max|grad| = 0.0054486 (tol = 0.001, component 1)