Related
I have to run a lmer with a log transformed response variable, a continuous variable as fixed effect and and a nested random effect:
first<-lmer(logterrisize~spm + (1|studyarea/teriid),
data = Data_table_for_analysis_Character_studyarea,
control=lmerControl(optimizer="Nelder_Mead",
optCtrl=list(maxfun=1e4)))
I got this error message: Error in length(value <- as.numeric(value)) == 1L :
Downdated VtV is not positive definite
I tried this with bobyqa() as optimization argument and got this warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.753065 (tol = 0.002, component
1) 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:
Model is nearly unidentifiable: very large eigenvalue-Rescale variables?
my summary looks like this:
Linear mixed model fit by REML ['lmerMod']
Formula: logterrisize ~ spm + (1 studyarea/teriid) Data: Data_table_for_analysis_Character_studyareaControl: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 10000)) REML criterion at convergence: -6079.6Scaled residuals:
Min 1Q Median 3Q Max
-3.639e-07 -4.962e-08 3.310e-09 5.293e-08 9.725e-07
Random effects:
Groups Name Variance Std.Dev.
teriid:studyarea (Intercept) 1.291e-01 3.593e-01
studyarea (Intercept) 1.944e-02 1.394e-01
Residual 4.506e-15 6.712e-08
Number of obs: 273, groups: teriid:studyarea, 66; studyarea, 22
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.480e+00 5.631e-02 26.28
spm -5.785e-16 8.507e-10 0.00
Correlation of Fixed Effects:
(Intr) spm 0.000 convergence code: 0
Model failed to converge with max|grad| = 0.753065 (tol = 0.002, component1)
Model is nearly unidentifiable: very large eigenvalue - Rescale variables?
my data looks like following:
show(logterrisize) [1] 1.3317643 1.3317643 1.3317643 0.1295798 0.1295798 1.5051368 1.5051368 1.5051368 1.5051368 [10] 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 1.5051368 [19] 1.5051368 1.5051368 1.5051368 1.4665993 1.4665993 1.4665993 1.8282328 1.8282328 1.9252934 [28] 1.9252934 1.9252934 2.3006582 2.3006582 2.5160920 2.7774040 2.7774040 3.3398623 3.3398623 [37] 3.4759297 1.2563594 1.6061204 1.6061204 1.7835139 1.7835139 2.1669498 2.1669498 2.1669498 [46] 2.1669498 0.7264997 0.7458155 0.8380524 0.8380524 0.8380524 0.8380524 0.8380524 0.8380524
show(spm) [1] 18.461538 22.641509 35.172414 10.418006 15.611285 3.482143 3.692308 4.483986 4.821429 [10] 6.000000 6.122449 6.176471 6.220736 6.260870 6.593407 7.010309 9.200000 9.473684 [19] 9.600000 12.600000 14.200000 16.146179 28.125000 30.099010 13.731343 14.432990 11.089109 [28] 17.960526 32.903226 8.955224 33.311688 8.800000 11.578947 20.000000 14.455446 18.181818 [37] 28.064516 25.684211 17.866667 23.142857 18.208955 20.536913 11.419355 11.593220 12.703583 [46] 20.000000 3.600000 11.320755 6.200000 6.575342 12.800000 19.109589 20.124224 22.941176 [55] 4.600000 6.600000 6.771160 8.000000 19.200000 19.400000 22.773723 3.333333 4.214047
Studyarea are character names and teriID represents continuous numbers of the study sites.
My data frame looks like this:
Did I forget anything to include to the equation while using a log transformed variable?
Thanks!
EDIT:
I used ?convergence to check on convergence errors. I tried this:
## 3. recompute gradient and Hessian with Richardson extrapolation
devfun <- update(first, devFunOnly=TRUE)
if (isLMM(first)) {
pars <- getME(first,"theta")
} else {## GLMM: requires both random and fixed parameters
pars <- getME(first, c("theta","fixef"))
}
if (require("numDeriv")) {
cat("hess:\n"); print(hess <- hessian(devfun, unlist(pars)))
cat("grad:\n"); print(grad <- grad(devfun, unlist(pars)))
cat("scaled gradient:\n")
print(scgrad <- solve(chol(hess), grad))}
and got this answer:
hess:
[,1] [,2]
[1,] 147.59157 -14.37956
[2,] -14.37956 120.85329
grad:
[1] -222.1020 -108.1038
scaled gradient:
[1] -19.245584 -9.891077
Unfortunately I don´t know what the answer should tell me.
2nd EDIT:
I tried numerous optimizer and while using this:
first<-lmer(logterrisize~spm + (1|studyarea/teriid),REML=FALSE,
data = Data_table_for_analysis_Character_studyarea,
control=lmerControl(optimizer="optimx",
optCtrl=list(method='nlminb')))
I only got one warning: In optwrap(optimizer, devfun, getStart(start, rho$lower, rho$pp), :
convergence code 1 from optimx
now my summary looks like this:
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: logterrisize ~ spm + (1 | studyarea/teriid)
Data: Data_table_for_analysis_Character_studyarea
Control: lmerControl(optimizer = "optimx", optCtrl = list(method ="nlminb"))
AIC BIC logLik deviance df.resid
-3772.4 -3754.3 1891.2 -3782.4 268
Scaled residuals:
Min 1Q Median 3Q Max
-1.523e-04 -1.693e-05 1.480e-06 1.436e-05 3.332e-04
Random effects:
Groups Name Variance Std.Dev.
teriid:studyarea (Intercept) 8.219e-02 0.2866882
studyarea (Intercept) 7.478e-02 0.2734675
Residual 3.843e-10 0.0000196
Number of obs: 273, groups: teriid:studyarea, 66; studyarea, 22
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.551e+00 7.189e-02 21.58
spm 3.210e-11 2.485e-07 0.00
Correlation of Fixed Effects:
(Intr)spm 0.000
convergence code: 1
So may I able to turn a blind eye on this warning message or will this be a huge mistake?
tl;dr every observation on a territory shares the same territory size, so your random effect of territory ID is essentially explaining everything and leaving no variation at all for either the log(terrsize) fixed effect or the residual variance. Leaving the random effect of territory ID out of the model seems to give reasonable answers; a simulated data set replicates this pathology pretty well, but suggests that you'll end up with an underestimate of the spm effect ...
read data and plot
library(readxl)
library(dplyr)
dd <- (read_excel("lme4_terr_dataset.xlsx")
%>% rename(spm="scans per min",
studyarea="Study areaID",
teriid="TerritoryID",
terrsize="Territory_Size")
)
library(ggplot2); theme_set(theme_bw())
library(ggalt)
(ggplot(dd, aes(spm,terrsize,colour=studyarea))
+geom_point()
+geom_encircle(aes(group=teriid))
+theme(legend.position="none")
+ scale_y_log10()
)
This plot, with its horizontal lines of values from the same territory ID, is what helped me diagnose the problem. Confirming that every territory ID has a single territory size for all observations:
tt <- with(dd,table(terrsize,teriid))
all(rowSums(tt>0)==1) ## TRUE
model fitting
library(lme4)
m1 <- lmer(log(terrsize) ~ spm + (1|studyarea/teriid), dd)
## replicate warnings
m2 <- lmer(log(terrsize) ~ spm + (1|studyarea), dd)
## no warnings
now simulate similar-looking data
set.seed(101)
## experimental design: rep within f2 (terr_id) within f1 (study area)
ddx <- expand.grid(studyarea=factor(letters[1:10]),
teriid=factor(1:4),rep=1:5)
## study-area, terr_id effects, and spm
b_studyarea <- rnorm(10)
b_teriid <- rnorm(40)
ddx <- within(ddx, {
int <- interaction(studyarea,teriid)
spm <- rlnorm(nrow(ddx), meanlog=1,sdlog=1)
})
## compute average spm per terr/id
## (because response will be identical across id)
spm_terr <- aggregate(spm~int, data=ddx, FUN=mean)[,"spm"]
ddx <- within(ddx, {
mu <- 1+0.2*spm_terr[int]+b_studyarea[studyarea] + b_teriid[int]
tsize <- rlnorm(length(levels(int)), meanlog=mu, sdlog=1)
terrsize <- tsize[int]
})
gg1 %+% ddx
fit simulated data
This gives similar behaviour to the real data:
lmer(log(terrsize) ~ spm + (1|studyarea/teriid), ddx)
We can avoid the warnings by dropping teriid:
m1 <- lmer(log(terrsize) ~ spm + (1|studyarea), ddx)
But the true effect of spm (0.2) will be underestimated (because of the ignored noise from teriid ...)
round(confint(m1, parm="beta_"),3)
## 2.5 % 97.5 %
## (Intercept) 1.045 2.026
## spm 0.000 0.070
aggregating
On the basis of this single simulation, it looks like aggregating to the level of the territory (as recommended e.g. by Murtaugh 2007, "Simplicity and complexity in ecological data analysis" Ecology) and weighting by number of samples per territory gives a reasonable estimate of the true spm effect ...
ddx_agg <- (ddx
%>% group_by(studyarea,terrsize,teriid)
%>% summarise(spm=mean(spm),
n=n())
)
library(nlme)
m3x <- lme(log(terrsize) ~ spm, random=~1|studyarea, data=ddx_agg,
weights=varFixed(~I(1/n)))
round(summary(m3x)$tTab,3)
Value Std.Error DF t-value p-value
(Intercept) 0.934 0.465 29 2.010 0.054
spm 0.177 0.095 29 1.863 0.073
I tried to use gogarchspec, gogarchfit and gogarchforecast in rmgarch yesterday but noticed there has no aic value able to be retrieved.
> fit
*------------------------------*
* GO-GARCH Fit *
*------------------------------*
Mean Model : VAR
(Lag) : 1
(Robust) : FALSE
GARCH Model : gjrGARCH
Distribution : mvnorm
ICA Method : fastica
No. Factors : 3
No. Periods : 1475
Log-Likelihood : NA
------------------------------------
U (rotation matrix) :
[,1] [,2] [,3]
[1,] -0.059 0.9973 0.0435
[2,] -0.594 -0.0701 0.8017
[3,] 0.803 0.0214 0.5962
A (mixing matrix) :
[,1] [,2] [,3]
[1,] 4.49e-05 -0.0258 0.000127
[2,] 1.80e-03 -0.0260 -0.004237
[3,] 4.19e-03 -0.0257 -0.000478
[4,] 6.78e-05 -0.0259 0.000116
Above is the fit model and below is the attributes of the model but the aic values seen unable to retrieve.
> ## attributes of univariate stage 1
> attributes(attributes(fit)$mfit$ufit)
$`fit`
$`fit`[[1]]
*---------------------------------*
* GARCH Model Fit *
*---------------------------------*
Conditional Variance Dynamics
-----------------------------------
GARCH Model : gjrGARCH(1,1)
Mean Model : ARFIMA(0,0,0)
Distribution : norm
Optimal Parameters
------------------------------------
Estimate Std. Error t value Pr(>|t|)
omega 0.005346 0.002855 1.8723 0.061167
alpha1 0.057142 0.012491 4.5746 0.000005
beta1 0.955136 0.006263 152.4948 0.000000
gamma1 -0.026556 0.012794 -2.0756 0.037931
Robust Standard Errors:
Estimate Std. Error t value Pr(>|t|)
omega 0.005346 0.004330 1.2344 0.217043
alpha1 0.057142 0.014525 3.9340 0.000084
beta1 0.955136 0.004851 196.9049 0.000000
gamma1 -0.026556 0.016780 -1.5826 0.113509
LogLikelihood : -2016.878
Information Criteria
------------------------------------
Akaike 2.7402
Bayes 2.7545
Shibata 2.7402
Hannan-Quinn 2.7455
Weighted Ljung-Box Test on Standardized Residuals
------------------------------------
statistic p-value
Lag[1] 0.003505 0.952788
Lag[2*(p+q)+(p+q)-1][2] 6.443923 0.016755
Lag[4*(p+q)+(p+q)-1][5] 12.082773 0.002649
d.o.f=0
H0 : No serial correlation
Weighted Ljung-Box Test on Standardized Squared Residuals
------------------------------------
statistic p-value
Lag[1] 2.143 0.1432
Lag[2*(p+q)+(p+q)-1][5] 3.096 0.3898
Lag[4*(p+q)+(p+q)-1][9] 3.467 0.6801
d.o.f=2
Weighted ARCH LM Tests
------------------------------------
Statistic Shape Scale P-Value
ARCH Lag[3] 1.363 0.500 2.000 0.2430
ARCH Lag[5] 1.384 1.440 1.667 0.6233
ARCH Lag[7] 1.470 2.315 1.543 0.8274
Nyblom stability test
------------------------------------
Joint Statistic: 1.0821
Individual Statistics:
omega 0.07099
alpha1 0.10828
beta1 0.11995
gamma1 0.09015
Asymptotic Critical Values (10% 5% 1%)
Joint Statistic: 1.07 1.24 1.6
Individual Statistic: 0.35 0.47 0.75
Sign Bias Test
------------------------------------
t-value prob sig
Sign Bias 0.2798 0.7796
Negative Sign Bias 1.0104 0.3125
Positive Sign Bias 0.3739 0.7086
Joint Effect 1.9887 0.5748
Adjusted Pearson Goodness-of-Fit Test:
------------------------------------
group statistic p-value(g-1)
1 20 292.2 8.015e-51
2 30 322.3 3.064e-51
3 40 315.2 6.495e-45
4 50 345.2 3.814e-46
Elapsed time : 0.6845639
$`fit`[[2]]
*---------------------------------*
* GARCH Model Fit *
*---------------------------------*
Conditional Variance Dynamics
-----------------------------------
GARCH Model : gjrGARCH(1,1)
Mean Model : ARFIMA(0,0,0)
Distribution : norm
Optimal Parameters
------------------------------------
Estimate Std. Error t value Pr(>|t|)
omega 0.002153 0.000008 261.73 0
alpha1 0.102055 0.000012 8831.33 0
beta1 0.884322 0.002903 304.64 0
gamma1 -0.102709 0.000139 -740.41 0
Robust Standard Errors:
Estimate Std. Error t value Pr(>|t|)
omega 0.002153 0.000282 7.6367 0
alpha1 0.102055 0.000145 704.3463 0
beta1 0.884322 0.046124 19.1728 0
gamma1 -0.102709 0.006434 -15.9638 0
LogLikelihood : -215.4113
Information Criteria
------------------------------------
Akaike 0.29751
Bayes 0.31187
Shibata 0.29749
Hannan-Quinn 0.30286
Weighted Ljung-Box Test on Standardized Residuals
------------------------------------
statistic p-value
Lag[1] 3.416 0.06456
Lag[2*(p+q)+(p+q)-1][2] 3.526 0.10118
Lag[4*(p+q)+(p+q)-1][5] 3.744 0.28773
d.o.f=0
H0 : No serial correlation
Weighted Ljung-Box Test on Standardized Squared Residuals
------------------------------------
statistic p-value
Lag[1] 0.0001372 0.9907
Lag[2*(p+q)+(p+q)-1][5] 0.0008338 1.0000
Lag[4*(p+q)+(p+q)-1][9] 0.0035321 1.0000
d.o.f=2
Weighted ARCH LM Tests
------------------------------------
Statistic Shape Scale P-Value
ARCH Lag[3] 0.0001646 0.500 2.000 0.9898
ARCH Lag[5] 0.0004930 1.440 1.667 1.0000
ARCH Lag[7] 0.0032936 2.315 1.543 1.0000
Nyblom stability test
------------------------------------
Joint Statistic: 3.7551
Individual Statistics:
omega 0.3142
alpha1 1.6889
beta1 0.2903
gamma1 1.7023
Asymptotic Critical Values (10% 5% 1%)
Joint Statistic: 1.07 1.24 1.6
Individual Statistic: 0.35 0.47 0.75
Sign Bias Test
------------------------------------
t-value prob sig
Sign Bias 0.16496 0.8690
Negative Sign Bias 0.19577 0.8448
Positive Sign Bias 0.17061 0.8646
Joint Effect 0.07736 0.9944
Adjusted Pearson Goodness-of-Fit Test:
------------------------------------
group statistic p-value(g-1)
1 20 66.98 2.901e-07
2 30 89.45 4.416e-08
3 40 84.46 3.381e-05
4 50 108.76 2.002e-06
Elapsed time : 2.061266
$`fit`[[3]]
*---------------------------------*
* GARCH Model Fit *
*---------------------------------*
Conditional Variance Dynamics
-----------------------------------
GARCH Model : gjrGARCH(1,1)
Mean Model : ARFIMA(0,0,0)
Distribution : norm
Optimal Parameters
------------------------------------
Estimate Std. Error t value Pr(>|t|)
omega 0.002290 0.001686 1.35806 0.174445
alpha1 0.033963 0.007582 4.47926 0.000007
beta1 0.966294 0.003796 254.58599 0.000000
gamma1 -0.002514 0.007508 -0.33489 0.737707
Robust Standard Errors:
Estimate Std. Error t value Pr(>|t|)
omega 0.002290 0.002543 0.90055 0.367828
alpha1 0.033963 0.007854 4.32413 0.000015
beta1 0.966294 0.004485 215.46707 0.000000
gamma1 -0.002514 0.009729 -0.25844 0.796069
LogLikelihood : -1976.537
Information Criteria
------------------------------------
Akaike 2.6855
Bayes 2.6998
Shibata 2.6855
Hannan-Quinn 2.6908
Weighted Ljung-Box Test on Standardized Residuals
------------------------------------
statistic p-value
Lag[1] 4.856e-04 9.824e-01
Lag[2*(p+q)+(p+q)-1][2] 1.243e+01 4.364e-04
Lag[4*(p+q)+(p+q)-1][5] 3.929e+01 7.361e-11
d.o.f=0
H0 : No serial correlation
Weighted Ljung-Box Test on Standardized Squared Residuals
------------------------------------
statistic p-value
Lag[1] 0.5075 0.4762
Lag[2*(p+q)+(p+q)-1][5] 1.6769 0.6964
Lag[4*(p+q)+(p+q)-1][9] 4.2823 0.5417
d.o.f=2
Weighted ARCH LM Tests
------------------------------------
Statistic Shape Scale P-Value
ARCH Lag[3] 0.5119 0.500 2.000 0.4743
ARCH Lag[5] 2.0544 1.440 1.667 0.4593
ARCH Lag[7] 3.8795 2.315 1.543 0.3643
Nyblom stability test
------------------------------------
Joint Statistic: 1.5558
Individual Statistics:
omega 0.28550
alpha1 0.12862
beta1 0.18188
gamma1 0.09135
Asymptotic Critical Values (10% 5% 1%)
Joint Statistic: 1.07 1.24 1.6
Individual Statistic: 0.35 0.47 0.75
Sign Bias Test
------------------------------------
t-value prob sig
Sign Bias 1.15840 0.24689
Negative Sign Bias 1.72872 0.08407 *
Positive Sign Bias 0.03879 0.96906
Joint Effect 3.23634 0.35660
Adjusted Pearson Goodness-of-Fit Test:
------------------------------------
group statistic p-value(g-1)
1 20 317.7 4.733e-56
2 30 345.8 6.180e-56
3 40 351.4 6.851e-52
4 50 339.5 4.331e-45
Elapsed time : 0.7231081
$desc
$desc$`type`
[1] "equal"
$class
[1] "uGARCHmultifit"
attr(,"package")
[1] "rugarch"
There has no something like Information Criteria after log.likelihoods when I check the names list. Same with using str() to check all available string in the list.
> ## attributes of univariate stage 2
> names(attributes(attributes(attributes(fit)$mfit$ufit)[[1]][[1]])$fit)
[1] "hessian" "cvar" "var" "sigma"
[5] "condH" "z" "LLH" "log.likelihoods"
[9] "residuals" "coef" "robust.cvar" "A"
[13] "B" "scores" "se.coef" "tval"
[17] "matcoef" "robust.se.coef" "robust.tval" "robust.matcoef"
[21] "fitted.values" "convergence" "kappa" "persistence"
[25] "timer" "ipars" "solver"
> names(attributes(attributes(attributes(fit)$mfit$ufit)[[1]][[2]])$fit)
[1] "hessian" "cvar" "var" "sigma"
[5] "condH" "z" "LLH" "log.likelihoods"
[9] "residuals" "coef" "robust.cvar" "A"
[13] "B" "scores" "se.coef" "tval"
[17] "matcoef" "robust.se.coef" "robust.tval" "robust.matcoef"
[21] "fitted.values" "convergence" "kappa" "persistence"
[25] "timer" "ipars" "solver"
> names(attributes(attributes(attributes(fit)$mfit$ufit)[[1]][[3]])$fit)
[1] "hessian" "cvar" "var" "sigma"
[5] "condH" "z" "LLH" "log.likelihoods"
[9] "residuals" "coef" "robust.cvar" "A"
[13] "B" "scores" "se.coef" "tval"
[17] "matcoef" "robust.se.coef" "robust.tval" "robust.matcoef"
[21] "fitted.values" "convergence" "kappa" "persistence"
[25] "timer" "ipars" "solver"
You don't provide a reproducible example, so this code is untested but might provide a solution:
rugarch:::.information.test(likelihood(fit#mfit$ufit),
nObs = nrow(fitted(fit#mfit$ufit)),
nPars = 4)$AIC
I digged into the source code and the following fill give you what you need:
object = fit
if(object#model$modelinc[1]>0){
npvar = dim(object#model$varcoef)[1] * dim(object#model$varcoef)[2]
} else{
npvar = 0
}
m = dim(object#model$modeldata$data)[2]
T = object#model$modeldata$T
itest = rugarch:::.information.test(object#mfit$llh, nObs = T, nPars = npvar + (m^2 - m)/2 + length(object#mfit$matcoef[,1]))
itest$AIC
I tried to create mixed-effect logistic regression model using glmer() function, however the model does not converge. Firstly, I changed categorical variables to from vectors to factors.
schwa_completed_2$Outcome <- as.factor(schwa_completed_2$Outcome)
schwa_completed_2$frequency_grouped <- as.factor(schwa_completed_2$frequency_grouped)
schwa_completed_2$sonority_grouped <- as.factor(schwa_completed_2$sonority_grouped)
schwa_completed_2$participant_gender <- as.factor(schwa_completed_2$participant_gender)
schwa_completed_2$participant_age_group <- as.factor(schwa_completed_2$participant_age_group)
schwa_completed_2$Speaker <- as.factor(schwa_completed_2$Speaker)
Also there is one more continuous variable. Then I created a model
model <- glmer(Outcome ~ frequency_grouped + sonority_grouped + syl_sec_EN +
participant_gender + participant_age_group + 1|Speaker,
data = schwa_completed_2, family = binomial, optimizer = "bobyqa")
Unfortunately, the model does not converge. If I got rid off "Speaker" effect the model works just fine, however, the results probably are skewed.
Warning messages:
1: In commonArgs(par, fn, control, environment()) :
maxfun < 10 * length(par)^2 is not recommended.
2: In optwrap(optimizer, devfun, start, rho$lower, control = control, :
convergence code 1 from bobyqa: bobyqa -- maximum number of function
evaluations exceeded
3: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf, :
failure to converge in 10000 evaluations
4: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.0785481 (tol = 0.001, component 1)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: Outcome ~ frequency_grouped + sonority_grouped + syl_sec_EN +
participant_gender + participant_age_group + 1 | Speaker
Data: schwa_completed_2
AIC BIC logLik deviance df.resid
1820.8 2066.1 -864.4 1728.8 1486
Scaled residuals:
Min 1Q Median 3Q Max
-2.5957 -0.6255 -0.3987 0.7714 3.4432
Random effects:
Groups Name Variance Std.Dev. Corr
Speaker (Intercept) 2.08476 1.4439
frequency_groupedmoderately_frequent 0.78914 0.8883 -0.15
frequency_groupedvery_frequent 3.07514 1.7536 -0.90 0.35
sonority_groupedsonorants 1.33795 1.1567 0.82 -0.44 -0.91
sonority_groupedstops 1.76849 1.3298 0.02 -0.42 -0.36 0.51
sonority_groupedvowels 2.97690 1.7254 0.23 0.02 -0.32 0.55 0.77
syl_sec_EN 0.03217 0.1794 -0.62 -0.42 0.32 -0.44 0.11 -0.52
participant_genderM 0.41458 0.6439 -0.86 -0.18 0.77 -0.77 -0.24 -0.62 0.82
participant_age_groupY 0.52428 0.7241 0.46 0.80 -0.20 0.06 -0.44 0.08 -0.73 -0.63
Number of obs: 1532, groups: Speaker, 40
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.7650 0.1862 -4.108 3.99e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
convergence code: 0
Model failed to converge with max|grad| = 0.0785481 (tol = 0.001, component 1)
failure to converge in 10000 evaluations
Is it because of the too complicated model or my laptop is not powerful enough? I don't know what should I do at this point. Is very anything I can do to fix this?
Ok, so what helped me was grouping the speakers with group by, and then scale the syl_sec_EN variable
I have a data frame with 5 variables: Lot / Wafer / Serial Number / Voltage / Amplification. In this data frame there are 1020 subsets grouped by Serial_number. Each subset has a certain number of measurement data points (Amplification against voltage).
I fit the data with
summary(fit2.lme <- lmer(log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 1) | Serial_number),
+ data = APD))
which yields:
Linear mixed model fit by REML ['lmerMod']
Formula: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 1) | Serial_number)
Data: APD
REML criterion at convergence: 35286.1
Scaled residuals:
Min 1Q Median 3Q Max
-20.7724 -0.2438 -0.1297 0.2434 13.2663
Random effects:
Groups Name Variance Std.Dev. Corr
Serial_number (Intercept) 1.439e-02 0.1199
poly(Voltage, 1) 2.042e+03 45.1908 -0.76
Residual 8.701e-02 0.2950
Number of obs: 76219, groups: Serial_number, 1020
Fixed effects:
Estimate Std. Error t value
(Intercept) 5.944e-02 3.914e-03 15.2
poly(Voltage, 3)1 5.862e+02 1.449e+00 404.5
poly(Voltage, 3)2 -1.714e+02 3.086e-01 -555.4
poly(Voltage, 3)3 4.627e+01 3.067e-01 150.8
Correlation of Fixed Effects:
(Intr) p(V,3)1 p(V,3)2
ply(Vlt,3)1 -0.713
ply(Vlt,3)2 0.015 -0.004
ply(Vlt,3)3 0.004 0.012 0.018
and when I add a higher polynomial in the random effects I get a warning:
> summary(fit3.lme <- lmer(log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) | Serial_number),
+ data = APD))
Linear mixed model fit by REML ['lmerMod']
Formula: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) | Serial_number)
Data: APD
REML criterion at convergence: 16285.9
Scaled residuals:
Min 1Q Median 3Q Max
-20.5042 -0.2393 -0.0697 0.3165 13.9634
Random effects:
Groups Name Variance Std.Dev. Corr
Serial_number (Intercept) 1.584e-02 0.1259
poly(Voltage, 2)1 1.777e+03 42.1536 -0.67
poly(Voltage, 2)2 1.579e+03 39.7365 0.87 -0.95
Residual 6.679e-02 0.2584
Number of obs: 76219, groups: Serial_number, 1020
Fixed effects:
Estimate Std. Error t value
(Intercept) 5.858e-02 4.062e-03 14.4
poly(Voltage, 3)1 5.938e+02 1.351e+00 439.5
poly(Voltage, 3)2 -1.744e+02 1.276e+00 -136.7
poly(Voltage, 3)3 5.719e+01 2.842e-01 201.2
Correlation of Fixed Effects:
(Intr) p(V,3)1 p(V,3)2
ply(Vlt,3)1 -0.641
ply(Vlt,3)2 0.825 -0.906
ply(Vlt,3)3 -0.001 0.030 -0.004
convergence code: 1
Model failed to converge with max|grad| = 2.22294 (tol = 0.002, component 1)
Model is nearly unidentifiable: large eigenvalue ratio
- Rescale variables?
Warning messages:
1: In optwrap(optimizer, devfun, getStart(start, rho$lower, rho$pp), :
convergence code 1 from bobyqa: bobyqa -- maximum number of function evaluations exceeded
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 2.22294 (tol = 0.002, component 1)
3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: large eigenvalue ratio
- Rescale variables?
The data is as following (can provide the complete data of course, if desired). It includes 77479 observables of 6 variables:
'data.frame': 77479 obs. of 6 variables:
$ Serial_number: num 9.12e+08 9.12e+08 9.12e+08 9.12e+08 9.12e+08 ...
$ Lot : int 9 9 9 9 9 9 9 9 9 9 ...
$ Wafer : int 912 912 912 912 912 912 912 912 912 912 ...
$ Amplification: num 1 1 1.01 1.01 1.01 ...
$ Voltage : num 25 30 34.9 44.9 49.9 ...
and the data itself looks like:
Serial_number Lot Wafer Amplification Voltage
1 912009913 9 912 1.00252 24.9681
2 912009913 9 912 1.00452 29.9591
3 912009913 9 912 1.00537 34.9494
(...)
73 912009913 9 912 918.112 375.9850
74 912009913 9 912 1083.74 377.9990
75 912009897 9 912 1.00324 19.9895
76 912009897 9 912 1.00449 29.9777
(...)
What does the warnings mean?
According to the anova the fit3.lme model describes the data better:
> anova(fit3.lme, fit2.lme)
refitting model(s) with ML (instead of REML)
Data: APD
Models:
fit2.lme: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 1) |
fit2.lme: Serial_number)
fit3.lme: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) |
fit3.lme: Serial_number)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
fit2.lme 8 35294 35368 -17638.9 35278
fit3.lme 11 16264 16366 -8121.1 16242 19036 3 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In optwrap(optimizer, devfun, x#theta, lower = x#lower, calc.derivs = TRUE, :
convergence code 1 from bobyqa: bobyqa -- maximum number of function evaluations exceeded
Therefore I would like to use that model but I stuck in the warning.
update:
center and scale predictors
ss.CS<- transform(APD, Voltage=scale(Voltage))
> fit31.lme<- update(fit3.lme, data=ss.CS)
Error in poly(dots[[i]], degree, raw = raw, simple = raw) :
'degree' must be less than number of unique points
Also for the other variable (don't know for which it makes sense)
> ss.CS<- transform(APD, Amplitude=scale(Amplitude))
Error in scale(Amplitude) : object 'Amplitude' not found
> ss.CS<- transform(APD, Amplification=scale(Amplification))
> fit31.lme<- update(fit3.lme, data=ss.CS)
Warning messages:
1: In log(Amplification) : NaNs produced
2: In log(log(Amplification)) : NaNs produced
3: In log(Amplification) : NaNs produced
4: In log(log(Amplification)) : NaNs produced
5: In log(Amplification) : NaNs produced
6: In log(log(Amplification)) : NaNs produced
7: Some predictor variables are on very different scales: consider rescaling
check singularity
> diag.vals<- getME(fit3.lme, "theta")[getME(fit3.lme, "lower")==0]
> any(diag.vals<- 1e-6)
[1] TRUE
Warning message:
In any(diag.vals <- 1e-06) : coercing argument of type 'double' to logical
compute gradient and Hessian with Richardson extrapolation
> devfun<- update(fit3.lme, devFunOnly=TRUE)
> if(isLMM(fit3.lme)){
+ pars<- getME(fit3.lme, "theta")
+ } else {
+ pars<- getME(fit3.lme, c("theta", "fixef"))
+ }
> if (require("numDeriv")) {
+ cat("hess:\n"); print(hess <- hessian(devfun, unlist(pars)))
+ cat("grad:\n"); print(grad <- grad(devfun, unlist(pars)))
+ cat("scaled gradient:\n")
+ print(scgrad <- solve(chol(hess), grad))
+ }
Loading required package: numDeriv
hess:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 39137.840764 -56.189442277 -1.348127e+02 3.789427141 25.456612941 -3.806942811
[2,] -56.189442 0.508077776 6.283795e-01 -0.068882737 -0.056159369 0.003228274
[3,] -134.812704 0.628379462 1.061584e+00 -0.079620905 -0.152816413 0.007457255
[4,] 3.789427 -0.068882737 -7.962090e-02 0.516054976 0.534346634 0.001513168
[5,] 25.456613 -0.056159369 -1.528164e-01 0.534346634 0.901191745 -0.002344407
[6,] -3.806943 0.003228274 7.457255e-03 0.001513168 -0.002344407 0.179283416
grad:
[1] -22.9114985 2.2229416 -0.2959238 0.6790044 -0.2343368 -0.4020556
scaled gradient:
[1] -0.1123624 4.4764140 -0.8777938 1.3980054 -0.4223921 -0.9508207
> fit3.lme#optinfo$derivs
$gradient
[1] -22.9118920 2.2229424 -0.2959264 0.6790037 -0.2343360 -0.4020605
$Hessian
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 39137.915527 -56.20745850 -134.87176514 3.74780273 25.47540283 -3.79016113
[2,] -56.207458 0.44262695 0.61462402 -0.04736328 -0.06585693 0.02130127
[3,] -134.871765 0.61462402 1.04296875 -0.10467529 -0.23223877 0.05438232
[4,] 3.747803 -0.04736328 -0.10467529 0.52026367 0.50909424 -0.02130127
[5,] 25.475403 -0.06585693 -0.23223877 0.50909424 0.68481445 -0.02044678
[6,] -3.790161 0.02130127 0.05438232 -0.02130127 -0.02044678 0.07617188
4. restart the fit from the original value (or a slightly perturbed value):
> fit3.lme.restart <- update(fit3.lme, start=pars)
> summary(fit3.lme.restart)
Linear mixed model fit by REML ['lmerMod']
Formula: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) | Serial_number)
Data: APD
REML criterion at convergence: 16250.3
Scaled residuals:
Min 1Q Median 3Q Max
-20.4868 -0.2404 -0.0697 0.3166 13.9464
Random effects:
Groups Name Variance Std.Dev. Corr
Serial_number (Intercept) 1.823e-02 0.1350
poly(Voltage, 2)1 2.124e+03 46.0903 -0.77
poly(Voltage, 2)2 1.937e+03 44.0164 0.90 -0.96
Residual 6.668e-02 0.2582
Number of obs: 76219, groups: Serial_number, 1020
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.05823 0.00434 13.4
poly(Voltage, 3)1 593.83396 1.47201 403.4
poly(Voltage, 3)2 -174.61257 1.40711 -124.1
poly(Voltage, 3)3 57.15901 0.28427 201.1
Correlation of Fixed Effects:
(Intr) p(V,3)1 p(V,3)2
ply(Vlt,3)1 -0.735
ply(Vlt,3)2 0.868 -0.927
ply(Vlt,3)3 -0.001 0.028 -0.003
5. try all available optimizers
> source(system.file("utils", "allFit.R", package="lme4"))
Loading required package: optimx
Loading required package: dfoptim
> fit3.lme.all <- allFit(fit3.lme)
bobyqa : [OK]
Nelder_Mead : [OK]
nlminbw : [OK]
nmkbw : [OK]
optimx.L-BFGS-B : [OK]
nloptwrap.NLOPT_LN_NELDERMEAD : [OK]
nloptwrap.NLOPT_LN_BOBYQA : [OK]
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
4: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
5: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
6: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
7: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
8: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
ss <- summary(fit3.lme.all)
> ss$ fixef ## extract fixed effects
(Intercept) poly(Voltage, 3)1 poly(Voltage, 3)2 poly(Voltage, 3)3
bobyqa 0.05822789 593.8340 -174.6126 57.15901
Nelder_Mead 0.05822787 593.8340 -174.6126 57.15902
nlminbw 0.05822787 593.8340 -174.6126 57.15902
nmkbw 0.05841966 593.7804 -174.4999 57.17107
optimx.L-BFGS-B 0.05822845 593.8336 -174.6116 57.16183
nloptwrap.NLOPT_LN_NELDERMEAD 0.05823870 593.8330 -174.6076 57.16039
nloptwrap.NLOPT_LN_BOBYQA 0.05823870 593.8330 -174.6076 57.16039
> ss$ llik ## log-likelihoods
bobyqa Nelder_Mead nlminbw nmkbw optimx.L-BFGS-B
-8125.125 -8125.125 -8125.125 -8129.827 -8125.204
nloptwrap.NLOPT_LN_NELDERMEAD nloptwrap.NLOPT_LN_BOBYQA
-8125.137
> ss$ sdcor ## SDs and correlations
Serial_number.(Intercept) Serial_number.poly(Voltage, 2)1.(Intercept) Serial_number.poly(Voltage, 2)2.(Intercept)
bobyqa 0.1350049 46.09013 44.01631
Nelder_Mead 0.1350064 46.09104 44.01705
nlminbw 0.1350065 46.09106 44.01707
nmkbw 0.1347214 46.11336 43.81219
optimx.L-BFGS-B 0.1356576 46.32849 44.27171
nloptwrap.NLOPT_LN_NELDERMEAD 0.1347638 45.97995 43.91054
nloptwrap.NLOPT_LN_BOBYQA 0.1347638 45.97995 43.91054
Serial_number.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2
bobyqa -0.7665898 0.9042387 -0.9608662
Nelder_Mead -0.7665981 0.9042424 -0.9608680
nlminbw -0.7665980 0.9042425 -0.9608680
nmkbw -0.7658163 0.9076551 -0.9649999
optimx.L-BFGS-B -0.7713801 0.9067725 -0.9617129
nloptwrap.NLOPT_LN_NELDERMEAD -0.7645748 0.9034336 -0.9606020
nloptwrap.NLOPT_LN_BOBYQA -0.7645748 0.9034336 -0.9606020
sigma
bobyqa 0.2582156
Nelder_Mead 0.2582156
nlminbw 0.2582156
nmkbw 0.2584714
optimx.L-BFGS-B 0.2582244
nloptwrap.NLOPT_LN_NELDERMEAD 0.2582207
nloptwrap.NLOPT_LN_BOBYQA 0.2582207
> ss$ theta ## Cholesky factors
Serial_number.(Intercept) Serial_number.poly(Voltage, 2)1.(Intercept) Serial_number.poly(Voltage, 2)2.(Intercept)
bobyqa 0.5228377 -136.8323 154.1396
Nelder_Mead 0.5228438 -136.8364 154.1428
nlminbw 0.5228439 -136.8365 154.1429
nmkbw 0.5212237 -136.6278 153.8521
optimx.L-BFGS-B 0.5253478 -138.3947 155.4631
nloptwrap.NLOPT_LN_NELDERMEAD 0.5218936 -136.1436 153.6293
nloptwrap.NLOPT_LN_BOBYQA 0.5218936 -136.1436 153.6293
Serial_number.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2
bobyqa 114.6181 -71.06063 1.578418e+01
Nelder_Mead 114.6186 -71.06067 1.578354e+01
nlminbw 114.6187 -71.06067 1.578351e+01
nmkbw 114.7270 -71.14411 3.440466e-42
optimx.L-BFGS-B 114.1731 -70.65227 1.527854e+01
nloptwrap.NLOPT_LN_NELDERMEAD 114.7688 -71.19817 1.568481e+01
nloptwrap.NLOPT_LN_BOBYQA 114.7688 -71.19817 1.568481e+01
> ss$ which.OK ## which fits worked
bobyqa Nelder_Mead nlminbw nmkbw optimx.L-BFGS-B
TRUE TRUE TRUE TRUE TRUE
nloptwrap.NLOPT_LN_NELDERMEAD nloptwrap.NLOPT_LN_BOBYQA
TRUE TRUE
Due to users's coment I add the following:
> bam(log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs="re") + s(Voltage, Serial_number, bs="re"), data=APD, discrete = TRUE)
Family: gaussian
Link function: identity
Formula:
log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs = "re") +
s(Voltage, Serial_number, bs = "re")
Estimated degrees of freedom:
9 993 987 total = 1990.18
fREML score: -226.8182
> summary(bam(log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs="re") + s(Voltage, Serial_number, bs="re"), data=APD, discrete = TRUE))
Family: gaussian
Link function: identity
Formula:
log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs = "re") +
s(Voltage, Serial_number, bs = "re")
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.11500 0.01896 6.066 1.31e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(Voltage) 8.998 9 89229 <2e-16 ***
s(Serial_number) 993.441 1019 55241 <2e-16 ***
s(Voltage,Serial_number) 986.741 1019 36278 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.989 Deviance explained = 99%
fREML = -226.82 Scale est. = 0.051396 n = 76219
On https://uploadfiles.io/n7h9z you can download the r script and data.
Update
(some plots concerning the gam model):
Here are all measurement data points transformed double-logarithmically:
The physical behaviour of the device is at least exponentially and even almost double-exponentially (as I found in a book). By transforming them double-logarithmically the almost behave then linearly. A polynomial of degree described the data already well but a polynomial degree of three did it better, though. I guess this can also be seen on the plot why that.
Some additional plots (I'm not used to GAMs so I just add them):
You can download the data from the link: https://uploadfiles.io/n7h9z
The convergence warnings disappeared when I removed all data points <2. I stumbled over this by coincidence..
Probably this is somehow connected to the issue that for each subset within the range from 0 to about 50 all data points are almost exactly the same (and have values of about ~1).
I have a little convergence concerns for one of my models, the data are in file attached: https://mon-partage.fr/f/06LTiBGt/
. To explain the data a little, it is a matter of knowing whether the formation of the pre-nymphs is impacted by the different modalities. The obs column corresponds to formed / unformed nymph. The day variables are very important in the model. I would like to use at least the hive variable as a random effect.
I tried to add to the function the bobyqa control, but the problem of convergence persists and follow the ?convergence. But alls optmizers converge C
Can i considers that it's false positive ?
Thank you in advance,
library("lme4", lib.loc="~/R/win-library/3.3")
> glmpn<-glmer(Obs~moda*jour+1|ruch)+1|code_test),data=dataall_pn,family=binomial(logit),glmerControl(optimizer="bobyqa"))
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.0054486 (tol = 0.001, component 1)
> summary(glmpn)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: Obs ~ moda * jour + (1 | ruch) + (1 | code_test)
Data: dataall_pn
Control: glmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
22477.2 22651.2 -11216.6 22433.2 20072
Scaled residuals:
Min 1Q Median 3Q Max
-8.4511 -0.9370 0.4435 0.6890 1.5559
Random effects:
Groups Name Variance Std.Dev.
ruch (Intercept) 0.2811 0.5302
code_test (Intercept) 0.2475 0.4975
Number of obs: 20094, groups: ruch, 7; code_test, 5
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.47806 0.46287 -7.514 5.73e-14 ***
modaA 0.46866 0.50489 0.928 0.353281
modaL -2.77363 0.75599 -3.669 0.000244 ***
modaLA 2.04869 0.52218 3.923 8.73e-05 ***
modaP 2.19098 0.48984 4.473 7.72e-06 ***
modaB 1.75376 0.49874 3.516 0.000437 ***
modaBP 2.27875 0.52120 4.372 1.23e-05 ***
modaPL 2.01771 0.48696 4.143 3.42e-05 ***
modaBL 1.06337 0.48795 2.179 0.029312 *
modaBLP 1.93218 0.51939 3.720 0.000199 ***
jour 0.41973 0.02981 14.079 < 2e-16 ***
modaA:jour -0.06559 0.04188 -1.566 0.117369
modaL:jour 0.19876 0.06491 3.062 0.002198 **
modaLA:jour -0.22419 0.04267 -5.254 1.49e-07 ***
modaP:jour -0.25363 0.04012 -6.322 2.58e-10 ***
modaB:jour -0.19555 0.04097 -4.773 1.82e-06 ***
modaBP:jour -0.23478 0.04262 -5.509 3.61e-08 ***
modaPL:jour -0.24454 0.03988 -6.131 8.71e-10 ***
modaBL:jour -0.16590 0.04003 -4.145 3.40e-05 ***
modaBLP:jour -0.21726 0.04245 -5.119 3.08e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation matrix not shown by default, as p = 20 > 12.
Use print(x, correlation=TRUE) or
vcov(x) if you need it
convergence code: 0
Model failed to converge with max|grad| = 0.0054486 (tol = 0.001, component 1)