"Vector memory exhausted" w/ multiple imputation + multilevel model + robust standard errors - r

I am attempting to run a multilevel model with robust standard errors on multiple imputation output from the MICE package. I can't provide the data, unfortunately, but here is the code I'm running:
imputed <- mice(impvars, maxit = 5)
barg_imp <-complete(imputed, action = "long", include = TRUE)
fitimp <- with(barg_imp,rlmer(success_new ~ conflict_imp + positiondistance_coun +
positiondistance_ep + positiondistance_com + population_log + salience_rel
+ (1|prnrnmc)))
Note that I'm using rlmer from the robustlmm package. It gives me the error "vector memory exhausted (limit reached?)". I've tried out some of the solutions from other threads on this error message that haven't worked.
I'm able to run this all right when using lmer. Any suggestions on another way to do robust standard errors here (coeftest doesn't work), how to run this code more efficiently or remedying this error? Thank you!

Related

glm package glm.nb() "Error: no valid set of coefficients has been found: please supply starting values"

I am running a negative binomial regression on my dataset using the glm.nb() function.
My model looks something like this:
m_nb= glm.nb(Error_Count ~ TotalWL + Auto_frac +PHONE+JUSTIF_weight + MESSAGE_OTHER_count + Hour+
I(Auto_frac^2)+I(TotalWL^2), data = df)
When I ran it with a dataset of 10,000, the model is able to run, however, when I ran it with a larger dataset (60,000), I got this error:
`Error: no valid set of coefficients has been found: please supply starting values`
I then tried to give it some start values, but still throw the same error
m_nb= glm.nb(Error_Count ~ TotalWL + Auto_frac +PHONE+JUSTIF_weight + MESSAGE_OTHER_count + Hour+
I(Auto_frac^2)+I(TotalWL^2), data = df, start = c(0.02, 0.3,0.2,3,43, 4,13,0.04, 100))
Error: cannot find valid starting values: please specify some
But the model still doesn't converge. How should I set the starting value?
I also tried the same model with the fenebin() function in the fixest pacakage and the model works. However, I need the glm package, since the fixest package does not provide the standard error (S.E.) in the predict().
Thank you.

Huber-White robust standard errors for a GLMM - R

I have discovered some heteroscedasticity in my model that I would like to compensate for with more robust standard errors. I have tried to use the Huber-White robust standard errors from the merDeriv package in R but I beleive these only work for a GLMM with a binomial distribution. Is there a way I could achieve the same thing for a Negative Binomial distribition?
Model:
library(lme4)
model <- glmer.nb(Jobs ~ 1 + Month + Year + (1|Region), data = df)
Huber-White robust standard errors:
library(merDeriv)
bread.glmerMod(model)
Error:
Error in vcov.lmerMod(object, full = full) : estfun.lmerMod() only works for lmer() models.
Thank you for any help!
This looks like a bug in the package, as far as I can tell (the bread.glmerMod function was calling estfun.lmerMod rather than estfun.glmerMod; there's a broader question here about the design of the generic functions, but never mind ...)
You should be able to install a fixed version from my fork via remotes::install_github("bbolker/merDeriv"), then reload the package and try again.
Alternately, download the tarball, change vcov.lmerMod to vcov.glmerMod in the last line of R/bread.glmerMod.R, and re-install the package ...
Try something like this:
library(lme4)
model <- glmer.nb(Jobs ~ 1 + Month + Year + (1|Region), data = df)
cov <- vcovHC(model, type = "HC1", sandwich = T)
se <- sqrt(diag(cov_m1))
(Can't confirm if it works since this there isn't a reproducible example)

Error in lme4::allFit() -- no applicable method for 'isGLMM'

I'm hitting a confusing error while trying to run the lme4::allFit() using some built-in parallelization. I fit an initial model m0, which uses a larger dataframe ckDF (n = 265,623 rows) to model a binary response to a number of categorical and continuous predictors in a logistic framework with a random intercept for year.
I'm interested in determining whether different optimizers yield different results, following some recommendations I've found online (e.g. by #BenBolker here). My data is fairly large and takes ~20 minutes to run usually, so I'm hoping to use the parallel and ncpus parameters of allFit() to speed it up a bit. Here's my relevant code:
require(lme4)
require(parallel)
m0 <- glmer(returned ~ 1 + barge + site + barge:site +
(run + rearType + basin)^2 +
(tdg + temp + holdingTime)^2 +
(1|year),
data = ckDF, family = 'binomial',
control = glmerControl(optimizer='bobyqa',
optCtrl = list(maxfun = 1e5)))
af1 <- allFit(m0, parallel = 'multicore', ncpus = detectCores())
Upon doing this, I encounter the following error:
Error in checkForRemoteErrors(val) :
7 nodes produced errors; first error: no applicable method for 'isGLMM' applied to an object of class "list"
Any ideas? It seems to me that when it constructs a bunch of nodes, somehow some of them don't import the lme4 package and thus do not recognize isGLMM(); but I don't know why allFit() would do this, since it's from lme4(). I tried looking under the hood and altering the function for my own allFit() package, but ran into other errors.
Any help would be appreciated. R Version: 3.6.1; lme4 Version: 1.1-21; platform: Windows 10 64-bit
Thanks to #user20650 & #Ben Bolker for the tips in comments above -- it worked and I was able to get allFit() to run as expected, by ensuring I use parallel = "snow" in my function call since I'm running in Windows. Just posting the edited code here for anyone else who finds this useful:
require(lme4); require(snow)
# Define initial model (switched to defaults here)
m0 <- glmer(returned ~ 1 + barge + site + barge:site +
(run + rearType + basin)^2 +
(tdg + temp + holdingTime)^2 +
(1|year),
data = ckDF, family = 'binomial')
# Set up cluster for running allFit()
optCls <- makeCluster(detectCores()-1, type = "SOCK")
clusterEvalQ(optCls,library("lme4"))
clusterExport(optCls, "ckDF")
# Use allFit() to look at differences in optimizers
system.time(af1 <- allFit(m0, parallel = 'snow',
ncpus = detectCores()-1, cl=optCls))
stopCluster(optCls)
Ended up taking ~40 minutes using 11 cores on my machine.

Trying to run this generalised linear model to find the effect on mass of crickets by multiple random effects and I am getting this following error

I am trying to run a glm model to check the variation in the mass of cricket which could be affected by age(numeric), altitude(High or low), temperature(two different temperatures) and the incubators(4 different incubators) they are kept it.
I have tried the glm model which seems to be fine theoretically. The data on excel is all checked as well. I am assuming I have to convert some of the data into binary or some sorts.
glm(crick$Mass ~ crick$Altitude*crick$Age + crick$Altitude*crick$suare(Altitude) + 1/crick$Nymph.ID + 1/crick$Population + crick$Temperature*crick$Altitude + 1/crick$Incubator)
This is the code I am trying to run.
Error in eval(predvars, data, env) : attempt to apply non-function this is the error message.
You need to properly specify the formula (e.g. see examples in ?glm). Try
glm(Mass ~ Altitude*Age + 1/Nymph.ID + 1/Population + Temperature*Altitude + 1/Incubator, data = cricks)
Note that I excluded Altitude*suare(Altitude) which is invalid R syntax for formulas. You'll have to explain more what you're trying to go with here.

glmmadmb help: multiple errors when running models

I'm trying to run a mixed effects model that includes three fixed effects with interaction and a random intercept and slope. The model I'm trying to specify in glmmadmb is:
> fit_zipoiss_ambig<-glmmadmb(AmbigCount~Posn.c*mood.c*Valence.c + offset(InputAmbig) + (1+Valence.c|mood.c/Chain), data = Data, zeroInflation = TRUE, family="poisson")
First I received this error message:
Error in Droplevels(eval(parse(text = x), data)) :
all grouping variables in random effects must be factors
So I used (as an example) fPosn.c=as.factor(Data$Posn.c)to convert all my predictors to factors. Then I ran this model:
> fit_zipoiss_ambig<-glmmadmb(AmbigCount~fPosn.c*fmood.c*fValence.c + offset(InputAmbig) + (1+fValence.c|fmood.c/Chain), data = Data, zeroInflation = TRUE, family="poisson")
Then I got this error:
Error in glmmadmb(AmbigCount ~ fPosn.c * fmood.c * fValence.c + offset(InputAmbig) + :
The function maximizer failed (couldn't find STD file) Troubleshooting steps include (1) run with 'save.dir' set and inspect output files; (2) change run parameters: see '?admbControl'
In addition: Warning message:
running command 'C:\Windows\system32\cmd.exe /c "C:/Program Files/R/R-3.2.2/library/glmmADMB/bin/windows64/glmmadmb.exe" -maxfn 500 -maxph 5 -noinit -shess' had status 1
I tried to follow the troubleshooting advice so included , admb.opts=admbControl(shess=FALSE,noinit=FALSE)) at the end of my model. Now I am receiving this error:
Error in glmmadmb(AmbigCount ~ fPosn.c * fmood.c * fValence.c + offset(InputAmbig) + :
rank of X = 106 < ncol(X) = 107
I have no idea what this error means. I'm hoping someone can help me work out how to specify my model in glmmadmb or failing that, some other package that will allow me to test a poisson or negative binomial distribution.
Without being able to run it myself, what jumps out at me is:
As far as your first error message, it is saying that the variables in your nested random-effects formula need to be factors.
Then, in your code: fPosn.c=as.factor(Data$Posn.c)
you are not creating "fPosn.c" within your data frame. To do that you need to run:
Data$fPosn.c = as.factor(Data$Posn.c)

Resources