Huber-White robust standard errors for a GLMM - R - r

I have discovered some heteroscedasticity in my model that I would like to compensate for with more robust standard errors. I have tried to use the Huber-White robust standard errors from the merDeriv package in R but I beleive these only work for a GLMM with a binomial distribution. Is there a way I could achieve the same thing for a Negative Binomial distribition?
Model:
library(lme4)
model <- glmer.nb(Jobs ~ 1 + Month + Year + (1|Region), data = df)
Huber-White robust standard errors:
library(merDeriv)
bread.glmerMod(model)
Error:
Error in vcov.lmerMod(object, full = full) : estfun.lmerMod() only works for lmer() models.
Thank you for any help!

This looks like a bug in the package, as far as I can tell (the bread.glmerMod function was calling estfun.lmerMod rather than estfun.glmerMod; there's a broader question here about the design of the generic functions, but never mind ...)
You should be able to install a fixed version from my fork via remotes::install_github("bbolker/merDeriv"), then reload the package and try again.
Alternately, download the tarball, change vcov.lmerMod to vcov.glmerMod in the last line of R/bread.glmerMod.R, and re-install the package ...

Try something like this:
library(lme4)
model <- glmer.nb(Jobs ~ 1 + Month + Year + (1|Region), data = df)
cov <- vcovHC(model, type = "HC1", sandwich = T)
se <- sqrt(diag(cov_m1))
(Can't confirm if it works since this there isn't a reproducible example)

Related

glm package glm.nb() "Error: no valid set of coefficients has been found: please supply starting values"

I am running a negative binomial regression on my dataset using the glm.nb() function.
My model looks something like this:
m_nb= glm.nb(Error_Count ~ TotalWL + Auto_frac +PHONE+JUSTIF_weight + MESSAGE_OTHER_count + Hour+
I(Auto_frac^2)+I(TotalWL^2), data = df)
When I ran it with a dataset of 10,000, the model is able to run, however, when I ran it with a larger dataset (60,000), I got this error:
`Error: no valid set of coefficients has been found: please supply starting values`
I then tried to give it some start values, but still throw the same error
m_nb= glm.nb(Error_Count ~ TotalWL + Auto_frac +PHONE+JUSTIF_weight + MESSAGE_OTHER_count + Hour+
I(Auto_frac^2)+I(TotalWL^2), data = df, start = c(0.02, 0.3,0.2,3,43, 4,13,0.04, 100))
Error: cannot find valid starting values: please specify some
But the model still doesn't converge. How should I set the starting value?
I also tried the same model with the fenebin() function in the fixest pacakage and the model works. However, I need the glm package, since the fixest package does not provide the standard error (S.E.) in the predict().
Thank you.

Is R Sandwich package not generating the expected clustered robust standard errors?

Load data
utils::data("InstInnovation", package = "sandwich")
df <- InstInnovation
Create group variable combining 'company' and 'year'
df[['cluster_var']] <- factor(paste0(df$company,"-",df$year))
Linear regression model
model <- lm(sales ~ competition + log(capital/employment) + year, data = df)
Why this:
lmtest::coeftest(model, vcov = vcovCL(model, type="HC3", cluster=~company+year))
Produces Standard Errors DIFFERENT than this?
lmtest::coeftest(model, vcov = vcovCL(model, type="HC3", cluster=~cluster_var))
Shouldn't cluster=~company+year and cluster=~cluster_var be equivalent?
In addition, I cannot find a place (e.g. Github) to report issues on R sandwich package, I found this but is just a read-only mirror: https://github.com/cran/sandwich
Thank you very much in advance.
cluster=~company+year is indeed something different: 'multiway clustering'. I found the explanation here:
http://fmwww.bc.edu/repec/bost10/BOS10.baum.pdf
https://francish.netlify.app/post/note-on-robust-standard-errors/

Extracting path coefficients of piecewise SEM (structural equation model)

I'm constructing a piecewise structural equation model using the piecewiseSEM package in R (Lefcheck - https://cran.r-project.org/web/packages/piecewiseSEM/vignettes/piecewiseSEM.html)
I already created the model set and I could evaluate the model fit, so the model itself works. Also, the data fits the model (p = 0.528).
But I do not succeed in extracting the path coefficients.
This is the error i get: Error in cbind(Xlarge, Xsmall) : number of rows of matrices must match (see arg 2)
I already tried (but this did not work):
standardising my data because of the warning: Some predictor variables are on very different scales: consider rescaling
adapted my data (threw some NA values away)
This is my modellist:
predatielijst = list(
lmer(plantgrootte ~ gapfraction + olsen_P + (1|plot_ID), data = d),
glmer(piek1 ~ gapfraction + olsen_P + plantgrootte + (1|plot_ID),
family = poisson, data = d),
glmer(predatie ~ piek1 + (1|plot_ID), family = binomial, data = d)
)
with "predatie" being a binary variable (yes or no) and all the rest continuous variables (gapfraction, plantgrootte, olsen_P & piek1)
Thanks in advance!
Try installing the development version:
library(devtools)
install_github("jslefche/piecewiseSEM#2.0")
Replace list with psem and run the coefs or summary function. It will likely get rid of your error. If not, open a bug on Github!
WARNING: this will overwrite your current version from CRAN. You will need to reinstall from CRAN to get version 1.4 back.
try to use lme (out of the nlme library) ilstead of glmer. As far as I understand, the fact that lmer does not provide p-values (while lme does) seems to be the problem here.
Hope this works.

Is lme4:::profile.merMod() supposed to work with glmer models?

Is lme4:::profile.merMod() supposed to work with glmer models? What about Negative Binomial models?
I have a negative binomial model that throws this error:
Error in names(opt) <- profnames(fm, signames) :
'names' attribute [2] must be the same length as the vector [1]
When I try and run the profile function on my model profile(model12) to get standard errors for my random effects.
Am I missing something or is this a problem with lme4?
I should mention that I'm using glmer(..., family = negative.binomial(theta = lme4:::est_theta(poissonmodel))) not glmer.nb() because I had issues with the update() function in using glmer.nb().
I can reproduce your error with the CRAN version (1.1-8). There has been some improvement in glmer.nb in the most recent development version, so if you have compilation tools installed I would definitely do devtools::install_github("lme4/lme4") and try again. In addition, update() works better with NB models now, so you might not need your workaround.
This works fine with version 1.1-9:
library("lme4")
m1 <- glmer.nb(TICKS~cHEIGHT+(1|BROOD),data=grouseticks)
pp <- profile(m1)
lattice::xyplot(pp)
Note by the way that your solution with est_theta only does the initial step or two of an iterative solution where the theta value and the other parameters are optimized in alternation ...
m0 <- glmer(TICKS~cHEIGHT+(1|BROOD),data=grouseticks,family=poisson)
m2 <- update(m0,
family = negative.binomial(theta = lme4:::est_theta(m0)))
cbind(glmer.nb=fixef(m1),pois=fixef(m0),fakenb=fixef(m2))
## glmer.nb pois fakenb
## (Intercept) 0.58573085 0.56835340 0.57759498
## cHEIGHT -0.02520326 -0.02521386 -0.02520702
profile() works OK on this model too, at least in the devel version ...

coding multivariate response in R Party Package

I am looking to do multivariate prediction using the party package in R (Party package documentation below)
http://cran.r-project.org/web/packages/party/party.pdf
I however, cannot figure out how to do multivariate prediction (multiple response variables). It says that it can do it, and I try this:
f <-cbind(A,B,C~shopping_pt+n_A_0)
model_1 <- ctree(f, data= train)
But that produces the following error:
Error in [<-(tmp, nas, drop = FALSE, value = 0) : (subscript) logical subscript too long
The documentation says it supports multivariate... but doesn't suggest how one can write the syntax correctly, any ideas?
Use the following syntax:
ctree(A + B + C ~ shopping_pt + n_A_0, data=train)
Try the package partykit
Check the sample below where it shows how you can code the multivariate response for the Conditional Inference Tree
### multivariate responses
airq2 <- ctree(Ozone + Temp ~ ., data = airq)
airq2
plot(airq2)

Resources