Is there a way of getting "marginal effects" from a `glmer` object - r

I am estimating random effects logit model using glmer and I would like to report Marginal Effects for the independent variables. For glm models, package mfx helps compute marginal effects. Is there any package or function for glmer objects?
Thanks for your help.
A reproducible example is given below
## mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
## as of 2020-08-24:
mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
mydata$rank <- factor(mydata$rank) #creating ranks
id <- rep(1:ceiling(nrow(mydata)/2), times=c(2)) #creating ID variable
mydata <- cbind(mydata,data.frame(id,stringsAsFactors=FALSE))
set.seed(12345)
mydata$ran <- runif(nrow(mydata),0,1) #creating a random variable
library(lme4)
cfelr <- glmer(admit ~ (1 | id) + rank + gpa + ran + gre, data=mydata ,family = binomial)
summary(cfelr)

Here's an approach using the margins() package:
library(margins)
library(lme4)
gm1 <- glmer(cbind(incidence, size - incidence) ~ period +
(1 | herd),
data = cbpp,
family = binomial)
m <- margins(gm1, data = cbpp)
m

You could use the ggeffects-package (examples in the package-vignettes). So, for your code this might look like this:
library(ggeffects)
# dat is a data frame with marginal effects
dat <- ggpredict(cfelr, term = "rank")
plot(dat)
or you use, as Benjamin described, the You could use the sjPlot-package, using the plot_model() function with plot-type "pred" (this simply wraps the ggeffects package for marginal effect plots):
library(sjPlot)
plot_model(cfelr, type = "pred", term = "rank")

This is a much less technical answer, but perhaps provides a useful resource. I am a fan of the sjPlot package which provides plots of marginal effects of glmer objects, like so:
library(sjPlot)
sjp.glmer(cfelr, type = "eff")
The package provides a lot of options for exploring a glmer model's fixed and random effects as well. https://github.com/strengejacke/sjPlot

My solution does not answer the question,
"Is there a way of getting “marginal effects” from a glmer object",
but rather,
"Is there a way of getting marginal logistic regression coefficients from a conditional logistic regression with one random intercept?"
I am only offering this write-up because the reproducible example provided was a conditional logistic regression with one random intercept and I'm intending to be helpful. Please do not downvote; I will take down if this answer is deemed too off topic.
The R-code is based on the work of Patrick Heagerty (click "View Raw" to see pdf), and I include a reproducible example below from my github version of his lnMLE package (excuse the warnings at installation -- I'm shoehorning Patrick's non-CRAN package). I'm omitting the output for all except the last line, compare, which shows the fixed effect coefficients side-by-side.
library(devtools)
install_github("lnMLE_1.0-2", "swihart")
library(lnMLE)
## run the example from the logit.normal.mle help page
## see also the accompanying document (click 'View Raw' on page below:)
## https://github.com/swihart/lnMLE_1.0-2/blob/master/inst/doc/lnMLEhelp.pdf
data(eye_race)
attach(eye_race)
marg_model <- logit.normal.mle(meanmodel = value ~ black,
logSigma= ~1,
id=eye_race$id,
model="marginal",
data=eye_race,
tol=1e-5,
maxits=100,
r=50)
marg_model
cond_model <- logit.normal.mle(meanmodel = value ~ black,
logSigma= ~1,
id=eye_race$id,
model="conditional",
data=eye_race,
tol=1e-5,
maxits=100,
r=50)
cond_model
compare<-round(cbind(marg_model$beta, cond_model$beta),2)
colnames(compare)<-c("Marginal", "Conditional")
compare
The output of the last line:
compare
Marginal Conditional
(Intercept) -2.43 -4.94
black 0.08 0.15
I attempted the reproducible example given, but had problems with both the glmer and lnMLE implementations; again I only include output pertaining to the comparison results and the warnings from the glmer() call:
##original question / answer... glmer() function gave a warning and the lnMLE did not fit well...
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
mydata$rank <- factor(mydata$rank) #creating ranks
id <- rep(1:ceiling(nrow(mydata)/2), times=c(2)) #creating ID variable
mydata <- cbind(mydata,data.frame(id,stringsAsFactors=FALSE))
set.seed(12345)
mydata$ran <- runif(nrow(mydata),0,1) #creating a random variable
library(lme4)
cfelr <- glmer(admit ~ (1 | id) + rank + gpa + ran + gre,
data=mydata,
family = binomial)
Which gave:
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00161047 (tol = 0.001, component 2)
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue
- Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
- Rescale variables?
but I foolishly went on without rescaling, trying to apply the logit.normal.mle to the example given. However, the conditional model doesn't converge or produce standard error estimates,
summary(cfelr)
library(devtools)
install_github("lnMLE_1.0-2", "swihart")
library(lnMLE)
mydata$rank2 = mydata$rank==2
mydata$rank3 = mydata$rank==3
mydata$rank4 = mydata$rank==4
cfelr_cond = logit.normal.mle(meanmodel = admit ~ rank2+rank3+rank4+gpa+ran+gre,
logSigma = ~1 ,
id=id,
model="conditional",
data=mydata,
r=50,
tol=1e-6,
maxits=500)
cfelr_cond
cfelr_marg = logit.normal.mle(meanmodel = admit ~ rank2+rank3+rank4+gpa+ran+gre,
logSigma = ~1 ,
id=id,
model="marginal",
data=mydata,
r=50,
tol=1e-6,
maxits=500)
cfelr_marg
compare_glmer<-round(cbind(cfelr_marg$beta, cfelr_cond$beta,summary(cfelr)$coeff[,"Estimate"]),3)
colnames(compare_glmer)<-c("Marginal", "Conditional","glmer() Conditional")
compare_glmer
The last line of which reveals that the conditional model from cfelr_cond did not evaluate a conditional model but just returned the marginal coefficients and no standard errors.
> compare_glmer
Marginal Conditional glmer() Conditional
(Intercept) -4.407 -4.407 -4.425
rank2 -0.667 -0.667 -0.680
rank3 -1.832 -1.833 -1.418
rank4 -1.930 -1.930 -1.585
gpa 0.547 0.548 0.869
ran 0.860 0.860 0.413
gre 0.004 0.004 0.002
I hope to iron out these issues. Any help/comments appreciated. I'll give status updates when I can.

Related

Marginal effects plot of PLM

I’ve run an individual-fixed effects panel model in R using the plm-package. I now want to plot the marginal effects.
However, neither plot_model() nor effect_plot() work for plm-objects. plot_model() works for type = “est” but not for type = “pred”.
My online search so far only suggests using ggplot (which however only displays OLS-regressions, not fixed effects) or outdated functions (i.e, sjp.lm())
Does anyone have any recommendations how I can visualize effects of plm-objects?
IFE_Aut_uc <- plm(LoC_Authorities_rec ~ Compassion_rec, index = c("id","wave"), model = "within", effect = "individual", data = D3_long2)
summary(IFE_Aut_uc)
plot_model(IFE_Aut_uc, type = "pred”)
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 50238, 82308
and:
effect_plot(IFE_Pol_uc, pred = Compassion_rec)
Error in `stop_wrap()`:
! ~does not appear to be a one- or two-sided formula.
LoC_Politicians_recdoes not appear to be a one- or two-sided formula.
Compassion_recdoes not appear to be a one- or two-sided formula.
Backtrace:
1. jtools::effect_plot(IFE_Pol_uc, pred = Compassion_rec)
2. jtools::get_data(model, warn = FALSE)
4. jtools:::get_lhs(formula)
Edit 2022-08-20: The latest version of plm on CRAN now includes a predict() method for within models. In principle, the commands illustrated below using fixest should now work with plm as well.
In my experience, plm models are kind of tricky to deal with, and many of the packages which specialize in “post-processing” fail to handle these objects properly.
One alternative would be to estimate your “within” model using the fixest package and to plot the results using the marginaleffects package. (Disclaimer: I am the marginaleffects author.)
Note that many of the models estimated by plm are officially supported and tested with marginaleffects (e.g., random effects, Amemiya, Swaymy-Arora). However, this is not the case of this specific "within" model, which is even trickier than the others to support.
First, we estimate two models to show that the plm and fixest versions are equivalent:
library(plm)
library(fixest)
library(marginaleffects)
library(modelsummary)
data("EmplUK")
mod1 <- plm(
emp ~ wage * capital,
index = c("firm", "year"),
model = "within",
effect = "individual",
data = EmplUK)
mod2 <- feols(
emp ~ wage * capital | firm,
se = "standard",
data = EmplUK)
models <- list("PLM" = mod1, "FIXEST" = mod2)
modelsummary(models)
PLM
FIXEST
wage
0.000
0.000
(0.034)
(0.034)
capital
2.014
2.014
(0.126)
(0.126)
wage × capital
-0.043
-0.043
(0.004)
(0.004)
Num.Obs.
1031
1031
R2
0.263
0.986
R2 Adj.
0.145
0.984
R2 Within
0.263
R2 Within Adj.
0.260
AIC
4253.9
4253.9
BIC
4273.7
4273.7
RMSE
1.90
1.90
Std.Errors
IID
FE: firm
X
Now, we use the marginaleffects package to plot the results. There are two main functions for this:
plot_cap(): plot conditional adjusted predictions. How does my predicted outcome change as a function of a covariate?
plot_cme(): plot conditional marginal effects. How does the slope of my model with respect to one variable (i.e., a derivative or “marginal effect”) change with respect to another variable?
See the website for definitions and details: https://vincentarelbundock.github.io/marginaleffects/
plot_cap(mod2, condition = "capital")
plot_cme(mod2, effect = "wage", condition = "capital")

How to do negative binomial regression with the rms package in R?

How can I use the rms package in R to execute a negative binomial regression? (I originally posted this question on Statistics SE, but it was closed apparently because it is a better fit here.)
With the MASS package, I use the glm.nb function, but I am trying to switch to the rms package because I sometimes get weird errors when bootstrapping with glm.nb and some other functions. But I cannot figure out how to do a negative binomial regression with the rms package.
Here is sample code of what I would like to do (copied from the rms::Glm function documentation):
library(rms)
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
f <- Glm(counts ~ outcome + treatment, family=poisson())
f
anova(f)
summary(f, outcome=c('1','2','3'), treatment=c('1','2','3'))
So, instead of using family=poisson(), I would like to use something like family=negative.binomial(), but I cannot figure out how to do this.
In the documentation for family {stats}, I found this note in the "See also" section:
For binomial coefficients, choose; the binomial and negative binomial distributions, Binomial, and NegBinomial.
But even after clicking the link for ?NegBinomial, I cannot make any sense of this.
I would appreciate any help on how to use the rms package in R to execute a negative binomial regression.
opinion up front You might be better off posting (as a separate question) a reproducible example of the "weird errors" from your bootstrap attempts and seeing whether people have ideas for resolving them. It's fairly common for NB fitting procedures to throw warnings or errors when data are equi- or underdispersed, as the estimates of the dispersion parameter become infinite in this case ...
#coffeinjunky is correct that using family = negative.binomial(theta=VALUE) will work (where VALUE is a numeric constant, e.g. theta=1 for the geometric distribution [a special case of the NB]). However: you won't be able (without significantly more work) be able to fit the general NB model, i.e. the model where the dispersion parameter (theta) is estimated as part of the fitting procedure. That's what MASS::glm.nb does, and AFAICS there is no analogue in the rms package.
There are a few other packages/functions in addition to MASS::glm.nb that fit the negative binomial model, including (at least) bbmle and glmmTMB — there may be others such as gamlss.
## Dobson (1990) Page 93: Randomized Controlled Trial :
dd < data.frame(
counts = c(18,17,15,20,10,20,25,13,12)
outcome = gl(3,1,9),
treatment = gl(3,3))
MASS::glm.nb
library(MASS)
m1 <- glm.nb(counts ~ outcome + treatment, data = dd)
## "iteration limit reached" warning
glmmTMB
library(glmmTMB)
m2 <- glmmTMB(counts ~ outcome + treatment, family = nbinom2, data = dd)
## "false convergence" warning
bbmle
library(bbmle)
m3 <- mle2(counts ~ dnbinom(mu = exp(logmu), size = exp(logtheta)),
parameters = list(logmu ~outcome + treatment),
data = dd,
start = list(logmu = 0, logtheta = 0)
)
signif(cbind(MASS=coef(m1), glmmTMB=fixef(m2)$cond, bbmle=coef(m3)[1:5]), 5)
MASS glmmTMB bbmle
(Intercept) 3.0445e+00 3.04540000 3.0445e+00
outcome2 -4.5426e-01 -0.45397000 -4.5417e-01
outcome3 -2.9299e-01 -0.29253000 -2.9293e-01
treatment2 -1.1114e-06 0.00032174 8.1631e-06
treatment3 -1.9209e-06 0.00032823 6.5817e-06
These all agree fairly well (at least for the intercept/outcome parameters). This example is fairly difficult for a NB model (5 parameters + dispersion for 9 observations, data are Poisson rather than NB).
Based on this, the following seems to work:
library(rms)
library(MASS)
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
Glm(counts ~ outcome + treatment, family = negative.binomial(theta = 1))
General Linear Model
rms::Glm(formula = counts ~ outcome + treatment, family = negative.binomial(theta = 1))
Model Likelihood
Ratio Test
Obs 9 LR chi2 0.31
Residual d.f.4 d.f. 4
g 0.2383063 Pr(> chi2) 0.9892
Coef S.E. Wald Z Pr(>|Z|)
Intercept 3.0756 0.2121 14.50 <0.0001
outcome=2 -0.4598 0.2333 -1.97 0.0487
outcome=3 -0.2962 0.2327 -1.27 0.2030
treatment=2 -0.0347 0.2333 -0.15 0.8819
treatment=3 -0.0503 0.2333 -0.22 0.8293

Plot Effects of Variables in Interaction Terms

I would like to plot the effects of variables in interaction terms, using panel data and a FE model.
I have various interaction effects in my equation, for example this one here:
FIXED1 <- plm(GDPPCgrowth ~ FDI * PRIVCR, data = dfp)
I can only find solutions for lm, but not for plm.
So on the x-axis there should be PRIVCR and on the y-axis the effect of FDI on growth.
Thank you for your help!
Lisa
I am not aware of a package that supports plm objects directly. As you are asking for FE models, you can just take an LSDV approach for FE and do the estimation by lm to get an lm object which works with the effects package. Here is an example for the Grunfeld data:
library(plm)
library(effects)
data("Grunfeld", package = "plm")
mod_fe <- plm(inv ~ value + capital + value:capital, data = Grunfeld, model = "within")
Grunfeld[ , "firm"] <- factor(Grunfeld[ , "firm"]) # needs to be factor in the data NOT in the formula [required by package effects]
mod_lsdv <- lm(inv ~ value + capital + value:capital + firm, data = Grunfeld)
coefficients(mod_fe) # estimates are the same
coefficients(mod_lsdv) # estimates are the same
eff_obj <- effects::Effect(c("value", "capital"), mod_lsdv)
plot(eff_obj)

Error in nlme repeated measures

I'm trying to run a linear mixed model with repeated measures at 57 different timepoints. But I keep getting the error message:
Error in solve.default(estimates[dimE[1L] - (p:1), dimE[2L] - (p:1), drop = FALSE]) :
system is computationally singular: reciprocal condition number = 7.7782e-18
What does this mean?
My code is such:
model.dataset = data.frame(TimepointM=timepoint,SubjectM=sample,GeneM=gene)
library("nlme")
model = lme(score ~ TimepointM + GeneM,data=model.dataset,random = ~1|SubjectM)
Here's the data:
score = c(2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,6,7,2,-3,11,14,1,7,6,7,2,-3,11,14,1,7,6,2,-3,11,14,1,7,7,2,-3,11,14,1,7,6,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,6,2,-3,11,14,1,7,7,2,-3,11,14,1,7,6,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,6,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,6,7,2,-3,11,14,1,7,6,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7)
timepoint = c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,7,7,7,7,7,7,8,8,8,8,8,8,8,9,9,9,9,9,9,10,10,10,10,10,10,10,12,12,12,12,12,12,12,12,13,13,13,13,13,13,13,13,14,14,14,14,14,14,14,14,15,15,15,15,15,15,16,16,16,16,16,16,16,16,17,17,17,17,17,17,18,18,18,18,18,18,19,19,19,19,19,19,19,20,20,20,20,20,20,21,21,21,21,21,21,24,24,24,24,24,24,24,25,25,25,25,25,25,25,27,27,27,27,27,27,28,28,28,28,28,28,29,29,29,29,29,29,30,30,30,30,30,30,30,31,31,31,31,31,31,31,32,32,32,32,32,32,33,33,33,33,33,33,33,33,34,34,34,34,34,34,34,35,35,35,35,35,35,36,36,36,36,36,36,36,37,37,37,37,37,37,38,38,38,38,38,38,39,39,39,39,39,39,39,40,40,40,40,40,40,40,41,41,41,41,41,41,41,41,42,42,42,42,42,42,42,42,43,43,43,43,43,43,44,44,44,44,44,44,44,45,45,45,45,45,45,46,46,46,46,46,46,47,47,47,47,47,47,48,48,48,48,48,48,49,49,49,49,49,49,49,50,50,50,50,50,50,51,51,51,51,51,51,52,52,52,52,52,52,52,53,53,53,53,53,53,53,54,54,54,54,54,54,55,55,55,55,55,55,56,56,56,56,56,56,57,57,57,57,57,57)
sample = c("S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0")
gene =c(24.1215870,-18.8771658,-27.3747309,-41.5740199,26.1561877,-2.7836332,20.8322796,36.5745088,-24.1541743,-11.2362216,4.9042852,7.4230219,155.8663563,16.4465366,-11.7982286,-1.6102783,-35.9559091,27.7909495,-13.9181661,-29.6037658,-68.4297261,-45.0877920,-48.3157529,17.1649982,-26.9084544,19.7358439,-5.8991143,-24.1541743,-23.5960654,13.0780939,-2.7836332,18.6394081,-28.3157487,-49.9186269,-33.7086648,41.6864242,-30.6199654,36.1823804,-36.5745088,-49.9186269,-44.9448864,-4.9042852,-34.3314764,62.3465425,-42.7609951,-11.7982286,-32.2055657,-56.1811080,5.7216661,-17.6296771,4.3857431,-43.6534459,9.6616697,-44.9448864,18.7997599,-12.9902884,109.1064494,7.6750504,-43.6534459,-17.7130611,-25.8433097,5.7216661,-18.5575548,35.2750175,36.1823804,2.3596457,-25.7644526,-55.0574858,15.5302365,-19.4854325,73.3687689,63.1668918,20.8322796,16.5175201,-22.5438960,-28.0905540,15.5302365,7.4230219,39.5062602,107.4657509,36.1823804,-23.5964573,-45.0877920,-43.8212642,4.0869043,-40.8266205,26.3375068,13.1572292,-25.9561030,-40.2569571,-52.8102415,2.4521426,-49.1775202,246.1047731,36.1823804,11.7982286,-35.4261223,-26.9669318,-2.4521426,-38.0429873,38.5656349,9.8679219,16.5175201,8.0513914,-42.6976421,26.9735686,-26.9084544,4.3857431,12.9780515,-32.2055657,-33.7086648,9.8085704,-36.2800196,215.7518511,6.5786146,-9.4385829,-19.3233394,-40.4503978,17.1649982,-7.4230219,14.2536650,-23.5964573,-53.1391834,-52.8102415,22.0692834,-54.7447866,24.1215870,-44.8332688,-24.1541743,-42.6976421,26.9735686,-40.8266205,191.1413737,17.5429723,-70.7893718,-37.0364006,-39.3267756,-4.9042852,-0.9278777,93.5198138,-6.5786146,-24.7762801,-28.9850091,-39.3267756,22.0692834,-50.1053979,14.2536650,23.5964573,-20.9336177,-53.9338637,14.7128556,-39.8987428,4.3857431,-64.8902575,-59.5802966,-33.7086648,22.0692834,2.7836332,46.0503024,-35.3946859,-43.4775137,-53.9338637,30.2430921,-34.3314764,80.3942259,28.5073300,-87.3068919,-24.1541743,-62.9228410,13.0780939,-25.0526990,35.0859447,-24.7762801,-38.6466789,-58.4283523,31.0604729,0.0000000,24.4562563,1.0964358,-27.1359259,-75.6830794,-16.8543324,20.4345217,-11.1345329,74.1390629,18.2282447,-27.3044720,-45.2890768,-46.7707724,15.3258912,-27.9523169,-6.9763039,117.3099418,18.6394081,-21.2368115,-38.6466789,-34.8322870,22.0692834,-48.2496425,6.5786146,-64.8902575,-51.5289052,-80.9007955,23.7040451,-26.9084544,223.1349942,8.7714862,10.6184058,-127.2119846,-31.4614205,0.8173809,-16.7017993,9.8679219,-35.3946859,-54.7494617,-44.9448864,14.7128556,-18.5575548,97.5827836,-166.3550237,-95.0064189,-123.5984376,104.6247509,-121.5519839,33.9895089,-44.8332688,-40.2569571,-56.1811080,51.4949946,0.0000000,-16.9312544,95.9808615,6.5786146,-21.2368115,-9.6616697,-13.4834659,10.6259513,-25.9805767,116.4895926,-1.0964358,-16.5175201,-56.3597400,-44.9448864,13.8954747,-12.9902884,-5.6437515,71.3703842,25.2180227,-41.2938002,-53.1391834,-32.5850426,8.9911895,12.9902884,31.9812582,1.0964358,-70.7893718,-33.8158440,-38.2031534,-15.5302365,-25.0526990,153.4053085,36.1823804,-34.2148630,-41.8672354,-19.1015767,22.8866643,0.9278777,20.8322796,-29.4955716,-43.4775137,-69.6645739,33.5126155,-45.4660092,26.3144585,-33.0350402,24.1541743,-42.6976421,0.0000000,-28.7642099,38.3752520,-7.0789372,-22.5438960,-20.2251989,34.3299964,19.4854325,4.3857431,-61.3507889,-33.8158440,-64.0464631,39.2342816,-28.7642099,183.7582306,-4.3857431,-22.4166344,-28.9850091,-57.3047302,25.3388069,-26.9084544,35.0859447,7.0789372,-33.8158440,-43.8212642,-1.6347617,5.5672664,-35.0859447,-40.1139773,-14.4925046,-12.3598438,21.2519025,-14.8460438,119.7709896,30.7002016,-22.4166344,-46.6980703,-43.8212642,5.7216661,-10.2066551,203.4466124,116.2221917,-83.7674233,-109.4989234,-38.2031534,78.4685632,-56.6005421,21.9287154,-63.7104346,-56.3597400,-4.4944886,25.3388069,-73.3023414,29.6037658,-31.8552173,-46.6980703,-79.7771734,21.2519025,-18.5575548,16.4465366,-27.1359259,-43.4775137,-41.5740199,-11.4433321,-23.1969435,27.4108943,-84.9472461,-53.1391834,-40.4503978,22.8866643,16.7017993)
tl;dr I think your problem is that every individual has exactly the same response value (score) for every time point (i.e. perfect homogeneity within individuals), so the random-effects term completely explains the data; there's nothing left over for the fixed effects. Are you sure you didn't want to use gene as your response variable?? (Discovered after running through a bunch of modeling attempts, by plotting the damn data, something everyone should always do first ...)
## simplifying names etc. slightly
dd <- data.frame(timepoint,sample,gene,score,)
library("nlme")
m0 <- lme(score ~ timepoint + gene, data=dd,
random = ~1|sample)
## reproduces error
As a first check, let's just see if there's something in your fixed-effect model that is singular:
lm(score~timepoint+gene,dd)
##
## Call:
## lm(formula = score ~ timepoint + gene, data = dd)
##
## Coefficients:
## (Intercept) timepoint gene
## 5.414652 -0.004064 -0.024485
No, that works fine.
Let's try it in lme4:
library(lme4)
m1 <- lmer(score ~ timepoint + gene + (1|sample), data=dd)
## Error in fn(x, ...) : Downdated VtV is not positive definite
Let's try scaling & centering the data -- sometimes that helps:
ddsc <- transform(dd,
timepoint=scale(timepoint),
gene=scale(gene))
lme still fails:
m0sc <- lme(score ~ timepoint + gene, data=ddsc,
random = ~1|sample)
lmer works -- sort of!
m1sc <- lmer(score ~ timepoint + gene + (1|sample), data=ddsc)
## Warning message:
## In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model is nearly unidentifiable: very large eigenvalue
## - Rescale variables?
The results give coefficients for the parameters that are vanishingly close to zero. (The residual variance is also vanishingly small.)
## m1sc
## Linear mixed model fit by REML ['lmerMod']
## Formula: score ~ timepoint + gene + (1 | sample)
## Data: ddsc
## REML criterion at convergence: -9062.721
## Random effects:
## Groups Name Std.Dev.
## sample (Intercept) 7.838e-01
## Residual 3.344e-07
## Number of obs: 348, groups: sample, 8
## Fixed Effects:
## (Intercept) timepoint gene
## 5.714e+00 -4.194e-16 -1.032e-14
At this point I can only think of a couple of possibilities:
there's something about the experimental design that means the random effects are somehow (?) completely confounded with one or both of the fixed effects
these are simulated data that are artificially constructed to be perfectly balanced ... ?
library(ggplot2); theme_set(theme_bw())
ggplot(dd,aes(timepoint,score,group=sample,colour=gene))+
geom_point(size=4)+
geom_line(colour="red",alpha=0.5)
Aha!
In order for R to solve a matrix, it needs to be computationally invertible. The error you are getting back is telling you that, for computational purposes, your matrix is singular, which means it does not have an inverse.
As this error deals more with the statistical theory side, it's probably better suited for cross-validated. See this link for more information.
Check your data to make sure you do not have perfectly correlated independent variables.

Hausman type test in R

I have been using "plm" package of R to do the analysis of panel data. One of the important test in this package for choosing between "fixed effect" or "random effect" model is called Hausman type. A similar test is also available for the Stata. The point here is that Stata requires fixed effect to be estimated first followed by random effect. However, I didn't see any such restriction in the "plm" package. So, I was wondering whether "plm" package has the default "fixed effect" first and then "random effect" second. For your reference, I mention below the steps in Stata and R that I followed for the analysis.
*
Stata Steps: (data=mydata, y=dependent variable,X1:X4: explanatory variables)
*step 1 : Estimate the FE model
xtreg y X1 X2 X3 X4 ,fe
*step 2: store the estimator
est store fixed
*step 3 : Estimate the RE model
xtreg y X1 X2 X3 X4,re
* step 4: store the estimator
est store random
*step 5: run Hausman test
hausman fixed random
#R steps (data=mydata, y=dependent variable,X1:X4: explanatory variables)
#step 1 : Estimate the FE model
fe <- plm(y~X1+X2+X3+X4,data=mydata,model="within")
summary(model.fe)
#step 2 : Estimate the RE model
re <- pggls(y~X1+X2+X3+X4,data=mydata,model="random")
summary(model.re)
#step 3 : Run Hausman test
phtest(fe, re)
Update: Be sure to read the comments. Original answer below.
Trial-and-error way of finding this out:
> library(plm)
> data("Gasoline", package = "plm")
> form <- lgaspcar ~ lincomep + lrpmg + lcarpcap
> wi <- plm(form, data = Gasoline, model = "within")
> re <- plm(form, data = Gasoline, model = "random")
> phtest(wi, re)
Hausman Test
data: form
chisq = 302.8037, df = 3, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent
> phtest(re, wi)
Hausman Test
data: form
chisq = 302.8037, df = 3, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent
As you can see, the test yields the same result no matter which of the models you feed it as the first and which as the second argument.

Resources