How to code Fixed effects Poisson model in R? - r

I am trying to fit a fixed effects Poisson model in R using pglm function. I need to use both individual and time fixed effects in the model. How can I include both of them?
My example data are:
library(pglm)
library(readstata13)
library(lmtest)
library(MASS)
ships<-readstata13::read.dta13("http://www.stata-
press.com/data/r13/ships.dta")
ships$lnservice=log(ships$service)
ships$time <- rep(1:8, 5)
I have tried using the following code to estimate the model with time and individual fixed effects:
res <- pglm(accident ~
op_75_79+co_65_69+co_70_74+co_75_79+lnservice,family = poisson, data =
ships, effect = "twoways", model="within", index = c("ship", "time"))
summary(res)
The code works, however I am getting the same results as with using the model with only individual and not time effects:
res <- pglm(accident ~
op_75_79+co_65_69+co_70_74+co_75_79+lnservice,family = poisson, data =
ships, effect = "individual", model="within", index = c("ship"))
summary(res)
Do you know where could be the problem? I don't expect the two models to produce the same estimates.

Related

plotting semivariograms with non-nlme package models

I am trying to plot a semivariogram of my model residuals for a generalised mixed effect model in R. Doing this for a mixed effect model with normal distribution is straightforward with the nlme package, and using the quakes dataset as an example.
library(nlme)
data(quakes)
head(quakes)
model1 <- lme(mag ~ depth , random = ~1|stations, data = quakes)
summary(model1)
semivario <- Variogram(model1, form = ~long+lat,resType = "normalized")
plot(semivario, smooth = TRUE)
I want to create a model with a non-normal distribution, which I can't do with nlme, so I have tried glmer and glmmPQL. I have turned the 'mag' into a binomial variable, then try to reapply the Variogram function to make a plot with models.
quakes$thresh <- ifelse(quakes$mag > "5", 0, 1)
library(MASS)
model2 <- glmmPQL(as.factor(thresh) ~ depth , random = ~1|stations, family = binomial, data = quakes)
summary(model2)
semivario <- Variogram(model2, form = ~long+lat,resType = "normalized")
plot(semivario, smooth = TRUE)
library(lme4)
model3 <-glmer(as.factor(thresh) ~ depth + (1|stations), data = quakes, family = binomial)
summary(model3)
semivario <- Variogram(model3, form = ~long+lat,resType = "normalized")
plot(semivario, smooth = TRUE)
Neither of these appear to work for plotting the variogram. The glmmPQL says that lat and long isn't found, and the glmer says distance isn't specified.
How can I code a plot of semivariogram of these models? Is the Variogram function from the nlme package unusable for them? And if so what alternatives can I use?

Getting estimated means after multiple imputation using the mitml, nlme & geepack R packages

I'm running multilevel multiple imputation through the package mitml (using the panimpute() function) and am fitting linear mixed models and marginal models through the packages nlme and geepack and the mitml:with() function.
I can get the estimates, p-values etc for those through the testEstimates() function but I'm also looking to get estimated means across my model predictors. I've tried the emmeans package, which I normally use for getting estimated means when running nlme & geepack without multiple imputation but doing so emmeans tell me "Can't handle an object of class “mitml.result”".
I'm wondering is there a way to get pooled estimated means from the multiple imputation analyses I've run?
The data frames I'm analyzing are longitudinal/repeated measures and in long format. In the linear mixed model I want to get the estimated means for a 2x2 interaction effect and in the marginal model I'm trying to get estimated means for the 6 levels of 'time' variable. The outcome in all models is continuous.
Here's my code
# mixed model
fml <- Dep + time ~ 1 + (1|id)
imp <- panImpute(data=Data, formula=fml, n.burn=50000, n.iter=5000, m=100, group = "treatment")
summary(imp)
plot(imp, trace="all")
implist <- mitmlComplete(imp, "all", force.list = TRUE)
fit <- with(implist, lme(Dep ~ time*treatment, random = ~ 1|id, method = "ML", na.action = na.exclude, control = list(opt = "optim")))
testEstimates(fit, var.comp = TRUE)
confint.mitml.testEstimates(testEstimates(fit, var.comp = TRUE))
# marginal model
fml <- Dep + time ~ 1 + (1|id)
imp <- panImpute(data=Data, formula=fml, n.burn=50000, n.iter=5000, m=100)
summary(imp)
plot(imp, trace="all")
implist <- mitmlComplete(imp, "all", force.list = TRUE)
fit <- with(implist, geeglm(Dep ~ time, id = id, corstr ="unstructured"))
testEstimates(fit, var.comp = TRUE)
confint.mitml.testEstimates(testEstimates(fit, var.comp = TRUE))
is there a way to get pooled estimated means from the multiple imputation analyses I've run?
This is not a reprex without Data, so I can't verify this works for you. But emmeans provides support for mira-class (lists of) models in the mice package. So if you fit your model in with() using the mids rather than mitml.list class object, then you can use that to obtain marginal means of your outcome (and any contrasts or pairwise comparisons afterward).
Using example data found here, which uncomfortably loads an external workspace:
con <- url("https://www.gerkovink.com/mimp/popular.RData")
load(con)
## imputation
library(mice)
ini <- mice(popNCR, maxit = 0)
meth <- ini$meth
meth[c(3, 5, 6, 7)] <- "norm"
pred <- ini$pred
pred[, "pupil"] <- 0
imp <- mice(popNCR, meth = meth, pred = pred, print = FALSE)
## analysis
library(lme4) # fit multilevel model
mod <- with(imp, lmer(popular ~ sex + (1|class)))
library(emmeans) # obtain pooled estimates of means
(em <- emmeans(mod, specs = ~ sex) )
pairs(em) # test comparison

R - fixed effect of panel data analysis and robust standard errors

I am working with the panel data through plm package in R. And now I am considering a fixed effect model of group (cities), time, and two ways of group and time, respectively. Because I detected heteroskedasticity through the Breusch-Pagan test, I compute robust standard errors.
I read a help ?vcovHC, but I could not understand fully how to utilize coeftest.
My current code is:
library(plm)
library(lmtest)
library(sandwich)
fem_city <- plm (z ~ x+y, data = rawdata, index = c("city","year"), model = "within", effect = "individual")
fem_year <- plm (z ~ x+y, data = rawdata, index = c("city","year"), model = "within", effect = "time")
fem_both <- plm (z ~ x+y, data = rawdata, index = c("city","year"), model = "within", effect = "twoways")
coeftest(fem_city, vcovHC(fem_city, type = 'HC3', cluster = 'group')
coeftest(fem_year, vcovHC(fem_city, type = 'HC3', cluster = 'time')
In order to compute the robust standard errors, are codes of coeftest appropriate? I am wondering that how to set the cluster option for effect = 'individual and effect = 'time' each.
For example, I set coeftest codes:
cluster = 'group' in plm of fem_city for effect = 'individual' in coeftest
cluster = 'time' in plm of fem_year for effect = 'time' in coeftest
Is this way appropriate?
And, how to compute the robust standard error for twoways of both city and year?
Set cluster='group' if you want to cluster on the variable serving as the individual index (city in your example).
Set cluster='time' if you want to cluster on the variable serving as the time index (yearin your example).
You can cluster on the time index even for a fixed effects one-way individual model.
For clustering on both index variables, you cannot do that with plm::vcovHC. Look at vcovDC from the same packages which provides double clustering (DC = double clustering), e.g.,
coeftest(fem_city, vcovDC(fem_city)

How to get the corr(u_i, Xb) for panel data fixed effects regression in R

I am trying to develop a fixed effect regression model for a panel data using the plm package in R. I want to get the correlation between fixed effects and the regressors. Something like the corr(u_i, Xb) that comes in the Stata output.
How to get it in R?
I have tried the following (using the in-built dataset in the plm package):-
data("Grunfeld", package = "plm")
library(plm)
# build the model
gi <- plm(inv ~ value + capital, data = Grunfeld, model = "within")
# extract the fixed effects fixef(gi)
summary(fixef(gi))
fixefs <- fixef(gi)[index(gi, which = "id")] ## get the fixed effects
newdata <- as.data.frame(cbind(fixefs, Grunfeld$value, Grunfeld$capital))
colnames(newdata) <- c("fixed_effects", "value", "capital")
cor(newdata)
EDIT: I asked this question on cross validated first and I got this reply- "Questions that are solely about programming or carrying out an operation within a statistical package are off-topic for this site and may be closed." Since my question has more to do with a operation in a package, so I guess this is the right place!
How about the following considering functions of plm:
# Run the model
gi <- plm(inv ~ value + capital, data = Grunfeld, model = "within")
# Get the residuals (res) and fixed effects (fix)
res = residuals(gi)
fix = fixef(gi)
# Aggregate residuals and fixed effects
newdata = cbind(res, fix)
# Correlation
cor(newdata)
res fix
res 1.00000000 0.05171279
fix 0.05171279 1.00000000

Converting a mixed model with repeated and random effects and different covariance structures from SAS to R

I have a model, created in SAS by a colleague, with a repeated effect that has an ARH1 (autoregressive heterogeneous variances) covariance structure and a random effect (with a variance components covariance structure) that I am trying to re-create in R, where I have more experience.
The original SAS code is:
PROC MIXED DATA=mylib.sep_cover_data plots=all ALPHA=0.15 CL COVTEST;
CLASS soil_grp dummy_year plot pasture;
MODEL cover = pcp soil_grp / ALPHA=0.15 CL residual SOLUTION;
RANDOM pasture / ALPHA=0.15 CL SOLUTION;
REPEATED dummy_year / subject = plot type = ARH(1);
After looking through other similar questions on here, I'm pretty sure I'm able to re-create the repeated statement in R, including the covariance structure, using the nlme library:
library(nlme)
cover.data <- read.csv("https://drive.google.com/uc?export=donload&id=0Bxdatltmq5ljMVlObXh1NHFGck0", header = TRUE)
cover.data <- cover.data[cover.data$Flag=="data",]
cover.data$soil_grp <- factor(cover.data$soil_grp)
cover.data$dummy_year <- factor(cover.data$dummy_year)
cover.data$pcp <- as.numeric(as.character(cover.data$pcp))
cover.data$cover <- as.numeric(as.character(cover.data$cover))
model.test1 <- gls(cover ~ Pcp + soil_grp, corr = corAR1(, form = ~ 1 | year/plot), weights = varIdent(form = ~ 1 | dummy_year), data = cover.data, na.action = "na.omit")
but I can't figure out how to add in the random effect (with a different covariance structure) into this model.
Additionally, I can put both the repeated and random variables into the same model but don't know how to specify the covariance structures for them.
model.test2 <- lme(cover ~ Pcp_jan_jun + soil, random = list( plot = ~ 1, pasture = ~1), data = cover.data, na.action = "na.omit")
Is is possible to put both a repeated and random effect with different covariance structures into the same model in R? If so, how do I code R to do it?
The data can be downloaded from
https://drive.google.com/uc?export=donload&id=0Bxdatltmq5ljMVlObXh1NHFGck0

Resources