Contrasts for geeglm ((binomial(link="logit") using glht (multcomp package) - r

I’m using the multcomp package to generate contrasts for a geeglm (binomial(link="logit") ) model in R. I am running the geeglm model running the following script.
Library(geepack)
u1<-geeglm(outcome~ px_race_jama,id=npi_gp, family=binomial(link="logit"),data=mf)
Summary(u1)
Call:
geeglm(formula = outcome ~ px_race_jama, family = binomial(link = "logit"),
data = mf, id = npi_gp)
Coefficients:
Estimate Std.err Wald Pr(>|W|)
(Intercept) -0.4671 0.1541 9.19 0.0024 **
px_race_jama1 0.0959 0.1155 0.69 0.4067
px_race_jama2 -0.0293 0.1503 0.04 0.8453
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Estimated Scale Parameters:
Estimate Std.err
(Intercept) 1 0.0506
Correlation: Structure = independenceNumber of clusters: 83 Maximum cluster size: 792
To get the contrasts for the model I run the script
Library(multcomp)
glht(u1,mcp(px_race_jama="Tukey"))
I receive the error:
Error in match.arg(type) :
'arg' should be one of “pearson”, “working”, “response”
Error in modelparm.default(model, ...) :
no ‘vcov’ method for ‘model’ found!
Alternatively, I have tried creating a contrast matrix:
contrast.matrix <- rbind(
`Other-Black` = c(0, -1, 1))
comps <- glht(u1, contrast.matrix)
summary(comps)
However, I receive the same error. Any help on how to correctly generate the contrasts would be greatly appreciated.
Respectfully,
Jdukes

Something like this?
contrast.matrix <- matrix(c(0,-1,1,
0,1,-1),nrow=2,byrow=TRUE)
contrasts_geeglm <- function(fit,model_matrix,vcov_type = "robust"){
vcov_gee = if(vcov_type =="robust"){
fit$geese$vbeta}else{fit$geese$vbeta.naiv}
contrast_est = coef(fit)%*%t(model_matrix)
contrast_se = sqrt(model_matrix%*%vcov_gee%*% t(model_matrix))
output = data.frame(Estimate = contrast_est[1,],
SE = diag(contrast_se)) %>%
mutate(LCI = Estimate - 1.96*SE,
UCI = Estimate + 1.96*SE)
return(output)
}
contrasts_geeglm(u1,contrast.matrix,vcov_type="robust")

Related

Difference between glm with sandwich package and glmrob for Poisson distribution

I´m having some doubts with glmrob(package: robustbase). I want to use glmrob to get the same results as I was having with glm + sandwich.
I was writing:
p_3 <- glm(formula = var1~ var2,
family = poisson(link=log),
data = p3,
na.action = na.omit)
coeftest(p_3, vcov = sandwich)
Both variables are categorical. var1 has two categories and var2 has four.
Now I'm trying to use glmrob to get everything in the same step:
p_2 <- glmrob(formula = var1~ var2,
family = poisson (link=log),
data = p3,
na.action = na.omit,
method= "Mqle",
control = glmrobMqle.control(tcc= 1.2)
)
summary(p2)and summary(p_3)don´t yield the same results so I think that I need to make some changes to this two lines method= "Mqle",control = glmrobMqle.control(tcc= 1.2)but I don´t really know which ones.
Maybe I have to use method="MT" as it works for Poisson models, but I´m not sure.
The outputs:
With glmrob:
summary(p2)
>Call: glmrob(formula = dummy28_n ~ coexistencia, family = poisson(link = log), data = p3, na.action = na.omit, method = "Mqle", control = glmrobMqle.control(tcc = 1.2))
>Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.7020 0.5947 -2.862 0.00421 **
coexistenciaPobreza energética 1.1627 0.9176 1.267 0.20514
coexistenciaInseguridad residencial 0.7671 0.6930 1.107 0.26830
coexistenciaCoexistencia de inseguridades 1.3688 0.6087 2.249 0.02453 *
---
>Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Robustness weights w.r * w.x:
143 weights are ~= 1. The remaining 3 ones are
71 124 145
0.6266 0.6266 0.6266
>Number of observations: 146
Fitted by method ‘Mqle’ (in 8 iterations)
>(Dispersion parameter for poisson family taken to be 1)
>No deviance values available
Algorithmic parameters:
acc tcc
0.0001 1.2000
maxit
50
test.acc
"coef"
with glm and sandwich:
coef <- coeftest(p_3, vcov = sandwich)
coef
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.79176 0.52705 -3.3996 0.0006748 ***
coexistenciaPobreza energética 1.09861 0.72648 1.5122 0.1304744
coexistenciaInseguridad residencial 0.69315 0.60093 1.1535 0.2487189
coexistenciaCoexistencia de inseguridades 1.32972 0.53259 2.4967 0.0125349 *
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
´´´

GAM with binomial distribution and with spatial autocorrelation in R

I am using gam (from mgcv package in R) to model presence/absence data in 3355 cells of 1x1km (151 presences and 3204 absences). Even though I include a smooth with the spatial locations in the model to address the spatial dependence in my data, the results from a variogram show spatial autocorrelation in the residuals of my gam (range=6000 meters). Since I am modelling a binary response, using a gamm with a correlation structure is not advisable because it "performs poorly with binary data", neither gamm4 because (although is supposed to be appropriate for binary data) it has "no facility for nlme style correlation structures".
The alternative I have found is to fit my model using the function magic from the same mgcv package. Because I found no examples of how to use magic for spatially correlated data I have adapted the ?magic example for temporally correlated data. The results of the output (see below) change the coefficients of the model but does not remove the spatial autocorrelation and the smooth plots show the same effect.
> summary(data)
Object of class SpatialPolygonsDataFrame
Coordinates:
min max
x 670000 780000
y 140000 234000
Is projected: TRUE
proj4string :
[+proj=tmerc +lat_0=0 +lon_0=19 +k=0.9993 +x_0=500000 +y_0=-5300000 +datum=WGS84 +units=m
+no_defs +ellps=WGS84 +towgs84=0,0,0]
Data attributes:
f_edge lat long dam
Min. : 0.0 Min. :670500 Min. :140500 0:3204
1st Qu.: 0.0 1st Qu.:722500 1st Qu.:165500 1: 151
Median : 8.0 Median :743500 Median :181500
Mean :11.5 Mean :736992 Mean :180949
3rd Qu.:21.0 3rd Qu.:756500 3rd Qu.:194500
Max. :50.0 Max. :779500 Max. :233500
> gam1x1 <- gam(dam ~ s(lat, long) + s(f_edge),
+ family = binomial, data=data,
+ method = "ML", select = TRUE, na.action = "na.fail")
> # generate standadized residuals
> res <- residuals(gam1x1, type = "person")
> # calculate the sample variogram and fit the variogram model
> var <- variogram(res ~ long + lat, data=data)
> fit <- fit.variogram(var, vgm(c("Sph", "Exp"))); plot(var, fit)
> # extract the nugget and range from the variogram model to create the correlation structure
> # to include in the magic formula
> fit
model psill range
1 Nug 0.5324180 0.000
2 Sph 0.2935823 6002.253
> cs1Exp <- corSpher(c(6002.253, 0.5324180), form = ~ long + lat, nugget=TRUE)
> cs1Exp <- Initialize(cs1Exp, dataPOD1x1s)
> # follow instructions from mgcv 'magic' function
> V <- corMatrix(cs1Exp)
> Cv <- chol(V)
> ## gam ignoring correlation
> b <- gam(dam ~ s(lat, long) +
+ s(f_edge),
+ family = binomial,
+ data=data,
+ method = "ML",
+ select = TRUE,
+ na.action = "na.fail")
> ## Fit smooth, taking account of *known* correlation...
> w <- solve(t(Cv)) # V^{-1} = w'w
> ## Use `gam' to set up model for fitting...
> G <- gam(dam10_15 ~ s(lat, long) +
+ s(f_edge),
+ family = binomial,
+ data=dataPOD1x1s,
+ method = "ML",
+ select = TRUE,
+ na.action = "na.fail",
+ fit=FALSE)
> ## fit using magic, with weight *matrix*
> mgfit <- magic(G$y,G$X,G$sp,G$S,G$off,rank=G$rank,C=G$C,w=w)
> mg.stuff <- magic.post.proc(G$X,mgfit,w)
> b$edf <- mg.stuff$edf;b$Vp <- mg.stuff$Vb
> b$coefficients <- mgfit$b
> summary(gam1x1)
Family: binomial
Link function: logit
Formula:
dam ~ s(lat, long) + s(f_edge)
Parametric coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.6634 0.1338 -27.37 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df Chi.sq p-value
s(lat,long) 15.208 29 83.35 2.70e-14 ***
s(f_edge) 3.176 9 54.05 2.85e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.0595 Deviance explained = 15.6%
-ML = 551.92 Scale est. = 1 n = 3355
> summary(b)
Family: binomial
Link function: logit
Formula:
dam ~ s(lat, long) + s(f_edge)
Parametric coefficients:
Estimate Std. Error z value Pr(>|z|)
[1,] 0.06537 0.01131 5.778 7.56e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df Chi.sq p-value
s(lat,long) 3.716 29 0.046 1
s(f_edge) 6.753 9 0.295 1
R-sq.(adj) = 0.0618 Deviance explained = 15.6%
-ML = 551.92 Scale est. = 1 n = 3355
plot(b, select=2)
plot(gam1x1, select =2)
> resMag <- residuals.gam(b, type = "person")
> varMag <- variogram(resMag ~ long + lat, data=dataPOD1x1s)
> fitMag <- fit.variogram(varMag, vgm(c("Sph", "Exp"))); plot(varMag, fitMag)
> fitMag
model psill range
1 Nug 0.5324180 0.000
2 Sph 0.2935823 6002.253
Is something wrong in my script? Is there any other alternative to remove the residuals' spatial autocorrelation from my binomial gam?
Thanks a lot!

linearHypothesis equivalent for ols command (rms package) in R

I am trying to use "linearHypothesis" function from "car" package to test coefficients of a model estimated with "ols" from "rms" package. The function works with "lrm" objects but not with "ols" objects. Have you got any alternatives? I know that using "lm" would sort the issue but I want to use "ols" since it is easier getting clustered standard errors there.
You can use glht from the multcomp package.
library(rms)
library(multcomp)
d <- datadist(swiss); options(datadist="d")
fit <- ols(Fertility ~ ., data = swiss)
summary(fit)
test <- glht(fit, linfct = "Agriculture = 0")
summary(test)
# Fit: ols(formula = Fertility ~ ., data = swiss, x = TRUE)
#
# Linear Hypotheses:
# Estimate Std. Error z value Pr(>|z|)
# Agriculture == 0 -0.1721 0.0703 -2.448 0.0144 *
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

F-score and standardized Beta for heteroscedasticity-corrected covariance matrix (hccm) in R

I have multiple regression models which failed Breusch-Pagan tests, and so I've recalculated the variance using a heteroscedasticity-corrected covariance matrix, like this: coeftest(lm.model,vcov=hccm(lm.model)). coeftest() is from the lmtest package, while hccm() is from the car package.
I'd like to provide F-scores and standardized betas, but am not sure how to do this, because the output looks like this...
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.000261 0.038824 0.01 0.995
age 0.004410 0.041614 0.11 0.916
exercise -0.044727 0.023621 -1.89 0.059 .
tR -0.038375 0.037531 -1.02 0.307
allele1_num 0.013671 0.038017 0.36 0.719
tR:allele1_num -0.010077 0.038926 -0.26 0.796
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Any advice on how to report these so they are as consistent as possible with the standard summary() and Anova() output from R and car, and the function std_beta() from the sjmisc package?
In case anyone else has this question, here was my solution. It is not particularly elegant, but it works.
I simply used the function for std_beta as a template, and then changed the input for the standard error to that derived from the std_beta() function.
# This is taken from std_beta function from sj_misc package.
# =====================================
b <-coef(lm.model) # Same Estimate
b <-b[-1] # Same intercept
fit.data <- as.data.frame(stats::model.matrix(lm.model)) # Same model.
fit.data <- fit.data[, -1] # Keep intercept
fit.data <- as.data.frame(sapply(fit.data, function(x) if (is.factor(x))
to_value(x, keep.labels = F)
else x))
sx <- sapply(fit.data, sd, na.rm = T)
sy <- sapply(as.data.frame(lm.model$model)[1], sd, na.rm = T)
beta <- b * sx/sy
se <-coeftest(lm.model,vcov=hccm(lm.model))[,2] # ** USE HCCM covariance for SE **
se <- se[-1]
beta.se <- se * sx/sy
data.frame(beta = beta, ci.low = (beta - beta.se *
1.96), ci.hi = (beta + beta.se * 1.96))
For the F-scores, I just squared the t-values.
I hope this saves someone some time.

R fGarch: presample matrix for garchSpec()

I am trying to specify GARCH model by function fGarch::garchSpec() and i need a specified presample. As defined in manual:
presample: a numeric three column matrix with start values for the
series, for the innovations, and for the conditional
variances.
But i am pretty sure, that this is not the correct order. After reading the manuals and codes for functions: 'garchFit', 'garchSpec', 'garchSim' I am still quite confused.
The question is: how to exactly build presample matrix?
You don't need to set an argument to presample. It will supply a "good" guess and in estimating the parameters it will not matter. If you want to simulate data, then I would just make sure the burnin, n.start, is large enough.
Let's look at an example:
library(fGarch)
## First we simulate some data without setting presample:
# we set up the model by spec:
set.seed(911)
spec <- garchSpec(model = list(mu = 0.02, omega = 0.05, alpha = 0.2, beta = 0.75))
# then simulate our GARCH(1,1) model:
garchSim <- garchSim(spec, n = 200, n.start = 1)
plot(garchSim)
And estimates:
> garchFit(~ garch(1, 1), data = garchSim)
Error Analysis:
Estimate Std. Error t value Pr(>|t|)
mu -0.02196 0.05800 -0.379 0.7049
omega 0.03567 0.02681 1.331 0.1833
alpha1 0.12074 0.04952 2.438 0.0148 *
beta1 0.84527 0.05597 15.103 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log Likelihood:
-265.8417 normalized: -1.329209
Let us now try to add a very xtreme presample. In the above model (and this seed) the presample was:
> spec#presample
Presample:
time z h y
1 0 -0.4324072 1 0.02
now we replace it by c(100, 0.1, 0.1). Since my model is a GARCH(1,1) without any ARMA-part, I only need to set 3 parameters as descriped in the documentation ?garchSpec. After updating spec we estimate the same model:
set.seed(911)
spec#presample <- matrix(c(0.1, 0.1, 0.1), ncol = 3)
garchFit(~ garch(1, 1), data = garchSim)
with the same output:
Error Analysis:
Estimate Std. Error t value Pr(>|t|)
mu -0.02196 0.05800 -0.379 0.7049
omega 0.03567 0.02681 1.331 0.1833
alpha1 0.12074 0.04952 2.438 0.0148 *
beta1 0.84527 0.05597 15.103 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log Likelihood:
-265.8417 normalized: -1.329209
The likelihood and estimates are identical, but notice when we simulate with the new spec:
set.seed(911)
garchSim <- garchSim(spec, n = 200, n.start = 1)
plot(garchSim)
, the extreme initial sample supplied messed up our nice simulation. But by increasing the burn.in we get:
set.seed(911)
garchSim <- garchSim(spec, n = 200, n.start = 100)
plot(garchSim)

Resources