Post hoc (emmeans) for binomial glmer - r

I made a glmer model to predict correct responses as a function of two independent variables (2x2 within-subjects design).
Here is the head of the df with ID, stimulus, the two within-subj conditions, the dependent variable "correct" and the predicted probability from the glmer fit (added after model computation).
Note: the correct variable was computed from the rate colum (if participants' rate < 49 and cond_A = near it is correct, if rate > 51 and cond_A = far, it is correct; the opposite for incorrect responses)
ID stimulus cond_A cond_B rate correct prob
1 5ef197dadda04d0008ba9dce FIORE near noncongr 0 1 0.5239730
2 5ef197dadda04d0008ba9dce TRENO near noncongr 19 1 0.5443476
3 5ef197dadda04d0008ba9dce MESE far noncongr 9 0 0.6530908
4 5ef197dadda04d0008ba9dce MONDO far noncongr 28 0 0.7131941
5 5ef197dadda04d0008ba9dce VASO near noncongr 63 0 0.4607776
6 5ef197dadda04d0008ba9dce SEGNO far congr 7 0 0.6626701
the results of the anova are below
fit = glmer(correct ~ spazio*linguag + (1|ID) + (1|stimulus), family = "binomial", data = correct)
car::Anova(fit, type = 3)
Analysis of Deviance Table (Type III Wald chisquare tests)
Response: correct
Chisq Df Pr(>Chisq)
(Intercept) 24.6270 1 6.957e-07 ***
spazio 0.2052 1 0.6505643
linguag 1.2483 1 0.2638814
spazio:linguag 14.1912 1 0.0001651 ***
now I need to explore among which levels the interaction is significant.
Is the emmean code correct? (why df is Inf?)
Any suggestions for improving and/or plotting the post hoc?
Thanks!
post = emmeans(fit, pairwise~linguag|spazio)
$emmeans
spazio = far:
linguag emmean SE df asymp.LCL asymp.UCL
congr 0.6050 0.122 Inf 0.332 0.878
noncongr 0.8009 0.129 Inf 0.512 1.090
spazio = near:
linguag emmean SE df asymp.LCL asymp.UCL
congr 0.5396 0.121 Inf 0.268 0.811
noncongr -0.0529 0.123 Inf -0.328 0.222
Results are given on the logit (not the response) scale.
Confidence level used: 0.95
Conf-level adjustment: bonferroni method for 2 estimates
$contrasts
spazio = far:
contrast estimate SE df z.ratio p.value
congr - noncongr -0.196 0.175 Inf -1.117 0.2639
spazio = near:
contrast estimate SE df z.ratio p.value
congr - noncongr 0.593 0.170 Inf 3.478 0.0005
Results are given on the log odds ratio (not the response) scale.

Related

lrtest() not working: posthoc testing for multinomial with vglm() not working with lrtest()

I am running a multinomial analysis with vglm(). It all works, but then I try to follow the instructions from the following website (https://rcompanion.org/handbook/H_08.html) to do a pairwise test, because emmeans cannot handle pairwise for vglm models. The lrtest() part gives me the following error:
Error in lrtest.default(model) :
'list' object cannot be coerced to type 'double'
I cannot figure out what is wrong, I even copy and pasted the exact code that the website used (see below) and get the same error with their own code and dataset. Any ideas?
Their code and suggestion for doing pairwise testing with vglm() is the only pairwise testing option I found for vglm() anywhere on the web.
Here is the code along with all the expected output and extra details from their website (it is simpler than mine but gets same error anyways).
Input = ("
County Sex Result Count
Bloom Female Pass 9
Bloom Female Fail 5
Bloom Male Pass 7
Bloom Male Fail 17
Cobblestone Female Pass 11
Cobblestone Female Fail 4
Cobblestone Male Pass 9
Cobblestone Male Fail 21
Dougal Female Pass 9
Dougal Female Fail 7
Dougal Male Pass 19
Dougal Male Fail 9
Heimlich Female Pass 15
Heimlich Female Fail 8
Heimlich Male Pass 14
Heimlich Male Fail 17
")
Data = read.table(textConnection(Input),header=TRUE)
### Order factors otherwise R will alphabetize them
Data$County = factor(Data$County,
levels=unique(Data$County))
Data$Sex = factor(Data$Sex,
levels=unique(Data$Sex))
Data$Result = factor(Data$Result,
levels=unique(Data$Result))
### Check the data frame
library(psych)
headTail(Data)
str(Data)
summary(Data)
### Remove unnecessary objects
rm(Input)
Multinomial regression
library(VGAM)
model = vglm(Result ~ Sex + County + Sex:County,
family=multinomial(refLevel=1),
weights = Count,
data = Data)
summary(model)
library(car)
Anova(model,
type="II",
test="Chisq")```
Analysis of Deviance Table (Type II tests)
Response: Result
Df Chisq Pr(>Chisq)
Sex 1 6.7132 0.00957 **
County 3 4.1947 0.24120
Sex:County 3 7.1376 0.06764 .
library(rcompanion)
nagelkerke(model)
$Pseudo.R.squared.for.model.vs.null
Pseudo.R.squared
McFadden 0.0797857
Cox and Snell (ML) 0.7136520
Nagelkerke (Cragg and Uhler) 0.7136520
$Likelihood.ratio.test
Df.diff LogLik.diff Chisq p.value
7 -10.004 20.009 0.0055508
library(lmtest)
lrtest(model)
Likelihood ratio test
Model 1: Result ~ Sex + County + Sex:County
Model 2: Result ~ 1
#Df LogLik Df Chisq Pr(>Chisq)
1 8 -115.39
2 15 -125.39 7 20.009 0.005551 **
Post-hoc analysis
At the time of writing, the lsmeans package cannot be used with vglm models.
One option for post-hoc analysis would be to conduct analyses on reduced models, including only two levels of a factor. For example, if the variable County x Sex term had been significant, the following code could be used to create a reduced dataset with only Bloom–Female and Bloom–Male, and analyze this data with vglm.
Data.b = Data[Data$County=="Bloom" &
(Data$Sex=="Female"| Data$Sex=="Male") , ]
Data.b$County = factor(Data.b$County)
Data.b$Sex = factor(Data.b$Sex)
summary(Data.b)
County Sex Result Count
Bloom:4 Female:2 Pass:2 Min. : 5.0
Male :2 Fail:2 1st Qu.: 6.5
Median : 8.0
Mean : 9.5
3rd Qu.:11.0
Max. :17.0
library(VGAM)
model.b = vglm(Result ~ Sex,
family=multinomial(refLevel=1),
weights = Count,
data = Data.b)
lrtest(model.b)
Likelihood ratio test
#Df LogLik Df Chisq Pr(>Chisq)
1 2 -23.612
2 3 -25.864 1 4.5041 0.03381 *
Summary table of results
Comparison p-value
Bloom–Female - Bloom–Male 0.034
Cobblestone–Female - Cobblestone–Male 0.0052
Dougal–Female - Dougal–Male 0.44
Heimlich–Female - Heimlich–Male 0.14
p.value = c(0.034, 0.0052, 0.44, 0.14)
p.adj = p.adjust(p.value,
method = "fdr")
p.adj = signif(p.adj,
2)
p.adj
[1] 0.068 0.021 0.440 0.190
Comparison p-value p.adj
Bloom–Female - Bloom–Male 0.034 0.068
Cobblestone–Female - Cobblestone–Male 0.0052 0.021
Dougal–Female - Dougal–Male 0.44 0.44
Heimlich–Female - Heimlich–Male 0.14 0.19
It looks to me like qdrq() can be used. As I commented, you can't use the lazy interface, you have to give all the specific needed parameters:
> library(emmeans)
> RG = qdrg(formula(model), Data, coef(model), vcov(model), link = "log")
> RG
'emmGrid' object with variables:
Sex = Female, Male
County = Bloom, Cobblestone, Dougal, Heimlich
Transformation: “log”
> emmeans(RG, consec ~ Sex | County)
$emmeans
County = Bloom:
Sex emmean SE df asymp.LCL asymp.UCL
Female -0.588 0.558 Inf -1.68100 0.5054
Male 0.887 0.449 Inf 0.00711 1.7675
County = Cobblestone:
Sex emmean SE df asymp.LCL asymp.UCL
Female -1.012 0.584 Inf -2.15597 0.1328
Male 0.847 0.398 Inf 0.06643 1.6282
County = Dougal:
Sex emmean SE df asymp.LCL asymp.UCL
Female -0.251 0.504 Inf -1.23904 0.7364
Male -0.747 0.405 Inf -1.54032 0.0459
County = Heimlich:
Sex emmean SE df asymp.LCL asymp.UCL
Female -0.629 0.438 Inf -1.48668 0.2295
Male 0.194 0.361 Inf -0.51320 0.9015
Results are given on the log (not the response) scale.
Confidence level used: 0.95
$contrasts
County = Bloom:
contrast estimate SE df z.ratio p.value
Male - Female 1.475 0.716 Inf 2.060 0.0394
County = Cobblestone:
contrast estimate SE df z.ratio p.value
Male - Female 1.859 0.707 Inf 2.630 0.0085
County = Dougal:
contrast estimate SE df z.ratio p.value
Male - Female -0.496 0.646 Inf -0.767 0.4429
County = Heimlich:
contrast estimate SE df z.ratio p.value
Male - Female 0.823 0.567 Inf 1.450 0.1470
Results are given on the log (not the response) scale.
If I understand this model correctly, the response is the log of the ratio of the 2nd multinomial response to the 1st. So what we see above is estimated differences of logs and setimated differences of those differences. If run with type = "response" you would get estimated ratios, and ratios of those ratios.
Probably something changed in either the VGAM package or the lmtest package since that was written.
But the following will work for a likelihood ratio test for vglm models:
VGAM::lrtest(model)
VGAM::lrtest(model, model2)

Linear mixed model confidence intervals question

Hoping that you can clear some confusion in my head.
Linear mixed model is constructed with lmerTest:
MODEL <- lmer(Ca content ~ SYSTEM +(1 | YEAR/replicate) +
(1 | YEAR:SYSTEM), data = IOSDV1)
Fun starts happening when I'm trying to get the confidence intervals for the specific levels of the main effect.
Commands emmeans and lsmeans produce the same intervals (example; SYSTEM A3: 23.9-128.9, mean 76.4, SE:8.96).
However, the command as.data.frame(effect("SYSTEM", MODEL)) produces different, narrower confidence intervals (example; SYSTEM A3: 58.0-94.9, mean 76.4, SE:8.96).
What am I missing and what number should I report?
To summarize, for the content of Ca, i have 6 total measurements per treatment (three per year, each from different replication). I will leave the names in the code in my language, as used. Idea is to test if certain production practices affect the content of specific minerals in the grains. Random effects without residual variance were left in the model for this example.
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: CA ~ SISTEM + (1 | LETO/ponovitev) + (1 | LETO:SISTEM)
Data: IOSDV1
REML criterion at convergence: 202.1
Scaled residuals:
Min 1Q Median 3Q Max
-1.60767 -0.74339 0.04665 0.73152 1.50519
Random effects:
Groups Name Variance Std.Dev.
LETO:SISTEM (Intercept) 0.0 0.0
ponovitev:LETO (Intercept) 0.0 0.0
LETO (Intercept) 120.9 11.0
Residual 118.7 10.9
Number of obs: 30, groups: LETO:SISTEM, 10; ponovitev:LETO, 8; LETO, 2
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 76.417 8.959 1.548 8.530 0.0276 *
SISTEM[T.C0] -5.183 6.291 24.000 -0.824 0.4181
SISTEM[T.C110] -13.433 6.291 24.000 -2.135 0.0431 *
SISTEM[T.C165] -7.617 6.291 24.000 -1.211 0.2378
SISTEM[T.C55] -10.883 6.291 24.000 -1.730 0.0965 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) SISTEM[T.C0 SISTEM[T.C11 SISTEM[T.C16
SISTEM[T.C0 -0.351
SISTEM[T.C11 -0.351 0.500
SISTEM[T.C16 -0.351 0.500 0.500
SISTEM[T.C5 -0.351 0.500 0.500 0.500
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see ?isSingular
> ls_means(MODEL, ddf="Kenward-Roger")
Least Squares Means table:
Estimate Std. Error df t value lower upper Pr(>|t|)
SISTEMA3 76.4167 8.9586 1.5 8.5299 23.9091 128.9243 0.02853 *
SISTEMC0 71.2333 8.9586 1.5 7.9514 18.7257 123.7409 0.03171 *
SISTEMC110 62.9833 8.9586 1.5 7.0305 10.4757 115.4909 0.03813 *
SISTEMC165 68.8000 8.9586 1.5 7.6797 16.2924 121.3076 0.03341 *
SISTEMC55 65.5333 8.9586 1.5 7.3151 13.0257 118.0409 0.03594 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Confidence level: 95%
Degrees of freedom method: Kenward-Roger
> emmeans(MODEL, spec = c("SISTEM"))
SISTEM emmean SE df lower.CL upper.CL
A3 76.4 8.96 1.53 23.9 129
C0 71.2 8.96 1.53 18.7 124
C110 63.0 8.96 1.53 10.5 115
C165 68.8 8.96 1.53 16.3 121
C55 65.5 8.96 1.53 13.0 118
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
> as.data.frame(effect("SISTEM", MODEL))
SISTEM fit se lower upper
1 A3 76.41667 8.958643 57.96600 94.86734
2 C0 71.23333 8.958643 52.78266 89.68400
3 C110 62.98333 8.958643 44.53266 81.43400
4 C165 68.80000 8.958643 50.34933 87.25067
5 C55 65.53333 8.958643 47.08266 83.98400
Many thanks.
I'm pretty sure this has to do with the dreaded "denominator degrees of freedom" question, i.e. what kind (if any) of finite-sample correction is being employed. tl;dr emmeans is using a Kenward-Roger correction, which is more or less the most accurate available option — the only reason not to use K-R is if you have a large data set for which it becomes unbearably slow.
load packages, simulate data, fit model
library(lmerTest)
library(emmeans)
library(effects)
dd <- expand.grid(f=factor(letters[1:3]),g=factor(1:20),rep=1:10)
set.seed(101)
dd$y <- simulate(~f+(1|g), newdata=dd, newparams=list(beta=rep(1,3),theta=1,sigma=1))[[1]]
m <- lmer(y~f+(1|g), data=dd)
compare default emmeans with effects
emmeans(m, ~f)
## f emmean SE df lower.CL upper.CL
## a 0.848 0.212 21.9 0.409 1.29
## b 1.853 0.212 21.9 1.414 2.29
## c 1.863 0.212 21.9 1.424 2.30
## Degrees-of-freedom method: kenward-roger
## Confidence level used: 0.95
as.data.frame(effect("f",m))
## f fit se lower upper
## 1 a 0.8480161 0.2117093 0.4322306 1.263802
## 2 b 1.8531805 0.2117093 1.4373950 2.268966
## 3 c 1.8632228 0.2117093 1.4474373 2.279008
effects doesn't explicitly tell us what/whether it's using a finite-sample correction: we could dig around in the documentation or the code to try to find out. Alternatively, we can tell emmeans not to use finite-sample correction:
emmeans(m, ~f, lmer.df="asymptotic")
## f emmean SE df asymp.LCL asymp.UCL
## a 0.848 0.212 Inf 0.433 1.26
## b 1.853 0.212 Inf 1.438 2.27
## c 1.863 0.212 Inf 1.448 2.28
## Degrees-of-freedom method: asymptotic
## Confidence level used: 0.95
Testing shows that these are equivalent to about a tolerance of 0.001 (probably close enough). In principle we should be able to specify KR=TRUE to get effects to use Kenward-Roger correction, but I haven't been able to get that to work yet.
However, I will also say that there's something a little bit funky about your example. If we compute the distance between the mean and the lower CI in units of standard error, for emmeans we get (76.4-23.9)/8.96 = 5.86, which implies a very small effect degrees of freedom (e.g. about 1.55). That seems questionable to me unless your data set is extremely small ...
From your updated post, it appears that Kenward-Roger is indeed estimating only 1.5 denominator df.
In general it is dicey/not recommended to try fitting random effects where the grouping variable has a small number of levels (although see here for a counterargument). I would try treating LETO (which has only two levels) as a fixed effect, i.e.
CA ~ SISTEM + LETO + (1 | LETO:ponovitev) + (1 | LETO:SISTEM)
and see if that helps. (I would expect you would then get on the order of 7 df, which would make your CIs ± 2.4 SE instead of ± 6 SE ...)

Check for statistically significant differences between groups after running a logistic regression w/ interaction & random effect?

I ran an ordinal logistic regression (using the function clmm from the R package ordinal) with a two-factor interaction and a random effect.
The response is a factor w/ 5 levels (Liker scale: 1 2 3 4 5), the independent variables are a factor w/ 2 levels (time) and a factor w/ 3 levels (group)
The code looks like this:
library(ordinal)
# dataset
ID time group response
person1 1 a 3
person2 1 a 5
person3 1 c 5
person4 1 b 2
person5 1 c 2
person6 1 a 4
person1 2 a 2
person2 2 a 2
person3 2 c 1
person4 2 b 4
person5 2 c 3
person6 2 a 4
... ... ... ...
# model
model <- clmm(response ~ time*group + (1|ID))
# model results
formula: response ~ time * group + (1 | ID)
data: dataset
link threshold nobs logLik AIC niter max.grad cond.H
logit flexible 168 -226.76 473.52 508(4150) 9.42e-05 1.8e+02
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 5.18 2.276
Number of groups: ID 84
Coefficients:
Estimate Std. Error z value Pr(>|z|)
time2 0.2837 0.5289 0.536 0.59170
group_b 1.8746 0.6946 2.699 0.00695 **
group_c 4.0023 0.9383 4.265 2e-05 ***
time2:group_b -0.5100 0.7294 -0.699 0.48447
time2:group_c -0.8830 0.9749 -0.906 0.36508
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Threshold coefficients:
Estimate Std. Error z value
1|2 -2.6223 0.6440 -4.072
2|3 0.2474 0.5427 0.456
3|4 2.5384 0.5824 4.359
4|5 4.6786 0.7143 6.550
As you can see, the model results show only whether there are differences compared to the intercept (i.e. time1:group_a). However, what I am also interested in is to check if the difference between time1:group_b and time2:group_b is statistically significant, same for group_c.
Since I have to account for the random effect, I cannot use a simple chi-square test to check for statistically significant differences between groups. I therefore tried to run the function contrast from the R package emmeans, which uses the output of the function emmeans, see the code below:
library(emmeans)
em <- emmeans(model, ~ time | group) #calculates the estimated marginal means
contrast(em, "consec", simple = "each")
# contrast results
$`simple contrasts for time`
group = a:
contrast estimate SE df z.ratio p.value
2 - 1 0.284 0.529 Inf 0.536 0.5917
group = b:
contrast estimate SE df z.ratio p.value
2 - 1 -0.226 0.482 Inf -0.470 0.6386
group = c:
contrast estimate SE df z.ratio p.value
2 - 1 -0.599 0.816 Inf -0.734 0.4629
Note: contrasts are still on the as.factor scale
$`simple contrasts for group`
time = 1:
contrast estimate SE df z.ratio p.value
b - a 1.87 0.695 Inf 2.699 0.0137
c - b 2.13 0.871 Inf 2.443 0.0284
time = 2:
contrast estimate SE df z.ratio p.value
b - a 1.36 0.687 Inf 1.986 0.0897
c - b 1.75 0.838 Inf 2.095 0.0695
Note: contrasts are still on the as.factor scale
P value adjustment: mvt method for 2 tests
My questions are:
a) Is this a correct and valid method to check whether the differences are significant?
b) If not, what is the correct way to do this?
Of course any other suggestion is extremely welcome! Thanks a lot.

Estimating effect size with emmenas for post hoc

Is there a way to have effect size (such as Cohen's d or the most appropriate) directly using emmeans()?
I cannot find anything for obtaining effect size by using emmeans()
post <- emmeans(fit, pairwise~ favorite.pirate | sex)
emmip(fit, ~ favorite.pirate | sex)
There is not a built-in provision for effect-size calculations, but you can cobble one together by defining a custom contrast function that divides each pairwise comparison by a value of sigma:
mypw.emmc = function(..., sigma = 1) {
result = emmeans:::pairwise.emmc (...)
for (i in seq_along(result[1, ]))
result[[i]] = result[[i]] / sigma
result
}
Here's a test run:
> mypw.emmc(1:3, sigma = 4)
1 - 2 1 - 3 2 - 3
1 0.25 0.25 0.00
2 -0.25 0.00 0.25
3 0.00 -0.25 -0.25
With your model, the error SD is 9.246 (look at summary(fit); so, ...
> emmeans(fit, mypw ~ sex, sigma = 9.246, name = "effect.size")
NOTE: Results may be misleading due to involvement in interactions
$emmeans
sex emmean SE df lower.CL upper.CL
female 63.8 0.434 3.03 62.4 65.2
male 74.5 0.809 15.82 72.8 76.2
other 68.8 1.439 187.08 65.9 71.6
Results are averaged over the levels of: favorite.pirate
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
$contrasts
effect.size estimate SE df t.ratio p.value
female - male -1.158 0.0996 399 -11.624 <.0001
female - other -0.537 0.1627 888 -3.299 0.0029
male - other 0.621 0.1717 981 3.617 0.0009
Results are averaged over the levels of: favorite.pirate
Degrees-of-freedom method: kenward-roger
P value adjustment: tukey method for comparing a family of 3 estimates
Some words of caution though:
The SEs of the effect sizes are misleading because they don't account for the variation in sigma.
This is not a very good example because
a. The factors interact (Edward Low is different in his profile).
Also, see the warning message.
b. The model is singular (as warned when the model was fitted), yielding an estimated variance of zero for college)
library(yarrr)
View(pirates)
library(lme4)
library(lmerTest)
fit <- lmer(weight~ favorite.pirate * sex +(1|college), data = pirates)
anova(fit, ddf = "Kenward-Roger")
post <- emmeans(fit, pairwise~ sex)
post

R logistic regression and marginal effects - how to exclude NA values in categorical independent variable

I am a beginner with R. I am using glm to conduct logistic regression and then using the 'margins' package to calculate marginal effects but I don't seem to be able to exclude the missing values in my categorical independent variable.
I have tried to ask R to exclude NAs from the regression. The categorical variable is weight status at age 9 (wgt9), and it has three levels (1, 2, 3) and some NAs.
What am I doing wrong? Why do I get a wgt9NA result in my outputs and how can I correct it?
Thanks in advance for any help/advice.
Conduct logistic regression
summary(logit.phbehav <- glm(obese13 ~ gender + as.factor(wgt9) + aded08b,
data = gui, weights = bdwg01, family = binomial(link = "logit")))
Regression output
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -3.99 0.293 -13.6 2.86e- 42
2 gender 0.387 0.121 3.19 1.42e- 3
3 as.factor(wgt9)2 2.49 0.177 14.1 3.28e- 45
4 as.factor(wgt9)3 4.65 0.182 25.6 4.81e-144
5 as.factor(wgt9)NA 2.60 0.234 11.1 9.94e- 29
6 aded08b -0.0755 0.0224 -3.37 7.47e- 4
Calculate the marginal effects
effects_logit_phtotal = margins(logit.phtot)
print(effects_logit_phtotal)
summary(effects_logit_phtotal)
Marginal effects output
> summary(effects_logit_phtotal)
factor AME SE z p lower upper
aded08a -0.0012 0.0002 -4.8785 0.0000 -0.0017 -0.0007
gender 0.0115 0.0048 2.3899 0.0169 0.0021 0.0210
wgt92 0.0941 0.0086 10.9618 0.0000 0.0773 0.1109
wgt93 0.4708 0.0255 18.4569 0.0000 0.4208 0.5207
wgt9NA 0.1027 0.0179 5.7531 0.0000 0.0677 0.1377
First of all welcome to stack overflow. Please check the answer here to see how to make a great R question. Not providing a sample of your data, some times makes it impossible to answer the question. However taking a guess, I think that you have not set your NA values correctly but as strings. This behavior can be seen in the dummy data below.
First let's create the dummy data:
v1 <- c(2,3,3,3,2,2,2,2,NA,NA,NA)
v2 <- c(2,3,3,3,2,2,2,2,"NA","NA","NA")
v3 <- c(11,5,6,7,10,8,7,6,2,5,3)
obese <- c(0,1,1,0,0,1,1,1,0,0,0)
df <- data.frame(obese,v1,v2)
Using the variable named v1, does not include NA as a category:
glm(formula = obese ~ as.factor(v1) + v3, family = binomial(link = "logit"),
data = df)
Deviance Residuals:
1 2 3 4 5 6 7 8
-2.110e-08 2.110e-08 1.168e-05 -1.105e-05 -2.110e-08 3.094e-06 2.110e-08 2.110e-08
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 401.48 898581.15 0 1
as.factor(v1)3 -96.51 326132.30 0 1
v3 -46.93 106842.02 0 1
While making the string "NA" to factor gives an output similar to the one in question:
glm(formula = obese ~ as.factor(v2) + v3, family = binomial(link = "logit"),
data = df)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.402e-05 -2.110e-08 -2.110e-08 2.110e-08 1.472e-05
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 394.21 744490.08 0.001 1
as.factor(v2)3 -95.33 340427.26 0.000 1
as.factor(v2)NA -327.07 613934.84 -0.001 1
v3 -45.99 84477.60 -0.001 1
Try the following to replace NAs that are strings:
gui$wgt9[ gui$wgt9 == "NA" ] <- NA
Don't forget to accept any answer that solved your problem.

Resources