using "at" argument of margins function in R for logit model - r

I want to be able to analyze the marginal effect of continuous and binary variables in a logit model. I am hoping for R to provide what the independent marginal effect of hp is at its mean (in this example that is at 200), while also finding the marginal effect of the vs variable equaling 1. I am hoping the output table also includes the SE, p value, and z score. I am having trouble with the table and when I have gotten it to run it doesn't evaluate the two variables independently. Here is an MRE below. Thank you!
mod2 <- glm(am ~ hp + factor(vs), data=mtcars, family=binomial)
margins(mod2)
#> Average marginal effects
#> glm(formula = am ~ hp + factor(vs), family = binomial, data = mtcars)
#> hp vs1
#> -0.00203 -0.03154
#code where I am trying to evaluate at the desired values.
margins(mod2, at=list(hp=200, vs=1))

This is because you've changed vs to a factor.
Consider the following
library(margins)
mod3 <- glm(am ~ hp + vs, data=mtcars, family=binomial)
margins(mod3, at=list(hp=200, vs=1))
# Average marginal effects at specified values
# glm(formula = am ~ hp + vs, family = binomial, data = mtcars)
#
# at(hp) at(vs) hp vs
# 200 1 -0.001783 -0.02803
There is no real reason to turn vs into a factor here; it's dichotomous.

Related

fixest::feols and ggeffects::ggeffect not working together in R

I'm having a hard time getting a fixest object to play nicely with ggeffects in R, when fixed effects are included.
When I run the following code:
m <- feols(mpg ~ disp + gear + hp | cyl, mtcars,
cluster = c("am", "cyl"))
summary(m)
marg1 <- ggeffect(m, terms = c("disp"))
I get an error reading:
Can't compute marginal effects, 'effects::Effect()' returned an error.
Reason: non-conformable arguments
You may try 'ggpredict()' or 'ggemmeans()'.
However, there are no problems when I remove the fixed effects term / include it without using the pipe:
m <- feols(mpg ~ disp + gear + hp + cyl, mtcars,
cluster = c("am", "cyl"))
summary(m)
marg1 <- ggeffect(m, terms = c("disp"))
ggpredict also returns an error on my data (Could not compute variance-covariance matrix of predictions. No confidence intervals are returned.) but I am unable to replicate that same error using the toy data.

How to run a linear regression using lm() on a subset in R after multiple imputation using MICE

I want to run a linear regression analysis on my multiple imputed data. I imputed my dataset using mice. The formula I used to run a linear regression on my whole imputed set is as follows:
mod1 <-with(imp, lm(outc ~ age + sex))
pool_mod1 <- pool(mod1)
summary(pool_mod1)
This works fine. Now I want to create a subset of BMI, by saying: I want to apply this regression analysis to the group of people with a BMI below 30 and to the group of people with a BMI above or equal to 30. I tried to do the following:
mod2 <-with(imp, lm(outc ~ age + sex), subset=(bmi<30))
pool_mod2 <- pool(mod2)
summary(pool_mod2)
mod3 <-with(imp, lm(outc ~ age + sex), subset=(bmi>=30))
pool_mod3 <- pool(mod3)
summary(pool_mod3)
I do not get an error, but the problem is: all three analysis give me exactly the same results. I thought this could be just the real life situation, however, if I use variables other than bmi (like blood pressure < 150), the same thing happens to me.
So my question is: how can I do subset analysis in R when the data is imputed using mice?
(BMI is imputed as well, I do not know if that is a problem?)
You should place subset within lm(), not outside of it.
with(imp, lm(outc ~ age + sex, subset=(bmi<30)))
A reproducible example.
with(mtcars, lm(mpg ~ disp + hp)) # Both produce the same
with(mtcars, lm(mpg ~ disp + hp), subset=(cyl < 6))
Coefficients:
(Intercept) disp hp
30.73590 -0.03035 -0.02484
with(mtcars, lm(mpg ~ disp + hp, subset=(cyl < 6))) # Calculates on the subset
Coefficients:
(Intercept) disp hp
43.04006 -0.11954 -0.04609

margins.plot: using the 'which' argument to choose which margins to include in plot

I am trying to plot marginal effects in r based on a logistic regression. For example:
data <- mtcars
mod <- glm(am ~ cyl + hp + wt + mpg, family = binomial, data = data)
library(margins)
marg <- margins(mod, atmeans = TRUE)
summary(marg)
I can run the margins plot command:
plot(marg)
which plots marginal effects and confidence intervals for all of the IVs. I only want to include in the plot cyl and hp, my explanatory variables of interest. According to r documentation, this can be accomplished using the 'which' argument, which takes a character vector. However, the documentation doesn't say how to use this argument. Does anyone know how to use the 'which' argument to ask margins.plot to plot only select marginal effects? Unfortunately, the margins plot help page, linked above, does not have any examples.
plot image
Before plotting, we can specify variables of interest with the variables option within the margins()function.
mod <- glm(am ~ cyl + hp + wt + mpg, family=binomial, data=mtcars)
library(margins)
marg <- margins(mod, variables=c("cyl", "hp"))
plot(marg)
Gives:

Setting Different Levels of constants for categorical variables in R

Will anyone be able to explain how to set constants for different levels of categorical variables in r?
I have read the following: How to set the Coefficient Value in Regression; R and it does a good job for explaining how to set a constant for the whole of a categorical variable. I would like to know how to set one for each level.
As an example, let us look at the MTCARS dataset:
df <- as.data.frame(mtcars)
df$cyl <- as.factor(df$cyl)
set.seed(1)
glm(mpg ~ cyl + hp + gear, data = df)
This gives me the following output:
Call: glm(formula = mpg ~ cyl + hp + gear, data = df)
Coefficients:
(Intercept) cyl6 cyl8 hp gear
19.80268 -4.07000 -2.29798 -0.05541 2.79645
Degrees of Freedom: 31 Total (i.e. Null); 27 Residual
Null Deviance: 1126
Residual Deviance: 219.5 AIC: 164.4
If I wanted to set cyl6 to -.34 and cyl8 to -1.4, and then rerun to see how it effects the other variables, how would I do that?
I think this is what you can do
df$mpgCyl=df$mpg
df$mpgCyl[df$cyl==6]=df$mpgCyl[df$cyl==6]-0.34
df$mpgCyl[df$cyl==8]=df$mpgCyl[df$cyl==8]-1.4
model2=glm(mpgCyl ~ hp + gear, data = df)
> model2
Call: glm(formula = mpgCyl ~ hp + gear, data = df)
Coefficients:
(Intercept) hp gear
16.86483 -0.07146 3.53128
UPDATE withe comments:
cyl is a factor, therefore by default it contributes to glm as offset, not slope. Actually cyl==4 is 'hidden' but existing in the glm as well. So in your first glm what the models says is:
1) for cyl==4: mpg=19.8-0.055*hp+2.79*gear
2) for cyl==6: mpg=(19.8-4.07)-0.055*hp+2.79*gear
3) for cyl==8: mpg=(19.8-2.29)-0.055*hp+2.79*gear
Maybe you can also check here https://stats.stackexchange.com/questions/213710/level-of-factor-taken-as-intercept and here Is there any way to fit a `glm()` so that all levels are included (i.e. no reference level)?
Hope this helps

How to extract intraclass correlation or rho in r?

I am trying to use the between and within effects model specifications. My question is how do I extract the intraclass correlation from these models. In Stata, this output is rho, or the variance due to differences across panels. Below is a copy of what I have using the mtcars dataset. (Hopefully the between and within effects models are correctly specified.)
between <- lmer(mpg ~ disp + hp + (1|cyl), mtcars)
summary(between)
within <- felm(mpg ~ disp + hp | factor(cyl), data = mtcars)
summary(within)
I think I answered my own question regarding the lmer model specification. By running
between.null <- lmer(mpg ~ 1 + (1|cyl), mtcars)
summary(between.null)
There are reported both variance of cyl and variance of Residual. Rho should then just be variance of cyl / (variance of cyl + variance of Residual).

Resources