Hierarchical logistic regression - r

I am trying to predict depression by using two quantitative variables and their interaction. However, before I want to see how much variance they explain, I want to control for a few variables.
My plan was to build a logistic regression model:
Depression = Covariates + IV1 + IV2 + IV1:IV2
Unfortunately, R doesn't seem to care about the order in which you add the variables to the model (Type III sum of squares?). Is there a way to build a logistic regression model in which the order does matter?
Thanks in advance!
-Lukas

Related

Multilevel model using glmer: Singularity issue

I'm using R to run a logistic multilevel model with random intercepts. I'm using the frequentist approach (glmer). I'm not able to use Bayesian methods due to the research centre's policy.
When I run my code it says that my model is singular. I'm not sure why or how to fix the issue. Any advice would be appreciated!
More information about the multilevel model I used:
I'm using a multilevel modelling method used in intersectionality research called multilevel analysis of individual heterogeneity and discriminatory accuracy (MAIHDA). The method uses individual level data as level 2 (the intersection group) and nests individuals within their intersections.
My outcome is binary and I have three categorical variables as fixed effects (gender, martial status, and disability). The random effect (level 2) is called intersect1 which includes each unique combination of the categorical variables (gender x marital x disability).
This is the code:
MAIHDA_full <- glmer(IPV_pos ~ factor(sexgender) + factor(marital) + factor(disability) + (1|intersect1), data=Data, family=binomial, control=glmerControl(optimizer=”bobyqa”,optCtrl=list(maxfun=2e5)))
The usual reason for a singular fit with mixed effects models is that either the random structure is overfitted - typically because of the inclusion of random slopes, or in the case such as this where we only have random intercepts, then the variation in the intercepts is so small that the model cannot detect it.
Looking at your model formula I suspect the issue is:
The random effect (level 2) is called intersect1 which includes each unique combination of the categorical variables (gender x marital x disability).
If I have understood this correctly, the model is equivalent to:
IPV_pos ~ sexgender + marital + disability + (1 | sexgender:marital:disability)
It is likely that any variation in sexgender:marital:disability is captured by the fixed effects, leading to near-zero variation in the random intercepts.
I suspect you will find almost identical results if you don't use any random effect.

Having trouble with overfitting in simple R logistic regression

I am a newbie to R and I am trying to perform a logistic regression on a set of clinical data.
My independent variable is AGE, TEMP, WBC, NLR, CRP, PCT, ESR, IL6, and TIME.
My dependent variable is binomial CRKP.
After using glm.fit, I was given this error message:
glm.fit <- glm(CRKP ~ AGE + TEMP + WBC + NLR + CRP + PCT + ESR, data = cv, family = binomial, subset=train)
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred
I searched up potential problems and used the corrplot function to see if there is multicollinearity that could potentially result in overfitting.
This is what I have as the plot.
Correlation plot shows that my ESR and PCT variable are highly correlated. Similarly, CRP and IL6 are highly correlated. But they are all important clinical indicators I would like to include in the model.
I tried to use the VIF to selectively discard variables, but wouldn't that be biased and also I would have to sacrifice some of my variables of interest.
Does anyone know what I can do? Please help. Thank you!
You have a multicollinearity problem but don't want to drop variables. In this case you can use Partial Least Squares (PLS) or Principal Component Regression (PCR).

Correct model in R with lmer

I have a question about using linear mixed model effects in R using lmer.
I have a repeated measure experiment with 117 participants. They all perform a task with 5 categories (Prime_Names). The dependent variable is reaction times (Score). I want to compare those 5 categories with each other. There is a lot of missing data so I think a RM anova is not an option.
I have two questions:
Am I using the correct analysis if I do a linear mixed model effect analysis in R with lmer?
I am not sure if my model is completely correct, especially for the random effects. When do you use only "+ (1|Resp_ID)" and when do you use "+ (Prime_Name|Resp_ID)"
Two options:
Option 1:
model <- lmer(Score ~ Prime_Name + (1|Resp_ID), data=df)
Option 2:
model <- lmer(Score ~ Prime_Name + (Prime_Name|Resp_ID), data=df)
Any help will be appreciated.
Thank you

Use FE estimates for OLS

I am analyzing a panel data set and I am interested in some time-independent explanatory variables(z). The Hausmann Test shows that I should use a fixed effects model instead of a random effects model.
Downside is, that the model will not estimate any coefficients for the time-independent explanatory variables.
So one idea is to take the estimated coefficients(b) for the time-dependent variables(x) from the FE model and use them on the raw data which means, take out the effects from the already estimated explanatory variables. Then use this corrected values as dependent variable for an OLS model with the time-independent variables as explanatory variables. This leads to:
y - x'b = z'j + u (with j as coefficients of interesst)
Do these two models exclude each other with any necessary assumption or is it just that the standard errors of the OLS model need to be corrected?
Thanks for every hint!

How do you fit a linear mixed model with an AR(1) random effects correlation structure in R?

I am trying to use R to rerun someone else's project, so we need to use some macros in R.
Here comes a very basic question:
m1.nlme = lme(log.bp.dia ~ M25.9to9.ma5iqr + temp.c.9to9.ma4iqr + o3.ma5iqr + sea_spring + sea_summer + sea_fall + BMI + male + age_ini, data=barbara.1.clean, random = ~ 1|study_id)
Since the model is using AR(1) [autocorrelation 1 covariance model] in SAS for within person variance, I am not sure how to do this in R.
And where I can see the index for different models, like unstructured?
Thanks
I don't know what you mean by "index" for different models, but to specify an AR(1) covariance structure for the residuals, you can add corr=corAR1() to your lme call.
The correlation at lag $1$ is say $r$, where $-1< r <1$ for a stationary $AR(1)$ model. The correlation at lag $k \geq 1$ is $r^k$. This gives you the autocovariance matrix by just multiplying by the variance of $X_t$.

Resources