Error when using a multilevel regression (lme4) - r

I want to use a multilevel regression to analyse the effect of some independent variables on a dependent variable and use varying intercept and slope.
My regression includes non-numeric independent variables which I want to use for the varying intercept and slope. The dependent variable is a numeric variable. When using this multilevel regression I get the following error:
Error: number of observations (=88594) <= number of random effects (=337477) for term (1 + x | z); the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable
x and z are characters and are correlated with each other and with y. This is the regression I use:
multi_reg1 <- lmer(y~ 1 + x + (1 + x| z), REML = FALSE, data = data_frame1)
Is there a way to fix this problem or is it not possible and I have to use other regression methods?

Related

Multiple groups with Linear Quantile Mixed Models

I'm trying to fit a linear quantile mixed model with multiple group variables for random effects.
With lme4 in R, I can use
model <- lmer(target ~ feat1 + (1|feat2) + (1|feat3), data)
which gets me two set of random effects for both feat2 and feat3.
Is it possible to do anything similar for quantile models? Reading through the doc for lqmm, it seems it can only do one group variable?

How to deal with NA in the output of the linear regression in R, specifically NA for the three-way interaction terms?

I am running a linear regression with three-way interaction in R
lm(A~XYZ), A=numerical variable, whereas X, Y, Z all are categorical variables with factors.
X=5 factors, Y=2 factors, Z=4 factors. Every time I am running regression the three-way interaction for the last level is missing. For e.g. if I relevel the Z factors, the last is getting dropped in the three-way interaction.
Coefficients: (8 not defined because of singularities). (This is mentioned in the R output)
I have tried using zero intercepts but it did not make any difference
lm(A~0+XYZ) or lm(A~XYZ-1) and all other possible combinations.
lm (A ~ X * Y * Z, data = dat)
X = 5 factors, Y = 2 factors and Z = 4 factors [one of the factors from each variable is acting as base level]
If you want to try to use the three categorical variables you could use the mixed effect model package
model = lmer(A ~ noise + (1 | Y) + (1 | X) +(1 | Z), data = data)
This will give you coefficients for the factors by creating random intercepts but you might need some additional information for the fixed effects aka "noise".
Another approach could be creating dummy variables using the fastdummies package. This will create a binary variable out of each category that you have.

Logistic regression for numerical predictor?

I'm working on a data set and want to use some of following variables to predict "Operatieduur". All the predictors have been factorized.
LogicFit <- train(Operatieduur ~ Anesthesioloog + Aorta_chirurgie + Benadering +
Chirurg + Operatietype, data = TrainData,
method="glm", family="binomial")
Here I use "train" function from caret package to make a logistic fitting with glm. When I ran this code I got the error message:
1: model fit failed for Resample01: parameter=none Error in eval(family$initialize) : y values must be 0 <= y <= 1
I googled it and found that the reason is that the resopnse "Operatieduur" is a continuous numerical value(it's a duration time). So how should I modify the function to use the predictors(they are all categorical values) to predict a continuous numerical value? Can logistic function do that?
Logistic regression predicts categories, not numerical variables. If you want to predict a continuous numerical variable (even using categorical variables), use normal regression. Depending on the number of categories of your predictor variables, you may want to consider one hot/dummy encoding.

High GLMER dispersion parameters

I am running a glmer with a random effect for count data (x) and two categorical variables (y and z):
fullmodel<-glmer(x~y*z + (1|Replicate), family = poisson, data = Data)
However, when I look at the dispersion parameter:
> dispersion_glmer(fullmodel)
[1] 2.338742
It is way higher than 1. Does this mean my model is over dispersed? How do I correct it. I want to keep my random effect but when I tried to swap the family to quasipoisson it says you can't use it for a glmer

Remove estimated effect of independent variable on dependent variables (interval/ratio) using linear regression

Question:
Can we use unstandardized coefficients derived from linear regression to remove the effect of an independent variable on the dependent variable?
I have a large dataset and suspect that an (unwanted) independent variable (IV) influences the hundreds of dependent values (DV) under research. All variables are ratio/interval. I need to remove this effect before continuing further analysis and would like to create a new, corrected estimated dataset. My idea has been to calculate the regression coefficient through linear regression, between the IV and each of the DV. If the effect of the IV (X) on DV (Y) turns out to be significant, consequently, I will calculate a new estimated Y that subtracts the regression coefficient multiplied by the IV value in to a corrected estimated Y value for the new dataset.
Y^new = Y^old - bX
Y = dependent variable
X = independent variable
b = unstandardized regression coefficient
Is this method appropriate? What should I do for IV-DV that are not significantly correlated?

Resources