One of my random effects in REML causes error - r

Thank you for your time. I have been reading through some of the threads where the error message is
"Downdated VtV is not positive definite"
My response variable is numeric (latency to display in seconds).
Fixed effects are sexC2, Mixed (yes or no), AscoreQ (4 categories), WinnerC1 (yes or no)
The random effect is DyadC2, which is a factor, with a different number for each pair of animals.
This is the code I'm using:
lmm <- lmer(displayC2 ~ sexC2 + Mixed + WinnerC1 + AscoreQ + WinnerC1*AscoreQ + (1|DyadC2)
It works if I don't put in the random variable, but when I put this in, I get the output:
"Error in fn(x, ...) : Downdated VtV is not positive definite"
Please could you let me know if you have any ideas why this is not working?
Thank you

This is a not a definite answer, a probable issue is that you have zero variance somewhere (for example there are all-0 or all-1 categories in the data). It is impossible to test without the data. Maybe this discusion could help you.

Related

Error during wrapup: Number of parameters to be estimated is larger than the number of observations

I am running a meta-regression analysis in metafor package and my code is as follows
rma(yi, vi, mods = ~ duration_exp + feedstock_rename
+ temp_group + application_rate + fertilizer_app_rate +
manure_app_rate + compost_app_rate + continent
+ feedstock_rename + annual_temp, method = "REML", data=.)
When I run this code I get the following error
Error during wrapup: Number of parameters to be estimated is larger than the number of observations.
But when I remove "annual_temp" from the model it runs perfectly. I am not sure what is the problem. "annual_temp" is a continuous variable. Could you please suggest how I can solve this error? Thanks in advance
You are trying to fit a model that has more parameters than you actually have observations. If it runs when you remove "annual_temp", then apparently that just pushes the number of parameters under the threshold, but the model is going to be useless as it will totally overfit. To solve this problem, you need to reduce the model complexity.

glmmLasso error and warning

I am trying to perform variable selection in a generalized linear mixed model using glmmLasso, but am coming up with an error and a warning, that I can not resolve. The dataset is unbalanced, with some participants (PTNO) having more samples than others; no missing data. My dependent variable is binary, all other variables (beside the ID variable PTNO) are continous.
I suspect something very generic is happening, but obviously fail to see it and have not found any solution in the documentation or on the web.
The code, which is basically just adapted from the glmmLasso soccer example is:
glm8 <- glmmLasso(Group~NDUFV2_dCTABL+GPER1_dCTABL+ ESR1_dCTABL+ESR2_dCTABL+KLF12_dCTABL+SP4_dCTABL+SP1_dCTABL+ PGAM1_dCTABL+ANK3_dCTABL+RASGRP1_dCTABL+AKT1_dCTABL+NUDT1_dCTABL+ POLG_dCTABL+ ADARB1_dCTABL+OGG_dCTABL+ PDE4B_dCTABL+ GSK3B_dCTABL+ APOE_dCTABL+ MAPK6_dCTABL, rnd = list(PTNO=~1),
family = poisson(link = log), data = stackdata, lambda=100,
control = list(print.iter=TRUE,start=c(1,rep(0,29)),q.start=0.7))
The error message is displayed below. Specficially, I do not believe there are any NAs in the dataset and I am unsure about the meaning of the warning regarding the factor variable.
Iteration 1
Error in grad.lasso[b.is.0] <- score.beta[b.is.0] - lambda.b * sign(score.beta[b.is.0]) :
NAs are not allowed in subscripted assignments
In addition: Warning message:
In Ops.factor(y, Mu) : ‘-’ not meaningful for factors
An abbreviated dataset containing the necessary variables is available in R format and can be downladed here.
I hope I can be guided a bit as to how to go on with the analysis. Please let me know if there is anything wrong with the dataset or you cannot download it. ANY help is much appreciated.
Just to follow up on #Kristofersen comment above. It is indeed the start vector that messes your analysis up.
If I run
glm8 <- glmmLasso(Group~NDUFV2_dCTABL+GPER1_dCTABL+ ESR1_dCTABL+ESR2_dCTABL+KLF12_dCTABL+SP4_dCTABL+SP1_dCTABL+ PGAM1_dCTABL+ANK3_dCTABL+RASGRP1_dCTABL+AKT1_dCTABL+NUDT1_dCTABL+ POLG_dCTABL+ ADARB1_dCTABL+OGG_dCTABL+ PDE4B_dCTABL+ GSK3B_dCTABL+ APOE_dCTABL+ MAPK6_dCTABL,
rnd = list(PTNO=~1),
family = binomial(),
data = stackdata,
lambda=100,
control = list(print.iter=TRUE))
then everything is fine and dandy (i.e., it converges and produces a solution). You have copied the example with poisson regression and you need to tweak the code to your situation. I have no idea about whether the output makes sense.
Quick note: I ran with the binomial distribution in the code above since your outcome is binary. If it makes sense to estimate relative risks then poisson may be reasonable (and it also converges), but you need to recode your outcome as the two groups are defined as 1 and 2 and that will certainly mess up the poisson regression.
In other words do a
stackdata$Group <- stackdata$Group-1
before you run the analysis.

Nested model in R

I'm having a huge problem with a nested model I am trying to fit in R.
I have response time experiment with 2 conditions with 46 people each and 32 measures each. I would like measures to be nested within people and people nested within conditions, but I can't get it to work.
The code I thought should make sense was:
nestedmodel <- lmer(responsetime ~ 1 + condition +
(1|condition:person) + (1|person:measure), data=dat)
However, all I get is an error:
Error in checkNlevels(reTrms$flist, n = n, control) :
number of levels of each grouping factor must be < number of observations
Unfortunately, I do not even know where to start looking what the problem is here.
Any ideas? Please, please, please? =)
Cheers!
This might be more appropriate on CrossValidated, but: lme4 is trying to tell you that one or more of your random effects is confounded with the residual variance. As you've described your data, I don't quite see why: you should have 2*46*32=2944 total observations, 2*46=92 combinations of condition and person, and 46*32=1472 combinations of measure and person.
If you do
lf <- lFormula(responsetime ~ 1 + condition +
(1|condition:person) + (1|person:measure), data=dat)
and then
lapply(lf$reTrms$Ztlist,dim)
to look at the transposed random-effect design matrices for each term, what do you get? You should (based on your description of your data) see that these matrices are 1472 by 2944 and 92 by 2944, respectively.
As #MrFlick says, a reproducible example would be nice. Other things you could show us are:
fit the model anyway, using lmerControl(check.nobs.vs.nRE="ignore") to ignore the test, and show us the results (especially the random effects variances and the statement of the numbers of groups)
show us the results of with(dat,table(table(interaction(condition,person))) to give information on the number of replicates per combination (and similarly for measure)

Error message: Error in fn(x, ...) : Downdated VtV is not positive definite

I'm trying to use the lmer function to create a minimum adequate model. My model is Mated ~ Size * Attempts * Status + (random factor).
as.logical(Mated)
as.numeric(Size)
as.factor(Attempts)
as.factor(Status)
(These have all worked on previous models)
So after all that I try running my model:
Model1<-lmer(Mated ~ Size*Status*Attempts + (1|FemaleID),data=mydata)
And it can be submitted without fault.It's only when I try to apply this update that it goes wrong:
Model2<-update(Model1, REML=FALSE)
Here is the error message supplied:
Error in fn(x, ...) : Downdated VtV is not positive definite
If I make a third model without the interaction and do an ANOVA between that and model one, then it says the two are significantly different.
Model3<-update(Model1,~.-Size:Status:Attempts
anova(Model1,Model3)
What am I doing wrong? Is the three way interaction really significant or have I made some mistake?
Thank you
If Mated is binary, then you should probably be using glmer with a logit or probit link function instead, something like:
model <- glmer(Mated ~ Size * Status * Attempts + (1|FemaleID),
data = mydata, family = binomial)
It would help if you could let us know what your data looks like (head(mydata) might be fine, or see here for how to make a reproducible example).
Also, I would avoid making Mated logical (see this question and answer for how it can make your life more difficult). Instead, as.factor(Mated) will explicitly make your response variable discrete.
After that, you can compare your full and reduced models with anova().

Odd behavior with step()

step() and stepAIC() produce a "remove missing values error" when running the code on data with missing values.
Error in step(mod1, direction = "backward") :
number of rows in use has changed: remove missing values?
According to ?step:
The model fitting must apply the models to the same dataset. This may be
a problem if there are missing values and R's default of na.action = na.omit
is used. We suggest you remove the missing values first.
I have a data frame with one variable which has four na values. However, when I run step on the lm object, I don't get the "missing values" error even though it has missing values. Can anyone tell me what could be going on?
> d1$Impressions
[1] NA NA NA 6924180 9313226 27888455
18213812 54557205 13495553
...
This does not produce an error message:
mod1 = lm(Leads ~ G + Con + GOO + DAY + Res + SD + ED +
ME + Impressions + Inc + Sea, data=d1)
step(mod1, direction="backward")
stepAIC(mod1)
Even with a variable which has missing values, it's not generating an error message. Any ideas on what's going on?
One reason for the stated behaviour is this. step() fits the full model and hence drops 3 (as stated) observations due to presence of NAs. As long as the variables for which there are NAs remain in the model, the lm() function will remove those observations at each step. If stepping stops before it removes a variable that would result in one of the previously removed observations remaining in the model, then no error will be raised, because the numbers of rows in the model matrix will not have changed.
As an aside, stepwise selection like this is considered to be of somewhat dubious validity. Not least, in using it you a making a fairly bold statement that the effects of the eliminated variables are exactly equal to zero. This also has the effect of biasing the effect (estimated coefficients) of the variables retained in the model to have larger (absolute) value.
Alternatives to this stepwise selection include shrinkage methods such as the Lasso and the Elastic Net.

Resources