Multinomial logit with random effects does not converge using mblogit - r

I would like to estimate a random effects (RE) multinomial logit model.
I have been applying mblogit from the mclogit package. However, once I introduce RE into my model, it fails to converge.
Is there a workaround this?
For instance, I tried to adjust the fitting process of mblogit and increase the maximal number of iterations (maxit), but did not succeed to correctly write the syntax for the control function. Would this be the right approach? And if so, could you advise me how to implement it into my model which so far looks as follows:
meta.mblogit <- mblogit(Migration ~ ClimateHazard4 , weights = logNsquare,
data = meta.df, subset= Panel==1, random = ~1|StudyID,
)
Here, both variables (Migration and ClimateHazard4) are factor variables.
Or is there an alternative approach you could recommend me for an estimation of RE multinomial logit?
Thank you very much!

Related

Is there a way to include an autocorrelation structure in the gam function of mgcv?

I am building a model using the mgcv package in r. The data has serial measures (data collected during scans 15 minutes apart in time, but discontinuously, e.g. there might be 5 consecutive scans on one day, and then none until the next day, etc.). The model has a binomial response, a random effect of day, a fixed effect, and three smooth effects. My understanding is that REML is the best fitting method for binomial models, but that this method cannot be specified using the gamm function for a binomial model. Thus, I am using the gam function, to allow for the use of REML fitting. When I fit the model, I am left with residual autocorrelation at a lag of 2 (i.e. at 30 minutes), assessed using ACF and PACF plots.
So, we wanted to include an autocorrelation structure in the model, but my understanding is that only the gamm function and not the gam function allows for the inclusion of such structures. I am wondering if there is anything I am missing and/or if there is a way to deal with autocorrelation with a binomial response variable in a GAMM built in mgcv.
My current model structure looks like:
gam(Response ~
s(Day, bs = "re") +
s(SmoothVar1, bs = "cs") +
s(SmoothVar2, bs = "cs") +
s(SmoothVar3, bs = "cs") +
as.factor(FixedVar),
family=binomial(link="logit"), method = "REML",
data = dat)
I tried thinning my data (using only every 3rd data point from consecutive scans), but found this overly restrictive to allow effects to be detected due to my relatively small sample size (only 42 data points left after thinning).
I also tried using the prior value of the binomial response variable as a factor in the model to account for the autocorrelation. This did appear to resolve the residual autocorrelation (based on the updated ACF/PACF plots), but it doesn't feel like the most elegant way to do so and I worry this added variable might be adjusting for more than just the autocorrelation (though it was not collinear with the other explanatory variables; VIF < 2).
I would use bam() for this. You don't need to have big data to fit a with bam(), you just loose some of the guarantees about convergence that you get with gam(). bam() will fit a GEE-like model with an AR(1) working correlation matrix, but you need to specify the AR parameter via rho. This only works for non-Gaussian families if you also set discrete = TRUE when fitting the model.
You could use gamm() with family = binomial() but this uses PQL to estimate the GLMM version of the GAMM and if your binomial counts are low this method isn't very good.

non nested model comparison using R

To explain my problem , i have this simulated data using R.
require(splines)
x=rnorm(20 ,0,1)
y=rep(c(0,1),times=10)
First i fitted a regular (linear effects) logistic regression model.
fit1=glm(y~x ,family = "binomial")
Then to check the non linear effects, i fitted this natural spline model .
fit2=glm(y~ns(x,df=2) ,family = "binomial")
Based on my thinking models , i believe these 2 models are non nested models.
Next i wanted check whether the non linear model (fit2) has any significant effects compared to the regular logistic model (fit1).
Is there any way to compare this two models? I believe i cannot use the lrtest function in lmtest package, because these two models are not nested models.
Any suggestion will be highly appreciated
Thank you.

I get an error with functions of nlme package in R

I am trying to fit a linear growth model (LGM) in R, and I understand that the primary steps would be to fit a Null model with time as a predictor of my independent variable Y (allowing for random effects) and a Null model not allowing for random effects, then compare the two and see whether the random effect is strong enough to justify the usage of the model with random intercept.
I managed to fit the model with random intercept with the lmer function of the lme4 package, but I can't find a function in that package that allows me to fit a model without random intercept.
I have tried to fit models both with random intercept (lme function) and without (gls function) with the nlme package, but neither of them have been working for me.
My original code was:
library(nlme)
LMModel <- lme(Y~Time, random=~Time| ID, data=dataset,
method="ML")
and running that, I got an error saying "missing values in object" (apparently referring to my Time variable). I thus added a transformation of my dataset into a matrix with "matr <- as.matrix(dataset)" and added the missing data management part to my code, which ended up being:
LMModel <- lme(Y~Time, random=~Time| ID, data=dataset,
method="ML", na.action = na.exclude(matr))
Running this, I get the error: ' could not find function "1" '
I further tried to fit a model with no random effect with the gls function of nlme and got the exact same error.
I feel quite lost as I can't seem to figure out what that function 1 means. Any ideas of what might be happening here?
Thanks a lot in advance for the help!
Federico

coxme proportional hazard assumption

I am running mixed effect Cox models using the coxme function {coxme} in R, and I would like to check the assumption of proportional hazard.
I know that the PH assumption can be verified with the cox.zph function {survival} on cox.ph model.
However, I cannot find the equivalent for coxme models.
In 2015 a similar question has been posted here, but had no answer.
my questions are:
1) how to test PH assumption on mixed effect cox model coxme?
2) if there is no equivalent of the cox.zph for coxme models, is it valid for publication in scientific article to run mixed effect coxme model but test the PH assumption on a cox.ph model identical to the coxme model but without random effect?
Thanks in advance for your answers.
Regards
You can use frailty option in coxph function. Let's say, your random effect variable is B, your fixed effect variable is A. Then you fit your model as below
myfit <- coxph( Surv(Time, Censor) ~ A + frailty(B) , data = mydata )
Now, you can use cox.zph(myfit) to test the proportional hazard assumption.
I don't have enough reputation to comment, but I don't think using the frailty option in the coxph function will work. In the cox.zph documentation, it says:
Random effects terms such a frailty or random effects in a coxme model are not checked for proportional hazards, rather they are treated as a fixed offset in model.
Thus, it's not taking the random effects into account when testing the proportional hazards assumption.

Random forest evaluation in R

I am a newbie in R and I am trying to do my best to create my first model. I am working in a 2- classes random forest project and so far I have programmed the model as follows:
library(randomForest)
set.seed(2015)
randomforest <- randomForest(as.factor(goodkit) ~ ., data=training1, importance=TRUE,ntree=2000)
varImpPlot(randomforest)
prediction <- predict(randomforest, test,type='prob')
print(prediction)
I am not sure why I don't get the overall prediction for my model.I must be missing something in my code. I get the OOB and the prediction per case in the test set but not the overall prediction of the model.
library(pROC)
auc <-roc(test$goodkit,prediction)
print(auc)
This doesn't work at all.
I have been through the pROC manual but I cannot get to understand everything. It would be very helpful if anyone can help with the code or post a link to a good practical sample.
Using the ROCR package, the following code should work for calculating the AUC:
library(ROCR)
predictedROC <- prediction(prediction[,2], as.factor(test$goodkit))
as.numeric(performance(predictedROC, "auc")#y.values))
Your problem is that predict on a randomForest object with type='prob' returns two predictions: each column contains the probability to belong to each class (for binary prediction).
You have to decide which of these predictions to use to build the ROC curve. Fortunately for binary classification they are identical (just reversed):
auc1 <-roc(test$goodkit, prediction[,1])
print(auc1)
auc2 <-roc(test$goodkit, prediction[,2])
print(auc2)

Resources