How do I combine the posterior distribution of the parameters with new data to get predictions in R-INLA? - predictive

I fit a model using R-INLA, with a mix of linear and non-linear effects (but no spatial), and I understand I can generate samples of the posterior distribution of the parameters by the inla.posterior.sample() function, but I can't work out how to combine this model fit with new data in order to make predictions over this new data? Any help is greatly appreciated.
Thanks,
Erin

Related

How to use Brant test with survey ordinal logistic reg (svyolr) to test PO assumption in complex survey data?

I am using the R survey package to estimate an ordinal logistic regression on complex survey data using the function svyolr(). I would like to test the proportional odds assumption with this weighted data, but brant::brant() only seems to work with unweighted models generated using MASS::polr(). Is anyone aware of how to test the PO assumption with a svyolr model?
Brant test reference using brant::brant() on a MASS:polr() model:
https://cran.r-project.org/web/packages/brant/brant.pdf
https://conservancy.umn.edu/bitstream/handle/11299/166205/ThomasA_TheProportionalOddsModel.pdf?sequence=1
Thanks so much!

Subject specific prediction from heterogenous linear mixed effect model package (lcmm)

I am fitting a heterogeneous linear mixed effect model which is in the lcmm package in R. Currently, I am only getting the class-specific and weighted subject-specific prediction from the predictY function. But, I want a subject-specific prediction. Is there any way to construct a subject-specific prediction from this package? Any help is appreciated.
I have found the answer. Looks like PredictY gives the mean class-specific predictions and adding them with the multiplication of the random effects from each subject (ranef(model)) and the model design matrix for the random part will provide the subject-specific prediction.

Random forest evaluation in R

I am a newbie in R and I am trying to do my best to create my first model. I am working in a 2- classes random forest project and so far I have programmed the model as follows:
library(randomForest)
set.seed(2015)
randomforest <- randomForest(as.factor(goodkit) ~ ., data=training1, importance=TRUE,ntree=2000)
varImpPlot(randomforest)
prediction <- predict(randomforest, test,type='prob')
print(prediction)
I am not sure why I don't get the overall prediction for my model.I must be missing something in my code. I get the OOB and the prediction per case in the test set but not the overall prediction of the model.
library(pROC)
auc <-roc(test$goodkit,prediction)
print(auc)
This doesn't work at all.
I have been through the pROC manual but I cannot get to understand everything. It would be very helpful if anyone can help with the code or post a link to a good practical sample.
Using the ROCR package, the following code should work for calculating the AUC:
library(ROCR)
predictedROC <- prediction(prediction[,2], as.factor(test$goodkit))
as.numeric(performance(predictedROC, "auc")#y.values))
Your problem is that predict on a randomForest object with type='prob' returns two predictions: each column contains the probability to belong to each class (for binary prediction).
You have to decide which of these predictions to use to build the ROC curve. Fortunately for binary classification they are identical (just reversed):
auc1 <-roc(test$goodkit, prediction[,1])
print(auc1)
auc2 <-roc(test$goodkit, prediction[,2])
print(auc2)

Using stepAIC to make out of sample predictions

just had a quick question on using Step AIC to make prediction. I'm a beginner in R, so please pardon if the solution is obvious. Tried searching around but couldn't really find what I was looking for.
So I'm trying to predict the response variable, after running stepwise AIC on a main model (main model has all the explanatory variables). The stepAIC gives out a new model that has a reduced number of variables. My question is how do I do an out of sample prediction using the new reduced model. In other words, how does I reduce the dataset so that when I feed it into predict.lm, it only has the variables that were selected in the reduced model.
Here's my code below:
# Specify start and end row of the first 5 year window
start_row=1
end_row=60
#declare matrix that will contain the predicted returns by specifying dimensions
predicted=matrix(0,179,7)
y_var=as.matrix(orig_data[start_row:end_row,2:7])
x_var=as.matrix(orig_data[start_row:end_row,8:27])
# Perform linear regression on all factors and then select factors using stepwise AIC method
initial_model<- lm(y_var[,1]~x_var[,1]+x_var[,2]+x_var[,3]+x_var[,4]+x_var[,5]+x_var[,6]+x_var[,7]+x_var[,8]+x_var[,9]+x_var[,10]+x_var[,11]+x_var[,12]+x_var[,13]+x_var[,14]+x_var[,15]+x_var[,16]+x_var[,17]+x_var[,18]+x_var[,19]+x_var[,20])
reduced_model<-stepAIC(initial_model, direction="both")
reduced_coefs<-t(as.matrix(coef(reduced_model)))
x_input<-as.matrix(x_var[60,])
Basically how do I multiply the coefficients that I get from the reduced model to only the corresponding explanatory variables in "x_var" (which has all the explanatory variables)
Thanks a lot for your help!

How do I plot predictions from new data fit with gee, lme, glmer, and gamm4 in R?

I have fit my discrete count data using a variety of functions for comparison. I fit a GEE model using geepack, a linear mixed effect model on the log(count) using lme (nlme), a GLMM using glmer (lme4), and a GAMM using gamm4 (gamm4) in R.
I am interested in comparing these models and would like to plot the expected (predicted) values for a new set of data (predictor variables). My goal is to compare the predicted effects for each model under particular conditions (x variables). Of particular interest is the comparison between marginal (GEE) and conditional estimates.
I think my main problem might be getting the new data in the correct form with the correct labels and attributes and such. I am still very much an R novice and struggle with this stuff (no course on this at my university unfortunately).
I currently have fitted models
gee1 lme1 lmer1 gamm1
and can extract their fixed effect coefficients and standard errors without a problem. I also don't have a problem converting them from the log scale or estimating confidence intervals accounting for the random effects.
I also have my new dataframe newdat which has 365 observations of 23 variables (average environmental data for each day of the year).
I am stuck on how to predict new count estimates from this. I played around with the model.matrix function but couldn't get it to work. For example, I tried:
mm = model.matrix(terms(glmm1), newdat) # Error in model.frame.default(object,
# data, xlev = xlev) : object is not a matrix
newdat$pcount = mm %*% fixef(glmm1)
Any suggestions or good references would be greatly appreciated. Can anyone help with the error above?
Getting predictions for lme() and lmer() is documented on http://glmm.wikidot.com/faq

Resources