Comparing model DICs across multiple MCMCglmm models - r

So I'm looking for a quick way to create a table of DIC scores from MCMCglmm models in R. I've run 10 different models and could extract the DIC from each separately using the following code, where the model is called m1:
m1.DIC <- m1$DIC
But then I have to do this for each model, and then create the dataframe, which is tedious. I've looked at the documentation for the MCMCglmm package and haven't found any hints about whether I can get a quick summary across models through some built-in function. Is there one? Is there another package that can do this? I know the rethinking package uses compare to get quick and easy model comparisons, but this doesn't appear to work with MCMCglmm outputs, as I get the following error message:
> compare(m1, m2, WAIC=FALSE)
Error in UseMethod("logLik") :
no applicable method for 'logLik' applied to an object of class "MCMCglmm"
In addition: Warning message:
In DIC(z, n = n) :
No specific DIC method for object of class MCMCglmm. Returning AIC instead. applied to an object of class "MCMCglmm"
Is there a similar method that will work to compare MCMCglmm models?
EDIT: Also note that the compare function in rethinking calculates weights for the models, from the DIC. Maybe this just doesn't exist in a form that works with the MCMCglmm package.

If you want to generate and compare a list of possible models from scratch you can use the dredge function in the MuMIn package http://cran.at.r-project.org/web/packages/MuMIn/MuMIn.pdf which supports MCMCglmm objects.
First you need to make the MCMCglmm call updateable though (so that dredge can change the model composition):
MCMCglmm.updateable<- updateable(MCMCglmm).
Then you can run your global model:
global.model<- MCMCglmm.updateable(y~x1+...etc.)
The call to dredge would then be something like
dredge.MCMCglmm<- dredge(global.model, rank="DIC" ...)
You can also get standardized coefficients and adjusted R^2.
If you already have a list of fitted models you can use model.sel (in the same package) to generate a ranked table with model weights etc:
model.sel(model1, model2, model3, rank="DIC")
Good luck!
Best,
Adrian

Related

R save xgb model command error: 'model must be xgb.Booster'

'bst' is the name of an xgboost model that I built in R. It gives me predicted values for the test dataset using this code. So it is definitely an xgboost model.
pred.xgb <- predict(bst , xdtest) # get prediction in test sample
cor(ytestdata, pred.xgb)
Now, I would like to save the model so another can use the model with their data set which has the same predictor variables and the same variable to be predicted.
Consistent with page 4 of xgboost.pdf, the documentation for the xgboost package, I use the xgb.save command:
xgb.save(bst, 'xgb.model')
which produces the error:
Error in xgb.save(bst, "xgb.model") : model must be xgb.Booster.
Any insight would be appreciated. I searched the stack overflow and could not locate relevant advice.
Mike
It's hard to know exactly what's going on without a fully reproducible example. But just because your model can make predictions on the test data, it doesn't mean it's an xgboost model. It can be any type of model with a predict method.
You can try class(bst) to see the class of your bst object. It should return "xgb.Booster," though I suspect it won't here (hence your error).
On another note, if you want to pass your model to another person using R, you can just save the R object rather than exporting to binary, via:
save(bst, model.RData)

Is there any way to dredge a PGLMM_compare model in R (Phyr and MuMin packages)?

I am doing a comparative analysis, and my response variables are 0 or 1, therefore I need to do a phylogenetically-corrected analysis with a binomial error distribution. I used the PGLMM_compare function from the phyr package (https://rdrr.io/github/daijiang/phyr/man/pglmm_compare.html) to create a full model with all of my variables, but MuMin does not support this output as a 'global model', therefore I cannot dredge it. I am looking for a way to find the best models and possibly perform model averaging from these, however it seems that these packages are not compatible. It would be difficult to create all the models by hand, since I have ~8 explanatory variables. Is there any way of dredging a phylogenetic model with binomial error structure? Thanks in advance.
You would need to implement at least the following methods for dredge and model.avg to work with pglmm_compare:
nobs.pglmm_compare(object, ...)
logLik.pglmm_compare(object, ...)
coef.pglmm_compare(object, ...)
coefTable.pglmm_compare(model, ...)

Using a 'gbm' model created in R package 'dismo' with functions in R package 'gbm'

This is a follow-up to a previous question I asked a while back that was recently answered.
I have built several gbm models with dismo::gbm.step, which relies on the gbm fitting functions found in R package gbm, as well as cross validation tools from R package splines.
As part of my analysis, I would like to use some of the graphical tools available in R (e. g. perspective plots) to visualize pairwise interactions in the data. Both the gbm and the dismo packages have functions for detecting and modelling interactions in the data.
The implementation in dismo is explained in Elith et. al (2008) and returns a statistic which indicates departures of the model predictions from a linear combination of the predictors, while holding all other predictors at their means.
The implementation in gbm uses Friedman`s H statistic (Friedman & Popescue, 2005), and returns a different metric, and also does NOT set the other variables at their means.
The interactions modelled and plotted with dismo::gbm.interactions are great and have been very informative. However, I would also like to use gbm::interact.gbm, partly for publication strength and also to compare the results from the two methods.
If I try to run gbm::interact.gbm in a gbm.object created with dismo, an error is returned…
"Error in is.factor(data[, x$var.names[j]]) :
argument "data" is missing, with no default"
I understand dismo::gmb.step adds extra data the authors thought would be useful to the gbm model.
I also understand that the answer to my question lies somewherein the source code.
My questions is...
Is it possible to modify a gbm object created in dismo to be used in gbm::gbm.interact? If so, would this be accomplished by...
a. Modifying the gbm object created in dismo::gbm.step?
b. Modifying the source code for gbm::interact.gbm?
c. Doing something else?
I will be going through the source code trying to solve this myself, if I come up with a solution before anyone answers I will answer my own question.
The gbm::interact.gbm function requires data as an argument interact.gbm <- function(x, data, i.var = 1, n.trees = x$n.trees).
The dismo gbm.object is essentially the same as the gbm gbm.object, but with extra information attached so I don't imagine changing the gbm.object would help.

what is the difference between lmFit and rlm

I want to use robust limma on my microarray data and R's user guide says rlm is the correct function to use according to:
http://rss.acs.unt.edu/Rdoc/library/limma/html/mrlm.html
I currently have:
lmFit(ExpressionMatrix, design, method = "robust", na.omit=T)
I can see that I chose the method to be robust. Does that mean that rlm will be called by this lmFit? and if I want it not to be robust, what method should I use?
The help page says:
The function mrlm is used if method="robust".
And then goes on:
If method="ls", then gls.series is used if a correlation structure has been specified, i.e., if ndups>1 or block is non-null and correlation is different from zero. If method="ls" and there is no correlation structure, lm.series is used.
If you follow the links from the help page for lmFit (06.LinearModels)
Fitting Models
The main function for model fitting is lmFit. This is recommended
interface for most users. lmFit produces a fitted model object of
class MArrayLM containing coefficients, standard errors and residual
standard errors for each gene. lmFit calls one of the following three
functions to do the actual computations:
lm.series
Straightforward least squares fitting of a linear model for
each gene.
mrlm
An alternative to lm.series using robust regression as
implemented by the rlm function in the MASS package.
gls.series
Generalized least squares taking into account correlations
between duplicate spots (i.e., replicate spots on the same array) or
related arrays. The function duplicateCorrelation is used to estimate
the inter-duplicate or inter-block correlation before using
gls.series.

Model fitting: glm vs glmmPQL

I am fitting a model regarding absence-presence data and I would like to check whether the random factor is significant or not.
To do this, one should compare a glmm with a glm and check with the LR-test which one is most significant, if I understand correct.
But if I perform an ANOVA(glm,glmm) , I get an analysis of Deviance Table and no output that compares the models.
How do I get the output that I desire, thus comparing both models?
Thanks in advance,
Koen
Somewhere you got the wrong impression about using anova() for this. Below re was fit using glmmPQL() in MASS package. fe was fit using glm() from base:
> anova(re,fe)
#Error in anova.glmmPQL(re, fe) : 'anova' is not available for PQL fits
That message appears to be the sole reason anova.glmmPQL() was created.
See this thread for verification and vague explanation:
https://stat.ethz.ch/pipermail/r-help/2002-July/022987.html
simply anova does not work with glmmPQL you need to use glmer from lme4 package to be able to use anova.

Resources