Compare fit Cox models with a cluster() element in R - r

I would like to compare the model fits of Cox proportional-hazards models with a likelihood-ratio test (or equivalent). This usually works by comparing the fit with the anova() function. However, when I have data that is organized in clusters and therefore has to be defined that way, this does not work. See the example below (the lung data set is included in the survival package):
library(survival)
data(lung)
lung_cox <- coxph(Surv(time, event = status) ~ age + cluster(sex), data = lung)
lung_cox1 <- coxph(Surv(time, event = status) ~ 1 + cluster(sex), data = lung)
anova(lung_cox, lung_cox1)
The error I get after the anova() function is:
Error in anova.coxphlist(c(list(object), dotargs), test = test) :
Can't do anova tables with robust variances
Is there a way to work around this? I'm quite new to survival analysis in R, so maybe I'm not aware of a certain package that provides this?
Many thanks!

Related

How to use sample weights in GAM (mgcv) on survey data for Logit regression?

I'm interesting in performing a GAM regression on data from a national wide survey which presents sample weights. I read with interest this post.
I selected my vars of interest generating a DF:
nhanesAnalysis <- nhanesDemo %>%
select(fpl,
age,
gender,
persWeight,
psu,
strata)
Than, for what I understood, I generated a weighted DF with the following code:
library(survey)
nhanesDesign <- svydesign( id = ~psu,
strata = ~strata,
weights = ~persWeight,
nest = TRUE,
data = nhanesAnalysis)
Let's say that I would select only subjects with ageā‰„30:
ageDesign <- subset(nhanesDesign, age >= 30)
Now, I would fit a GAM model (fpl ~ s(age) + gender) with mgcv package. Is it possible to do so with the weights argument or using svydesign object ageDesign ?
EDIT
I was wondering if is it correct to extrapolate computed weights from the an svyglm object and use it for weights argument in GAM.
This is more difficult than it looks. There are two issues
You want to get the right amount of smoothing
You want valid standard errors.
Just giving the sampling weights to mgcv::gam() won't do either of these: gam() treats the weights as frequency weights and so will think it has a lot more data than it actually has. You will get undersmoothing and underestimated standard errors because of the weights, and you will also likely get underestimated standard errors because of the cluster sampling.
The simple work-around is to use regression splines (splines package) instead. These aren't quite as good as the penalised splines used by mgcv, but the difference usually isn't a big deal, and they work straightforwardly with svyglm. You do need to choose how many degrees of freedom to assign.
library(splines)
svglm(fpl ~ ns(age,4) + gender, design = nhanesDesign)

SVM Plot does not show any result in R

I've used the following code :
library(e1071)
svm.model<-svm(default.payment.next.month~PAY_AMT6,data=creditdata,cost=5,gamma=1)
plot(svm.model,data=creditdata,fill=TRUE)
Using a reproducible example provided by #parth you can try something like below.
library(caret)
library(e1071)
#sample data
data("GermanCredit")
#SVM model
svm.model <- svm(Class ~ Duration + Amount + Age, data = GermanCredit)
#plot SVM model
plot(svm.model, data = GermanCredit, Duration ~ Amount)
Here I ran a classification model so y (i.e. Class in above case) in ?svm should be a factor (you can verify this using str(GermanCredit)).
Once the model is built you can plot it using plot. ?plot.svm says that you need to provide formula (by fixing 2 dimensions i.e. Duration ~ Amount in above case) if more than two independent variables are used in your model. You may also be interested in slice option (for more detail refer ?plot.svm).

How to test significant improvement of LRM model

Using the rms package of Frank Harrell I constructed a predictive model using the lrm function.
I want to compare if this model has a significant better predictive value on a binomial event in comparison with another (lrm-) model.
I used different functions like anova(model1, model2) or the pR2 function of the pscl library to compare the pseudo R^2, but they all don't work with the lrm based model.
What can I do best to see if my new model is significant beter than the earlier model?
Update: Here is a example (where I want to predict the chance on bone metastasis) to check if size or stage (in addition to other variabele) gives the best model:
library(rms)
getHdata(prostate)
ddd <- datadist(prostate)
options( datadist = "ddd" )
mod1 = lrm(as.factor(bm) ~ age + sz + rx, data=prostate, x=TRUE, y=TRUE)
mod2 = lrm(as.factor(bm) ~ age + stage + rx, data=prostate, x=TRUE, y=TRUE)
It seems fundamentally the question is about comparing two non-nested models.
If you fit your models using the glm function you can use the -vuong- function in -pscl- package.
To test the fit of 2 nested models, you can use the lrtest function from the "rms" package.
lrtest(mod1,mod2)

Cannot get adjusted means for glmer using lsmeans

I have a glm that I would like to get adjusted means for using lsmeans. The following code makes the model (and seems to be doing it correctly):
library(lmerTest)
data$group <- as.factor(data$grp)
data$site <- as.factor(data$site)
data$stimulus <- as.factor(data$stimulus)
data.acc1 = glmer(accuracy ~ site + grp*stimulus + (1|ID), data=data, family=binomial)
However, using when I try to use any of the below code to get adjusted means for the model, I get the error
Error in lsmeansLT(model, test.effs = test.effs, ddf = ddf) :
The model is not linear mixed effects model.
lsmeans(data.acc1, "stimulus")
or
data.lsm <- lsmeans(data.acc1, accuracy ~ stimulus ~ grp)
pairs(data.lsm)
Any suggestiongs?
The problem is that you have created a generalised linear mixed model using glmer() (in this case a mixed logistic regression model) not a linear mixed model using lmer(). The lsmeans() function does not accept objects created by glmer() because they are not linear mixed models.
Answers in this post might help: I can't get lsmeans output in glmer
And this post might be useful if you want to understand/compute marginal effects for mixed GLMs: Is there a way of getting "marginal effects" from a `glmer` object

Is there a way to extrapolate predicted data from lmer

I am using lmer to fit a multilevel polynomial regression model with several fixed effects (including subject-specific variables like age, short-term memory span, etc.) and two sets of random effects (Subject and Subject:Condition). Now I would like to predict data for a hypothetical subject with particular properties (age, short-term memory span, etc.). I fit the model (m) and created a new data frame (pred) that contains my hypothetical subject, but when I tried predict(m, pred) I got an error:
Error in UseMethod("predict") :
no applicable method for 'predict' applied to an object of class "mer"
I know I could use the brute-force method of extracting fixed effects from my model and multiplying it all out, but is there a more elegant solution?
You can do this type of extrapolated prediction easily with the merTools package for R: http://www.github.com/jknowles/merTools
merTools includes a function called predictInterval which provides robust prediction capabilities for lmer and glmer fits. Specifically, you can use this function to predict extrapolated data, and to obtain prediction intervals that account for the variance in both the fixed and random effects, as well as the residual error of the model.
Here's a quick code example:
library(merTools)
m1 <- lmer(Reaction ~ Days + (1|Subject), data = sleepstudy)
predOut <- predictInterval(m1, newdata = sleepstudy, n.sims = 100)
# extrapolated data
extrapData <- sleepstudy[1:10,]
extrapData$Days <- 20
extrapPred <- predictInterval(m1, newdata = extrapData)

Resources