Goal: get R^2 marginal and conditional in dredge results using an optimizer in the origninal model
This branches off of this question: dredge doesnt work when specifying glmer optimizer and the two solutions provided.
Solution 1: change r.squaredLR.R package code
Solution 2: add a function into the dredge function to call r.squaredGLMM instead of r.squaredLR
I tried Solution 2 first, which works perfectly on the simulated data, but when I try it on my model I get the error :
Error in r.squaredGLMM(x, null = nullmodel)["delta", ] :
subscript out of bounds
I then tried Solution 1 by altering the source code of r.squaredLR.R as descripbed and saving it as a R script and using source() to call the edited 'null.fit' function as to avoid editing r.squaredLR.R permenantly (I call MuMIn before sourcing the edited function). Yet this doesn't work.
Back to Solution 2...
I tried to simulate data similar to mine and was able to get the same error (the lmercontrol argument is disregarded in this global model, but I get the desired error so I didn't try to correct the data to need lmercontrol).
#Solution 2 attempt
set.seed(101)
dd <- data.frame(x1= rnorm(1920), x2=rnorm(1920), x3=rnorm(1920), x4=rnorm(1920),
treatment = factor(rep(1:2, each=3)),
replicate = factor(rep(1:3, each=1)),
stage = factor(rep(1:5, each=384)),
country = factor(rep(1:4, each=96)),
plot = factor(rep(1:10, each=24)),
chamber = factor(rep(1:6, each=1)),
n = 1920)
library(lme4)
dd$y <- simulate(~ x1 + x2 + x3 + (1|plot),
family = binomial,
weights = dd$n,
newdata = dd,
newparams = list(beta = c(1,1,1,1),
theta = 1))[[1]]
# my real response variable 'y' has a poisson distribution, but I had difficulty figuring
# out how to simulate a poisson distribution so I left the bionomial.
m0 <- lmer(y~ x1 + x2 + x3 + x4 + treatment*replicate*stage + (1|chamber) + (1|country/plot),
data=dd,
na.action = "na.fail",
REML = F,
lmercontrol = glmerControl(optimizer="bobyqa"))
nullmodel <- MuMIn:::.nullFitRE(m0)
dredge(m0, m.lim = c(0,5), rank = "AIC", extra =list(R2 = function(x) {
r.squaredGLMM(x, null = nullmodel)["delta", ]}))
A suggested reason for the error "subscript out of bounds" was that "the data being put into the algorithm are not in the format that the function expects."
Indeed, the function works when I remove ["delta", ] and I get the columns R21 and R22, but without taking into account the delta column these values are probably incorrect and I'm not sure which one is marginal and conditional R^2.
If you have any ideas, I'm all ears! Thanks in advance for all help.
Related
New to stackoverflow. I'm working on a project with NHIS data, but I cannot get the svyglm function to work even for a simple, unadjusted logistic regression with a binary predictor and binary outcome variable (ultimately I'd like to use multiple categorical predictors, but one step at a time).
El_under_glm<-svyglm(ElUnder~SO2, design=SAMPdesign, subset=NULL, family=binomial(link="logit"), rescale=FALSE, correlation=TRUE)
Error in eval(extras, data, env) :
object '.survey.prob.weights' not found
I changed the variables to 0 and 1 instead:
Under_narm$SO2REG<-ifelse(Under_narm$SO2=="Heterosexual", 0, 1)
Under_narm$ElUnderREG<-ifelse(Under_narm$ElUnder=="No", 0, 1)
But then get a different issue:
El_under_glm<-svyglm(ElUnderREG~SO2REG, design=SAMPdesign, subset=NULL, family=binomial(link="logit"), rescale=FALSE, correlation=TRUE)
Error in svyglm.survey.design(ElUnderREG ~ SO2REG, design = SAMPdesign, :
all variables must be in design= argument
This is the design I'm using to account for the weights -- I'm pretty sure it's correct:
SAMPdesign=svydesign(data=Under_narm, id= ~NHISPID, weight= ~SAMPWEIGHT)
Any and all assistance appreciated! I've got a good grasp of stats but am a slow coder. Let me know if I can provide any other information.
Using some make-believe sample data I was able to get your model to run by setting rescale = TRUE. The documentation states
Rescaling of weights, to improve numerical stability. The default
rescales weights to sum to the sample size. Use FALSE to not rescale
weights.
So, one solution maybe is just to set rescale = TRUE.
library(survey)
# sample data
Under_narm <- data.frame(SO2 = factor(rep(1:2, 1000)),
ElUnder = sample(0:1, 1000, replace = TRUE),
NHISPID = paste0("id", 1:1000),
SAMPWEIGHT = sample(c(0.5, 2), 1000, replace = TRUE))
# with 'rescale' = TRUE
SAMPdesign=svydesign(ids = ~NHISPID,
data=Under_narm,
weights = ~SAMPWEIGHT)
El_under_glm<-svyglm(formula = ElUnder~SO2,
design=SAMPdesign,
family=quasibinomial(), # this family avoids warnings
rescale=TRUE) # Weights rescaled to the sum of the sample size.
summary(El_under_glm, correlation = TRUE) # use correlation with summary()
Otherwise, looking code for this function's method with 'survey:::svyglm.survey.design', it seems like there may be a bug. I could be wrong, but by my read when 'rescale' is FALSE, .survey.prob.weights does not appear to get assigned a value.
if (is.null(g$weights))
g$weights <- quote(.survey.prob.weights)
else g$weights <- bquote(.survey.prob.weights * .(g$weights)) # bug?
g$data <- quote(data)
g[[1]] <- quote(glm)
if (rescale)
data$.survey.prob.weights <- (1/design$prob)/mean(1/design$prob)
There may be a work around if you assign a vector of numeric values to .survey.prob.weights in the global environment. No idea what these values should be, but your error goes away if you do something like the following. (.survey.prob.weights needs to be double the length of the data.)
SAMPdesign=svydesign(ids = ~NHISPID,
data=Under_narm,
weights = ~SAMPWEIGHT)
.survey.prob.weights <- rep(1, 2000)
El_under_glm<-svyglm(formula = ElUnder~SO2,
design=SAMPdesign,
family=quasibinomial(),
rescale=FALSE)
summary(El_under_glm, correlation = TRUE)
I am trying to use inverse probability of treatment weighting in a cause-specific cox regression using the CSC function in the riskRegression Package.
I calculated the weights without a problem, but when I try to pass the weights to the CSC function I get the following error message:
Error in eval(extras, data, env) :
..1 used in an incorrect context, no ... to look in
A complete reproducible example looks like this:
library(ipw)
library(cmprsk)
library(survival)
library(riskRegression)
data(mgus2)
# get some example data
mgus2$etime <- with(mgus2, ifelse(pstat==0, futime, ptime))
mgus2$event <- with(mgus2, ifelse(pstat==0, 2*death, 1))
mgus2$event <- factor(mgus2$event, 0:2, labels=c("censor", "pcm", "death"))
mgus2$age_cat <- cut(mgus2$age, breaks=seq(0, 100, 25))
mgus2$sex <- ifelse(mgus2$sex=="F", 0, 1)
# remove NA
mgus2 <- subset(mgus2, !is.na(mspike))
# estimate inverse probability weights
weights <- ipwpoint(sex, "binomial", "logit", denominator= ~ age_cat + mspike,
data=mgus2)
mgus2$weights <- weights$ipw.weights
# rerun cox model using weights
mod2 <- CSC(Hist(etime, event) ~ sex + age_cat + mspike, cause="pcm",
surv.type="hazard", fitter="coxph", data=mgus2,
weights=weights)
I know from the documentation that the CSC function calls the coxph function internally, passing additional arguments to it using ... syntax. Other arguments could be passed to the function just fine, but the weight argument always produces the error message above.
How can I fix this?
UPDATE:
I have contacted the Package Maintainer and he has fixed the bug already. It should work fine now, with one little difference: Instead of weights=weights one has to use weights=mgus2$weights.
I created a function to roll apply an exponentially weighted least-squares using the dynlm package. Here is the code:
residualization<-function(df,formula_ref, size){
rollapply(df,
width=size,
FUN = ewma_regression,
formula_ref = formula_ref,
by.column=FALSE, align="right")
}
ewma_regression<-function(x,formula_ref) {
n<-nrow(x)
weights <- 0.06*0.94^(seq(n-1,0,by=-1))
t <- dynlm(formula=as.formula(formula_ref), data = as.zoo(x),weights = weights)
return(t$residuals)
}
However when I run this code on my dataset, it shows the problem:
Error in as.formula(formula_ref) : object 'formula_ref' not found
When I try to debug it, inside the environment of the function, the variable formula_ref does exist! However even inside the debug mode, I cannot run the dynlm regression even if I try to set formula_ref to a temporary formula object.
Can anyone help me out? I know it might be a silly mistake but I can't find out!
A reproducible example would be:
dates<-seq.Date(from=as.Date("2010-01-01"), length.out = 1000, by="day")
teste1<-data.frame(x=rnorm(1000),y=rnorm(1000)*5)
teste2<-xts(teste1,order.by = dates)
formula.test<- y ~ x + I(x^2)
teste3<-residualization(df=teste2,formula_ref = formula.test, size=100)
You can just wrap y ~ x + I(x^2) in quotation marks ("y ~ x + I(x^2)").
I have a random effects model that looks at the mean bias of the results from a numerical water temperature model given changes to a parameter, roughness. Roughness is the x variable, mean_bias is the y variable, and the location in the stream (1:9) is the random effect variable:
lmer_mb <- lmer(mean_bias ~ roughness + (1|location), data = W3T, REML=FALSE)
I've tried using approx () to find the roughness value when mean_bias = 0:
xval <- approx(roughness = lmer_mb$fitted, mean_bias = lmer_mb$roughness, xout = 0)$mean_bias
But I keep getting the error:
Error in approx(roughness = lmer_mb$fitted, mean_bias =
lmer_mb$roughness, : unused arguments (roughness = lmer_mb$fitted,
mean_bias = lmer_mb$roughness)
I also want to plot the xval (once I figure it out) on my plot, and was going to adapt the code that I found in another question on stackoverflow:
xval <- approx(x = fit$fitted.values, y = x, xout = 30)$y
Am I on the right track?
I found the approach I needed to solve this problem. I needed to load the package
library(merTools)
Then I used the wiggle function to generate a new data frame around the area where I wanted to solve for x:
opt2 <- wiggle(data=W3T, var="roughness", values = seq(0.05,0.1,0.001))
Then, I filtered opt2 for the values I wanted and turned them into a list:
sub.opt2 <- filter(opt2,mean_bias == "0")
sub.opt3 <- list(sub.opt2)
And finally, found the average roughness using the draw() function:
opt.roughness <- draw(lmer_mb, type = 'average', varList = sub.opt3)
opt.roughness
My solution was based on the code I found here:
https://cran.r-project.org/web/packages/merTools/vignettes/merToolsIntro.html
Extract the baseline hazard function h0(t) from glmnet object
I want to know the hazard function at time t >> h(t,X) = h0(t) exp[Σ βi*Xi]. How can I extract the baseline hazard function h0(t) from glmnet object in R?
What I know is that function "basehaz()" in Survival Packages can extract the baseline hazard function from coxph object only.
I also found a function, glmnet.basesurv(time, event, lp, times.eval = NULL, centered = FALSE). But when I try to use this function, there is an error.
Error: could not find function "glmnet.basesurv"
Below is my code, using glmnet to fit the cox model and obtained the coefficients of selected variables. Is it possible to get the baseline hazard function h0(t) from this glmnet object?
Code
# Split data into training data and testing data
set.seed(101)
train_ratio = 2/3
sample <- sample.int(nrow(x), floor(train_ratio*nrow(x)), replace = F)
x.train <- x[sample, ]
x.test <- x[-sample, ]
y.train <- y[sample, ]
y.test <- y[-sample, ]
surv_obj <- Surv(y.train[,1],y.train[,2])
#
my_alpha = 0.5
fit = glmnet(x = x.train, y = surv_obj, family = "cox",alpha = my_alpha) # fit the model with elastic net method
plot(fit,xvar="lambda", main="cox model coefficient paths(glmnet.fit)\n\n") # Plot the paths for the fit
fit
# cross validation to find out best lambda
cv_fit = cv.glmnet(x = x.train,y = surv_obj , family = "cox",nfolds = 10,alpha = my_alpha)
tencrossfit <- cv_fit$glmnet.fit
plot(cv_fit, main="Cross-validated Deviance(10 folds cv.glmnet.fit)\n\n")
plot(tencrossfit, main="cox model coefficient paths(10 folds cv.glmnet.fit)\n\n")
max(cv_fit$cvm)
summary(cv_fit$cvm)
cv_fit$lambda.min
cv_fit$lambda.1se
coef.min = coef(cv_fit, s = "lambda.1se")
pred_min_value2 <- predict(cv_fit, s=cv_fit$lambda.min, newx=x.test,type="link")
I really appreciate any help you can provide.
The glmnet.basesurv function is part of the hdnom package (which is available on CRAN), not glmnet itself. So install that, and then call it.
I had similar question and after installing hdnom install.packages("hdnom"), if you check inside the function list library(help = "hdnom")
you can see that the function is actually glmnet_survcurve(). I made it working as hdnom:::glmnet_survcurve(), example is here:
S <- Surv(data$survtimed, data$outcome)
X_glm<-model.matrix(S~.,data[, c("factor1", "factor2")])
cox_model <- glmnet(X_glm, S, family="cox", alpha=1, lambda=0.2)
times = c (1,2) #for predict of survival and
linearpredictors at times = 1 and 2
predictions = hdnom:::glmnet_survcurve(cox_model, S[,1], S[,2], X_glm, survtime = times)
predictions$p[,1] #survival probability at time 1