Difference between confint and predict - r

I'm analyzing a linear model with 2 factors and I need to find a confidence interval for the response variable. As I understand it confint() and predict(, interval='confidence') both find confidence intervals so what is the difference between them?

confint() finds confidence intervals on the model parameters
predict(., interval="confidence") finds confidence intervals on the model predictions

Related

How do I specify the dispersion parameter when computing the confidence interval for a GLM?

I have a model of exponential decay in the form Y = exp{a + bX + cW}. In R, I represent this as a generalized linear model (GLM) using a gamma random component with log link function.
fitted <- glm(Y ~ X + W, family=Gamma(link='log'))
I know from this post that for the standard errors to really represent an exponential rather than gamma random component, I need to specify the dispersion parameter as being 1 when I call summary.
summary(fitted, dispersion=1)
summary(fitted) # not the same!
Now, I want to find the 95% confidence intervals for my estimates of a, b, c. However, there seems to be no way to specify the dispersion parameter for the confint, even though I know it should affect the confidence interval (because it affects the standard error).
confint(fitted)
confint(fitted, dispersion=1) # same as the last confint :(
So, in order to get the confidence intervals corresponding to an exponential rather than gamma random component, how do I specify the dispersion parameter when computing the confidence interval for a GLM?

plotting bootstrapped confidence intervals

I am attempting to plot a bootstrapped linear model and I would like to include the models upper and lower confidence intervals and am not sure I am calculating the bootstrapped upper and lower bounds correctly. Below is an example using the cars dataset in r in conjunction with the boot library.
library(boot)
plot(speed~dist,cars,pch=21,bg="grey")
## standard linear model
mod<-lm(speed~dist,cars)
new.dat=seq(0,120,10)
mod.fit<-predict(mod,newdata=data.frame(dist=new.dat),interval="confidence")
lines(new.dat,mod.fit[,1]);#line fit
lines(new.dat,mod.fit[,2],lty=2);#lower confidence interval
lines(new.dat,mod.fit[,3],lty=2);#upper confidence interval
##Bootstrapped Confidence Intervals
lm.boot=function(formula, data, indices) {
d <- data[indices,] # allows boot to select sample
fit <- lm(formula, data=d)
return(coef(fit))
}
results <- boot(data=cars, statistic=lm.boot,
R=100, formula=speed~dist)
N.mod<-nrow(cars)
x.val<-new.dat
y.boot.fit<-(mean(results$t[,2])*x.val)+mean(results$t[,1])
y.boot.fit.uCI<-y.boot.fit+qt(0.975,N.mod-2)*sd(results$t[,2])
y.boot.fit.lCI<-y.boot.fit-qt(0.975,N.mod-2)*sd(results$t[,2])
lines(new.dat,y.boot.fit,col="red")
lines(new.dat,y.boot.fit.lCI,lty=2,col="red");#lower confidence interval
lines(new.dat,y.boot.fit.uCI,lty=2,col="red");#upper confidence interval
legend("bottomright",legend=c("Linear Model","Bootstrapped Model"),lty=1,
col=c("black","red"),ncol=1,cex=1,bty="n",y.intersp=1.5,x.intersp=0.75,
xpd=NA,xjust=0.5)
Using this code I get this output where the bootstrapped confidence interval is on top of the bootstrapped fitted line.
Any help/direction is appreciated. Granted this might be better suited for Cross Validated or other statistic specific boards.

Calculating 95% confidence intervals in quantile regression in R using rq function

I would like to get 95% confidence intervals for the regression coefficients of a quantile regression. You can calculate quantile regressions using the rq function of the quantreg package in R (compared to an OLS model):
library(quantreg)
LM<-lm(mpg~disp, data = mtcars)
QR<-rq(mpg~disp, data = mtcars, tau=0.5)
I am able to get 95% confidence intervals for the linear model using the confint function:
confint(LM)
When I use quantile regression I understand that the following code produces bootstrapped standard errors:
summary.rq(QR,se="boot")
But actually I would like something like 95% confidence intervals. That is, something to interprete like: "with a probability of 95%, the interval [...] includes the true coefficient". When I calculate standard errors using summary.lm() I would just multiply SE*1.96 and get similar results as from confint(). But this is not possible using bootstrapped standard errors.
So my question is how get 95% confidence intervals for quantile regression coefficients?
You can use the boot.rq function directly to bootstrap the coefficients:
x<-1:50
y<-c(x[1:48]+rnorm(48,0,5),rnorm(2,150,5))
QR <- rq(y~x, tau=0.5)
summary(QR, se='boot')
LM<-lm(y~x)
QR.b <- boot.rq(cbind(1,x),y,tau=0.5, R=10000)
t(apply(QR.b$B, 2, quantile, c(0.025,0.975)))
confint(LM)
plot(x,y)
abline(coefficients(LM),col="green")
abline(coefficients(QR),col="blue")
for(i in seq_len(nrow(QR.b$B))) {
abline(QR.b$B[i,1], QR.b$B[i,2], col='#0000ff01')
}
You may want to use the boot package to compute intervals other than the percentile interval.
You could also simply retrieve the vcov from the object, setting covariance=TRUE. This amounts to using boostrapped standard errors in your CI:
vcov.rq <- function(x, se = "iid") {
vc <- summary.rq(x, se=se, cov=TRUE)$cov
dimnames(vc) <- list(names(coef(x)), names(coef(x)))
vc
}
confint(QR)
But yes, the better way to do this is to use a studentized bootstrap.

Confidence intervals for predicted probabilities from predict.lrm

I am trying to determine confidence intervals for predicted probabilities from a binomial logistic regression in R. The model is estimated using lrm (from the package rms) to allow for clustering standard errors on survey respondents (each respondent appears up to 3 times in the data):
library(rms)
model1<-lrm(outcome~var1+var2+var3,data=mydata,x=T,y=T,se.fit=T)
model.rob<-robcov(model1,cluster=respondent.id)
I am able to estimate a predicted probability for the outcome using predict.lrm:
predicted.prob<-predict(model.rob,newdata=data.frame(var1=1,var2=.33,var3=.5),
type="fitted")
What I want to determine is a 95% confidence interval for this predicted probability. I have tried specifying se.fit=T, but this not permissible in predict.lrm when type=fitted.
I have spent the last few hours scouring the Internet for how to do this with lrm to no avail (obviously). Can anyone point me toward a method for determining this confidence interval? Alternatively, if it is impossible or difficult with lrm models, is there another way to estimate a logit with clustered standard errors for which confidence intervals would be more easily obtainable?
The help file for predict.lrm has a clear example. Here is a slight modification of it:
L <- predict(fit, newdata=data.frame(...), se.fit=TRUE)
plogis(with(L, linear.predictors + 1.96*cbind(- se.fit, se.fit)))
For some problems you may want to use the gendata or Predict functions, e.g.
L <- predict(fit, gendata(fit, var1=1), se.fit=TRUE) # leave other vars at median/mode
Predict(fit, var1=1:2, var2=3) # leave other vars at median/mode; gives CLs

gbm confidence intervals in R?

Anybody know how to calculate confidence intervals from the gbm.predict() function? I'd like a method to ascertain a 95% confidence band on my gbm predictions.

Resources