I'm trying to calculate the Survival prediction using Cox Proportional Hazard model in R.
model<-coxph(Surv(time,status ==2)~age + sex + ph.karno + wt.loss, data=lung)
predict(model, data=lung, type ="expected")
When I use the above code, I get the Cumulative hazard's prediction corresponding to the formula
But my concern is all about predicting the Survival corresponding to the formula,
How do I predict the Survival in R?
Thanks in Advance.

You can use either predict or survfit. With predict you need to give the newdata argument a list with values for all the variables in the model:
newdata=list(time=100,status=1,age=60,sex=1, ph.karno=60,wt.loss=15),
type ="expected")
[1] 0.2007497
There's a plot method for survfit objects:
png(); plot(survfit(model));


How to obtain the QQ plot of a spline model R

I have a model that I've fitted using splines:
ssfit.3 <- smooth.spline(anage$lifespan ~ log(anage$,
df = 3)
I'm trying to obtain the model diagnostics such as the residual plot and the QQ plot for this model. I know for a linear model you can do
which outputs all the different plots. How can I do this with spline models since plot(ssfit.3) does not output the same?
Extract the residuals and use qqnorm()/qqline().
example(smooth.spline) ## to get a model to work with

Parameter estimates and variance for stratified variables in Cox regression (strata / survival package)

I have run Cox regression using the survival package to calculate mortality hazard ratio of an exposure A. I have found that the age variable violated the proportional hazard assumption (with cox.zph) and used strata(age)to stratify age in further models.
I need a parameter estimate of the age variable, as well as the variance and the matrix of covariance (to calculate Rate Advancement Periods)... And I don't know where to find them!
Am I missing something or am I misunderstanding what strata is doing?
Here is a reproducible example, using the lung data from the survival package.
I create the survival object and do a first Cox regression with non-stratified age variable.
lung$SurvObj <- with(lung, Surv(time, status == 2))
coxreg1 <- coxph(SurvObj ~ age + sex, data = lung)
So, I get coefficients, variance, and covariance matrix for the parameter estimates.
> coxreg1$coefficients
age sex
0.01704533 -0.51321852
> vcov(coxreg1)
age sex
age 8.506877e-05 8.510634e-05
sex 8.510634e-05 2.804217e-02
Now, if do a second regression with the stratified age variable, I don't get any coefficient estimates, variance or covariance.
coxreg2 <- coxph(SurvObj ~ strata(age) + sex, data = lung)
> coxreg2$coefficients
> vcov(coxreg2)
sex 0.0449369
Thanks for the help!
When you use a variable for stratification you don't get any coefficient estimate for it. Instead separate baseline hazards are estimated for the different age groups.
The essence of a stratified cox regression is to fit a model that has a different baseline hazard in each stratum.

How can I get the probability density function from a regression random forest?

I am using random-forest for a regression problem to predict the label values of Test-Y for a given set of Test-X (new values of features). The model has been trained over a given Train-X (features) and Train-Y (labels). "randomForest" of R serves me very well in predicting the numerical values of Test-Y. But this is not all I want.
Instead of only a number, I want to use random-forest to produce a probability density function. I searched for a solution for several days and here is I found so far:
"randomForest" doesn't produce probabilities for regression, but only in classification. (via "predict" and setting type=prob).
Using "quantregForest" provides a nice way to make and visualize prediction intervals. But still not the probability density function!
Any other thought on this?
Please see the predict.all parameter of the predict.randomForest function.
rf = randomForest(cty ~ displ + cyl + trans, data = mpg)
# Predict the first car in the dataset
pred = predict(rf, newdata = mpg[1, ], predict.all = TRUE)
The histogram of 500 "elementary" predictions looks like this:
You can also use quantregForest with a very fine grid of quantiles, convert them into a "cumulative distribution function (cdf)" with R-function ecdf and convert this cdf into a density estimation with a kernel density estimator.

Predict Weibull instant hazard

I am trying to model baseline hazard using different survival models.
For example I can model the cumulative baseline hazard using Cox as follows:
df <- aml
cox <- coxph(Surv(time, status) ~ 1, df)
cox.summary <- basehaz(cox, centered = TRUE)
However Cox is not ideal for this and I would like to model instant hazards rather than cumulative, so I am trying to use Weibull as follows:
weibull <- survreg(Surv(time, status) ~ 1, data=df, dist="weibull")
However I cannot work out how to predict the instant hazard from this. For example if I try the following code I just get a constant value (38.18681) at each time point.
predict(weibull, newdata=data.frame(time=df$time))
Where am I going wrong here ? In case I wasn't clear my aim is to visualize the instantaneous hazard in a plot of hazard vs time.

both x and y variables are censored in a regression

Is there any existing routine in R to do Tobit regression model with both censored x and y varialbes? I know survreg function in survival pacakge can deal with censored response variables. What about the left censored x predictor variable?
There is a framework for both for tobit regression and for "interval"-censored variables in the survival package. This is Therneau's example using Tobin's original data:
tfit <- survreg(Surv(durable, durable>0, type='left') ~age + quant,
data=tobin, dist='gaussian')
And the Surv function will accept interval censoring.
