Let's say I want to plot the survival curves using a model of the lung data, that controls for sex and a median split of the age variable (I could also control linearly for age and that would make my problem even worse).
I would like to make a plot of this model only showing the stratification between the levels of the sex factor. If I do what seems to be the standard, however, I get 4 instead of two survival curves.
library(survival)
library(survminor)
reg_lung <- lung %>% mutate(age_cat = ifelse(age > 63, "old", "young"))
lung_fit <- survfit(Surv(time, status) ~ age_cat + sex, data = reg_lung)
ggsurvplot(lung_fit, data = reg_lung)
resulting survival plot
That is to say, I would like to the difference sex makes while holding the influence of age fixed (either as factor old/young or linearly).
You can fit your model with coxph and define sex as strata:
lung_fit <- coxph(Surv(time, status) ~ age_cat + strata(sex), data = reg_lung)
ggsurvplot(survfit(lung_fit), data = reg_lung)
Related
In R, I am using ggsurvplot_facet to produce survival curves plotted for groups sex as a facet stratified by a variable ecog. However, I would like to have an overall group as well in the same facet as well. Is this possible?
ggsurvplot_add_all did not help.
Here is some example data:
library(survminer)
lung$ecog <- ifelse(lung$ph.ecog == 0, 0, 1)
fit <- surv_fit(Surv(time, status) ~ sex, data = lung)
fig_os <- ggsurvplot_facet(fit, data = lung, facet.by = 'ecog')
I need a survival curve for the whole population, independent of ecog.
Cheat a little bit, by row-binding lung to a copy of lung where in the latter ecog has been replaced with a constant.
Cheat, making a copy, setting ecog to 2, row-binding, and changing ecog in the row-bound dataset to a factor.
lung2 <- copy(lung)
lung2$ecog <- 2
lung2 <- rbind(lung,lung2)
lung2$ecog <- factor(lung2$ecog,labels = c("0", "1", "Overall"))
Then use your code above, but using lung2 as the dataset.
fit <- surv_fit(Surv(time, status) ~ sex, data = lung2)
fig_os <- ggsurvplot_facet(fit, data = lung2, facet.by = 'ecog')
Output:
Context
I am using survival::coxph() to fit the cox model. I want to get HR for exposure factors under different sex.
I am using the dataset lung from the survival package to simplify the problem I am having.
Say that I want to get HR of ph.karno (Continuous variable) in different sex (Categorical variable). I can do this in two ways.
Method 1: Use the interaction term of sex and ph.karno to get the HR of ph.karno across sex.
Method 2: Filter the dataset containing only males (sex == 1) or females (sex == 1) separately. Then fit the cox model using these two datasets separately.
Both of the above methods can get HR of ph.karno in different sex.
According to the results, the results obtained by the two methods are not consistent (see Reproducible code).
Question
How can you explain the results of the two methods?
Which method obtains the correct HR of ph.karno in different sex.
Reproducible code
library(survival)
# method 1: get ph.karno effect in different sex by adding an interact term
fit <- coxph(Surv(time, status) ~ factor(sex):ph.karno, data = lung)
interact_result = fit$coefficients
# method 2: get ph.karno effect in different sex by subsetting differnt data
lung = lung
lung_1 = lung[lung$sex == 1,]
lung_2 = lung[lung$sex == 2,]
fit1 = coxph(Surv(time, status) ~ ph.karno, data = lung_1)
fit2 = coxph(Surv(time, status) ~ ph.karno, data = lung_2)
subgroup_result = c(sex_1 = fit1$coefficients, sex_2 = fit2$coefficients)
# comparing ph.karno effect in different sexe using the two methods described above
rbind(interact_result, subgroup_result)
# factor(sex)1:ph.karno factor(sex)2:ph.karno
# interact_result -0.013675278 -0.02030161
# subgroup_result -0.009673906 -0.03744254
I am estimating an SEM model that has observed variables. I am using SEM to handle missing data using FIML. My model has an interaction term to test for moderation. Here is a toy example that illustrates the issue.
library(lavaan)
library(car)
library(dplyr)
data(starwars)
sw2 <- starwars %>% mutate(
male = Recode(sex, "'male' = 1; NA=NA; else = 0"),
human = Recode(species, "'Human' = 1; NA=NA; else = 0"),
maleXby = male * birth_year,
)
mod <- 'mass ~ height + human + male + birth_year + maleXby'
fit <- sem(mod, data = sw2, missing="fiml.x")
summary(fit)
What I want to do is plot the interaction term like a margin plot, to visualize the interaction effect. But package like library(interactions) does not work with an object of class lavaan. How could I visualize this? Is there a package (like interactions) that makes this easier.
You could fit this model using lm(), but I think you want to be able to use FIML estimates, yes? In that case, you could use the emmeans package, which can work on lavaan-class objects if you have the semTools package loaded.
You didn't say which predictor was focal vs. moderator, but I assume you want to treat male as moderator because it is a grouping variable. The example below can be adapted by switching their roles in the pairs() function, as well as by selecting different birth_year levels at= which to probe the effect of male. When birth_year is the focal predictor, its linear effect will be the same regardless of which levels are chosen, so I chose the full range() below.
library(emmeans)
library(semTools)
## for ease of use, fit model using colon operator
mod <- 'mass ~ height + human + male + birth_year + male:birth_year'
fit <- sem(mod, data = sw2, missing = "fiml.x")
## calculate expected marginal means for multiple
## levels of male (1:0) and birth_year
BYrange <- range(sw2$birth_year, na.rm = TRUE)
em.mass <- emmeans(fit, specs = ~ birth_year | male,
at = list(male = 1:0, birth_year = BYrange),
# because SEMs can have multiple DVs:
lavaan.DV = "mass")
em.mass
## probe effect of year across sex
rbind(pairs(em.mass))
## plot effect of year across sex
emmip(em.mass, male ~ birth_year) # 2 lines in same plot
emmip(em.mass, ~ birth_year | male) # in separate panels
I am trying to compare the survival in my study cohort with the survival in the Dutch general population (matched for age and sex). I created a rate table of the Dutch population.
library(relsurv)
setwd("")
nldpop <- transrate.hmd("mltper_1x1.txt","fltper_1x1.txt")
Then, I wanted to create a plot of the survival of my cohort (observed) and the survival of the population (expected) with age on the X-axis. However, the 'survexp' function does not seem to support a (start,stop,event)-format. Only with the normal (futime, event)-format it works, see below, but then I have follow-up time on the X-axis. Does anyone know how to get the age on the X-axis instead of follow-up time?
# Observed and expected survival with time on X-axis
fit <- survfit(Surv(futime, event)~1)
efit <- survexp(futime ~ 1, rmap = list(year=(date_entry), age=(age_entry), sex=(sex)),
ratetable=nldpop)
plot(fit)
lines(efit)
You didn't provide your example data, so i used survival::mgus data for this. Your problem may be due to incorrectly specifying variable names in the rmap option. See plot here
library(relsurv)
nldpop <- transrate.hmd("mltper_1x1.txt", "fltper_1x1.txt")
mgus2 <- mgus %>% mutate(date_year = dxyr + 1900)
fit <- survfit(Surv(futime, death) ~ 1, data = mgus2)
efit <- survexp(Surv(futime, death) ~ 1, data = mgus2,
ratetable = nldpop, rmap = list(age = age*365.25, year = date_year, sex = sex))
plot(fit)
lines(efit)
Because this is such a long question I've broken it down into 2 parts; the first being just the basic question and the second providing details of what I've attempted so far.
Question - Short
How do you fit an individual frailty survival model in R? In particular I am trying to re-create the coefficient estimates and SE's in the table below that were found from fitting the a semi-parametric frailty model to this dataset link. The model takes the form:
h_i(t) = z_i h_0(t) exp(\beta'X_i)
where z_i is the unknown frailty parameter per each patient, X_i is a vector of explanatory variables, \beta is the corresponding vector of coefficients and h_0(t) is the baseline hazard function using the explanatory variables disease, gender, bmi & age ( I have included code below to clean up the factor reference levels).
Question - Long
I am attempting to follow and re-create the Modelling Survival Data in Medical Research text book example for fitting frailty mdoels. In particular I am focusing on the semi parametric model for which the textbook provides parameter and variance estimates for the normal cox model, lognormal frailty and Gamma frailty which are shown in the above table
I am able to recreate the no frailty model estimates using
library(dplyr)
library(survival)
dat <- read.table(
"./Survival of patients registered for a lung transplant.dat",
header = T
) %>%
as_data_frame %>%
mutate( disease = factor(disease, levels = c(3,1,2,4))) %>%
mutate( gender = factor(gender, levels = c(2,1)))
mod_cox <- coxph( Surv(time, status) ~ age + gender + bmi + disease ,data = dat)
mod_cox
however I am really struggling to find a package that can reliably re-create the results of the second 2 columns. Searching online I found this table which attempts to summarise the available packages:
source
Below I have posted my current findings as well as the code I've used encase it helps someone identify if I have simply specified the functions incorrectly:
frailtyEM - Seems to work the best for gamma however doesn't offer log-normal models
frailtyEM::emfrail(
Surv(time, status) ~ age + gender + bmi + disease + cluster(patient),
data = dat ,
distribution = frailtyEM::emfrail_dist(dist = "gamma")
)
survival - Gives warnings on the gamma and from everything I've read it seems that its frailty functionality is classed as depreciated with the recommendation to use coxme instead.
coxph(
Surv(time, status) ~ age + gender + bmi + disease + frailty.gamma(patient),
data = dat
)
coxph(
Surv(time, status) ~ age + gender + bmi + disease + frailty.gaussian(patient),
data = dat
)
coxme - Seems to work but provides different estimates to those in the table and doesn't support gamma distribution
coxme::coxme(
Surv(time, status) ~ age + gender + bmi + disease + (1|patient),
data = dat
)
frailtySurv - I couldn't get to work properly and seemed to always fit the variance parameter with a flat value of 1 and provide coefficient estimates as if a no frailty model had been fitted. Additionally the documentation doesn't state what strings are support for the frailty argument so I couldn't work out how to get it to fit a log-normal
frailtySurv::fitfrail(
Surv(time, status) ~ age + gender + bmi + disease + cluster(patient),
dat = dat,
frailty = "gamma"
)
frailtyHL - Produce warning messages saying "did not converge" however it still produced coeficiant estimates however they were different to that of the text books
mod_n <- frailtyHL::frailtyHL(
Surv(time, status) ~ age + gender + bmi + disease + (1|patient),
data = dat,
RandDist = "Normal"
)
mod_g <- frailtyHL::frailtyHL(
Surv(time, status) ~ age + gender + bmi + disease + (1|patient),
data = dat,
RandDist = "Gamma"
)
frailtypack - I simply don't understand the implementation (or at least its very different from what is taught in the text book). The function requires the specification of knots and a smoother which seem to greatly impact the resulting estimates.
parfm - Only fits parametric models; having said that everytime I tried to use it to fit a weibull proportional hazards model it just errored.
phmm - Have not yet tried
I fully appreciate given the large number of packages that I've gotten through unsuccessfully that it is highly likely that the problem is myself not properly understanding the implementation and miss using the packages. Any help or examples on how to successfully re-create the above estimates though would be greatly appreciated.
Regarding
I am really struggling to find a package that can reliably re-create the results of the second 2 columns.
See the Survival Analysis CRAN task view under Random Effect Models or do a search on R Site Search on e.g., "survival frailty".