I am trying to plot a restricted cubic spline model using the rms package. However I don't find any way to adjust my cox proportional hazard model, I can only get the unadjusted fit.
Here is my code:
library(survival)
library(rms)
dd <- datadist(Cox9)
options(datadist="dd")
fit <- cph(Surv(follcox,evento) ~ rcs(G_VINO,3))
plot(Predict(fit_vino), lty=1, lwd=3, ylim=c(-0.5,1.0),xlim = c(0,50), col="white")
With this coding I get the unadjusted spline model.
I wondered how can I add the confounding variables to adjust the model.
I tried:
fit_vino_adj <- cph(Surv(follcox,evento) ~rcs(G_VINO+edad0+actfis+energia))
plot(Predict(fit_vino_adj), lty=2, lwd=2)
But that gives me the splines model of each variable separately, anyone has an idea how can I adjust my model?
Since you failed to include the data in Cox9 or show how one might construct a similar dataframe or show any output, we can only guess at what happened and respond in generalities. It appears that you are bundling the variables within the rcs function. That is unlikely to succeed, or if it does succeed seems likely that the results will be incorrect. Instead you should construct this fit and then plot only the adjusted fit of the curve you are interested in by naming the variable of focus in the Predict-call.
fit_vino_adj <- cph(Surv(follcox,evento) ~ rcs(G_VINO, 3)+edad0+actfis+energia)
plot(Predict(fit_vino_adj, name="G_VINO"), lty=2, lwd=2)
Or perhaps (assuming these are all continuous measurements) make the very slightly modified plotting call after:
fit_vino_adj2 <- cph(Surv(follcox,evento) ~ rcs(G_VINO, 3)+rcs(edad0, 3) +
rcs(actfis, 3) + rcs(energia, 3) )
plot(Predict(fit_vino_adj), lty=2, lwd=2) # to see form of all variable fits.
If you want to have two or more rcs splines in the models, then you need to wrap rcs around the other variables separately. I did not think that rcs function was not like the ^ function, which has a formula expansion method. (Although your claim that you got separate output from that second model makes me wonder if I have completely kept up with that package.) If you wanted a complex surface for what I call "crossed-splines", then you would use the * operator between two rcs calls. Crossing with a factor variable will construct individual rcs-spline fits for each level of the factor.
Related
I have a dataframe which is a time series. I am using the function lm to build a multivariate regression model.
linearmodel <- lm(Y~X1+X2+X3, data = data)
I want to plot the residuals of this linearmodel on the y-axis and time on the x-axis using a simple function, with the lm() object as the input.
Standard residual plotting functions like the one in car package (car::residualPlot) gives residuals on the Y-axis and fitted-values on the Y-axis.
Ideally, I need the residuals on the Y-axis and the timescale on the X-axis. But I understand that the function lm() is time agnositc. So, I can live with if the residuals are on Y-axis in the same order as the data input and nothing on the X-axis
Is there a plotting function which i can use by passing the linearmodel object into the function (not something where i can extract the residuals and use ggplot2). So for example: plot<- plotresidualsinorder(linearmodels) should give me the residuals on Y-axis in the same order of the data input?
I want to use this plot in R-shiny ultimately.
My research led me to car package, which is wonderful in its own right, but doesn't have the function to solve my problem.
Many thanks in advance for the help.
You can use the Residual Plot information. For the proposed solution, we need to apply the lm function to a formula that describes your Y variables by the variables X1+X2+X3, and save the linear regression model in a new linearmodel variable. Finally, we compute the residual with the resid function. In your case, the following solution can be representative for your problem.
Proposed solution:
linearmodel <- lm(Y~X1+X2+X3, data = data)
lm_resid <- resid(linearmodel)
plot(data$X1+X2+X3, lm_resid,
ylab="Residuals", xlab="Time",
main="Data")
abline(0, 0)
For any help concerning how does the resid function works, you can try:
help(resid)
Calisto's solution will work, but there is a more simple and straightforward solution. The lm function already give to you the regression residuals. So you may simply pass:
plot(XTime, linearmodel$residuals, main = "Residuals")
XTime is the Date variable of your dataset, maybe you may require to format that with POSIX functions: https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/as.POSIX*
Add parameters as you need to share it on R-shiny.
When using the lme/lmer function, I cannot get R to display all 4 diagnostic plots (res vs fit, normal-QQ, scale-location, res vs leverage) with par(mfrow=c(2,2)) and plot().
I just get the res vs fit plot and nothing else.
I have no problem when using the lm function.
Does anybody know how to do this?
library(lme4)
m0<-lmer(hematology~Treatment*day+Gender+(1|ID),data=long,na.action=na.omit,REML=FALSE)
par(mfrow=c(2,2))
plot(m0)
tl;dr ?plot.merMod explains in quite a bit of detail how the plotting methods work for fits produced by [g]lmer ...
You can get at least the first three plots corresponding to plot.lm fairly easily:
fitted vs residual with smooth line added
plot(lmer_model, type=c("p","smooth"), col.line=1)
(it's harder to get the smooth and the zero line drawn in different colours)
scale-location plot
plot(lmer_model,
sqrt(abs(resid(.)))~fitted(.),
type=c("p","smooth"), col.line=1)
Q-Q plot
lattice::qqmath(lmer_model)
residuals vs leverage
plot(fm1, rstudent(.) ~ hatvalues(.))
(the Cook's distances can be computed via cooks.distance() but superimposing the contours of CD={0.5,1} isn't so easy ...)
historical note
The design and implementation of lme4 diagnostic plot methods differ from plot.lm, which is the canonical example in base R. Why? I don't know for sure, but this approach is derived from the nlme package, which predates R; the earliest version I could find is this page from the Wayback Machine (1998), which links to a copy of the user's guide for version 1.2, dated February 1995; that's three months before the first source-code release of R (via ftp) in June 1995.
it uses lattice (derived from Trellis™ graphics) rather than base-R graphics
although it doesn't automatically construct e.g. scale-location plots, it is more flexible. You can use formulas to show fitted or residual values vs parameters, facet, etc., e.g. plot(fm1,residuals(.)~Days|Subject)
there are separate commands for plotting residuals etc. (plot) and Q-Q plots (qqnorm in nlme, qqmath in lme4)
I know that this is a 2-year-old question, but I was having the same issue (September/2022) and then I found Panel of Diagnostic Residual Plots and redres
resid_panel(mod1, smoother = TRUE, qqbands = TRUE)
which shows:
As long as we're adding answers, the performance package is now available:
library(lme4)
library(performance)
fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
check_model(fm1)
In R, plot is a generic function. This means that when you call plot, R examines the class of the object you have passed to the first argument and chooses the plotting method according to this class.
Let's take an example. Suppose I use the lm function to create a model. The resulting model object will have class "lm":
lm_model <- lm(Sepal.Length ~ Sepal.Width, data = iris)
class(lm_model)
#> [1] "lm"
That means that when I call plot(lm_model), R will see that I am calling plot on an object of class lm. Instead of trying to construct a basic xy plot as it would if I did plot(1:10), R now knows to call a plotting method that has been specifically written to plot objects of type "lm". In this case, it will dispatch the method stats:::plot.lm, which is a long function that takes the "lm" object and creates the 4 diagnostic plots.
Now let's see what we get when we create a model with lmer:
library(lme4)
lmer_model <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
class(lmer_model)
#> [1] "lmerMod"
#> attr(,"package")
#> [1] "lme4"
Our model is an object of type "lmerMod". When we call plot on this object, R looks up the correct method to plot an object of this class. Since it has a completely different structure from an object of class "lm", it wouldn't make sense to plot it with plot.lm, so the authors who created the lme4 package had to decide what the best way to plot an object of class "lmerMod" was. They wrote the method lme4:::plot.merMod, which draws the single plot you see when you call plot on your model.
Why is this? That's one for the authors to answer, but it seems the main reason is that they wanted a plot method that would cover GLMM, LMM and REML models. The diagnostic plots for lm don't make sense for all of these model types.
So the short answer is that there is no problem to "solve" as such; this is just not how "lmerMod" objects are plotted. If you have specific concerns about some aspects of your fit that can be answered by these diagnostic plots, you should examine these individually.
I know how to calculate risk difference with a 2x2 table, but I have no idea how to do this with a regression model, even though it is a quite widely used method when you need to adjust variables in question.
In case I'm not making any sense, here's an article that discusses proper ways to calculate risk difference, but unfortunately it doesn't contain any code: https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-016-0217-0
If I have understood your question, I think the book Regression Modelling Strategies may be what you are looking for. For example:
# Load the rms Package by FE Harrell et al., remember to install the package first..
library(rms)
# Create a fit object using some dummy data in the package
fit <- npsurv(Surv(time, status) ~ x, data = aml)
# Then you can plot a Kaplan-Meier survival curve.
plot(fit)
# Then plot the 'Risk difference' for your data, with 95% confidence limits
survdiffplot(fit, xlim = c(0,60))
I have a some data and I draw them on a plot, using R.
After that, I draw the loess function about that data.
Here is the code:
data <- read.table("D:/data.csv", header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE)
ur <- subset(data, select = c(users,responseTime))
ur <- ur[with(ur, order(users, responseTime)), ]
plot(ur, xlab="Users", ylab="Response Time (ms)")
lines(ur)
loess_fit <- loess(responseTime ~ users, ur)
lines(ur$users, predict(loess_fit), col = "blue")
Here's my plot's image:
How can I get the function of this regression?
For example: responseTime = 68 + 45 * users.
Thanks.
You can use the loess_fit object from your code to predict the response time. If you want to estimate the average response time for 230 users, you could do:
predict(loess_fit, newdata=data.frame(users=230))
Here is an interesting blog post on this subject.
EDIT: If you want to make predictions for values outside your data, you need a theory or further assumptions. The most simple assumption would be a linear fit,
lm_fit <- lm(responseTime ~ users, data=ur)
predict(lm_fit, newdata=data.frame(users=400))
However, your data may show heteroscedacity (non-constant variance) and may show non-normal residuals. You might want to check if that is the case. If it is, then a robust linear fitting procedure such as rlm from the package MASS, or a generalized linear model glm might be worth a try. I am not an expert for that, maybe someone else or at Cross Validated can provide better help.
The loess.demo function in the TeachingDemos package shows the logic underlying the loess fit. This can help you understand what is going on and why there is not a simple prediction function. However, for predicting, there is a predict function that works with loess fits to create the prediction. You can also find the linear equation that will predict for a specific value of x (but it will be different for each value of x you may want to predict for).
I am running some tests to try and determine what distribution my data follows. By the look of the density of my data I thought it looked a bit like a logistic distribution. I than used the package MASS to estimate the parameters of the distribution. However when I graph them together although better than the normal, the logistic is still not very good..Is there a way to find what distribution would go better? Thank you for the help !
library(quantmod)
getSymbols("^NDX",src="yahoo", from='1997-6-01', to='2012-6-01')
daily<- allReturns(NDX) [,c('daily')]
dailySerieTemporel<-ts(data=daily)
x<-na.omit(dailySerieTemporel)
library(MASS)
(xFit<-fitdistr(x,"logistic"))
# location scale
# 0.0005210570 0.0106366354
# (0.0002941922) (0.0001444678)
xFitEst<-coef(xFit)
plot(density(x))
set.seed(125)
lines(density(rlogis(length(x), xFitEst['location'], xFitEst['scale'])), col=3)
lines(density(rnorm(length(x), mean(x), sd(x))), col=2)
This is elementary R: plot() creates a new plotting canvas by default, and you should use a command such as lines() to add to an existing plot.
This works for your example:
plot(density(x))
lines(density(rlogis(length(x), location = 0.0005210570,
scale = 0.0106366354)), col="blue")
as it adds the estimated logistic fit in blue to your existing plot.