ggsurvplot - axes crossing at 0,0 - r

Survminer produces nice plots, but is there a way to further change the outcome with regular ggplot-commands?
What I try to do is make the y-axis start in the origin, as stated here.
For a regular ggplot, this works perfectly, but I can't make it work with survminer:
library(survival)
library(survminer)
df<-genfan
df$treat<-sample(c(0,1),nrow(df),replace=TRUE)
fit <- survfit(Surv(hours, status) ~ treat,
data = df)
p<-ggsurvplot(fit, data = df, risk.table = TRUE,,ncensor.plot=FALSE,tables.theme = theme_cleantable(),
ggtheme = theme_survminer(font.legend=c(12,"bold","black") ))
p%+%scale_x_continuous(expand=c(0,0))%+%scale_y_continuous(expand=c(0,0))
This produces an error
"Scale for 'y' is already present. Adding another scale for 'y', which will replace the existing scale."
and an additional error
"Error: Discrete value supplied to continuous scale"
Is there some way around this?

A possibile solution to your problem is to modify the ggsurvplot and ggrisktable inserting the expand=c(0,0) in the proper place.
At this link you can find my proposal.
Click on "download" and save the file a_modified_ggsurvplot.txt in your working directory.
Then run the following code:
library(survival)
library(survminer)
df <- genfan
df$treat <- sample(c(0,1),nrow(df),replace=TRUE)
fit <- survfit(Surv(hours, status) ~ treat, data = df)
source("a_modified_ggsurvplot.txt")
p <- myggsurvplot(fit, data = df, risk.table = TRUE,
ncensor.plot=FALSE,tables.theme = theme_cleantable(),
ggtheme = theme_survminer(font.legend=c(12,"bold","black")))
print(p)
Here is the plot:
Hope it could help you.

Related

Why aren't any points showing up in the qqcomp function when using plotstyle="ggplot"?

I want to compare the fit of different distributions to my data in a single plot. The qqcomp function from the fitdistrplus package pretty much does exactly what I want to do. The only problem I have however, is that it's mostly written using base R plot and all my other plots are written in ggplot2. I basically just want to customize the qqcomp plots to look like they have been made in ggplot2.
From the documentation (https://www.rdocumentation.org/packages/fitdistrplus/versions/1.0-14/topics/graphcomp) I get that this is totally possible by setting plotstyle="ggplot". If I do this however, no points are showing up on the plot, even though it worked perfectly without the plotstyle argument. Here is a little example to visualize my problem:
library(fitdistrplus)
library(ggplot2)
set.seed(42)
vec <- rgamma(100, shape=2)
fit.norm <- fitdist(vec, "norm")
fit.gamma <- fitdist(vec, "gamma")
fit.weibull <- fitdist(vec, "weibull")
model.list <- list(fit.norm, fit.gamma, fit.weibull)
qqcomp(model.list)
This gives the following output:
While this:
qqcomp(model.list, plotstyle="ggplot")
gives the following output:
Why are the points not showing up? Am I doing something wrong here or is this a bug?
EDIT:
So I haven't figured out why this doesn't work, but there is a pretty easy workaround. The function call qqcomp(model.list, plotstyle="ggplot") still returns an ggplot object, which includes the data used to make the plot. Using that data one can easily write an own plot function that does exactly what one wants. It's not very elegant, but until someone finds out why it's not working as expected I will just use this method.
I was able to reproduce your error and indeed, it's really intriguing. Maybe, you should contact developpers of this package to mention this bug.
Otherwise, if you want to reproduce this qqplot using ggplot and stat_qq, passing the corresponding distribution function and the parameters associated (stored in $estimate):
library(ggplot2)
df = data.frame(vec)
ggplot(df, aes(sample = vec))+
stat_qq(distribution = qgamma, dparams = as.list(fit.gamma$estimate), color = "green")+
stat_qq(distribution = qnorm, dparams = as.list(fit.norm$estimate), color = "red")+
stat_qq(distribution = qweibull, dparams = as.list(fit.weibull$estimate), color = "blue")+
geom_abline(slope = 1, color = "black")+
labs(title = "Q-Q Plots", x = "Theoritical quantiles", y = "Empirical quantiles")
Hope it will help you.

Cannot plot all data in box-plotting in R

I wanted to make a box plot. I have more than 1000 rows but when I am plotting them, it shows only a few entries.
Dataset:
https://www.dropbox.com/s/tgaqfgm2gkl7i3r/maintenance_data_updated.csv
#Start of Box plot Temperature
training_data <- read.csv("C:/Users/akhan/Documents/maintenance_data_updated_2.csv", stringsAsFactors = TRUE)
library(dplyr)
dt_temperature <- select(training_data, Runtime, Defect, Machine, Temperature, Plant)
dt_temperature$Machine_Plant = paste(dt_temperature$Machine,dt_temperature$Plant,sep = "_")
attach(dt_temperature)
class(Temperature)
class(Defect)
class(Runtime)
class(Machine)
?boxplot
boxplot(Temperature ~ Machine_Plant)
Current output: https://www.dropbox.com/s/7nv5n80en1vpkyt/Rplot01.png
Can anyone please give a hint what is the solution ?
What do you mean saying 'it shows only a few entries'? If your problem is about having only 4 boxplots annotated on X-axis, solution could be like this:
boxplot(Temperature ~ Machine_Plant, las=3)
Type
?par
for more information about las parameter.

object 'count' not found when using bymedian sorting method in R?

I'm pretty new to R so this is probably a simple question for the community.
I have a .csv file with 10 variables and 79 observations. I can plot a box plot of the Vcmax data grouped by treatment but when I try to organise the box plot by median (either ascending or descending) I hit a snag.
I use this code:
library(graphics)
bymedian <- with(Results2, reorder(Vcmax, -count, median))
boxplot(Vcmax ~ treatment, data=Results2, ylim=c(70,300), axes=TRUE, cex.axis=0.4,
xlab = "Species", ylab = "Vcmax",
main = "Species-Specific Variation in Vcmax", varwidth = TRUE,
col = "lightgray")
However R returns the error message:
Error in tapply(X = X, INDEX = x, FUN = FUN, ...) : object 'count'
not found
Is there something else i need to do to my data to help R recognise which data to sort and how to sort it? I'm sure it is something obvious...
It produces a really nice graph but not sorted.
Here is (hopefully) a link where you can download the data: http://expirebox.com/download/4258dc66916cdc054486f4f1bb7cfb55.html

How to set x limits on varImpPlot

How can I change the x limits of a plot produced by varImpPlot from the randomForest package?
If I try
set.seed(4543)
data(mtcars)
mtcars.rf <- randomForest(mpg ~ ., data=mtcars, ntree=1000, keep.forest=FALSE,
importance=TRUE)
varImpPlot(mtcars.rf, scale=FALSE, type=1, xlim=c(0,15))
I get the following error:
Error in dotchart(imp[ord, i], xlab = colnames(imp)[i], ylab = "", main = if (nmeas == : formal argument "xlim" matched by multiple actual arguments".
This is because varImpPlot defines its own x limits, I think, but how could I get around this if I wanted to set the x limits myself (perhaps for consistency across plots)?
First I extracted the values using importance() (thanks to the suggestion from #dww)
impToPlot <- importance(mtcars.rf, scale=FALSE)
Then I plotted them using dotchart(), which allowed me to manually set the x limits (and any other plot features I'd like)
dotchart(sort(impToPlot[,1]), xlim=c(0,15), xlab="%IncMSE")

Visualize data using histogram in R

I am trying to visualize some data and in order to do it I am using R's hist.
Bellow are my data
jancoefabs <- as.numeric(as.vector(abs(Janmodelnorm$coef)))
jancoefabs
[1] 1.165610e+00 1.277929e-01 4.349831e-01 3.602961e-01 7.189458e+00
[6] 1.856908e-04 1.352052e-05 4.811291e-05 1.055744e-02 2.756525e-04
[11] 2.202706e-01 4.199914e-02 4.684091e-02 8.634340e-01 2.479175e-02
[16] 2.409628e-01 5.459076e-03 9.892580e-03 5.378456e-02
Now as the more cunning of you might have guessed these are the absolute values of some model's coefficients.
What I need is an histogram that will have for axes:
x will be the number (count or length) of coefficients which is 19 in total, along with their names.
y will show values of each column (as breaks?) having a ylim="" set, according to min and max of those values (or something similar).
Note that Janmodelnorm$coef simply produces the following
(Intercept) LON LAT ME RAT
1.165610e+00 -1.277929e-01 -4.349831e-01 -3.602961e-01 -7.189458e+00
DS DSA DSI DRNS DREW
-1.856908e-04 1.352052e-05 4.811291e-05 -1.055744e-02 -2.756525e-04
ASPNS ASPEW SI CUR W_180_270
-2.202706e-01 -4.199914e-02 4.684091e-02 -8.634340e-01 -2.479175e-02
W_0_360 W_90_180 W_0_180 NDVI
2.409628e-01 5.459076e-03 -9.892580e-03 -5.378456e-02
So far and consulting ?hist, I am trying to play with the code bellow without success. Therefore I am taking it from scratch.
# hist(jancoefabs, col="lightblue", border="pink",
# breaks=8,
# xlim=c(0,10), ylim=c(20,-20), plot=TRUE)
When plot=FALSE is set, I get a bunch of somewhat useful info about the set. I also find hard to use breaks argument efficiently.
Any suggestion will be appreciated. Thanks.
Rather than using hist, why not use a barplot or a standard plot. For example,
## Generate some data
set.seed(1)
y = rnorm(19, sd=5)
names(y) = c("Inter", LETTERS[1:18])
Then plot the cofficients
barplot(y)
Alternatively, you could use a scatter plot
plot(1:19, y, axes=FALSE, ylim=c(-10, 10))
axis(2)
axis(1, 1:19, names(y))
and add error bars to indicate the standard errors (see for example Add error bars to show standard deviation on a plot in R)
Are you sure you want a histogram for this? A lattice barchart might be pretty nice. An example with the mtcars built-in data set.
> coef <- lm(mpg ~ ., data = mtcars)$coef
> library(lattice)
> barchart(coef, col = 'lightblue', horizontal = FALSE,
ylim = range(coef), xlab = '',
scales = list(y = list(labels = coef),
x = list(labels = names(coef))))
A base R dotchart might be good too,
> dotchart(coef, pch = 19, xlab = 'value')
> text(coef, seq(coef), labels = round(coef, 3), pos = 2)

Resources