I'm getting a half plotted graph when trying to plot the dose-response curve using ggplot in drc package. I'm trying to follow procedure given in a suplementary information from a recent paper on dose response curve.
Here is my raw data
Please help me to find solution, thanks!
# Fit a log-logistic model with lower and upper asymptotic limits fixed at respectively 0 and 100
mod.Pyr <- drm(gi ~ conc, data = my_data, fct = LL.4(fixed = c(NA, 0, 100, NA)))
# new dose levels as support for the line
newdata <- expand.grid(conc=exp(seq(log(0.5), log(3000), length=1000)))
# predictions and confidence intervals
pm <- predict(mod.Pyr, newdata=newdata, interval="confidence")
# new data with predictions
newdata$p <- pm[,1]
newdata$pmin <- pm[,2]
newdata$pmax <- pm[,3]
# need to shift conc == 0 a bit up, otherwise there are problems with coord_trans
my_data$conc0 <- my_data$conc
my_data$conc0[my_data$conc0 == 0] <- 0.5
# plotting the curve
ggplot(my_data, aes(x = conc0, y = gi)) +
geom_point() +
geom_ribbon(data=newdata, aes(x=conc, y=p, ymin=pmin, ymax=pmax), alpha=0.2) +
geom_line(data=newdata, aes(x=conc, y=p)) +
coord_trans(x="log") +
xlab("Concentration (mg/l)") + ylab("Growth inhibition")
You have conc values above log(3000) and you only created newdata for values until log(100) so you're not able to fit until log(3000), you just need to increase log(100) in expand.grid to higher values, examples with log(3000) :
newdata <- expand.grid(conc=exp(seq(log(0.5), log(3000), length=100)))
Related
I would like to ask you, please, how to create from the table two statistical graphs:
regression line with prediction interval
regression line with confidence interval
U used this script but I don't know what to do next:
pred <- lm(dta$Number.of.species ~ dta$Latitude)
pred_interval <- predict(lm(dta$Number.of.species ~ dta$Latitude), level = .99, interval = "confidence")[,2]
conf_interval <- predict(pred, newdata=dta, interval="prediction")[,3]
par(mfrow=c(2,2))
plot(
dta$Latitude,
dta$Number.of.species,
pch = 1,
ylim = c(0, 180),
xlim = c(37, 40)
)
plot(
dta$Latitude,
dta$Number.of.species,
pch = 1,
ylim = c(0, 180),
xlim = c(37, 40)
)
abline(pred)
Thank you for your time.
If you are just learning R, I would make 2 recommendations.
First, I would suggest learning the ggplot2 package, rather than using the base R plotting system. It is generally much easier to build up complex plots with many parts using ggplot().
Second, there are several packages designed to make working with model results easier in R. The most prominent of these are broom and the easystats collection of packages (modelbased, performance, parameters, etc.). Between the two, I would recommend easystats.
I'll demonstrate how to build up the data frame for plotting the model manually and using modelbased.
Manually building data frame
library(ggplot2)
# fit the model
m <- lm(mpg ~ disp, data = mtcars)
# construct prediction and confidence intervals using predict()
m_ci <- predict(m, interval = "confidence") |>
as.data.frame() |>
setNames(c("fit", "ci_lo", "ci_hi"))
m_pi <- predict(m, interval = "prediction") |>
as.data.frame() |>
setNames(c("fit", "pi_lo", "pi_hi"))
#> Warning in predict.lm(m, interval = "prediction"): predictions on current data refer to _future_ responses
# merge the interval data frames with the data frame used in the model
m_data <-
merge(
merge(
model.frame(m), m_ci, by = "row.names"
),
m_pi
)
# make a plot using the merged model data frame
ggplot(m_data) + # use m_data in the plot
aes(x = disp) + # put the 'disp' variable on the x axis
geom_point(aes(y = mpg)) + # add points, put the 'mpg' variable on the y axis for these
geom_ribbon(aes(ymin = pi_lo, ymax = pi_hi), fill = "lightblue", alpha = .4) + # add a ribbon for the prediction interval, put the pi_lo/pi_hi values on the y axis for this, color it lightblue and make it semitransparent
geom_ribbon(aes(ymin = ci_lo, ymax = ci_hi), fill = "lightblue", alpha = .4) + # add a ribbon for the confidence interval, put the ci_lo/ci_hi values on the y axis for this, color it lightblue and make it semitransparent
geom_line(aes(y = fit)) + # add a line for the fitted values, put the 'fit' values on the y axis
theme_minimal() # use a white background for the plot
Using the modelbased package to streamline some of the above steps
library(modelbased)
# compute intervals, including fitted values and original model matrix
ci <- estimate_expectation(m) # model fitted values and confidence intervals (uncertainty intervals on the expected values/predicted means)
pi <- estimate_prediction(m) # model fitted values and prediction intervals (uncertainty intervals on the individual predictions)
plot(ci) + # this produces a ggplot with points, fitted line, and confidence ribbon
geom_ribbon(aes(x = disp, ymin = CI_low, ymax = CI_high), data = pi, alpha = .4) + # add a prediction ribbon
theme_minimal() # use a white background
Here is how to modify the color of the ribbon when working with modelbased:
plot(ci, ribbon = list(fill = "lightblue")) +
geom_ribbon(aes(x = disp, ymin = CI_low, ymax = CI_high), data = pi, fill = "lightblue", alpha = .4) +
theme_minimal()
Created on 2021-08-18 by the reprex package (v2.0.0)
GauPro is an R library for fitting gaussian processes. You can also get it to produce a nuce predicted curve for you.
The documentation for GauPro uses builtin r plotting functions to do plots like this:
gp <- GauPro(x,y) ## fit a gaussian process model to x & y
plot(x,y) ## plots the x,y points
curve(gp$predict(x), add=T, col=2) ## adds the predicted curve from the gaussian process
What would be the equivalent using ggplot? I can get the points to show up, but I can't quite figure out how to add the curve.
GauPro documentation I refer to is here
We can do this by building a little data frame of predictions. Let's start by loading the necessary packages and creating some sample data:
library(GauPro)
library(ggplot2)
set.seed(69)
x <- 1:10
y <- cumsum(runif(10))
Now we can create our model and plot it using the same plotting functions shown in the vignette you linked:
gp <- GauPro(x, y)
plot(x, y)
curve(gp$predict(x), add = TRUE, col = 2)
Now if we want to customize this plot using ggplot, we need a data frame with columns for the x values at which we wish to predict, the y prediction at that point, and a column each for upper and lower 95% confidence intervals. We can obtain the x values like this:
new_x <- seq(min(x), max(x), length.out = 100)
and we can get the three sets of corresponding y values using predict like this:
predict_df <- predict(gp, new_x, se.fit = TRUE)
predict_df$x <- new_x
predict_df$y <- predict_df$mean
predict_df$lower <- predict_df$y - 1.96 * predict_df$se
predict_df$upper <- predict_df$y + 1.96 * predict_df$se
this is now quite straightforward to plot in ggplot with themes customized as you choose:
ggplot(data.frame(x, y), aes(x, y)) +
geom_point() +
geom_line(data = predict_df, color = "deepskyblue4", linetype = 2) +
geom_ribbon(data = predict_df, aes(ymin = lower, ymax = upper),
alpha = 0.2, fill = "deepskyblue4") +
theme_minimal()
Created on 2020-07-29 by the reprex package (v0.3.0)
I've some data for fitting crude and adjusted logit GAMs:
library(mgcv)
## Simulate some data...
set.seed(3);n<-400
dat <- gamSim(1,n=n)
mu <- binomial()$linkinv(dat$f/4-2)
phi <- .5
a <- mu*phi;b <- phi - a;
dat$y <- rbeta(n,a,b)
## Fitting GAMs
crude <- gam(y~s(x0),family=binomial(link="logit"),data=dat)
adj <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=binomial(link="logit"),data=dat)
Now I would intercept the value of x0 with the odds ratio (OR) 1.00 (i.e. probability 0.50). For this purpose I use visreg with argument plot = FALSE.
## Prepare data for ggplotting
library(visreg)
p.crude <- visreg(crude, "x0", plot = FALSE)
p.adj <- visreg(adj, "x0", plot = FALSE)
library(dplyr)
bind_rows(
mutate(p.crude$fit, Model = "crude"),
mutate(p.adj$fit, Model = "adj")
) -> fits
Ok. I gonna compute OR from LogOR. Is the following code correct?
# Compute ORs and CI from LogOR
fits$or <- exp(fits$visregFit)
fits$ci.low <- exp(fits$visregLwr)
fits$ci.up <- exp(fits$visregUpr)
Now I use approx in order to interpolate the x0 value with OR 1.00
## Interpolate x0 which give OR 1.00 (or 50% of probability)
x.crude <- round(approx(x = crude$fitted.values, y=crude$model$x0, xout = .5)$y, 1)
x.adj <- round(approx(x = adj$fitted.values, y=adj$model$x0, xout = .5)$y, 1)
Finally, I'm plotting the two models in a single graph:
## Plotting using ggplot
library(ggplot2)
ggplot(data = fits) +
geom_vline(aes(xintercept = x.crude), size=.2, color="black")+
geom_vline(aes(xintercept = x.adj), size=.2, color="red")+
annotate(geom ="text", x= x.crude - 0.05, y=.5, label = x.crude, size=3.5) +
annotate(geom ="text", x= x.adj - 0.05, y=.5, label = x.adj, size=3.5, color="red") +
geom_ribbon(aes(x0, ymin=ci.low, ymax=ci.up, group=Model, fill=Model), alpha=.05) +
geom_line(aes(x0, or, group=Model, color=Model)) +
labs(x="X0", y="Odds ratio")+
theme_bw(16)
As you can see, only the crude model shows an intercept with OR almost equal to 1.00 (x0 = 0.9), while this never happens for the adj model.
First, how can I get an interpolation with OR that is exactly at 1?
Second...With the limitation of my statistical knowledge, it was my understanding that I should have observed an intercept with OR=1 for the adj model, as well, based on the observed values for x0 according to this model. Why is the relative curve set upwards?
I have produced a scatter plot in R of expected/observed values. I calculated orthogonal regression and added the line using the following:
library(ggplot2)
library(MethComp)
r<-read_csv("Uni/MSci/Project/DATA/new data sheets/comparisons/for comarison
graphs/R Regression/GCdNi.csv")
x<-r[1]
y<-r[2]
P<-ggplot()+geom_point(aes(x=x,y=y))+
scale_size_area()+xlab("Expected")+ylab("Observed")+ggtitle("G - Cd x Ni")+
xlim(0, 40)+ylim(0, 40)
# Orthogonal, total least squares or Deming regression
deming <- Deming(y=r$Observed, x=r$Expected)[1:2]
deming
R <- prcomp( ~ r$Expected + r$Observed )
slope <- R$rotation[2,1] / R$rotation[1,1]
slope
intercept <- R$center[2] - slope*R$center[1]
intercept
#Plot orthogonal regression
P+geom_abline(intercept = deming[1], slope = deming[2])
This gives me the following plot:
Is there a way I can calculate and add an R squared value to the graph?
Heres some of the data frame to allow for reproduction:
Expected Observed
2.709093153 1.37799781
2.611562579 1.410720257
2.22411805 1.287685907
3.431914392 1.906787706
3.242018129 1.823698676
3.46139841 1.767857729
2.255673738 1.111307235
2.400606765 1.294583377
1.818447253 0.995226256
2.528992184 1.173159775
2.46829393 1.101852756
1.826044939 0.883336715
1.78702201 1.050122993
2.37226253 1.025298403
2.140921846 1.094761918
I could not reproduce your data, but here's how you could do something like that with linear regression.
library(ggplot2)
set.seed(1)
x <- rnorm(20,1,100)
y<- x + rnorm(20,50,10)
regression <- lm(y ~ x)
r2 <- summary(regression)$r.squared
ggplot() + geom_point(aes(x, y)) +
geom_line(aes(x, regression$fitted.values)) +
annotate("text", x = -100, y = 200, label = paste0("r squared = ", r2))
In the future, you should provide a reproducible example.
I'm trying to plot a line, smoothed by loess, but I'm trying to figure out how to include shaded error areas defined by existing variables, but also smoothed.
This code creates example data:
set.seed(12345)
data <- cbind(rep("A", 100), rnorm(100, 0, 1))
data <- rbind(data, cbind(rep("B", 100), rnorm(100, 5, 1)))
data <- rbind(data, cbind(rep("C", 100), rnorm(100, 10, 1)))
data <- rbind(data, cbind(rep("D", 100), rnorm(100, 15, 1)))
data <- cbind(rep(1:100, 4), data)
data <- data.frame(data)
names(data) <- c("num", "category", "value")
data$num <- as.numeric(data$num)
data$value <- as.numeric(data$value)
data$upper <- data$value+0.20
data$lower <- data$value-0.30
Plotting the data below, this is what I get:
ggplot(data, aes(x=num, y=value, colour=category)) +
stat_smooth(method="loess", se=F)
What I'd like is a plot that looks like the following, except with the upper and lower bounds of the shaded areas being bounded by smoothed lines of the "upper" and "lower" variables in the generated data.
Any help would be greatly appreciated.
Here's one way to add smoothed versions of upper and lower. We'll add LOESS predictions for upper and lower to the data frame and then plot those using geom_ribbon. It would be more elegant if this could all be done within the call to ggplot. That's probably possible by feeding a special-purpose function to stat_summary, and hopefully someone else will post an answer using that approach.
# Expand the scale of the upper and lower values so that the difference
# is visible in the plot
data$upper = data$value + 10
data$lower = data$value - 10
# Order data by category and num
data = data[order(data$category, data$num),]
# Create LOESS predictions for the values of upper and lower
# and add them to the data frame. I'm sure there's a better way to do this,
# but my attempts with dplyr and tapply both failed, so I've resorted to the clunky
# method below.
data$upperLoess = unlist(lapply(LETTERS[1:4],
function(x) predict(loess(data$upper[data$category==x] ~
data$num[data$category==x]))))
data$lowerLoess = unlist(lapply(LETTERS[1:4],
function(x) predict(loess(data$lower[data$category==x] ~
data$num[data$category==x]))))
# Use geom_ribbon to add a prediction band bounded by the LOESS predictions for
# upper and lower
ggplot(data, aes(num, value, colour=category, fill=category)) +
geom_smooth(method="loess", se=FALSE) +
geom_ribbon(aes(x=num, y=value, ymax=upperLoess, ymin=lowerLoess),
alpha=0.2)
And here's the result: