How can I increase the axis resolution with `ggplot` and `xlim()` - r

I'm trying to simulate a profile from a scientific experiment separating 3 compounds with HPLC.
I am plotting peaks with ggplot(), but get really poor resolution on the x-axis. How can I increase the number of points that ggplot is using on the x-axis?
I tried using tibble(x = seq(0, 9, 0.005)) as my input data, but no change.
Current output:
peaks <- list(p2 = list(name = "peak 1", tr = 2.5, sigma = 0.00873037869711973,
k = 0.25), p5 = list(name = "peak 2", tr = 5, sigma = 0.0174607573942395,
k = 1.5), p7 = list(name = "peak 3", tr = 7, sigma = 0.0244450603519352,
k = 2.5))
ggplot(tibble(x = 0), aes(x = x)) +
stat_function(fun = function(x) rowSums(mapply(dnorm,
mean = sapply(peaks, function(x) x$tr),
sd = sapply(peaks, function(x) x$sigma),
MoreArgs = list(x = x)))) +
xlim(0, 9) +
ylim(0, 50) +
theme_classic()
(thanks Roland)
Desired output:
This was done with a data frame, but I like the stat_function() approach, if possible. Thanks.

Related

Issues connected to creating a nls model in R

I am willing to interpolate two curves, according to the function Curing ~ a * atan(b * Time), fitting the data reported in the code below. I am getting two problems with this:
library(tidyverse)
library(investr)
library(ggplot2)
#DATAFRAME
RawData <- data.frame("Time" = c(0, 4, 8, 24, 28, 32, 0, 4, 8, 24, 28, 32), "Curing" = c(0, 28.57, 56.19, 86.67, 89.52, 91.42, 0, 85.71, 93.33, 94.28, 97.62, 98.09), "Grade" = c("Product A", "Product A", "Product A", "Product A", "Product A", "Product A", "Product B", "Product B", "Product B", "Product B", "Product B", "Product B"))
attach(RawData)
model <- nls(Curing ~ a * atan(b * Time), data= RawData, control=nls.control(printEval=TRUE, minFactor=2^-24, warnOnly=TRUE))
new.data <- data.frame(time=seq(1, 32, by = 0.1))
interval <- as_tibble(predFit(model, newdata = new.data, interval = "confidence", level= 0.9)) %>% mutate(Time = RawData$Time)
The first is an error as soon as I input the last line:
Error in assign(xname, newdata[, xname]) : first argument not valid
I have tried to change the values of new.data without success. If I remove the optional argument newdata = I can fit, but it looks like the fitting is made interpolating the whole set of data without differentiating the two series.
Below the command lines for getting the graph:
Graph <- ggplot(data=RawData, aes(x=`Time`, y=`Curing`, col=Grade)) + geom_point(aes(color = Grade), shape = 1, size = 2.5)
Graph + geom_line(data=interval, aes(x = Time, y = fit))+
geom_ribbon(data=interval, aes(x=Time, ymin=lwr, ymax=upr), alpha=0.5, inherit.aes=F, fill="blue")+
theme_classic()
Is it possible to have both: a smooth and series-separated fitting?
Your error is caused by a typo (time instead of Time in new.data). However, this will not fix the problem of getting one ribbon for each series.
To do this as a one-off, you will need two separate models for the two different sets of data. It is best to use the split-apply-bind idiom to create a single prediction data frame. It also helps plotting if this has a Grade column and the fit column is renamed to Curing
library(tidyverse)
library(investr)
library(ggplot2)
pred_df <- do.call(rbind, lapply(split(RawData, RawData$Grade), function(d) {
new.data <- data.frame(Time = seq(0, 32, by = 0.1))
nls(Curing ~ a * atan(b * Time), data = d, start = list(a = 5, b = 1)) %>%
predFit(newdata = new.data, interval = "confidence", level = 0.9) %>%
as_tibble() %>%
mutate(Time = new.data$Time,
Grade = d$Grade[1],
Curing = fit)
}))
This then allows the plot to be quite straightforward:
ggplot(data = RawData, aes(x = Time, y = Curing, color = Grade)) +
geom_point(shape = 1, size = 2.5) +
geom_ribbon(data = pred_df, aes(ymin = lwr, ymax = upr, fill = Grade),
alpha = 0.3, color = NA) +
geom_line(data = pred_df) +
theme_classic(base_size = 16)
General approach
I think this is quite a useful technique, and might be of broader interest, so a more general solution if one wishes to plot confidence bands with an nls model using geom_smooth would be to create little wrappers around nls and predFit:
nls_se <- function(formula, data, start, ...) {
mod <- nls(formula, data, start)
class(mod) <- "nls_se"
mod
}
predict.nls_se <- function(model, newdata, level = 0.9, ...) {
class(model) <- "nls"
p <- investr::predFit(model, newdata = newdata,
interval = "confidence", level = level)
list(fit = p, se.fit = p[,3] - p[,1])
}
This allows very simple plotting with ggplot:
ggplot(data = RawData, aes(x = Time, y = Curing, color = Grade)) +
geom_point(size = 2.5) +
geom_smooth(method = nls_se, formula = y ~ a * atan(b * x),
method.args = list(start = list(a = 5, b = 1))) +
theme_minimal(base_size = 16)
To put both prediction and confidence bands, we can do:
nls_se <- function(formula, data, start, type = "confidence", ...) {
mod <- nls(formula, data, start)
class(mod) <- "nls_se"
attr(mod, "type") <- type
mod
}
predict.nls_se <- function(model, newdata, level = 0.9, interval, ...) {
class(model) <- "nls"
p <- investr::predFit(model, newdata = newdata,
interval = attr(model, "type"), level = level)
list(fit = p, se.fit = p[,3] - p[,1])
}
ggplot(data = RawData, aes(x = Time, y = Curing, color = Grade)) +
geom_point(size = 2.5) +
geom_smooth(method = nls_se, formula = y ~ a * atan(b * x),
method.args = list(start = list(a = 5, b = 1),
type = "prediction"), alpha = 0.2,
aes(fill = after_scale(color))) +
geom_smooth(method = nls_se, formula = y ~ a * atan(b * x),
method.args = list(start = list(a = 5, b = 1)),
aes(fill = after_scale(color))) +
theme_minimal(base_size = 16)

Labeling a ggplot with a mix of variables and expressions

I'm trying to make a labeled scatterplot in ggplot and the specifics of the labels are causing me fits. Basically, among other things, I want to annotate my facet_wrapped 2-panel ggplot with the R^2 and Mean Bias. Notably, I want to label the mean bias with the appropriate units.
A simple version of my data might look as follows:
library(tidyverse)
Demo_Df <- tibble(Modeled = rnorm(50,0,1), Observed = rnorm(50, 0.5, 1),
Scheme = c(rep("Scheme1", 25), rep("Scheme2", 25)))
Demo_Annotation <- tibble(r.squared = c(0.589, 0.573), Mean_Bias = c(-2.038, -1.049), Scheme = c("Scheme1", "Scheme2"))
Demo_Scatter <- Demo_Df %>%
ggplot(aes(x = Observed, y = Modeled, color = Scheme)) +
geom_point(size = 1.5) +
facet_wrap(~Scheme) +
theme_tufte() +
xlab(expression(paste("Observed Aerosol (", mu, "g m" ^ "-3", ")"), sep = "")) +
ylab(expression(paste("Modeled Aerosol (", mu, "g m" ^ "-3", ")"), sep = "")) +
ylim(-3, 4) +
theme(legend.position = "none")
Demo_Labeled <- Demo_Scatter +
geom_text(data = Demo_Annotation, aes(-2, 3,
label = paste(
"R2 = ", sprintf("%.2f", signif(r.squared, 3)), "\n",
"Mean Bias = ", sprintf("%.2f", signif(Mean_Bias, 3))
)),
size = 5, hjust = 0, color = "black")
This produces almost the right figure, but I would like the R2 to have a superscript 2 and I need to add micrograms per cubic meter (ug/m3) to the end of the "Mean Bias = " label, as it is on the x and y-axes.
To date, I've completely failed at this. I cannot find a solution that supports multiple lines, facet_wrap, variable inputs, AND expressions. There has to be a way to do this. Please help me, tidyverse gods!
One option to achieve your desired result is to add you multiple lines via multiple geom_text layers. To parse the labels as math notation add parse=TRUE to geom_text. Finally I added the labels to you annotations df where I made use of ?plotmath for the math notation.
library(tidyverse)
library(ggthemes)
Demo_Annotation <- Demo_Annotation %>%
mutate(r.squared = paste0("R^{2} == ", sprintf("%.2f", signif(r.squared, 3))),
Mean_Bias = paste0("Mean~Bias == ", sprintf("%.2f", signif(Mean_Bias, 3)), "~mu*g~m^{-3}"))
Demo_Scatter +
geom_text(data = Demo_Annotation, aes(x = -2, y = 4, label = r.squared),
size = 5, hjust = 0, color = "black", parse = TRUE, family = "serif") +
geom_text(data = Demo_Annotation, aes(x = -2, y = 3.5, label = Mean_Bias),
size = 5, hjust = 0, color = "black", parse = TRUE, family = "serif")
DATA
set.seed(42)
Demo_Df <- tibble(Modeled = rnorm(50,0,1), Observed = rnorm(50, 0.5, 1),
Scheme = c(rep("Scheme1", 25), rep("Scheme2", 25)))

Combined scatter and line ggplot with proper legend

I try to find a clear approach for combined scatter and line plots with ggplot2 that have an appropriate legend. The following works, in principle, but with warnings:
library("ggplot2")
library("dplyr")
## 2 data sets, one for the lines, one for the points
tbl <- tibble(
f = rep(letters[1:2], each = 10),
x = rep(1:10, 2),
y = c(1e-4 * exp(1:10), log(1:10))
)
obs <- tibble(
f = rep("c", 5),
x = seq(2, 10, 2),
y = log(seq(2, 10, 2)) + rnorm(5, sd = 0.1)
)
rbind(tbl, obs) %>%
ggplot(aes(x, y, color = f, linetype = f)) +
geom_line(show.legend = TRUE) +
geom_point(show.legend = TRUE, aes(shape = f), size = 3) +
scale_linetype_manual(values=c("solid", "solid", "blank")) +
scale_shape_manual(values=c(NA, NA, 16))
but I would like to get rid of warnings and to write something like:
scale_shape_manual(values=c("none", "none", "circle"))
Is there already a "none" or "empty" shape code? Several past answers have been suggested on SO, but I wonder if there is a recent canonical way.

plotting density cauchy distribution in R

Just curious how can you generate the dcauchy distribution from Wikipedia:
Normally, you have
dcauchy(x, location = 0, scale = 1, log = FALSE)
for one line density p(x) v.s x
I assume in order to generate the diagram from wiki, a data.frame involves?
cauchy_dist <- data.frame(cauchy1 = rcauchy(10, location = 0, scale = 1, log = FALSE), cauchy2 = ....... , cauchy3 = ..... )
or you just need to
plot(x, P(x))
and then add lines to it?
You can use ggplot2's stat_function:
ggplot(data.frame(x = c(-5, 5)), aes(x)) +
stat_function(fun = dcauchy, n = 1e3, args = list(location = 0, scale = 0.5), aes(color = "a"), size = 2) +
stat_function(fun = dcauchy, n = 1e3, args = list(location = 0, scale = 1), aes(color = "b"), size = 2) +
stat_function(fun = dcauchy, n = 1e3, args = list(location = 0, scale = 2), aes(color = "c"), size = 2) +
stat_function(fun = dcauchy, n = 1e3, args = list(location = -2, scale = 1), aes(color = "d"), size = 2) +
scale_x_continuous(expand = c(0, 0)) +
scale_color_discrete(name = "",
labels = c("a" = expression(x[0] == 0*","~ gamma == 0.5),
"b" = expression(x[0] == 0*","~ gamma == 1),
"c" = expression(x[0] == 0*","~ gamma == 2),
"d" = expression(x[0] == -2*","~ gamma == 1))) +
ylab("P(x)") +
theme_bw(base_size = 24) +
theme(legend.position = c(0.8, 0.8),
legend.text.align = 0)
You could create the data as follows:
location <- c(0, 0, 0, -2)
scale <- c(0.5, 1, 2, 1)
x <- seq(-5, 5, by = 0.1)
cauchy_data <- Map(function(l, s) dcauchy(x, l, s), location, scale)
names(cauchy_data) <- paste0("cauchy", seq_along(location))
cauchy_tab <- data.frame(x = x, cauchy_data)
head(cauchy_tab)
## x cauchy1 cauchy2 cauchy3 cauchy4
## 1 -5.0 0.006303166 0.01224269 0.02195241 0.03183099
## 2 -4.9 0.006560385 0.01272730 0.02272830 0.03382677
## 3 -4.8 0.006833617 0.01324084 0.02354363 0.03600791
## 4 -4.7 0.007124214 0.01378562 0.02440091 0.03839685
## 5 -4.6 0.007433673 0.01436416 0.02530285 0.04101932
## 6 -4.5 0.007763656 0.01497929 0.02625236 0.04390481
Map is used to apply a function of multiple variables to just as many vectors element by element. Thus, the first list element of cauchy_data will contain the following
dcauchy(x, location[1], scale[1])
and so on. I then put the Cauchy data in a data frame together with the vector of x coordinates, x. So you have the desired data table.
There are, of course, many ways to plot this. I prefer to use ggplot and show you how to plot as an example:
library(tidyr)
library(ggplot2)
curve_labs <- paste(paste("x0 = ", location), paste("gamma = ", scale), sep = ", ")
plot_data <- gather(cauchy_tab, key = curve, value = "P", -x )
ggplot(plot_data, aes(x = x, y = P, colour = curve)) + geom_line() +
scale_colour_discrete(labels = curve_labs)
You could tweak the plot in many ways to get something that more closely resembles the plot from Wikipedia.

How do I create a legend in ggplot2 for a prior and posterior? [duplicate]

This question already has an answer here:
Adding legend to ggplot when lines were added manually
(1 answer)
Closed 9 years ago.
I want to create a legend for my prior and posterior in ggplot 2. I'm using knitr so it needs to be able to transfer onto it but that shouldn't be a problem.
Below is the code I have:
<<echo=FALSE,message=FALSE,cache=FALSE,include=TRUE,fig.height=5,fig.pos="h!",
warning=FALSE>>=
require(ggplot2)
x <- seq(0, 1, len = 100)
y <- seq(0,6,len=100)
p <- qplot(x, geom = "blank")
Prior <- stat_function(aes(x = x, y = y), fun = dbeta, colour="red", n = 1000,
args = list(shape1 = 3, shape2 = 7))
Posterior <- stat_function(aes(x = x, y = ..y..), fun = dbeta, colour="blue",
n = 1000,args = list(shape1 = 7, shape2 = 23))
p + Prior + Posterior
#
I've tried a few things but I can't figure out the best way. Thanks!
If you put colour= inside the calls to aes(...), ggplot makes a color scale and creates a legend automatically.
p <- qplot(x, geom = "blank")
Prior <- stat_function(aes(x = x, y = y,color="#FF0000"), fun = dbeta, n = 1000,
args = list(shape1 = 3, shape2 = 7))
Posterior <- stat_function(aes(x = x, y = y,color="#0000FF"), fun = dbeta,
n = 1000,args = list(shape1 = 7, shape2 = 23))
p + Prior + Posterior +
scale_color_discrete("Distibution",labels=c("Prior","Posterior"))

Resources