I'm trying to make a labeled scatterplot in ggplot and the specifics of the labels are causing me fits. Basically, among other things, I want to annotate my facet_wrapped 2-panel ggplot with the R^2 and Mean Bias. Notably, I want to label the mean bias with the appropriate units.
A simple version of my data might look as follows:
library(tidyverse)
Demo_Df <- tibble(Modeled = rnorm(50,0,1), Observed = rnorm(50, 0.5, 1),
Scheme = c(rep("Scheme1", 25), rep("Scheme2", 25)))
Demo_Annotation <- tibble(r.squared = c(0.589, 0.573), Mean_Bias = c(-2.038, -1.049), Scheme = c("Scheme1", "Scheme2"))
Demo_Scatter <- Demo_Df %>%
ggplot(aes(x = Observed, y = Modeled, color = Scheme)) +
geom_point(size = 1.5) +
facet_wrap(~Scheme) +
theme_tufte() +
xlab(expression(paste("Observed Aerosol (", mu, "g m" ^ "-3", ")"), sep = "")) +
ylab(expression(paste("Modeled Aerosol (", mu, "g m" ^ "-3", ")"), sep = "")) +
ylim(-3, 4) +
theme(legend.position = "none")
Demo_Labeled <- Demo_Scatter +
geom_text(data = Demo_Annotation, aes(-2, 3,
label = paste(
"R2 = ", sprintf("%.2f", signif(r.squared, 3)), "\n",
"Mean Bias = ", sprintf("%.2f", signif(Mean_Bias, 3))
)),
size = 5, hjust = 0, color = "black")
This produces almost the right figure, but I would like the R2 to have a superscript 2 and I need to add micrograms per cubic meter (ug/m3) to the end of the "Mean Bias = " label, as it is on the x and y-axes.
To date, I've completely failed at this. I cannot find a solution that supports multiple lines, facet_wrap, variable inputs, AND expressions. There has to be a way to do this. Please help me, tidyverse gods!
One option to achieve your desired result is to add you multiple lines via multiple geom_text layers. To parse the labels as math notation add parse=TRUE to geom_text. Finally I added the labels to you annotations df where I made use of ?plotmath for the math notation.
library(tidyverse)
library(ggthemes)
Demo_Annotation <- Demo_Annotation %>%
mutate(r.squared = paste0("R^{2} == ", sprintf("%.2f", signif(r.squared, 3))),
Mean_Bias = paste0("Mean~Bias == ", sprintf("%.2f", signif(Mean_Bias, 3)), "~mu*g~m^{-3}"))
Demo_Scatter +
geom_text(data = Demo_Annotation, aes(x = -2, y = 4, label = r.squared),
size = 5, hjust = 0, color = "black", parse = TRUE, family = "serif") +
geom_text(data = Demo_Annotation, aes(x = -2, y = 3.5, label = Mean_Bias),
size = 5, hjust = 0, color = "black", parse = TRUE, family = "serif")
DATA
set.seed(42)
Demo_Df <- tibble(Modeled = rnorm(50,0,1), Observed = rnorm(50, 0.5, 1),
Scheme = c(rep("Scheme1", 25), rep("Scheme2", 25)))
Related
I'm working on my research project and I want to add something to my ggplot. I have concentration-time graphs and I want to point out at what point a dose is given. I need to point this out with a triangle just above the graph so that the reader knows at what point a dose is given. An example of what I mean is added underneath.
The data is sensitive, so I can't give you that, but the idea is simple. It's concentration-time data. My code for the actual graph is:
ggplot(data = df, aes(x = "Time", y = "Concentration", col = "Species"))
+ ylab("Concentration (mg/mL)") + xlab ("Time (h)")
+ geom_point() + scale_color_viridis(discrete = T, option = "F", begin = 0, end = 0.8)
+ theme_bw() + scale_y_log10()
I know that there is an annotation() function, but I don't think there's an option for adding triangles to the graph. I haven't tried anything else yet, because I don't know what other options there are. I hope someone can help me with this problem.
Suppose your administration times are at 1, 6, 12 and 18 hours. Then you could do:
admin_times <- c(1, 6, 12, 18)
and
ggplot(data = df, aes(x = Time, y = Concentration, col = Species)) +
ylab("Concentration (mg/mL)") +
scale_x_continuous("Time (h)", breaks = 0:4 * 6, limits = c(1, 24)) +
geom_point() +
scale_color_viridis_d(option = "F", begin = 0, end = 0.8) +
theme_bw() +
scale_y_log10() +
annotate('point', x = admin_times, y = max(df$Concentration)*2,
shape = 25, size = 6, color = 'gray80', fill = 'gray80')
Note that you don't put quotation marks around column names inside aes when creating a ggplot.
Data used:
df <- data.frame(Time = rep(1:24, 2),
Concentration = dexp(c(1:24, 1:24),
rep(c(0.1, 0.15), each = 24)),
Species = rep(c('A', 'B'), each = 24))
i used iris data for an example
`
iris %>%
ggplot(aes(Sepal.Length,
fill = Species))+
geom_density(alpha = .6,
bw = 0.5)+
theme_classic()+
annotate("text",
x = 7,
y = c(.55, .60),
size = 4,
label = c(
paste0("Mean = ", round(mean(iris$Sepal.Length),4), " cm"),
paste0("r = ", round(cor(iris$Sepal.Length, iris$Sepal.Width),2), "")))
`
I try use force italic using Expression function, but does't work.
Adding parse=TRUE and using ?plotmath notation you could do:
EDIT: Getting a "," as decimal mark is a bit tricky. In the code below I use gsub to replace the "." by "*','*".
library(ggplot2)
mean <- round(mean(iris$Sepal.Length), 4)
mean <- gsub("\\.", "*','*", mean)
cor <- round(cor(iris$Sepal.Length, iris$Sepal.Width), 2)
cor <- gsub("\\.", "*','*", cor)
ggplot(iris, aes(Sepal.Length,
fill = Species
)) +
geom_density(
alpha = .6,
bw = 0.5
) +
theme_classic() +
annotate("text",
x = 7,
y = c(.55, .60),
size = 4,
label = c(
paste0("Mean == ", mean, "~cm"),
paste0("italic(r) == ", cor, "")
),
parse = TRUE
)
I'm plotting a Scatterplot with ggplot() as follows:
library(data.table)
library(plotly)
library(ggplot2)
library(lubridate)
dt.allData <- data.table(date = seq(as.Date('2020-01-01'), by = '1 day', length.out = 365),
DE = rnorm(365, 4, 1), Austria = rnorm(365, 10, 2),
Czechia = rnorm(365, 1, 2), check.names = FALSE)
## Calculate Pearson correlation coefficient: ##
corrCoeff <- cor(dt.allData$Austria, dt.allData$DE, method = "pearson", use = "complete.obs")
corrCoeff <- round(corrCoeff, digits = 2)
## Linear regression function extraction by creating linear model: ##
regLine <- lm(DE ~ Austria, data = dt.allData)
## Extract k and d values for the linear function f(x) = kx+d: ##
k <- round(regLine$coef[2], digits = 5)
d <- round(regLine$coef[1], digits = 2)
linRegFunction <- paste0("y = ", d, " + (", k, ")x")
## PLOT: ##
p1 <- ggplot(data = dt.allData, aes(x = Austria, y = DE,
text = paste("Date: ", date, '\n',
"Austria: ", Austria, "MWh/h", '\n',
"DE: ", DE, "\u20ac/MWh"),
group = 1)
) +
geom_point(aes(color = ifelse(date >= now()-weeks(5), "#419F44", "#F07D00"))) +
scale_color_manual(values = c("#F07D00", "#419F44")) +
geom_smooth(method = "lm", se = FALSE, color = "#007d3c") +
annotate("text", x = 10, y = 10,
label = paste("\u03c1 =", corrCoeff, '\n',
linRegFunction), parse = TRUE) +
theme_classic() +
theme(legend.position = "none") +
theme(panel.background = element_blank()) +
xlab("Austria") +
ylab("DE")+
ggtitle("DE vs Austria") +
theme(plot.title = element_text(hjust = 0.5, face = "bold"))
# Correlation plot converting from ggplot to plotly: #
plot <- plotly::ggplotly(p1, tooltip = "text")
which gives the following plot here:
I use annotate() to represent the correlation coefficient and the regression function. I define the x and y coordinates manually so that the text output is displayed in the middle at the top. Since I have some of such data tables dt.allData that have different axis scalings, I would like to define in the plot that the text should always be displayed in the middle at the top, depending on the axis scaling without defining x and y coordinate manually before.
I'd suggest using ggtitle and hjust = 0.5:
Edit: using plotly::layout and a span tag to create the title:
library(data.table)
library(ggplot2)
library(plotly)
library(lubridate)
dt.allData <- data.table(date = seq(as.Date('2020-01-01'), by = '1 day', length.out = 365),
DE = rnorm(365, 4, 1), Austria = rnorm(365, 10, 2),
Czechia = rnorm(365, 1, 2), check.names = FALSE)
## Calculate Pearson correlation coefficient: ##
corrCoeff <- cor(dt.allData$Austria, dt.allData$DE, method = "pearson", use = "complete.obs")
corrCoeff <- round(corrCoeff, digits = 2)
## Linear regression function extraction by creating linear model: ##
regLine <- lm(DE ~ Austria, data = dt.allData)
## Extract k and d values for the linear function f(x) = kx+d: ##
k <- round(regLine$coef[2], digits = 5)
d <- round(regLine$coef[1], digits = 2)
linRegFunction <- paste0("y = ", d, " + (", k, ")x")
## PLOT: ##
p1 <- ggplot(data = dt.allData, aes(x = Austria, y = DE,
text = paste("Date: ", date, '\n',
"Austria: ", Austria, "MWh/h", '\n',
"DE: ", DE, "\u20ac/MWh"),
group = 1)
) +
geom_point(aes(color = ifelse(date >= now()-weeks(5), "#419F44", "#F07D00"))) +
scale_color_manual(values = c("#F07D00", "#419F44")) +
geom_smooth(method = "lm", formula = 'y ~ x', se = FALSE, color = "#007d3c") +
# ggtitle(label = paste("My pretty useful title", '\n', "\u03c1 =", corrCoeff, '\n', linRegFunction)) +
theme_classic() +
theme(plot.title = element_text(hjust = 0.5)) +
theme(legend.position = "none") +
theme(panel.background = element_blank()) +
xlab("Austria") +
ylab("DE")
# Correlation plot converting from ggplot to plotly: #
# using span tag (directly in control of font-size):
span_plot <- plotly::ggplotly(p1, tooltip = "text") %>% layout(
title = paste(
'<b>My pretty useful title</b>',
'<br><span style="font-size: 15px;">',
'\u03c1 =<i>',
corrCoeff,
'</i><br>',
linRegFunction,
'</span>'
),
margin = list(t = 100)
)
span_plot
Edit: added the sup alternative as per this answer
# using sup tag:
sup_plot <- plotly::ggplotly(p1, tooltip = "text") %>% layout(
title = paste(
'<b>My pretty useful title</b>',
'<br><sup>',
"\u03c1 =<i>",
corrCoeff,
'</i><br>',
linRegFunction,
'</sup>'
),
margin = list(t = 100)
)
sup_plot
Here you can find some related information in the plotly docs.
First I would start by seeing if something like this could help you:
annotate("text",
x = mean(dt.allData$Austria, na.rm = TRUE),
y = max(dt.allData$DE, na.rm = TRUE),
label = paste("\u03c1 =",
corrCoeff, '\n',
linRegFunction),
parse = TRUE,
hjust = .5)
and then, in the case where you want to go through a list of x,y pairs, you'd eventually you'd want to move towards functional programming where you are passing x columns x1, x2, x3 and ycolumns y1, y2, y3 to a map function which then pulls out the relevant information from each pair and plots them.
I'm plotting some percentages of answers on a Likert-scale, and need to mark the sum of answer 1-2 and 4-5. I've done this pretty okay with the brackets, but I'm having some troubles with its colors. I'd like the bracket to be of the same color as the answering alternatives it represents, but I'd still like the text to be black. Is there a neat way to fix this?
Here's my code (a function for creating the plot I need with different questions and different percentages):
### FUNCTION
plot_function <- function (question, percent_1, percent_2, percent_3, percent_4, percent_5, percent_6) {
# Create temporary df
answers <- c(percent_1, percent_2, percent_3, percent_4, percent_5, percent_6)
labs = vector(mode = "list", length = 6)
for (i in 1:6) {
labs[i] <- paste(toString(answers[i]), "%", sep = "")
}
df <- data.frame(Scale, answers, labs)
# Create plot
plot <- ggplot(df, aes(x=Scale, y = answers)) +
geom_bar(aes(fill=factor(..x..)), stat = "identity") +
labs(x = "", y="Percent", fill="Scale", title = question) +
scale_fill_manual(name = "Scale",
labels = Scale,
values = c(color_1, color_2, color_3, color_4, color_5, color_6)) +
theme(panel.background = element_blank(),
panel.grid.minor = element_line(colour = 'grey',
size = 0.25,
linetype = 'dashed')) +
theme_tufte() +
geom_text(label = labs, nudge_y = 1.5, family = font) +
geom_bracket(xmin = 1,
xmax = 2,
y.position = max(answers) + 10,
label = paste(toString(percent_1 + percent_2), "%", sep = ""),
tip.length = c(0.05, 0.05),
family = font,
vjust = -1,
size = 1,
color = color_1) +
geom_bracket(xmin = 4,
xmax = 5,
y.position = max(answers) + 10,
label = paste(toString(percent_4 + percent_5), "%", sep = ""),
tip.length = c(0.05, 0.05),
family = font,
vjust = -1,
size = 1,
color = color_5) +
ylim(c(0, max(answers + 15)))
# Return plot
plot
}
As you can see below, the labels for the brackets take on the same color as the bracket itself. How can I make the label black, while keeping the colored brackets? :)
I try to find a clear approach for combined scatter and line plots with ggplot2 that have an appropriate legend. The following works, in principle, but with warnings:
library("ggplot2")
library("dplyr")
## 2 data sets, one for the lines, one for the points
tbl <- tibble(
f = rep(letters[1:2], each = 10),
x = rep(1:10, 2),
y = c(1e-4 * exp(1:10), log(1:10))
)
obs <- tibble(
f = rep("c", 5),
x = seq(2, 10, 2),
y = log(seq(2, 10, 2)) + rnorm(5, sd = 0.1)
)
rbind(tbl, obs) %>%
ggplot(aes(x, y, color = f, linetype = f)) +
geom_line(show.legend = TRUE) +
geom_point(show.legend = TRUE, aes(shape = f), size = 3) +
scale_linetype_manual(values=c("solid", "solid", "blank")) +
scale_shape_manual(values=c(NA, NA, 16))
but I would like to get rid of warnings and to write something like:
scale_shape_manual(values=c("none", "none", "circle"))
Is there already a "none" or "empty" shape code? Several past answers have been suggested on SO, but I wonder if there is a recent canonical way.