R - Interpreting a subscript in a variable used in ggplot - r

I'm using ggplot to do some multiline plots that are constructed with lots of variables and the use of paste. I have not been able to figure out how to get the subscript 3 in O3 to appear in the following simplified version of the code.
gasSubscript <- "O[3]"
color1 <- paste(gasSubscript,"some additional text")
df <- data.frame(x = c(1,2,3,4,5,6,7,8,9,10), y = c(10,9,8,7,6,5,4,3,2,1))
testPlot <- ggplot(data = df, aes(x = x)) + geom_line(aes(y = y, color = color1))
color1 contains
"O[3] some additional text"
The legend displays as "O[3] some additional text" rather than with a subscripted 3.

The problem is that you need the label in the scale to be an expression so that, when it is rendered, it is rendered according to the rules of plotmath. However, ggplot works with data.frames and data.frames can not have a column which is a vector of expressions. So the way around this is to store the information as the text (string) version of the plotmath expression and, as the last step for making the labels, turn these into expressions. This can be done because the labels argument to the scale functions can itself be a function which can transform/format the labels.
Putting this together with your example:
color1 <- paste(gasSubscript,"*\" some additional text\"")
This is now in a format that can be made into an expression.
> color1
[1] "O[3] *\" some additional text\""
> cat(color1)
O[3] *" some additional text"
> parse(text=color1)
expression(O[3] *" some additional text")
With that format, you can force the scale to interpret the labels as expressions which will cause them to be rendered as per the rules of plotmath.
testPlot <- ggplot(data = df, aes(x = x)) +
geom_line(aes(y = y, color = color1)) +
scale_colour_discrete(labels = function(x) parse(text=x))
Using the labels function approach works for data which is stored in the data.frame as well, so long as the strings are formatted so that they can be parsed.
DF <- data.frame(x=1:4, y=rep(1:2, times=2),
group = rep(c('O[3]*" some additional text"',
'H[2]*" some different text"'), each = 2))
ggplot(DF, aes(x, y, colour=group)) +
geom_line() +
scale_colour_discrete(labels=function(x) parse(text=x))

This should do what I think you want. It took me a little tinkering to get the right order of paste and expression.
require(ggplot2)
test <- data.frame(x = c(1,2,3,4,5,6,7,8,9,10), y = c(10,9,8,7,6,5,4,3,2,1))
colour1 <- "1"
testPlot <- ggplot(data = test, aes(x = x)) + geom_line(aes(y = y, colour = colour1))
testPlot + scale_colour_discrete(labels = c(expression(paste(O[3], "some other text here"))))
It also returns the warning
Warning message:
In is.na(scale$labels) :
is.na() applied to non-(list or vector) of type 'expression'
to which I haven't been able to find an explanation.

Related

different titles for each element after using facet_wrap()

After setting ncol = 1 in the facet_wrap() function, I'm trying to use ggtitle() function inside the facet_wrap() function to set a different title for each graph created (there are only two of them).
ggplot(df, aes(x, y)) +
geom_point() +
facet_wrap(~ var, ncol = 1) +
ggtitle(function(x) paste("Title for", df$title[df$var == x]))
I'm trying to use the value of the "title" column of the dataframe, where the value of the "var" column matches the value of the current plot's var.
But I get this error:
Error in as.character(x$label) :
cannot coerce type 'closure' to vector of type 'character'
How can I set different titles for each graph in ggplot2 using the facet_wrap() function with ncol=1?
Thanks, Ido
Here is something that might give you what you want. My comment above still stands, but this may be more what you are looking for.
library(ggplot2)
library(dplyr)
df <- mtcars %>%
mutate(strip_title = paste(cyl, "Cylinders"))
ggplot(df, aes(x = mpg, y = wt)) +
geom_point() +
facet_wrap(~strip_title, ncol = 1)

R: Programmatically changing ggplot scale labels to Greek letters with expressions

I am trying to change the labels in a ggplot object to Greek symbols for an arbitrary number of labels. Thanks to this post, I can do this manually when I know the number of labels in advance and the number is not too large:
# Simulate data
df <- data.frame(name = rep(c("alpha1","alpha2"), 50),
value = rnorm(100))
# Create a plot with greek letters for labels
ggplot(df, aes(x = value, y = name)) + geom_density() +
scale_y_discrete(labels = c("alpha1" = expression(alpha[1]),
"alpha2" = expression(alpha[2])))
For our purposes, assume I need to change k default labels, where each of the k labels is the pre-fix "alpha" followed by a number 1:k. Their corresponding updated labels would substitute the greek letter for "alpha" and use a subscript. An example of this is below:
# default labels
paste0("alpha", 1:k)
# desired labels
for (i in 1:k) { expression(alpha[i]) }
I was able to hack together the below programmatic solution that appears to produce the desired result thanks to this post:
ggplot(df, aes(x = value, y = name)) + geom_density() +
scale_y_discrete(labels = parse(text = paste("alpha[", 1:length(unique(df)), "]")))
However, I do not understand this code and am seeking clarification about:
What is parse() doing here that expression() otherwise would do?
While I understand everything to the right-hand side of =, what is text doing on the left-hand side of the =?
Another option to achieve your desired result would be to add a new column to your data which contains the ?plotmath expression as a string and map this new column on y. Afterwards you could use scales::label_parse() to parse the expressions:
set.seed(123)
df <- data.frame(name = rep(c("alpha1","alpha2"), 50),
value = rnorm(100))
df$label <- gsub("^(.*?)(\\d+)$", "\\1[\\2]", df$name)
library(ggplot2)
library(scales)
ggplot(df, aes(x = value, y = label)) + geom_density() +
scale_y_discrete(labels = scales::label_parse())

How do I pass a string of symbols for bquote to evaluate in ggplot?

The axis labels vary for a ggplot that I create within a function. Some of the labels have super/subscripts, while others don't. Example:
m.data <- data.frame(x = runif(10), y = runif(10))
x.labs <- c("rain, mm", "light*','~W~m^-2")
for (i in 1:2) {
ggplot(m.data, aes(x = x, y = y)) +
labs(title = bquote(.(x.labs[i])))
}
The title for the graph when i=2 is literally
light*','~W~m^-2
rather than the formatted version of same. With the same result, I also tried moving bquote inside each string.
x.labs <- c("bquote(rain*','~mm)", "bquote(light*','~W~m^-2)")
and
title = x.labs[i]
Of the many questions about ggplot and bquote, none seem to address passing in a symbol like the superscript indicator.
One alternative is to use expression() in your vector of titles instead of bquote().
For example
x.labs <- c("rain, mm", expression("light,"~W~m^-2))
ggplot(m.data, aes(x = x, y = y)) +
labs(title = x.labs[2])

ggplot2 - passing dataframe with column names

I looked at other solutions but cannot get a logical within ggplot to work correctly. I have the following function. A dataframe is passed alongwith two
columns to plot as a scatter plot.
scatter_plot2 <- function(df, xaxis, yaxis){
b <- ggplot(data = df, aes_string(xaxis, yaxis), environment = environment())
gtype <- geom_point(aes(alpha = 0.2, color = yaxis > 0))
sm <- geom_smooth(formula = xaxis ~ yaxis, color="black")
b + gtype + sm + theme_bw()
}
which I call using :
scatter_plot2(train_df, "train_df$signal", "train_df$yhat5")
===
The color = yaxis > 0
is intended to plot points above (yaxis) 0 in "green" and ones below in "red". While i'm able to get the string names to correctly display on the axis, I'm not able to get the logical to work correctly.
Please help.
Since you're creating your own function for this, just calculate the needed color ahead of time. Since you're passing in a data frame and the variables, you'll need to use some standard evaluation (you're already doing this using aes_string).
I cleaned up the code a bit, putting the ggplot statement into a single chain,, making some aes calls explicit, and making your smooth formula y~x. You also don't want to use $ when passing in the variables, just pass quoted names.
library(dplyr)
library(ggplot2)
scatter_plot2 <- function(df, xaxis, yaxis){
df <- mutate_(df, color = ~ifelse(yaxis > 0, "green", "red"))
ggplot(data = df, aes_string(x = xaxis, y = yaxis)) +
geom_point(aes(alpha = 0.2, color = color)) +
geom_smooth(formula = y ~ x, color =" black") +
scale_color_identity() +
theme_bw()
}
The call would be (using iris for an example):
scatter_plot2(iris, "Sepal.Width", "Sepal.Length")
resulting in:

Problems when using ggplot aes_string, group, and linetype

Let's say I have this data set:
x <- rnorm(1000)
y <- rnorm(1000, 2, 5)
line.color <- sample(rep(1:4, 250))
line.type <- as.factor(sample(rep(1:5, 200)))
data <- data.frame(x, y, line.color, line.type)
I'm trying to plot the x and y variables group by the interaction of line.type and line.color. In addition I want to specify the linetype using line.type and the color using line.color. If I write this:
ggplot(data, aes(x = x, y = y, group = interaction(line.type, line.color), colour = line.color, linetype = line.type)) + geom_line()
It works but If I try to use aes_string like this:
interact <- c("line.color", "line.type")
inter <- paste0("interaction(", paste0('"', interact, '"', collapse = ", "), ")")
ggplot(data, aes_string(x = "x", y = "y", group = inter, colour = "line.color", linetype = "line.type")) + geom_line()
I get the error:
Error: geom_path: If you are using dotted or dashed lines, colour, size and linetype must be constant over the line
What am I doing wrong? I need to use aes_string because I have a lot of variables to plot.
Turns out I was mistaken on several counts in my comments above. This appears to work:
data$inter <- interaction(data$line.type,data$line.color)
ggplot(data, aes_string(x = "x", y = "y", group = "inter",colour = "line.color",linetype = "line.type")) + geom_line()
(I was completely wrong about the graph specifying varying colour, etc within a single dashed/dotted line.)
I take this as a slight vindication, though, that relying on parsing of the interaction code inside aes_string() is a generally bad idea. My guess is that there is simply a small bug in ggplot's attempt to parse what you're giving aes_string() in complex cases that's causing it to evaluate things in an order that makes it look like you're asking for varying aesthetics over dashed/dotted lines.
You were almost there defining
inter <- paste0("interaction(", paste0('"', interact, '"', collapse = ", "), ")")
However, for aes_string to work, you need pass a character string of what would work if it you were calling aes, that is you don't need to have the arguments within interaction as strings. You want to create a string "interaction(line.color, line.type)". Therefore
inter <- paste0('interaction(', paste0(interact, collapse = ', ' ),')')
# or
# inter <- sprintf('interaction(%s), paste0(interact, collapse = ', '))
# the result being
inter
## [1] "interaction(line.color, line.type)"
# and the following works
ggplot(data, aes_string(x = "x", y = "y",
group = inter, colour = "line.color", linetype = "line.type")) +
geom_line()

Resources