How do I pass multiple arguments through an R function to ggplot, both aesthetics and scale format? - r

How do I pass multiple arguments through to my ggplot function?
Here is an example of the plot I want to automate.
library(ggplot2)
library(scales)
p <- ggplot(diamonds, aes(x=cut, y=price) ) +
geom_boxplot() +
scale_y_continuous(labels = dollar)
p
But I want to graph multiple different variables and use the appropriate scale e.g. price, depth etc, some are in dollars.
So I made a function
myfunction <- function(var1,var2){
p <- ggplot(diamonds, aes(x=cut, y= var1) ) +
geom_boxplot() +
scale_y_continuous(labels = var2)
p
return(p)
}
When I test the function, it doesn't work. Both arguments cause different errors on their own.
myfunction("price","dollar")
For var1 I get:
Error: Discrete value supplied to continuous scale
and var2:
Error in f(..., self = self) : Breaks and labels are different lengths
Question 1. Why doesn't that work? This is the most important question for me.
I then wish to make multiple graphs, which I can do with a for loop, but I keep hearing I should do it with apply. Here's what I tried.
Question 2. How would you make the multiple graphs work with apply?
FirstPlotData <- c("price","dollar")
SecondPlotData <- c("depth", "comma")
plotMetaData <- data.frame(FirstPlotData,SecondPlotData)
lapply doesn't work for me with multiple arguments. Can it pass multiple arguments?
lapply(plotMetaData, function(avar,bvar)myfunction(avar, bvar))
Would mapply work? How?
mapply(mytestfunction,plotMetaData[1,],plotMetaDataList[2,])
Thanks in advance. I note that I could do the multiple graphs with facet, but for my more complex example, with hiding outliers, scaling, and also doing stats, then doing the multiple plots and putting in a {cowplot} grid seems easier.

Try this
library(ggplot2)
library(scales)
library(rlang) # for sym
myfunction <- function(var1,var2){
p <- ggplot(diamonds, aes(x=cut, y= !! sym(var1)) ) +
geom_boxplot() +
scale_y_continuous(labels = get(var2))
p
return(p)
}
myfunction('price','dollar')

You probably want aes_string. This function has been designed to make programming with ggplot easier (similar ideas have also been applied to dplyr commands). The following works:
library(tidyverse)
data(diamonds)
myfunction <- function(var1){
p <- ggplot(diamonds, aes_string(x="cut", y= var1) ) +
geom_boxplot()
p
return(p)
}
myfunction("price")
Why?
contrast the following:
# works
ggplot(diamonds, aes(x=cut, y= price) ) + geom_boxplot()
# these 2 are equivalent, but do not work
ggplot(diamonds, aes(x=cut, y= "price") ) + geom_boxplot()
var1 = "price"
ggplot(diamonds, aes(x=cut, y= var1) ) + geom_boxplot()
# these 2 are equivalent, both works but inputs are strings
ggplot(diamonds, aes_string(x="cut", y= "price") ) + geom_boxplot()
var1 = "price"
ggplot(diamonds, aes_string(x="cut", y= var1) ) + geom_boxplot()
Using apply?
For this purpose I would be inclined to use loops (others are welcome to disagree). If you are set on using an apply approach then you probably want apply as lapply, mapply, vapply and sapply are list-, multivariate-, vector- and simple-apply respectively.

A more ggplot way of doing this now, is using .data pronoun.
library(ggplot2)
myfunction <- function(var1, var2) {
p <- ggplot(diamonds, aes(x = cut, y = .data[[var1]])) +
geom_boxplot() +
scale_y_continuous(
labels = getFromNamespace(x = var2, ns = "scales")
)
p
return(p)
}
myfunction("price", "dollar")
myfunction("price", "comma")
Then to create multiple plots with these function by passing multiple arguments, a better and tidier approach is using map functions from the {purrr}
plots <- purrr::map2(
.x = c("price", "price"),
.y = c("dollar", "comma"),
.f = myfunction
)
So, plots[[1]] contains the 1st plot with var1 = "price" and var2 = "dollar" and plots[[2]] contains the 2nd plot with var1 = "price" and var2 = "comma".

Related

Use tidy evaluation when passing an expression as a character to ggplot2::aes()

I am trying to convert my use of ggplot2 in functions to using tidy evaluation so as to avoid the warning messages that are evaluated. I particular, I have extensively used aes_string() and these need to be converted to aes(). I can handle cases when just the name of a column is passed as a characater. However, I have been unable to work how to deal with the case when the character is a mathematical expression.
Here is a small reproducible example of the problem that I an trying to solve.
library(ggplot2)
set.seed(1)
dat <- data.frame(x=rnorm(100),y=rnorm(100))
xvar <- 'x+100'
yvar <- 'y'
#This works but uses the deprecated aes_string
ggplot(dat,aes_string(x=xvar,y=yvar))
#This works
ggplot(dat, ggplot2::aes(x=x+100, y=.data[[yvar]])) + geom_point()
#This does not work
ggplot(dat, aes(x={{xvar}}, y=.data[[yvar]])) + geom_point()
My question is what tidy evaluation techniques do I need to employ to use xvar to specify the x variable as is possible with aes_string()?
You could use eval(str2expression()):
library(ggplot2)
ggplot(dat, aes(x = eval(str2expression(xvar)), y = eval(str2expression(yvar)))) +
geom_point() +
labs(x = xvar, y = yvar)
Or using the analogous rlang functions:
library(rlang)
ggplot(dat, aes(x = eval_tidy(parse_expr(xvar)), y = eval_tidy(parse_expr(yvar)))) +
geom_point() +
labs(x = xvar, y = yvar)
Note you’ll want to manually set the axis labels using labs(); otherwise you’ll end up with e.g. "eval(str2expression(xvar))" for the x axis.
It may need parse_expr/eval
library(ggplot2)
ggplot(dat, aes(x=eval(rlang::parse_expr(xvar)), y=.data[[yvar]])) +
geom_point() +
xlab(xvar)
-output
Or another option would be to interpolate and do the eval/parse
eval(parse(text = glue::glue("ggplot(dat, aes(x = {xvar}, y = {yvar}))",
"+ geom_point()")))

different titles for each element after using facet_wrap()

After setting ncol = 1 in the facet_wrap() function, I'm trying to use ggtitle() function inside the facet_wrap() function to set a different title for each graph created (there are only two of them).
ggplot(df, aes(x, y)) +
geom_point() +
facet_wrap(~ var, ncol = 1) +
ggtitle(function(x) paste("Title for", df$title[df$var == x]))
I'm trying to use the value of the "title" column of the dataframe, where the value of the "var" column matches the value of the current plot's var.
But I get this error:
Error in as.character(x$label) :
cannot coerce type 'closure' to vector of type 'character'
How can I set different titles for each graph in ggplot2 using the facet_wrap() function with ncol=1?
Thanks, Ido
Here is something that might give you what you want. My comment above still stands, but this may be more what you are looking for.
library(ggplot2)
library(dplyr)
df <- mtcars %>%
mutate(strip_title = paste(cyl, "Cylinders"))
ggplot(df, aes(x = mpg, y = wt)) +
geom_point() +
facet_wrap(~strip_title, ncol = 1)

for-loop to create ggplots

I trying to make boxplots with ggplot2.
The code I have to make the boxplots with the format that I want is as follows:
p <- ggplot(mg_data, aes(x=Treatment, y=CD68, color=Treatment)) +
geom_boxplot(mg_data, mapping=aes(x=Treatment, y=CD68))
p+ theme_classic() + geom_jitter(shape=16, position=position_jitter(0.2))
I can was able to use the following code to make looped boxplots:
variables <- mg_data %>%
select(10:17)
for(i in variables) {
print(ggplot(mg_data, aes(x = Treatment, y = i, color=Treatment)) +
geom_boxplot())
}
With this code I get the boxplots however, they do not have the name label of what variable is being select for the y-axis, unlike the original code when not using the for loop. I also do not know how to add the formating code to the loop:
p + theme_classic() + geom_jitter(shape=16, position=position_jitter(0.2))
Here is a way. I have tested with built-in data set iris, just change the data name and selected columns and it will work.
suppressPackageStartupMessages({
library(dplyr)
library(ggplot2)
})
variables <- iris %>%
select(1:4) %>%
names()
for(i in variables) {
g <- ggplot(iris, aes(x = Species, y = get(i), color=Species)) +
geom_boxplot() +
ylab(i)
print(g)
}
Edit
Answering to a comment by user TarJae, reproduced here because answers are less deleted than comments:
Could you please expand with saving all four files. Many thanks.
The code above can be made to save the plots with a ggsave instruction at the loop end. The filename is the variable name and the plot is the default, the return value of last_plot().
for(i in variables) {
g <- ggplot(iris, aes(x = Species, y = get(i), color=Species)) +
geom_boxplot() +
ylab(i)
print(g)
ggsave(paste0(i, ".png"), device = "png")
}
Try this:
variables <- mg_data %>%
colnames() %>%
`[`(10:17)
for (i in variables) {
print(ggplot(mg_data, aes(
x = Treatment, y = {{i}}, color = Treatment
)) +
geom_boxplot())
}
Another option is to use lapply. It's approximately the same as using a loop, but it hides the actual looping part and can make your code look a little cleaner.
variables = iris %>%
select(1:4) %>%
names()
lapply(variables, function(x) {
ggplot(iris, aes(x = Species, y = get(x), color=Species)) +
geom_boxplot() + ylab(x)
})

How to modify axis labels within ggplot labs()

Say I want to modify a ggplot axis label with the str_to_title() function.
library(tidyverse)
mtcars %>%
ggplot(aes(x = wt, y = mpg)) +
geom_point() +
labs(x = ~str_to_title(.x))
Rather than my x-axis being labeled 'Wt' it will be labeled 'str_to_title(.x)'. Is there a way to apply functions within the labs() function?
labs doesn't do programmatic NSE like many other components of ggplot2. One option is to define the columns programmatically, use aes_ and as.name (or other ways too) and it'll work.
library(ggplot2)
library(stringr) # str_to_title
xx <- "wt"; yy <- "mpg"
ggplot(mtcars, aes_(x = as.name(xx), y = as.name(yy))) +
geom_point() +
labs(x = str_to_title(xx))

How to pass a vector to the dots (...) argument of a function

Several R functions have and arguement ... that allows you to pass an arbitrary number of arguments. A example of this is the paste function, to which you can provide an arbitrary number of arguements. But sometimes, you don't know ahead of time how many arguements you want to pass.
For example, say I want to produce a plot in ggplot, where I want to color points by the combination of two columns:
df <- data.frame(x=rnorm(100),
y=rnorm(100),
cat1=sample(c(TRUE, FALSE), 100),
cat2=sample(c(TRUE, FALSE), 100),
cat3=sample(c(TRUE, FALSE), 100))
ggplot(df) + aes(x=x, y=y, col=paste(cat1,cat2) + geom_point()
But now consider that I want to the list of columns to be colour by to be determined at run-time. I would like to write a function that did something like:
library(rlang)
color_plot <- function(df, color_by) {
color_by = lapply(color_by, sym)
ggplot(df) + aes(x=x, y=y, col=paste(...=color_by)) + geom_point()
}
color_plot(df, list("cat1"))
color_plot(df, list("cat2", "cat3"))
color_plot(df, list("cat1", "cat2", "cat3"))
I guess i'm look for something equivalent to pythons *args as in:
args =[1,2,3]
my_fun(*args)
Use syms:
color_plot <- function(df, color_by) {
color_by <- syms(color_by)
ggplot(df) + aes(x=x, y=y, col=paste(!!!color_by)) + geom_point()
}
Another method would be to use quos if you prefer passing in unquoted column names instead of a list:
library(ggplot2)
library(rlang)
color_plot <- function(df, ...) {
color_by = quos(...)
ggplot(df) + aes(x=x, y=y, col=paste(!!!color_by)) + geom_point()
}
color_plot(df, cat1, cat2, cat3)

Resources