Dynamically change ggplot depending on the arguments passed to a function in R - r

I am currently writing a function which ultimately returns a ggplot. I would like to offer the user the option to change different aspects of that ggplot, such as the xlab by specifying it in the function. Right now I am using a code like this.
library(tidyverse)
d <- sample_n(diamonds,500)
plot_something <- function(data,x,y,x_axis_name=NULL){
if(is.null(x_axis_name)){
p<-ggplot(d,aes_string(x=x,y=y))+
geom_point()
} else{
p<-ggplot(d,aes_string(x=x,y=y))+
geom_point()+
xlab(x_axis_name)
}
return(p)
}
plot_something(data=d, x="depth", y="price",x_axis_name = "random_name")
This works fine, but as you can see a lot of the code is duplicated, the only difference is the xlab argument. In this case it is not too bad, but my actual function is much more complicated and things get more difficult if I would also allow the user to for example modify the ylab.
So my question is, if there is a more elegant way to modify ggplots inside of a function depending on arguments passed by the user.
Any help is much appreciated!

There is no need for the duplicated code. You could conditionally add layers to a base plot as desired like so:
library(tidyverse)
d <- sample_n(diamonds, 500)
plot_something <- function(data, x, y, x_axis_name = NULL) {
x_lab <- if (!is.null(x_axis_name)) xlab(x_axis_name)
p <- ggplot(d, aes_string(x = x, y = y)) +
geom_point() +
x_lab
return(p)
}
plot_something(data = d, x = "depth", y = "price", x_axis_name = "random_name")
Or using ggplot2s built-in mechanisms to choose defaults you don't even need an if condition but could do:
plot_something <- function(data, x, y, x_axis_name = ggplot2::waiver()) {
p <- ggplot(d, aes_string(x = x, y = y)) +
geom_point() +
xlab(x_axis_name)
return(p)
}

Related

Is there a way to pass the data of a ggplot2 call to the scale_* functions that works with .+gg in one pass [duplicate]

I would like to use a variable of the dataframe passed to the data parameter of function the ggplot in another ggplot2 function in the same call.
For instance, in the following example I want to refer to the variable x in the dataframe passed to the data parameter in ggplot in another function scale_x_continuous such as in:
library(ggplot2)
set.seed(2017)
samp <- sample(x = 20, size= 1000, replace = T)
ggplot(data = data.frame(x = samp), mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(x), max(x)))
And I get the error :
Error in seq(min(x)) : object 'x' not found
which I understand. Of course I can avoid the problem by doing :
df <- data.frame(x = samp)
ggplot(data = df, mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(df$x), max(df$x)))
but I don't want to be forced to define the object df outside the call to ggplot. I want to be able to directly refer to the variables in the dataframe I passed in data.
Thanks a lot
The scale_x_continuous function does not evaluate it's parameters in the data environment. One reason for this is that each layer can have it's own data source so by the time you got to the scales it wouldn't be clear which data environment is the "correct" one any more.
You could write a helper function to initialize the plot with your default. For example
helper <- function(df, col) {
ggplot(data = df, mapping = aes_string(x = col)) +
scale_x_continuous(breaks = seq(min(df[[col]]), max(df[[col]])))
}
and then call
helper(data.frame(x = samp), "x") + geom_bar()
Or you could write a wrapper around just the scale part. For example
scale_x_custom <- function(x) {
scale_x_continuous(breaks = seq(min(x) , max(x)))
}
and then you can add your custom scale to your plot
ggplot(data = df, mapping = aes(x = x)) +
geom_bar() +
scale_x_custom(df$x)
Or since you just want breaks at integer values, you can calculate the breaks from the default limits without needed to actually specify the data. For example
scale_x_custom <- function() {
scale_x_continuous(expand=expansion(0, .3),
breaks = function(x) {
seq(ceiling(min(x)), floor(max(x)))
})
}
ggplot(data = df, mapping = aes(x = x)) +
geom_bar() +
scale_x_custom()
Another less than ideal alternative would be to utilize the . special symbol in combination with {} which is imported from magrittr.
Enclosing the ggplot call in curly brackets allows one to reference . multiple times.
data.frame(x = samp) %>%
{ggplot(data = ., mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(.$x), max(.$x)))}

R: How to pass parameters to ggplot geom_ within a function?

I'm trying to create a utility function that combines several geom_, like in this example (which doesn't work):
my_geom_y <- function(yy, colour){
geom_line(aes(y=yy), col=colour) + geom_point(aes(y=yy), col=colour)
}
so that then I can do this:
myX <- 0:90
ggplot(mapping = aes(x=myX)) + my_geom_y(dlnorm(myX), "red") + my_geom_y(dexp(myX), "blue")
Is that possible?
I tried using get(), eval(), substitute(), as.name() with no avail.
Looking at related posts: passing parameters to ggplot, Use of ggplot() within another function in R didn't help.
I like MSM's approach, but if you want to be able to add my_geom_y to a ggplot you've already made, this is an alternative that might suit what you're after:
library(ggplot2)
x <- 1:100
my_geom_y <- function(yy, colour = "black"){
list(
geom_line(mapping = aes(y= yy),
col = colour),
data = data.frame(x, yy)),
geom_point(mapping = aes(y = yy),
col = colour,
data = data.frame(x, yy))
)
}
ggplot(mapping = aes(x)) +
my_geom_y(x, "red") +
my_geom_y(dlnorm(x), "blue") +
my_geom_y((x^1.1), "black") +
my_geom_y(x/2, "yellow")
I don't have enough reputations to comment so here is a suggestion:
my_geom_y <- function(xx, yy, colour){
ggplot() +
geom_line(aes(x=xx, y=yy), col=colour) +
geom_point(aes(x=xx, y=yy), col=colour)
}
This will create one plot. To create multiple ones, you need to pass your inputs to the function as a list and loop through it inside the function for each geom (since we can't add two or more ggplot objects) - if that makes sense.
Based on #luke-c idea, this makes the function standalone, cut-n-paste ready. We can also add now labels to each curve.
my_geom_y <- function(.xx, .yy, yLabel = 1, .colour=NA ){
if (is.na(.colour))
.colour <- palette()[yLabel%%length(palette())]
list( geom_line(mapping=aes(.xx,.yy), col=.colour, data=data.frame(.xx, .yy)),
geom_point(mapping=aes(.xx,.yy), col=.colour, data=data.frame(.xx, .yy)),
annotate(geom="text" , col = .colour, label=deparse(substitute(.yy)),
x=mean(.xx),y=max(.yy)-(max(.yy)-min(.yy))/20*yLabel)
)
}
myX <- 1:10
ggplot() + my_geom_y(myX, dlnorm(myX), 1) +
my_geom_y(myX, dexp(myX), 2) + my_geom_y(myX, dexp(myX,0.7), 3)
This function becomes handy when you need to visually compare multiple distributions.

Refering to a variable of the data frame passed in the 'data' parameter of ggplot function

I would like to use a variable of the dataframe passed to the data parameter of function the ggplot in another ggplot2 function in the same call.
For instance, in the following example I want to refer to the variable x in the dataframe passed to the data parameter in ggplot in another function scale_x_continuous such as in:
library(ggplot2)
set.seed(2017)
samp <- sample(x = 20, size= 1000, replace = T)
ggplot(data = data.frame(x = samp), mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(x), max(x)))
And I get the error :
Error in seq(min(x)) : object 'x' not found
which I understand. Of course I can avoid the problem by doing :
df <- data.frame(x = samp)
ggplot(data = df, mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(df$x), max(df$x)))
but I don't want to be forced to define the object df outside the call to ggplot. I want to be able to directly refer to the variables in the dataframe I passed in data.
Thanks a lot
The scale_x_continuous function does not evaluate it's parameters in the data environment. One reason for this is that each layer can have it's own data source so by the time you got to the scales it wouldn't be clear which data environment is the "correct" one any more.
You could write a helper function to initialize the plot with your default. For example
helper <- function(df, col) {
ggplot(data = df, mapping = aes_string(x = col)) +
scale_x_continuous(breaks = seq(min(df[[col]]), max(df[[col]])))
}
and then call
helper(data.frame(x = samp), "x") + geom_bar()
Or you could write a wrapper around just the scale part. For example
scale_x_custom <- function(x) {
scale_x_continuous(breaks = seq(min(x) , max(x)))
}
and then you can add your custom scale to your plot
ggplot(data = df, mapping = aes(x = x)) +
geom_bar() +
scale_x_custom(df$x)
Or since you just want breaks at integer values, you can calculate the breaks from the default limits without needed to actually specify the data. For example
scale_x_custom <- function() {
scale_x_continuous(expand=expansion(0, .3),
breaks = function(x) {
seq(ceiling(min(x)), floor(max(x)))
})
}
ggplot(data = df, mapping = aes(x = x)) +
geom_bar() +
scale_x_custom()
Another less than ideal alternative would be to utilize the . special symbol in combination with {} which is imported from magrittr.
Enclosing the ggplot call in curly brackets allows one to reference . multiple times.
data.frame(x = samp) %>%
{ggplot(data = ., mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(.$x), max(.$x)))}

Using ggplot2 in a function [duplicate]

would like to create a function that generates graphs using ggplot. For the sake of simplicity, the typical graph may be
ggplot(car, aes(x=speed, y=dist)) + geom_point()
The function I would like to create is of the type
f <- function(DS, x, y) ggplot(DS, aes(x=x, y=y)) + geom_point()
This however won't work, since x and y are not strings. This problem has been noted in previous SO questions (e.g., this one), but without providing, in my view, a satisfactory answer. How would one modify the function above to make it work with arbitrary data frames?
One solution would be to pass x and y as string names of columns in data frame DS.
f <- function(DS, x, y) {
ggplot(DS, aes_string(x = x, y = y)) + geom_point()
}
And then call the function as:
f(cars, "speed", "dist")
However, it seems that you don't want that? Can you provide an example why you would need different functionality? Is it because you don't want to have the arguments in the same data frame?
I think it's possible the following type of codes, which only build the aes component.
require(ggplot2)
DS <- data.frame(speed=rnorm(10), dist=rnorm(10))
f <- function(DS, x, y, geom, opts=NULL) {
aes <- eval(substitute(aes(x, y),
list(x = substitute(x), y = substitute(y))))
p <- ggplot(DS, aes) + geom + opts
}
p <- f(DS, speed, dist, geom_point())
p
However, it seems to be complicated approach.
Another option is to use do.call. Here is a one line copy paste from a working code:
gg <- gg + geom_rect( do.call(aes, args=list(xmin=xValues-0.5, xmax=xValues+0.5, ymin=yValues, ymax=rep(Inf, length(yValues))) ), alpha=0.2, fill=colors )
One approach that I can think of is using match.call() to reach the variable names contained by the parameters/arguments passed to the custom plotting function and then use eval() on them. In this way you avoid passing them as quoted to your custom function, if you do not like that.
library(ggplot2)
fun <- function(df, x, y) {
arg <- match.call()
ggplot(df, aes(x = eval(arg$x), y = eval(arg$y))) + geom_point()
}
fun(mpg, cty, hwy) # no need to pass the variables (column names) as quoted / as strings

passing parameters to ggplot

would like to create a function that generates graphs using ggplot. For the sake of simplicity, the typical graph may be
ggplot(car, aes(x=speed, y=dist)) + geom_point()
The function I would like to create is of the type
f <- function(DS, x, y) ggplot(DS, aes(x=x, y=y)) + geom_point()
This however won't work, since x and y are not strings. This problem has been noted in previous SO questions (e.g., this one), but without providing, in my view, a satisfactory answer. How would one modify the function above to make it work with arbitrary data frames?
One solution would be to pass x and y as string names of columns in data frame DS.
f <- function(DS, x, y) {
ggplot(DS, aes_string(x = x, y = y)) + geom_point()
}
And then call the function as:
f(cars, "speed", "dist")
However, it seems that you don't want that? Can you provide an example why you would need different functionality? Is it because you don't want to have the arguments in the same data frame?
I think it's possible the following type of codes, which only build the aes component.
require(ggplot2)
DS <- data.frame(speed=rnorm(10), dist=rnorm(10))
f <- function(DS, x, y, geom, opts=NULL) {
aes <- eval(substitute(aes(x, y),
list(x = substitute(x), y = substitute(y))))
p <- ggplot(DS, aes) + geom + opts
}
p <- f(DS, speed, dist, geom_point())
p
However, it seems to be complicated approach.
Another option is to use do.call. Here is a one line copy paste from a working code:
gg <- gg + geom_rect( do.call(aes, args=list(xmin=xValues-0.5, xmax=xValues+0.5, ymin=yValues, ymax=rep(Inf, length(yValues))) ), alpha=0.2, fill=colors )
One approach that I can think of is using match.call() to reach the variable names contained by the parameters/arguments passed to the custom plotting function and then use eval() on them. In this way you avoid passing them as quoted to your custom function, if you do not like that.
library(ggplot2)
fun <- function(df, x, y) {
arg <- match.call()
ggplot(df, aes(x = eval(arg$x), y = eval(arg$y))) + geom_point()
}
fun(mpg, cty, hwy) # no need to pass the variables (column names) as quoted / as strings

Resources