I'm trying to create a utility function that combines several geom_, like in this example (which doesn't work):
my_geom_y <- function(yy, colour){
geom_line(aes(y=yy), col=colour) + geom_point(aes(y=yy), col=colour)
}
so that then I can do this:
myX <- 0:90
ggplot(mapping = aes(x=myX)) + my_geom_y(dlnorm(myX), "red") + my_geom_y(dexp(myX), "blue")
Is that possible?
I tried using get(), eval(), substitute(), as.name() with no avail.
Looking at related posts: passing parameters to ggplot, Use of ggplot() within another function in R didn't help.
I like MSM's approach, but if you want to be able to add my_geom_y to a ggplot you've already made, this is an alternative that might suit what you're after:
library(ggplot2)
x <- 1:100
my_geom_y <- function(yy, colour = "black"){
list(
geom_line(mapping = aes(y= yy),
col = colour),
data = data.frame(x, yy)),
geom_point(mapping = aes(y = yy),
col = colour,
data = data.frame(x, yy))
)
}
ggplot(mapping = aes(x)) +
my_geom_y(x, "red") +
my_geom_y(dlnorm(x), "blue") +
my_geom_y((x^1.1), "black") +
my_geom_y(x/2, "yellow")
I don't have enough reputations to comment so here is a suggestion:
my_geom_y <- function(xx, yy, colour){
ggplot() +
geom_line(aes(x=xx, y=yy), col=colour) +
geom_point(aes(x=xx, y=yy), col=colour)
}
This will create one plot. To create multiple ones, you need to pass your inputs to the function as a list and loop through it inside the function for each geom (since we can't add two or more ggplot objects) - if that makes sense.
Based on #luke-c idea, this makes the function standalone, cut-n-paste ready. We can also add now labels to each curve.
my_geom_y <- function(.xx, .yy, yLabel = 1, .colour=NA ){
if (is.na(.colour))
.colour <- palette()[yLabel%%length(palette())]
list( geom_line(mapping=aes(.xx,.yy), col=.colour, data=data.frame(.xx, .yy)),
geom_point(mapping=aes(.xx,.yy), col=.colour, data=data.frame(.xx, .yy)),
annotate(geom="text" , col = .colour, label=deparse(substitute(.yy)),
x=mean(.xx),y=max(.yy)-(max(.yy)-min(.yy))/20*yLabel)
)
}
myX <- 1:10
ggplot() + my_geom_y(myX, dlnorm(myX), 1) +
my_geom_y(myX, dexp(myX), 2) + my_geom_y(myX, dexp(myX,0.7), 3)
This function becomes handy when you need to visually compare multiple distributions.
Related
I can't seem to find a way to combine two ggplots having different function ranges.
library(ggplot2)
myfun <- function(x) {
1/(1 + exp(-x))}
ggplot( NULL,aes(x)) +
stat_function(data=data.frame(x=c(0, 20)),fun=myfun, geom="line") +
stat_function(data=data.frame(x=c(10, 20)),fun=1/myfun, geom="line")
EDIT: Had a mistake in the question: 1/myfunc instead of myfunc in the second function data.
I am not sure if this is what you want, but I give your function two different colors based on two ranges. You can use the following code:
library(ggplot2)
myfun <- function(x) {
1/(1 + exp(-x))}
ggplot(NULL) +
stat_function(data= data.frame(x = c(0, 10)), aes(x, color = "blue"), fun=myfun, xlim = c(0,10)) +
stat_function(data= data.frame(x = c(10, 20)), aes(x, color = "red"), fun=myfun, xlim = c(10,20)) +
scale_color_manual(labels = c("blue", "red"), values = c("blue", "red"))
Output:
As you can see in the plot, the function is plotted within two different ranges.
Answer to edited question
I would suggest to just make a second function like this:
library(ggplot2)
myfun1 <- function(x) {
1/(1 + exp(-x))}
myfun2 <- function(x) {
1/(1/(1 + exp(-x)))}
ggplot( NULL) +
stat_function(data=data.frame(x=c(0, 20)),fun=myfun1, geom="line") +
stat_function(data=data.frame(x=c(10, 20)),fun=myfun2, geom="line")
Output:
I am trying to create some ggplots automaticly. Here is my working code example for adding stat_functions:
require(ggplot2)
p1 <- ggplot(data.frame(x = c(-2.5, 7.5)), aes(x = x)) + theme_minimal()+
stat_function(fun= function(x){1*x},lwd=1.25, colour = "navyblue") +
stat_function(fun= function(x){2*x},lwd=1.25, colour = "navyblue") +
stat_function(fun= function(x){3*-x},lwd=1.25, colour = "red")
p1
As you can see the stat_functions all use (nearly) the same function just with a different parameter.
Here is what i have tried to write:
f <- function(plot,list){
for (i in 1:length(list)){
plot <- plot + stat_function(fun= function(x){x*list[i]})
}
return(plot)
}
p1 <- ggplot(data.frame(x = c(-2.5, 7.5)), aes(x = x)) + theme_minimal()
p2 <- f(p1,c(1,2,3))
p2
This however doesnt return 3 lines, but only one. Why?
Your question is a bit confusing, because the first plot actually contains some other variable bits, but in your function you have a single stat_summary call for only one variable element.
Anyways. Keep the ggplot main object separate and create a list of additional objects, very easy for example with lapply. Add this list to your main plot as usual.
Check also https://ggplot2-book.org/programming.html
library(ggplot2)
p <- ggplot(data.frame(x = c(-2.5, 7.5)), aes(x = x)) + theme_minimal()
ls_sumfun <- lapply(1:3, function(y){
stat_function(fun= function(x){y*x}, lwd=1.25, colour = "navyblue")
}
)
p + ls_sumfun
Created on 2021-04-26 by the reprex package (v2.0.0)
In R, you can pass functions as arguments. You can also return functions from functions. This might make your code simpler and cleaner.
Here's an example:
p1 <- ggplot(data.frame(x = c(-2.5, 7.5)), aes(x = x))
add_stat_fun <- function (ggp, f) {
ggp + stat_function(fun = f)
}
make_multiply_fun <- function (multiplier) {
force(multiplier) # not sure if this is required...
f <- function (x) {multiplier * x}
return(f)
}
my_funs <- lapply(1:3, make_multiply_fun)
# my_funs is now a list of functions
add_stat_fun(p1, my_funs[[1]])
add_stat_fun(p1, my_funs[[2]])
add_stat_fun(p1, my_funs[[3]])
I would like to use a variable of the dataframe passed to the data parameter of function the ggplot in another ggplot2 function in the same call.
For instance, in the following example I want to refer to the variable x in the dataframe passed to the data parameter in ggplot in another function scale_x_continuous such as in:
library(ggplot2)
set.seed(2017)
samp <- sample(x = 20, size= 1000, replace = T)
ggplot(data = data.frame(x = samp), mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(x), max(x)))
And I get the error :
Error in seq(min(x)) : object 'x' not found
which I understand. Of course I can avoid the problem by doing :
df <- data.frame(x = samp)
ggplot(data = df, mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(df$x), max(df$x)))
but I don't want to be forced to define the object df outside the call to ggplot. I want to be able to directly refer to the variables in the dataframe I passed in data.
Thanks a lot
The scale_x_continuous function does not evaluate it's parameters in the data environment. One reason for this is that each layer can have it's own data source so by the time you got to the scales it wouldn't be clear which data environment is the "correct" one any more.
You could write a helper function to initialize the plot with your default. For example
helper <- function(df, col) {
ggplot(data = df, mapping = aes_string(x = col)) +
scale_x_continuous(breaks = seq(min(df[[col]]), max(df[[col]])))
}
and then call
helper(data.frame(x = samp), "x") + geom_bar()
Or you could write a wrapper around just the scale part. For example
scale_x_custom <- function(x) {
scale_x_continuous(breaks = seq(min(x) , max(x)))
}
and then you can add your custom scale to your plot
ggplot(data = df, mapping = aes(x = x)) +
geom_bar() +
scale_x_custom(df$x)
Or since you just want breaks at integer values, you can calculate the breaks from the default limits without needed to actually specify the data. For example
scale_x_custom <- function() {
scale_x_continuous(expand=expansion(0, .3),
breaks = function(x) {
seq(ceiling(min(x)), floor(max(x)))
})
}
ggplot(data = df, mapping = aes(x = x)) +
geom_bar() +
scale_x_custom()
Another less than ideal alternative would be to utilize the . special symbol in combination with {} which is imported from magrittr.
Enclosing the ggplot call in curly brackets allows one to reference . multiple times.
data.frame(x = samp) %>%
{ggplot(data = ., mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(.$x), max(.$x)))}
As an example, suppose that I have this snippet of code:
binwidth <- 0.01
my.histogram <- ggplot(my.data, aes(x = foo, fill = type)) +
geom_histogram(binwidth = binwidth,
aes(y = ..density..),
position = "identity",
alpha = 0.5) +
lims(x = c(0 - binwidth, 1 + binwidth), y = c(0, 100)) +
labs(x = "foo", y = "density")
Further, suppose that my.data has many other columns besides foo that could be plotted using pretty much the same code. Therefore, I would like to define a helper function make.histogram, so that I could replace the assignment above with something like:
my.histogram <- make.histogram(foo, bindwidth = 0.01)
Actually, this looks a bit weird to me. Would R complain that foo is not defined? Maybe the call would have to be this instead:
my.histogram <- make.histogram("foo", binwidth = 0.01)
Be that as it may, how would one define make.histogram?
For the purpose of this question, make.histogram may treat my.data as a global variable.
Also, note that in the snippet above, foo appears twice, once (as a variable) as the x argument in the first aes call, and once (as a string) as the x argument in the labs call. In other words, the make.histogram functions needs somehow to translate the column specified in its first argument into both a variable name and a string.
Not sure to understand your question.
Why couldn't you use aes_string() and define a function like below ?
make.histogram <- function(variable) {
p <- ggplot(my.data, aes_string(x = variable, fill = "type")) + (...) + xlab(variable)
print(p)
}
Since ggplot is part of the tidyverse, I think tidyeval will come in handy:
make.histogram <- function(var = "foo", bindwith = 0.01) {
varName <- as.name(var)
enquo_varName <- enquo(varName)
ggplot(my.data, aes(x = !!enquo_varName, fill = type)) +
...
labs(x = var)
}
Basically, with as.name() we generate a name object that matches var (here var is a string like "foo"). Then, following Programming with dplyr, we use enquo() to look at that name and return the associated value as a quosure. This quosure can then be unquoted inside the ggplot() call using !!.
After reading the material that #andrew.punnett linked in his comment, it was very easy to code the desired function:
make.histogram <- function(column.name, binwidth = 0.02) {
base.aes <- eval(substitute(aes(x = column.name, fill = type)))
x.label <- deparse(substitute(column.name))
ggplot(my.data, base.aes) +
geom_histogram(binwidth = binwidth,
aes(y = ..density..),
position = "identity",
alpha = 0.5) +
lims(x = c(0 - binwidth, 1 + binwidth), y = c(0, 100)) +
labs(x = x.label, y = "density")
}
my.histogram <- make.histogram(foo, binwidth = 0.01)
The benefit of this solution is its generality: it relies only on base R functions (substitute, eval, and deparse), so it can be easily ported to situations outside of the ggplot2 context.
I would like to use a variable of the dataframe passed to the data parameter of function the ggplot in another ggplot2 function in the same call.
For instance, in the following example I want to refer to the variable x in the dataframe passed to the data parameter in ggplot in another function scale_x_continuous such as in:
library(ggplot2)
set.seed(2017)
samp <- sample(x = 20, size= 1000, replace = T)
ggplot(data = data.frame(x = samp), mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(x), max(x)))
And I get the error :
Error in seq(min(x)) : object 'x' not found
which I understand. Of course I can avoid the problem by doing :
df <- data.frame(x = samp)
ggplot(data = df, mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(df$x), max(df$x)))
but I don't want to be forced to define the object df outside the call to ggplot. I want to be able to directly refer to the variables in the dataframe I passed in data.
Thanks a lot
The scale_x_continuous function does not evaluate it's parameters in the data environment. One reason for this is that each layer can have it's own data source so by the time you got to the scales it wouldn't be clear which data environment is the "correct" one any more.
You could write a helper function to initialize the plot with your default. For example
helper <- function(df, col) {
ggplot(data = df, mapping = aes_string(x = col)) +
scale_x_continuous(breaks = seq(min(df[[col]]), max(df[[col]])))
}
and then call
helper(data.frame(x = samp), "x") + geom_bar()
Or you could write a wrapper around just the scale part. For example
scale_x_custom <- function(x) {
scale_x_continuous(breaks = seq(min(x) , max(x)))
}
and then you can add your custom scale to your plot
ggplot(data = df, mapping = aes(x = x)) +
geom_bar() +
scale_x_custom(df$x)
Or since you just want breaks at integer values, you can calculate the breaks from the default limits without needed to actually specify the data. For example
scale_x_custom <- function() {
scale_x_continuous(expand=expansion(0, .3),
breaks = function(x) {
seq(ceiling(min(x)), floor(max(x)))
})
}
ggplot(data = df, mapping = aes(x = x)) +
geom_bar() +
scale_x_custom()
Another less than ideal alternative would be to utilize the . special symbol in combination with {} which is imported from magrittr.
Enclosing the ggplot call in curly brackets allows one to reference . multiple times.
data.frame(x = samp) %>%
{ggplot(data = ., mapping = aes(x = x)) + geom_bar() +
scale_x_continuous(breaks = seq(min(.$x), max(.$x)))}