Local Variables Within aes - r

I'm trying to use a local variable in aes when I plot with ggplot. This is my problem boiled down to the essence:
xy <- data.frame(x=1:10,y=1:10)
plotfunc <- function(Data,YMul=2){
ggplot(Data,aes(x=x,y=y*YMul))+geom_line()
}
plotfunc(xy)
This results in the following error:
Error in eval(expr, envir, enclos) : object 'YMul' not found
It seems as if I cannot use local variables (or function arguments) in aes. Could it be that it occurrs due to the content of aes being executed later when the local variable is out of scope? How can I avoid this problem (other than not using the local variable within aes)?

I would capture the local environment,
xy <- data.frame(x=1:10,y=1:10)
plotfunc <- function(Data, YMul = 2){
.e <- environment()
ggplot(Data, aes(x = x, y = y*YMul), environment = .e) + geom_line()
}
plotfunc(xy)

Here's an alternative that allows you to pass in any value through the YMul argument without having to add it to the Data data.frame or to the global environment:
plotfunc <- function(Data, YMul = 2){
eval(substitute(
expr = {
ggplot(Data,aes(x=x,y=y*YMul)) + geom_line()
},
env = list(YMul=YMul)))
}
plotfunc(xy, YMul=100)
To see how this works, try out the following line in isolation:
substitute({ggplot(Data, aes(x=x, y=y*YMul))}, list(YMul=100))

ggplot()'s aes expects YMul to be a variable within the data data frame. Try including YMull there instead:
Thanks to #Justin: ggplot()'s aes seems to look forYMul in the data data frame first, and if not found, then in the global environment. I like to add such variables to the data frame, as follows, as it makes sense to me conceptually. I also don't have to worry about changes to global variables having unexpected consequences to functions. But all of the other answers are also correct. So, use whichever suits you.
require("ggplot2")
xy <- data.frame(x = 1:10, y = 1:10)
xy <- cbind(xy, YMul = 2)
ggplot(xy, aes(x = x, y = y * YMul)) + geom_line()
Or, if you want the function in your example:
plotfunc <- function(Data, YMul = 2)
{
ggplot(cbind(Data, YMul), aes(x = x, y = y * YMul)) + geom_line()
}
plotfunc(xy)

I am using ggplot2, and your example seems to work fine with the current version.
However, it is easy to come up with variants which still create trouble. I was myself confused by similar behavior, and that's how I found this post (top Google result for "ggplot how to evaluate variables when passed"). For example, if we move ggplot out of plotfunc:
xy <- data.frame(x=1:10,y=1:10)
plotfunc <- function(Data,YMul=2){
geom_line(aes(x=x,y=y*YMul))
}
ggplot(xy)+plotfunc(xy)
# Error in eval(expr, envir, enclos) : object 'YMul' not found
In the above variant, "capturing the local environment" is not a solution because ggplot is not called from within the function, and only ggplot has the "environment=" argument.
But there is now a family of functions "aes_", "aes_string", "aes_q" which are like "aes" but capture local variables. If we use "aes_" in the above, we still get an error because now it doesn't know about "x". But it is easy to refer to the data directly, which solves the problem:
plotfunc <- function(Data,YMul=2){
geom_line(aes_(x=Data$x,y=Data$y*YMul))
}
ggplot(xy)+plotfunc(xy)
# works

Have you looked at the solution given by #wch (W. Chang)?
https://github.com/hadley/ggplot2/issues/743
I think it is the better one
essentially is like that of #baptiste but include the reference to the environment directly in the call to ggplot
I report it here
g <- function() {
foo3 <- 4
ggplot(mtcars, aes(x = wt + foo3, y = mpg),
environment = environment()) +
geom_point()
}
g()
# Works

If you execute your code outside of the function it works. And if you execute the code within the function with YMul defined globally, it works. I don't fully understand the inner workings of ggplot but this works...
YMul <- 2
plotfunc <- function(Data){
ggplot(Data,aes(x=x,y=y*YMul))+geom_line()
}
plotfunc(xy)

Related

Scatterplot function that can change based on variables for axes

I am trying to write a function that seems like it should be very simple but I am having problems with it. I want to write a function that takes in three arguements: a dataframe, x-axis variable and y-axis variable. Based on these, I want it to return a scatterplot in which the x-axis variable and y-axis variable can be changed. This is the very basic function I wrote:
scatter_plot <- function(dataframe, x_input, y_input) {
plot <- ggplot(data = dataframe) +
geom_point(mapping = aes(x = x_input, y = y_input),
)
}
For reproducibility, consider the dataset midwest that is in the ggplot2 package. The code I wrote does not produce errors when I run it, but when I try to pass arguments into it, such as
scatter_plot(midwest, percollege, percpovertyknown)
the function returns
"Error in FUN(X[[i]], ...) : object 'percollege' not found"
It seems like it does not recognize the variables in the argument, but I have been playing around with the function for quite some time and I can't seem to figure it out. Can someone help me with how to fix this so my function works correctly?
tidyverse uses non standard evaluation (NSE), which makes using its facilities in functions slightly more complicated than you expect. Here's a version of your function that works for me.
scatter_plot <- function(dataframe, x_input, y_input) {
qX <- enquo(x_input)
qY <- enquo(y_input)
plot <- ggplot(data = dataframe) +
geom_point(mapping = aes(x = !! qX, y = !! qY),
)
return(plot)
}
As you've assigned your plot to an object, I've added a return statement.
See here for more information on NSE.
Using !!rlang::ensym() in your function should work.
scatter_plot <- function(dataframe, x_input, y_input) {
plot <- ggplot(data = dataframe) +
geom_point(mapping = aes(x = !!rlang::ensym(x_input), y = !!rlang::ensym(y_input)))
plot
}
Example
scatter_plot(midwest, percollege, percpovertyknown)

Aesthetics Error When Calling ggplot() Using Two Methods

My end goal is to create a function to easily build a series of ggplot objects. However in running some tests on the a piece of the code I plan to use within my function I'm receiving a geom_point aesthetics error whose cause doesn't seem to match other instances of this error for which I've found SO questions.
Reproducible code below
library(ggpubr)
library(ggplot2)
redData <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
,header = TRUE, sep = ";")
datatest <- redData
x <- "alcohol"
y <- "quality"
#PlotTest fails with Error: geom_point requires the following missing aesthetics: x, y
PlotTest<-ggplot(datatest, aes(datatest$x,datatest$y)) +
geom_point()+xlim(0,15)+ylim(0,10)
#PlotTest2 works just fine, they should be functionally equivalent
PlotTest2 <- ggplot(redData, aes(redData$"alcohol", redData$"quality")) +
geom_point()+xlim(0,15)+ylim(0,10)
PlotTest
PlotTest2
PlotTest and PlotTest2 should be functionally equivalent, but they clearly are not but I can't see what causes one to work and not the other.
EDIT
I realize now that datatest$x,datatest$y dont actually resolve to datatest$"alcohol" and datatest$"quality". That was silly.
Is there some way to access data via a variable name that stores the column name? That would be what I need.
library(ggpubr)
library(ggplot2)
redData <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv" ,header = TRUE, sep = ";")
datatest <- redData
x <- "alcohol"
y <- "quality"
ggplot(datatest,aes(x=datatest[,x],y=datatest[,y]))+geom_point()+xlim(0,15)+ylim(0,10)+labs(x=x,y=y)
ggplot(redData,aes(x=alcohol,y=quality))+geom_point()+xlim(0,15)+ylim(0,10)
You can use aes_string() which takes character variables as argument names:
library(dplyr)
library(ggplot2)
plot_cars <- function(data = mtcars, x, y) {
data %>%
ggplot(aes_string(x, y)) +
geom_point()
}
plot_cars(x = "mpg", y = "cyl")
In your example above you'd call ggplot(redData, aes_string(x, y))..., though don't have your data to test that.

passing varying columns to aes inside a function

I am trying to write a function that calls ggplot with varying arguments to the aes:
hmean <- function(data, column, Label=label){
ggplot(data,aes(column)) +
geom_histogram() +
facet_wrap(~Antibody,ncol=2) +
ggtitle(paste("Mean Antibody Counts (Log2) for ",Label," stain"))
}
hmean(Log2Means,Primary.Mean, Label="Primary")
Error in eval(expr, envir, enclos) : object 'column' not found
Primary.Mean is the varying argument (I have multiple means). Following various posts here I have tried
passing the column name quoted and unquoted (which yieds either an "unexpected string constant" or the "object not found error)
setting up a local ennvironment (foo <-environment() followed by a environment= arg in ggplot)
creating a new copy of the data set using a data2$column <- data[,column]
None of these appear to work within ggplot. How do I write a function that works?
I will be calling it with different data.frames and columns:
hmean(Log2Means, Primary.mean, Label="Primary")
hmean(Log2Means, Secondary.mean, Label="Secondary")
hmean(SomeOtherFrame, SomeColumn, Label="Pretty Label")
You example is not reproducible, but likely this will work:
hmean <- function(data, column, Label=label){
ggplot(data, do.call("aes", list(y = substitute(column))) ) +
geom_histogram() +
facet_wrap(~Antibody,ncol=2) +
ggtitle(paste("Mean Antibody Counts (Log2) for ",Label," stain"))
}
hmean(Log2Means,Primary.Mean, Label="Primary")
If you need more arguments to aes, do like this:
do.call("aes", list(y = substitute(function_parameter), x = quote(literal_parameter)))
You could try this:
hmean <- function(data, column, Label=label){
# cool trick?
data$pColumn <- data[, column]
ggplot(data,aes(pColumn)) +
geom_histogram() +
facet_wrap(~Antibody,ncol=2) +
ggtitle(paste("Mean Antibody Counts (Log2) for ",Label," stain"))
}
hmean(Log2Means,'Primary.Mean', Label="Primary")
I eventually got it to work with an aes_string() call: aes_string(x=foo, y=y, colour=color), wehre y and color were also defined externally to ggplot().

Function that returns an aesthetic mapping

I would like to create a function that "works just like" ggplot2's aes() function. My humble attempts fail with an "Object not found" error:
library(ggplot2)
data <- data.frame(a=1:5, b=1:5)
# Works
ggplot(data) + geom_point() + aes(x=a, y=b)
my.aes <- function(x, y) { aes(x=x, y=y) }
# Fails with "Error in eval(expr, envir, enclos) : object 'x' not found"
ggplot(data) + geom_point() + my.aes(x=a, y=b)
What is the correct way to implement my.aes()? This is for encapsulation and code reuse.
Perhaps this is related, I just don't see yet how:
How to write an R function that evaluates an expression within a data-frame.
Type aes without any parentheses or arguments to see what it's doing:
function (x, y, ...)
{
aes <- structure(as.list(match.call()[-1]), class = "uneval")
rename_aes(aes)
}
It takes the name of its arguments without evaluating them. It's basically saving the names for later so it can evaluate them in the context of the data frame you're trying to plot (that's why your error message is complaining about eval). So when you include my.aes(x=a, y=b) in your ggplot construction, it's looking for x in data--because x was not evaluated in aes(x=x, y=y).
An alternate way of thinking about what's going on in aes is something like
my.aes <- function(x, y) {
ans <- list(x = substitute(x), y = substitute(y))
class(ans) <- "uneval"
ans
}
which should work in the example above, but see the note in plyr::. (which uses the same match.call()[-1] paradigm as aes):
Similar tricks can be performed with substitute, but when functions
can be called in multiple ways it becomes increasingly tricky to
ensure that the values are extracted from the correct frame.
Substitute tricks also make it difficult to program against the
functions that use them, while the quoted class provides
as.quoted.character to convert strings to the appropriate data
structure.
If you want my.aes to call aes itself, perhaps something like:
my.aes <- function(x,y) {
do.call(aes, as.list(match.call()[-1]))
}
Example with the aes_string function pointed out by Roman Luštrik:
my.aes <- function(x,y) {
aes_string(x = x, y = y)
}
but you would need to change your call to my.aes("a", "b") in this case.

Use of ggplot() within another function in R

I'm trying to write a simple plot function, using the ggplot2 library. But the call to ggplot doesn't find the function argument.
Consider a data.frame called means that stores two conditions and two mean values that I want to plot (condition will appear on the X axis, means on the Y).
library(ggplot2)
m <- c(13.8, 14.8)
cond <- c(1, 2)
means <- data.frame(means=m, condition=cond)
means
# The output should be:
# means condition
# 1 13.8 1
# 2 14.8 2
testplot <- function(meansdf)
{
p <- ggplot(meansdf, aes(fill=meansdf$condition, y=meansdf$means, x = meansdf$condition))
p + geom_bar(position="dodge", stat="identity")
}
testplot(means)
# This will output the following error:
# Error in eval(expr, envir, enclos) : object 'meansdf' not found
So it seems that ggplot is calling eval, which can't find the argument meansdf. Does anyone know how I can successfully pass the function argument to ggplot?
(Note: Yes I could just call the ggplot function directly, but in the end I hope to make my plot function do more complicated stuff! :) )
The "proper" way to use ggplot programmatically is to use aes_string() instead of aes() and use the names of the columns as characters rather than as objects:
For more programmatic uses, for example if you wanted users to be able to specify column names for various aesthetics as arguments, or if this function is going in a package that needs to pass R CMD CHECK without warnings about variable names without definitions, you can use aes_string(), with the columns needed as characters.
testplot <- function(meansdf, xvar = "condition", yvar = "means",
fillvar = "condition") {
p <- ggplot(meansdf,
aes_string(x = xvar, y= yvar, fill = fillvar)) +
geom_bar(position="dodge", stat="identity")
}
As Joris and Chase have already correctly answered, standard best practice is to simply omit the meansdf$ part and directly refer to the data frame columns.
testplot <- function(meansdf)
{
p <- ggplot(meansdf,
aes(fill = condition,
y = means,
x = condition))
p + geom_bar(position = "dodge", stat = "identity")
}
This works, because the variables referred to in aes are looked for either in the global environment or in the data frame passed to ggplot. That is also the reason why your example code - using meansdf$condition etc. - did not work: meansdf is neither available in the global environment, nor is it available inside the data frame passed to ggplot, which is meansdf itself.
The fact that the variables are looked for in the global environment instead of in the calling environment is actually a known bug in ggplot2 that Hadley does not consider fixable at the moment.
This leads to problems, if one wishes to use a local variable, say, scale, to influence the data used for the plot:
testplot <- function(meansdf)
{
scale <- 0.5
p <- ggplot(meansdf,
aes(fill = condition,
y = means * scale, # does not work, since scale is not found
x = condition))
p + geom_bar(position = "dodge", stat = "identity")
}
A very nice workaround for this case is provided by Winston Chang in the referenced GitHub issue: Explicitly setting the environment parameter to the current environment during the call to ggplot.
Here's what that would look like for the above example:
testplot <- function(meansdf)
{
scale <- 0.5
p <- ggplot(meansdf,
aes(fill = condition,
y = means * scale,
x = condition),
environment = environment()) # This is the only line changed / added
p + geom_bar(position = "dodge", stat = "identity")
}
## Now, the following works
testplot(means)
Here is a simple trick I use a lot to define my variables in my functions environment (second line):
FUN <- function(fun.data, fun.y) {
fun.data$fun.y <- fun.data[, fun.y]
ggplot(fun.data, aes(x, fun.y)) +
geom_point() +
scale_y_continuous(fun.y)
}
datas <- data.frame(x = rnorm(100, 0, 1),
y = x + rnorm(100, 2, 2),
z = x + rnorm(100, 5, 10))
FUN(datas, "y")
FUN(datas, "z")
Note how the y-axis label also changes when different variables or data-sets are used.
I don't think you need to include the meansdf$ part in your function call itself. This seems to work on my machine:
meansdf <- data.frame(means = c(13.8, 14.8), condition = 1:2)
testplot <- function(meansdf)
{
p <- ggplot(meansdf, aes(fill=condition, y=means, x = condition))
p + geom_bar(position="dodge", stat="identity")
}
testplot(meansdf)
to produce:
This is an example of a problem that is discussed earlier. Basically, it comes down to ggplot2 being coded for use in the global environment mainly. In the aes() call, the variables are looked for either in the global environment or within the specified dataframe.
library(ggplot2)
means <- data.frame(means=c(13.8,14.8),condition=1:2)
testplot <- function(meansdf)
{
p <- ggplot(meansdf, aes(fill=condition,
y=means, x = condition))
p + geom_bar(position="dodge", stat="identity")
}
EDIT:
update: After seeing the other answer and updating the ggplot2 package, the code above works. Reason is, as explained in the comments, that ggplot will look for the variables in aes in either the global environment (when the dataframe is specifically added as meandf$... ) or within the mentioned environment.
For this, be sure you work with the latest version of ggplot2.
If is important to pass the variables (column names) to the custom plotting function unquoted, while different variable names are used within the function, then another workaround that I tried, was to make use of match.call() and eval (like here as well):
library(ggplot2)
meansdf <- data.frame(means = c(13.8, 14.8), condition = 1:2)
testplot <- function(df, x, y) {
arg <- match.call()
scale <- 0.5
p <- ggplot(df, aes(x = eval(arg$x),
y = eval(arg$y) * scale,
fill = eval(arg$x)))
p + geom_bar(position = "dodge", stat = "identity")
}
testplot(meansdf, condition, means)
Created on 2019-01-10 by the reprex package (v0.2.1)
Another workaround, but with passing quoted variables to the custom plotting function is using get():
meansdf <- data.frame(means = c(13.8, 14.8), condition = 1:2)
testplot <- function(df, x, y) {
scale <- 0.5
p <- ggplot(df, aes(x = get(x),
y = get(y) * scale,
fill = get(x)))
p + geom_bar(position = "dodge", stat = "identity")
}
testplot(meansdf, "condition", "means")
Created on 2019-01-10 by the reprex package (v0.2.1)
This frustrated me for some time. I wanted to send different data frames with different variable names and I wanted the ability to plot different columns from the data frame. I finally got a work around by creating some dummy (global) variables to handle plotting and forcing assignment inside the function
plotgraph function(df,df.x,df.y) {
dummy.df <<- df
dummy.x <<- df.x
dummy.y <<- df.y
p = ggplot(dummy.df,aes(x=dummy.x,y=dummy.y,.....)
print(p)
}
then in the main code I can just call the function
plotgraph(data,data$time,data$Y1)
plotgraph(data,data$time,data$Y2)
Short answer: Use qplot
Long answer:
In essence you want something like this:
my.barplot <- function(x=this.is.a.data.frame.typically) {
# R code doing the magic comes here
...
}
But that lacks flexibility because you must stick to consistent column naming to avoid the annoying R scope idiosyncrasies. Of course the next logic step is:
my.barplot <- function(data=data.frame(), x=..., y....) {
# R code doing something really really magical here
...
}
But then that starts looking suspiciously like a call to qplot(), right?
qplot(data=my.data.frame, x=some.column, y=some.other column,
geom="bar", stat="identity",...)
Of course now you'd like to change things like scale titles but for that a function comes handy... the good news is that scoping issues are mostly gone.
my.plot <- qplot(data=my.data.frame, x=some.column, y=some.other column,...)
set.scales(p, xscale=scale_X_continuous, xtitle=NULL,
yscale=scale_y_continuous(), title=NULL) {
return(p + xscale(title=xtitle) + yscale(title=ytitle))
}
my.plot.prettier <- set.scale(my.plot, scale_x_discrete, 'Days',
scale_y_discrete, 'Count')
Another workaround is to define the aes(...) as a variable of your function :
func<-function(meansdf, aes(...)){}
This just worked fine for me on a similar topic
You don't need anything fancy. Not even dummy variables. You only need to add a print() inside your function, is like using cat() when you want something to show in the console.
myplot <- ggplot(......) + Whatever you want here
print(myplot)
It worked for me more than one time inside the same function
I just generate new data frame variables with the desired names inside the function:
testplot <- function(df, xVar, yVar, fillVar) {
df$xVar = df[,which(names(df)==xVar)]
df$yVar = df[,which(names(df)==yVar)]
df$fillVar = df[,which(names(df)==fillVar)]
p <- ggplot(df,
aes(x=xvar, y=yvar, fill=fillvar)) +
geom_bar(position="dodge", stat="identity")
}

Resources