Passing ggplot through a function and referencing an argument - r

I'm using ggplot2 within a function and trying to create average lines for the y-axis. I'm running into trouble seemingly because the variable that defines the y-axis is one of the function args, and I can't call it directly within ggplot.
library(ggplot2)
west <- data.frame(
spend = sample(50:100,50,replace=T),
trials = sample(100:200,50,replace=T),
courts = sample(25:50,50,replace=T),
country = sample(c("usa","canada","uk"),50,replace = T)
)
Here's a basic version of the function I'm working with:
ggfun <- function(data, xvar, yvar) {
newplot <- ggplot(data=west, aes_string(x=xvar, y=yvar)) +
geom_point(shape=21, fill="blue") +
newplot
}
and calling that as below works:
ggfun(west, "spend", "trials")
But when I try to add in geom_hline, I get an error:
ggfun <- function(data, xvar, yvar) {
newplot <- ggplot(data=west, aes_string(x=xvar, y=yvar)) +
geom_point(shape=21, fill="blue") +
geom_hline(yintercept=mean(yvar))
newplot
}
ggfun(west, "spend", "trials")
Warning messages:
1: In mean.default(data$yvar) :
argument is not numeric or logical: returning NA
2: Removed 1 rows containing missing values (geom_hline).
Is it not possible to call data this way within a function using ggplot?

aes_string substitutes the whole string, not just the var. You can create the right string using paste:
library(ggplot2)
west <- data.frame(
spend = sample(50:100,50,replace=T),
trials = sample(100:200,50,replace=T),
courts = sample(25:50,50,replace=T),
country = sample(c("usa","canada","uk"),50,replace = T)
)
ggfun <- function(data, xvar, yvar) {
newplot <- ggplot(data=data, aes_string(x=xvar, y=yvar)) +
geom_point(shape=21, fill="blue") +
geom_hline(aes_string(yintercept = paste0('mean(', yvar, ')')))
newplot
}
ggfun(west, "spend", "trials")

yvar is a string, it works exactly as if you were doing this, not in a function:
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
geom_hline(yintercept = mean("mpg"))
Warning messages:
1: In mean.default("mpg") :
argument is not numeric or logical: returning NA
2: Removed 1 rows containing missing values (geom_hline).
I'd recommend pre-computing the mean, that way you can pass a value to the yintercept:
ggfun <- function(data, xvar, yvar) {
mean_yvar = mean(data[[yvar]])
newplot <- ggplot(data = west, aes_string(x = xvar, y = yvar)) +
geom_point(shape=21, fill="blue") +
geom_hline(yintercept = mean_yvar)
newplot
}
ggfun(west, "spend", "trials")
# works fine

Related

Creating boxplots for visualisation in R using a function

I wrote a code to get boxplots for visualisation in R. The code is running but I am not getting any boxplots. Can you please help identify where I went wrong.
create_boxplots <- function(x, y){
ggplot(data = forest_fires) +
aes_string(x = x, y = y) +
geom_boxplot() +
theme(panel.background = element_rect(fill = "white"))
}
x_var_month <- names(forest_fires)[3]
y_var <- names(forest_fires)[5:12]
month_box <- map2(x_var_month, y_var, create_boxplots)
The code is ok, just that when you call ggplot inside a function, you return the object and it is stored in a list. You need to print it. For example:
library(ggplot2)
library(purrr)
library(gridExtra)
create_boxplots <- function(x, y){
ggplot(data = forest_fires) +
aes_string(x = x, y = y) +
geom_boxplot() +
theme(panel.background = element_rect(fill = "white"))
}
forest_fires = data.frame(matrix(runif(1300),ncol=13))
forest_fires[,3] = factor(sample(1:12,nrow(forest_fires),replace=TRUE))
x_var_month <- names(forest_fires)[3]
y_var <- names(forest_fires)[5:12]
month_box <- map2(x_var_month, y_var, create_boxplots)
This shows you the plot for y_var[1]
month_box[[1]]
#or print(month_box[[1]])
To get everything in 1 plot do:
grid.arrange(grobs=month_box)

R - How do I pass a function as an argument to another function?

I have a function I'm creating like this:
library(ggplot2)
plot_function <- function(data, x, y){
ggplot(data, aes_string(x=x, y=y)) +
geom_line() +
scale_y_continuous(labels = scales::comma_format())
}
I can call it like this:
df <- data.frame(date = seq(as.Date("2019/01/01"), as.Date("2019/01/05"),"1 day"),
value = seq(.1,.5, .1))
df
date value
2019-01-01 0.1
2019-01-02 0.2
2019-01-03 0.3
2019-01-04 0.4
2019-01-05 0.5
plot_function(df, x = "date", "value")
But what if I wanted to allow the user to be able to change the y axis to a percentage. How can I let them replace scales::comma_format()? This doesn't work:
plot_function <- function(data, x, y, y_format){
ggplot(data, aes_string(x=x, y=y)) +
geom_line() +
scale_y_continuous(labels = y_format)
}
plot_function(df, x = "date", "value", y_format = "scales::percent_format()")
I get this error:
"Error in f(..., self = self) : Breaks and labels are different lengths"
Another option is to set up the function using the ... argument, so that passing a labels argument to scale_y_continuous is optional:
plot_function <- function(data, x, y, ...) {
ggplot(data, aes_string(x=x, y=y)) +
geom_line() +
scale_y_continuous(...)
}
# Pass nothing to scale_y_continuous
plot_function(mtcars, x = "cyl", y="hp")
# Add some big numbers to mtcars
mtcars$hp = 1e5 * mtcars$hp
# Pass a labels argument to scale_y_continuous to get comma formatted values
plot_function(mtcars, x = "cyl", y="hp", labels=scales::comma)
try this:
plot_function <- function(data, x, y, y_format){
ggplot(data, aes_string(x=x, y=y)) +
geom_line() +
scale_y_continuous(labels = y_format())
}
plot_function(df, x = "date", "value", y_format = scales::percent_format)

return plot and values from a function together [duplicate]

This question already has an answer here:
Custom Function, ggplot and return values
(1 answer)
Closed last year.
I have a function like this:
fun <- function(dataset){
require(ggplot2)
g <- ggplot(dataset, aes(x = x, y = y)) + geom_smooth(method = "lm") + geom_point()
l<-lm(y~x)
return (list(l, g))
}
and I want to return both plot and the values, but it doesn't return the plot and I face this error:
Error in .Call.graphics(C_palette2, .Call(C_palette2, NULL)) :
invalid graphics state
What can I do?
The following works, and you can get the plot. However, R warns that's not the way to do it.
fun <- function(dataset){
require(ggplot2)
p <- ggplot(dataset, aes(x = x, y = y)) +
geom_smooth(method = "lm") + geom_point()
l <- lm(y~x, data=dataset)
return (list(l, p))
}
dataset <- data.frame(x= 1:10, y=1:10)
out <- fun(dataset)
Edit: I've had a look about the warning, it seems like something you can ignore. See link https://stat.ethz.ch/pipermail/r-devel/2016-December/073554.html

ggplot line or point plotting conditionally

I'm using ggplot to write a bunch of plots inside a function. I want to pass another flag to the function so that I can choose while calling the function that whether to plot lines or points.
Currently I'm doing it like this:
plot2pdfHD = function(opdata, dir = 'plots'){
#... do something ...
plots <- list()
for (i in seq(strikes)){
#... do something ...
plots[[i]] <- ggplot(sset, aes(x = TIMESTAMP, y = value, col = optype)) +
geom_line() + labs(x = "Time", y = 'values') +
#... do something ...
}
pdf(paste0(dir, '/', Sys.Date(), '.pdf'), width=16, height=10)
for(i in seq(length(plots)))
tryCatch({print(plots[[i]])}, error = function(e) NULL)
dev.off()
}
I want to add a flag so that by setting appropriate value to the flag I can switch between geom_line() and geom_point() while calling the function.
Addition:
Can it be done without repeating the additional call part, i.e. #... do something ...? I am hoping for an answer that does that.
Also sset is a subset of the opdata.
Maybe this is what you're looking for? I like #arvi1000's answer---nice and clear---but you can put the if statement inside a single ggplot addition expression:
type = "line"
## also try with
# type = "point"
ggplot(mtcars, aes(x = wt, y = mpg)) + {
if(type == "point") geom_point() else geom_line()
} +
theme_bw()
For multiple layers, you could do something like this:
gg = ggplot(mtcars, aes(x = wt, y = mpg))
{
if(type == "point") {
gg + geom_point()
} else {
gg + geom_line() + stat_smooth()
}
} + theme_bw()
(Of course, adding the theme_bw() to the original gg definition would be cleaner, but this demonstrates that it can be added later.)
Plain old if block?
plot2pdfHD = function(opdata, dir = 'plots', plot_type){
plots <- list()
for (i in seq(strikes)) {
# base gg call
p <- ggplot(sset, aes(x = TIMESTAMP, y = value, col = optype))
# add geom layer based on content of argument:
if(plot_type == 'line') p <- p + geom_line()
if(plot_type == 'point') p <- p + geom_point()
# add additional params and assign into list item
plots[[i]] <- p + labs(x = "Time", y = 'values')
#...
}
# ...
}
Other notes:
I'm assuming you are doing something to make sset different before each call, otherwise you are going to get a list of identical plots
lapply might be better than a for loop here, esp since you are wanting a list object as the result anyway

Error in trying to write a plotting function in ggplot2

I am trying to write a function in ggplot2 and obtain this error message:
Error in layout_base(data, vars, drop = drop) :
At least one layer must contain all variables used for facetting
Here is my code:
growth.plot<-function(data,x,y,fac){
gp <- ggplot(data = data,aes(x = x, y = y))
gp <- gp + geom_point() + facet_wrap(~ fac)
return(gp)
}
growth.plot(data=mydata, x=x.var, y=y.var,fac= fac.var)
If I try without the function, the plot appears perfectly
gp1 <- ggplot(data = mydata,aes(x = x.var), y = y.var))
gp1+ geom_point()+ facet_wrap(~ fac.var) # this works
Here is reproducible solution where your x, y, and fac arguments must be passed as character:
library(ggplot2)
make_plot = function(data, x, y, fac) {
p = ggplot(data, aes_string(x=x, y=y, colour=fac)) +
geom_point(size=3) +
facet_wrap(as.formula(paste("~", fac)))
return(p)
}
p = make_plot(iris, x="Sepal.Length", y="Petal.Length", fac="Species")
ggsave("iris_plot.png", plot=p, height=4, width=8, dpi=120)
Thanks to commenters #Roland and #aosmith for pointing the way to this solution.

Resources