I often combine dplyr with ggplot2 in wrapper functions for analysis. As I am moving to the new NSE / SE paradigm of v.0.7.1 with tidyeval, I am struggling to get this combination to work. I found that ggplot does not understand unquoted quosers (yet). The following does not work:
example_func <- function(col) {
col <- enquo(col)
mtcars %>% count(!!col) %>%
ggplot(aes((!!col), n)) +
geom_bar(stat = "identity")
}
example_func(cyl)
# Error in !col : invalid argument type
I currently use the following work-around. But I assume there must be a better way.
example_func2 <- function(col) {
col <- enquo(col)
mtcars %>% count(!!col) %>%
ggplot(aes_string(rlang::quo_text(col), "n")) +
geom_bar(stat = "identity")
}
Please show me what the best way to combine these two. Thanks!
If you are already handling quosures it's easier to use aes_ which accepts inputs quoted as a formula: aes_(col, ~n).
This bit of code solves your problem:
library(tidyverse)
example_func <- function(col) {
col <- enquo(col)
mtcars %>% count(!!col) %>%
ggplot(aes_(col, ~n)) +
geom_bar(stat = "identity")
}
example_func(cyl)
There seem to be two ways of thinking about this.
Approach 1: Separation of concerns.
I like my plotting stuff to be very much separate from my wrangling stuff. Also, you can name your group which feels like the easiest method to solve your problem [although you do loose the original column name]. So one method of solving what you're trying to do can be via;
library(tidyverse)
concern1_data <- function(df, col) {
group <- enquo(col)
df %>%
group_by(group = !!group) %>%
summarise(n = n())
}
concern2_plotting <- function(df){
ggplot(data=df) +
geom_bar(aes(group, n), stat = "identity")
}
mtcars %>%
concern1_data(am) %>%
concern2_plotting()
This achieves what you're trying to do more or less and keeps concerns apart (which deserves a mention).
Approach 2: Accept and Wait
Thing is: tidyeval is not yet implemented in ggplot2.
- Colin Fay from link
I think this is support that is currently not in ggplot2 but I can't imagine that ggplot2 won't get this functionality. It's just not there yet.
Related
Problem: purrr::pmap() output incompatible with ggplot::aes()
The following reprex boils down to a single question, is there anyway we can use the quoted variable names inside ggplot2::aes() instead of the plain text names? Example: we typically use ggplot(mpg, aes(displ, cyl)) , how to make aes() work normally with ggplot(mpg, aes("displ", "cyl")) ?
If you understood my question, the remainder of this reprex really adds no information. However, I added it to draw the full picture of the problem.
More details: I want to use purrr functions to create a bunch of routinely exploratory data analysis plots effortlessly. The problem is, purrr::pmap() results the string-quoted name of the variables, which ggplot::aes() doesn't understand. As far as I'm concerned, the functions cat() and as.name() can take the string-quoted variable name and return it in the very typical way that aes() understands; unquoted. However, neither of them worked. The following reprex reproduces the problem. I commented the code to spare you the pain of figuring out what the code does.
library(tidyverse)
# Divide the classes of variables into numeric and non-numeric. Goal: place a combination of numeric variables on the axes wwhile encoding a non-numeric variable.
mpg_numeric <- map_lgl(.x = seq_along(mpg), .f = ~ mpg[[.x]] %>% class() %in% c("numeric","integer"))
mpg_factor <- map_lgl(.x = seq_along(mpg), .f = ~ mpg[[.x]] %>% class() %in% c("factor","character"))
# create all possible combinations of the variables
eda_routine_combinations <- expand_grid(num_1 = mpg[mpg_numeric] %>% names(),
num_2 = mpg[mpg_numeric] %>% names(),
fct = mpg[mpg_factor] %>% names()) %>%
filter(num_1 != num_2) %>% slice_head(n = 2) # for simplicity, keep only the first 2 combinations
# use purrr::pmap() to create all the plots we want in a single call
pmap(.l = list(eda_routine_combinations$num_1,
eda_routine_combinations$num_2,
eda_routine_combinations$fct) ,
.f = ~ mpg %>%
ggplot(aes(..1 , ..2, col = ..3)) +
geom_point() )
Next we pinpoint the problem using a typical ggplot2 call.
this is what we want purrr::pmap() to create in its iterations:
mpg %>%
ggplot(aes(displ , cyl, fill = drv)) +
geom_boxplot()
However, this is purrr::pmap() renders; quoted variable names:
mpg %>%
ggplot(aes("displ" , "cyl", fill = "drv")) +
geom_boxplot()
Failing attempts
Using cat() to transform the quoted variable names from pmap() into unquoted form for aes() to understand fails.
mpg %>%
ggplot(aes(cat("displ") , cat("cyl"), fill = cat("drv"))) +
geom_boxplot()
Using as.name() to transform the quoted variable names from pmap() into unquoted form for aes() to understand fails.
mpg %>%
ggplot(aes(as.name("displ") , as.name("cyl"), fill = as.name("drv"))) +
geom_boxplot()
Bottom line
Is there a way to make ggplot(aes("quoted_var_name")) work properly?
When doing data analysis, we often use dplyr to modify the dataframe further in specific geoms. This allows us to change the default dataframe of a ggplot later, and have everything still work.
template <- ggplot(db, aes(x=time, y=value)) +
geom_line(data=function(db){db %>% filter(event=="Bla")}) +
geom_ribbon(aes(ymin=low, ymax=up))
ggsave( template, "global.png" )
for(i in unique(db$simulation))
ggsave( template %+% subset(db, simulation==i), paste0(i, ".png")
Is there a nicer/shorter way to specify the filter command, e.g. using some magical .?
EDIT
To clarify some of the comments: By using geom_line(data = db %>% filter(event=="Bla")), the layer would not be updated when I change the default dataframe later using %+%. I am really aiming to use the data argument of geom_* as a function.
Upon reading the documentation of %>% better, I have found the solution:
Using the dot-place holder as lhs
When the dot is used as lhs, the result will be a functional sequence, i.e. a function which applies the entire chain of right-hand sides in turn to its input. See the examples.
Therefore, the nicest way to formulate the above example, incorporating the suggestions from above as well:
db <- diamonds
template <- ggplot(db, aes(x=carat, y=price, color=cut)) +
geom_point() +
geom_smooth(data=. %>% filter(color=="J")) +
labs(caption="Smooths only for J color")
ggsave( template, "global.png" )
db %>% group_by(cut) %>% do(
ggsave( paste0(.$cut[1], ".png"), plot=template %+% .)
)
I find no solution for these two following issues:
First I try this:
library(tidyverse)
gg <- mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>%
ggplot(aes(x=carb, y=drat)) + geom_point(shape=group)
Error in layer(data = data, mapping = mapping, stat = stat, geom =
GeomPoint,:object 'group' not found
which is obviously not working. But using something like this .$group is also not successfull. Of note, I have to specifiy the shape outside from aes()
The second problem is this. I'm not able to call a saved ggplot (gg) within a pipe.
gg <- mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>%
ggplot(aes(x=carb, y=drat)) + geom_point()
mtcars %>%
filter(vs == 0) %>%
gg + geom_point(aes(x=carb, y=drat), size = 4)
Error in gg(.) : could not find function "gg"
Thanks for your help!
Edit
After a long time I found a solution here. One has to set the complete ggplot term in {}.
mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>% {
ggplot(.,aes(carb,drat)) +
geom_point(shape=.$group)}
If you wrap your shape definition in aes() you can get the desired behavior. To use shape outside of aes() you can pass it a single value (ie shape=1). Also note that group is converted to a discrete var, geom_point throws an error when you pass a continuous var to shape.
library(tidyverse)
gg <- mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>%
ggplot(aes(x=carb, y=drat)) +
geom_point(aes(shape=as.factor(group)))
gg
Second, the %>% operator, when called as lhs %>% rhs, assumes that the rhs is a function. So as the error shows, you are calling gg as a function. Calling a plot as a function on a dataframe (ie gg(mtcars)) isnt a valid operation.
See #docendo discimus comment on the question for how to use {} to accomplish adding a layer to an existing ggplot object from a magrittr pipeline.
When I integrate tables and figures in a document using knitr, adding the code makes it more reproducible and interesting.
Often a combination of dplyr and ggvis can make a plot that has relatively legible code (using the magrittr pipe operator %>).
mtcars %>%
group_by(cyl, am) %>%
summarise( weight = mean(wt) ) %>%
ggvis(x=~am, y=~weight, fill=~cyl) %>%
layer_bars()
The problem is that the ggvis plot:
does not look quite as as pretty as the ggplot2 plot (I know, factoring of cyl):
However, for ggplot2 we need:
mtcars %>%
group_by(am, cyl) %>%
summarise( weight = mean(wt) ) %>%
ggplot( aes(x=am, y=weight, fill=cyl) ) +
geom_bar(stat='identity')
My problem is that this switches from %>% to + for piping. I know this is a very minor itch, but I would much prefer to use:
mtcars %>%
group_by(am, cyl) %>%
summarise( weight = mean(wt) ) %>%
ggplot( aes(x=am, y=weight, fill=cyl) ) %>%
geom_bar(stat='identity')
Is there a way to modify the behaviour of ggplot2 so that this would work?
ps. I don't like the idea of using magrittr's add() since this again make the code more complicated to read.
Since it would be too long to expand in the comments, and based on your answer I am not sure if you tried the bit of code I provided and it didn't work or you tried previously and didn't manage
geom_barw<-function(DF,x,y,fill,stat){
require(ggplot2)
p<-ggplot(DF,aes_string(x=x,y=y,fill=fill)) + geom_bar(stat=stat)
return(p)
}
library(magrittr)
library(dplyr)
library(ggplot2)
mtcars %>%
group_by(cyl, am) %>%
summarise( weight = mean(wt) ) %>%
geom_barw(x='am', y='weight', fill='cyl', stat='identity')
This works for me with:
dplyr_0.4.2 ggplot2_2.1.0 magrittr_1.5
Of course geom_barw could be modified so you don't need to use the quotes anymore.
EDIT: There should be more elegant and safer way with lazy (see the lazyeval package), but a very quick adaptation would be to use substitute (as pointed by Axeman - however without the deparse part):
geom_barw<-function(DF,x,y,fill,stat){
require(ggplot2)
x<-substitute(x)
y<-substitute(y)
fill<-substitute(fill)
p<- ggplot(DF,aes_string(x=x,y=y,fill=fill))
p<- p + geom_bar(stat=stat)
return(p)
}
Trying to replicate the ggplot function position="fill" in ggvis. I use this handy function all the time in the presentation of results. Reproducible example successfully performed in ggplot2 + the ggvis code. Can it be done using the scale_numeric function?
library(ggplot2)
p <- ggplot(mtcars, aes(x=factor(cyl), fill=factor(vs)))
p+geom_bar()
p+geom_bar(position="fill")
library(ggvis)
q <- mtcars %>%
ggvis(~factor(cyl), fill = ~factor(vs))%>%
layer_bars()
# Something like this?
q %>% scale_numeric("y", domain = c(0,1))
I think that to do this sort of thing with ggvis you have to do the heavy data reshaping lifting before sending it to ggvis. ggplot2's geom_bar handily does a lot of calculations (counting things up, weighting them, etc) for you that you need to do explicitly yourself in ggvis. So try something like the below (there may be more elegant ways):
mtcars %>%
mutate(cyl=factor(cyl), vs=as.factor(vs)) %>%
group_by(cyl, vs) %>%
summarise(count=length(mpg)) %>%
group_by(cyl) %>%
mutate(proportion = count / sum(count)) %>%
ggvis(x= ~cyl, y = ~proportion, fill = ~vs) %>%
layer_bars()