Using dplyr functions in a pipe after ggplot - r

Is it possible to do a summarise after using ggplot in a pipe? The variable is not of great importance and I am just looking at the change for an exploratory purpose. Therefore, I don't really want to save the variable.
df %>%
mutate(change = t2 - t1) %>%
ggplot(aes(x = change)) +
geom_histogram() %>%
summarise(mean_change = mean(change))
Error in UseMethod("summarise_") : no applicable method for 'summarise_' applied to
an object of class "c('LayerInstance', 'Layer', 'ggproto')"
Is it possible to render ggplot output AND do a summarise (showing the mean) in the same pipe?

I don't know if this is exactly what you're looking for, but your question reminds me of the T-pipe in magittr (part of dplyr and the tidyverse), which I found in an online "R For Data Science" book here: http://r4ds.had.co.nz/pipes.html#other-tools-from-magrittr .
With this T-pipe, you can ggplot and continue to summarise, as the T-pipe returns not the ggplot object but the object which was passed to ggplot.

Using package ggfun this would work :
# devtools::install_github("moodymudskipper/ggfun")
library(tidyverse)
library(ggfun)
iris %>%
mutate(x = Sepal.Length * Sepal.Width) %>%
ggplot(aes(x)) +
geom_density() -
as_mapper(~{print(summarise(.,mean_x = mean(x)));.})
The - operator is used to apply a function on the data and still return a ggplot object, so we apply it to print but still return the original data to get our plot.

Related

purrr::pmap() output incompatible with what ggplot::aes() expects

Problem: purrr::pmap() output incompatible with ggplot::aes()
The following reprex boils down to a single question, is there anyway we can use the quoted variable names inside ggplot2::aes() instead of the plain text names? Example: we typically use ggplot(mpg, aes(displ, cyl)) , how to make aes() work normally with ggplot(mpg, aes("displ", "cyl")) ?
If you understood my question, the remainder of this reprex really adds no information. However, I added it to draw the full picture of the problem.
More details: I want to use purrr functions to create a bunch of routinely exploratory data analysis plots effortlessly. The problem is, purrr::pmap() results the string-quoted name of the variables, which ggplot::aes() doesn't understand. As far as I'm concerned, the functions cat() and as.name() can take the string-quoted variable name and return it in the very typical way that aes() understands; unquoted. However, neither of them worked. The following reprex reproduces the problem. I commented the code to spare you the pain of figuring out what the code does.
library(tidyverse)
# Divide the classes of variables into numeric and non-numeric. Goal: place a combination of numeric variables on the axes wwhile encoding a non-numeric variable.
mpg_numeric <- map_lgl(.x = seq_along(mpg), .f = ~ mpg[[.x]] %>% class() %in% c("numeric","integer"))
mpg_factor <- map_lgl(.x = seq_along(mpg), .f = ~ mpg[[.x]] %>% class() %in% c("factor","character"))
# create all possible combinations of the variables
eda_routine_combinations <- expand_grid(num_1 = mpg[mpg_numeric] %>% names(),
num_2 = mpg[mpg_numeric] %>% names(),
fct = mpg[mpg_factor] %>% names()) %>%
filter(num_1 != num_2) %>% slice_head(n = 2) # for simplicity, keep only the first 2 combinations
# use purrr::pmap() to create all the plots we want in a single call
pmap(.l = list(eda_routine_combinations$num_1,
eda_routine_combinations$num_2,
eda_routine_combinations$fct) ,
.f = ~ mpg %>%
ggplot(aes(..1 , ..2, col = ..3)) +
geom_point() )
Next we pinpoint the problem using a typical ggplot2 call.
this is what we want purrr::pmap() to create in its iterations:
mpg %>%
ggplot(aes(displ , cyl, fill = drv)) +
geom_boxplot()
However, this is purrr::pmap() renders; quoted variable names:
mpg %>%
ggplot(aes("displ" , "cyl", fill = "drv")) +
geom_boxplot()
Failing attempts
Using cat() to transform the quoted variable names from pmap() into unquoted form for aes() to understand fails.
mpg %>%
ggplot(aes(cat("displ") , cat("cyl"), fill = cat("drv"))) +
geom_boxplot()
Using as.name() to transform the quoted variable names from pmap() into unquoted form for aes() to understand fails.
mpg %>%
ggplot(aes(as.name("displ") , as.name("cyl"), fill = as.name("drv"))) +
geom_boxplot()
Bottom line
Is there a way to make ggplot(aes("quoted_var_name")) work properly?

Pipe a single variable (i.e. a vector) to a ggplot?

We can see how to plot a single variable (along with its index).
How can we pipe to a ggplot?
Example
Since
library(ggplot2)
qplot(seq_along(iris$Sepal.Length), iris$Sepal.Length)
yields
I expected
iris$Sepal.Length %>% { qplot(seq_along(.), .) }
to yield the same. But
Error: Discrete value supplied to continuous scale
Question
How do we pipe a single variable to a ggplot?
Seems to get it working you need to explicitly print it when inside a chain.
library(magrittr)
library(ggplot2)
iris$Sepal.Length %>% {print(qplot(seq_along(.), .))}
You can use the following code
library(tidyverse)
iris %>% ggplot(aes(seq_along(Sepal.Length), Sepal.Length))+
geom_point() + theme_bw()+
labs(title="Plot of Sepal length",x="Sepal.Length seq", y = "Sepal.Length")

How to add legends in this context?

songs %>% group_by(year) %>% summarise(count=nth(pop,1))%>%
ggplot(aes(x=factor(year),y=count,fill=year))+geom_bar(stat ='identity' )+theme_classic()
1.How can I adjust my legends to show years(2010:2019) rather than what it is showing right now?
2.Scale_size_manual is not working.
You need to set year as a factor each time (or externally), not just once. I don't have your data, so I'll use mtcars.
library(ggplot2)
library(dplyr)
# first plot
mtcars %>%
ggplot(aes(factor(carb), disp, fill=carb)) +
geom_bar(stat="identity")
# second plot
mutate(mtcars, carb = factor(carb)) %>%
ggplot(aes(carb, disp, fill=carb)) +
geom_bar(stat="identity")
# alternate code for second plot, not shown
mtcars %>%
ggplot(aes(factor(carb), disp, fill=factor(carb))) +
# both ^^^^^^ and ^^^^^^
geom_bar(stat="identity")
(There are numerous ways to convert to a factor. I'm using dplyr here, but it can easily be done in base or data.table.)
I included the "alternate" code above that shows the manual factor being applied to each use of carb; this is not the preferred method in my mind, since if you're doing it multiple times, just do it once before the plotting and use it multiple times. If you need both the ordinal year and the numeric version, you can add a new field, such as ordinal_year=factor(year).

Use dplyr SE with ggplot2

I often combine dplyr with ggplot2 in wrapper functions for analysis. As I am moving to the new NSE / SE paradigm of v.0.7.1 with tidyeval, I am struggling to get this combination to work. I found that ggplot does not understand unquoted quosers (yet). The following does not work:
example_func <- function(col) {
col <- enquo(col)
mtcars %>% count(!!col) %>%
ggplot(aes((!!col), n)) +
geom_bar(stat = "identity")
}
example_func(cyl)
# Error in !col : invalid argument type
I currently use the following work-around. But I assume there must be a better way.
example_func2 <- function(col) {
col <- enquo(col)
mtcars %>% count(!!col) %>%
ggplot(aes_string(rlang::quo_text(col), "n")) +
geom_bar(stat = "identity")
}
Please show me what the best way to combine these two. Thanks!
If you are already handling quosures it's easier to use aes_ which accepts inputs quoted as a formula: aes_(col, ~n).
This bit of code solves your problem:
library(tidyverse)
example_func <- function(col) {
col <- enquo(col)
mtcars %>% count(!!col) %>%
ggplot(aes_(col, ~n)) +
geom_bar(stat = "identity")
}
example_func(cyl)
There seem to be two ways of thinking about this.
Approach 1: Separation of concerns.
I like my plotting stuff to be very much separate from my wrangling stuff. Also, you can name your group which feels like the easiest method to solve your problem [although you do loose the original column name]. So one method of solving what you're trying to do can be via;
library(tidyverse)
concern1_data <- function(df, col) {
group <- enquo(col)
df %>%
group_by(group = !!group) %>%
summarise(n = n())
}
concern2_plotting <- function(df){
ggplot(data=df) +
geom_bar(aes(group, n), stat = "identity")
}
mtcars %>%
concern1_data(am) %>%
concern2_plotting()
This achieves what you're trying to do more or less and keeps concerns apart (which deserves a mention).
Approach 2: Accept and Wait
Thing is: tidyeval is not yet implemented in ggplot2.
- Colin Fay from link
I think this is support that is currently not in ggplot2 but I can't imagine that ggplot2 won't get this functionality. It's just not there yet.

dplyr and ggplot piping is not working as expected

I find no solution for these two following issues:
First I try this:
library(tidyverse)
gg <- mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>%
ggplot(aes(x=carb, y=drat)) + geom_point(shape=group)
Error in layer(data = data, mapping = mapping, stat = stat, geom =
GeomPoint,:object 'group' not found
which is obviously not working. But using something like this .$group is also not successfull. Of note, I have to specifiy the shape outside from aes()
The second problem is this. I'm not able to call a saved ggplot (gg) within a pipe.
gg <- mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>%
ggplot(aes(x=carb, y=drat)) + geom_point()
mtcars %>%
filter(vs == 0) %>%
gg + geom_point(aes(x=carb, y=drat), size = 4)
Error in gg(.) : could not find function "gg"
Thanks for your help!
Edit
After a long time I found a solution here. One has to set the complete ggplot term in {}.
mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>% {
ggplot(.,aes(carb,drat)) +
geom_point(shape=.$group)}
If you wrap your shape definition in aes() you can get the desired behavior. To use shape outside of aes() you can pass it a single value (ie shape=1). Also note that group is converted to a discrete var, geom_point throws an error when you pass a continuous var to shape.
library(tidyverse)
gg <- mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>%
ggplot(aes(x=carb, y=drat)) +
geom_point(aes(shape=as.factor(group)))
gg
Second, the %>% operator, when called as lhs %>% rhs, assumes that the rhs is a function. So as the error shows, you are calling gg as a function. Calling a plot as a function on a dataframe (ie gg(mtcars)) isnt a valid operation.
See #docendo discimus comment on the question for how to use {} to accomplish adding a layer to an existing ggplot object from a magrittr pipeline.

Resources