dplyr and ggplot piping is not working as expected

dplyr and ggplot piping is not working as expected - r

I find no solution for these two following issues:
First I try this:
library(tidyverse)
gg <- mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>%
ggplot(aes(x=carb, y=drat)) + geom_point(shape=group)
Error in layer(data = data, mapping = mapping, stat = stat, geom =
GeomPoint,:object 'group' not found
which is obviously not working. But using something like this .$group is also not successfull. Of note, I have to specifiy the shape outside from aes()
The second problem is this. I'm not able to call a saved ggplot (gg) within a pipe.
gg <- mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>%
ggplot(aes(x=carb, y=drat)) + geom_point()
mtcars %>%
filter(vs == 0) %>%
gg + geom_point(aes(x=carb, y=drat), size = 4)
Error in gg(.) : could not find function "gg"
Thanks for your help!
Edit
After a long time I found a solution here. One has to set the complete ggplot term in {}.
mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>% {
ggplot(.,aes(carb,drat)) +
geom_point(shape=.$group)}

If you wrap your shape definition in aes() you can get the desired behavior. To use shape outside of aes() you can pass it a single value (ie shape=1). Also note that group is converted to a discrete var, geom_point throws an error when you pass a continuous var to shape.
library(tidyverse)
gg <- mtcars %>%
mutate(group=ifelse(gear==3,1,2)) %>%
ggplot(aes(x=carb, y=drat)) +
geom_point(aes(shape=as.factor(group)))
gg
Second, the %>% operator, when called as lhs %>% rhs, assumes that the rhs is a function. So as the error shows, you are calling gg as a function. Calling a plot as a function on a dataframe (ie gg(mtcars)) isnt a valid operation.
See #docendo discimus comment on the question for how to use {} to accomplish adding a layer to an existing ggplot object from a magrittr pipeline.

Related

purrr::pmap() output incompatible with what ggplot::aes() expects

Problem: purrr::pmap() output incompatible with ggplot::aes()
The following reprex boils down to a single question, is there anyway we can use the quoted variable names inside ggplot2::aes() instead of the plain text names? Example: we typically use ggplot(mpg, aes(displ, cyl)) , how to make aes() work normally with ggplot(mpg, aes("displ", "cyl")) ?
If you understood my question, the remainder of this reprex really adds no information. However, I added it to draw the full picture of the problem.
More details: I want to use purrr functions to create a bunch of routinely exploratory data analysis plots effortlessly. The problem is, purrr::pmap() results the string-quoted name of the variables, which ggplot::aes() doesn't understand. As far as I'm concerned, the functions cat() and as.name() can take the string-quoted variable name and return it in the very typical way that aes() understands; unquoted. However, neither of them worked. The following reprex reproduces the problem. I commented the code to spare you the pain of figuring out what the code does.
library(tidyverse)
# Divide the classes of variables into numeric and non-numeric. Goal: place a combination of numeric variables on the axes wwhile encoding a non-numeric variable.
mpg_numeric <- map_lgl(.x = seq_along(mpg), .f = ~ mpg[[.x]] %>% class() %in% c("numeric","integer"))
mpg_factor <- map_lgl(.x = seq_along(mpg), .f = ~ mpg[[.x]] %>% class() %in% c("factor","character"))
# create all possible combinations of the variables
eda_routine_combinations <- expand_grid(num_1 = mpg[mpg_numeric] %>% names(),
num_2 = mpg[mpg_numeric] %>% names(),
fct = mpg[mpg_factor] %>% names()) %>%
filter(num_1 != num_2) %>% slice_head(n = 2) # for simplicity, keep only the first 2 combinations
# use purrr::pmap() to create all the plots we want in a single call
pmap(.l = list(eda_routine_combinations$num_1,
eda_routine_combinations$num_2,
eda_routine_combinations$fct) ,
.f = ~ mpg %>%
ggplot(aes(..1 , ..2, col = ..3)) +
geom_point() )
Next we pinpoint the problem using a typical ggplot2 call.
this is what we want purrr::pmap() to create in its iterations:
mpg %>%
ggplot(aes(displ , cyl, fill = drv)) +
geom_boxplot()
However, this is purrr::pmap() renders; quoted variable names:
mpg %>%
ggplot(aes("displ" , "cyl", fill = "drv")) +
geom_boxplot()
Failing attempts
Using cat() to transform the quoted variable names from pmap() into unquoted form for aes() to understand fails.
mpg %>%
ggplot(aes(cat("displ") , cat("cyl"), fill = cat("drv"))) +
geom_boxplot()
Using as.name() to transform the quoted variable names from pmap() into unquoted form for aes() to understand fails.
mpg %>%
ggplot(aes(as.name("displ") , as.name("cyl"), fill = as.name("drv"))) +
geom_boxplot()
Bottom line
Is there a way to make ggplot(aes("quoted_var_name")) work properly?

Pipe a single variable (i.e. a vector) to a ggplot?

We can see how to plot a single variable (along with its index).
How can we pipe to a ggplot?
Example
Since
library(ggplot2)
qplot(seq_along(iris$Sepal.Length), iris$Sepal.Length)
yields
I expected
iris$Sepal.Length %>% { qplot(seq_along(.), .) }
to yield the same. But
Error: Discrete value supplied to continuous scale
Question
How do we pipe a single variable to a ggplot?

Seems to get it working you need to explicitly print it when inside a chain.
library(magrittr)
library(ggplot2)
iris$Sepal.Length %>% {print(qplot(seq_along(.), .))}

You can use the following code
library(tidyverse)
iris %>% ggplot(aes(seq_along(Sepal.Length), Sepal.Length))+
geom_point() + theme_bw()+
labs(title="Plot of Sepal length",x="Sepal.Length seq", y = "Sepal.Length")

Ggplot subset data functions and dplyr

When doing data analysis, we often use dplyr to modify the dataframe further in specific geoms. This allows us to change the default dataframe of a ggplot later, and have everything still work.
template <- ggplot(db, aes(x=time, y=value)) +
geom_line(data=function(db){db %>% filter(event=="Bla")}) +
geom_ribbon(aes(ymin=low, ymax=up))
ggsave( template, "global.png" )
for(i in unique(db$simulation))
ggsave( template %+% subset(db, simulation==i), paste0(i, ".png")
Is there a nicer/shorter way to specify the filter command, e.g. using some magical .?
EDIT
To clarify some of the comments: By using geom_line(data = db %>% filter(event=="Bla")), the layer would not be updated when I change the default dataframe later using %+%. I am really aiming to use the data argument of geom_* as a function.

Upon reading the documentation of %>% better, I have found the solution:
Using the dot-place holder as lhs
When the dot is used as lhs, the result will be a functional sequence, i.e. a function which applies the entire chain of right-hand sides in turn to its input. See the examples.
Therefore, the nicest way to formulate the above example, incorporating the suggestions from above as well:
db <- diamonds
template <- ggplot(db, aes(x=carat, y=price, color=cut)) +
geom_point() +
geom_smooth(data=. %>% filter(color=="J")) +
labs(caption="Smooths only for J color")
ggsave( template, "global.png" )
db %>% group_by(cut) %>% do(
ggsave( paste0(.$cut[1], ".png"), plot=template %+% .)
)

Using dplyr functions in a pipe after ggplot

Is it possible to do a summarise after using ggplot in a pipe? The variable is not of great importance and I am just looking at the change for an exploratory purpose. Therefore, I don't really want to save the variable.
df %>%
mutate(change = t2 - t1) %>%
ggplot(aes(x = change)) +
geom_histogram() %>%
summarise(mean_change = mean(change))
Error in UseMethod("summarise_") : no applicable method for 'summarise_' applied to
an object of class "c('LayerInstance', 'Layer', 'ggproto')"
Is it possible to render ggplot output AND do a summarise (showing the mean) in the same pipe?

I don't know if this is exactly what you're looking for, but your question reminds me of the T-pipe in magittr (part of dplyr and the tidyverse), which I found in an online "R For Data Science" book here: http://r4ds.had.co.nz/pipes.html#other-tools-from-magrittr .
With this T-pipe, you can ggplot and continue to summarise, as the T-pipe returns not the ggplot object but the object which was passed to ggplot.

Using package ggfun this would work :
# devtools::install_github("moodymudskipper/ggfun")
library(tidyverse)
library(ggfun)
iris %>%
mutate(x = Sepal.Length * Sepal.Width) %>%
ggplot(aes(x)) +
geom_density() -
as_mapper(~{print(summarise(.,mean_x = mean(x)));.})
The - operator is used to apply a function on the data and still return a ggplot object, so we apply it to print but still return the original data to get our plot.

Use dplyr SE with ggplot2

I often combine dplyr with ggplot2 in wrapper functions for analysis. As I am moving to the new NSE / SE paradigm of v.0.7.1 with tidyeval, I am struggling to get this combination to work. I found that ggplot does not understand unquoted quosers (yet). The following does not work:
example_func <- function(col) {
col <- enquo(col)
mtcars %>% count(!!col) %>%
ggplot(aes((!!col), n)) +
geom_bar(stat = "identity")
}
example_func(cyl)
# Error in !col : invalid argument type
I currently use the following work-around. But I assume there must be a better way.
example_func2 <- function(col) {
col <- enquo(col)
mtcars %>% count(!!col) %>%
ggplot(aes_string(rlang::quo_text(col), "n")) +
geom_bar(stat = "identity")
}
Please show me what the best way to combine these two. Thanks!

If you are already handling quosures it's easier to use aes_ which accepts inputs quoted as a formula: aes_(col, ~n).
This bit of code solves your problem:
library(tidyverse)
example_func <- function(col) {
col <- enquo(col)
mtcars %>% count(!!col) %>%
ggplot(aes_(col, ~n)) +
geom_bar(stat = "identity")
}
example_func(cyl)

There seem to be two ways of thinking about this.
Approach 1: Separation of concerns.
I like my plotting stuff to be very much separate from my wrangling stuff. Also, you can name your group which feels like the easiest method to solve your problem [although you do loose the original column name]. So one method of solving what you're trying to do can be via;
library(tidyverse)
concern1_data <- function(df, col) {
group <- enquo(col)
df %>%
group_by(group = !!group) %>%
summarise(n = n())
}
concern2_plotting <- function(df){
ggplot(data=df) +
geom_bar(aes(group, n), stat = "identity")
}
mtcars %>%
concern1_data(am) %>%
concern2_plotting()
This achieves what you're trying to do more or less and keeps concerns apart (which deserves a mention).
Approach 2: Accept and Wait
Thing is: tidyeval is not yet implemented in ggplot2.
- Colin Fay from link
I think this is support that is currently not in ggplot2 but I can't imagine that ggplot2 won't get this functionality. It's just not there yet.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

dplyr and ggplot piping is not working as expected - r

Related

purrr::pmap() output incompatible with what ggplot::aes() expects

Pipe a single variable (i.e. a vector) to a ggplot?

Ggplot subset data functions and dplyr

Using dplyr functions in a pipe after ggplot

Use dplyr SE with ggplot2

Categories

Resources