Multiple ggplots with magrittr tee operator

Multiple ggplots with magrittr tee operator - r

I am trying to figure out why the tee operator, %T>%, does not work when I pass the data to a ggplot command.
This works fine
library(ggplot2)
library(dplyr)
library(magrittr)
mtcars %T>%
qplot(x = cyl, y = mpg, data = ., geom = "point") %>%
qplot(x = mpg, y = cyl, data = ., geom = "point")
And this also works fine
mtcars %>%
{ggplot() + geom_point(aes(cyl, mpg)) ; . } %>%
ggplot() + geom_point(aes(mpg, cyl))
But when I use the tee operator, as below, it throws "Error: ggplot2 doesn't know how to deal with data of class protoenvironment".
mtcars %T>%
ggplot() + geom_point(aes(cyl, mpg)) %>%
ggplot() + geom_point(aes(mpg, cyl))
Can anyone explain why this final piece of code does not work?

Either
mtcars %T>%
{print(ggplot(.) + geom_point(aes(cyl, mpg)))} %>%
{ggplot(.) + geom_point(aes(mpg, cyl))}
or abandon the %T>% operator and use an ordinary pipe with the "%>T%" operation made explicit as a new function as suggested in this answer
techo <- function(x){
print(x)
x
}
mtcars %>%
{techo( ggplot(.) + geom_point(aes(cyl, mpg)) )} %>%
{ggplot(.) + geom_point(aes(mpg, cyl))}
As TFlick noted, the reason the %T>% operator doesn't work here is because of the precedence of operations: %any% is done before +.

I think your problem has to do with order of operations. The + is stronger than the %T>% operator (according to the ?Syntax help page). You need to pass in the data= parameter to ggplot before you add the geom_point otherwise things get messy. I think you want
mtcars %T>%
{print(ggplot(.) + geom_point(aes(cyl, mpg)))} %>%
{ggplot(.) + geom_point(aes(mpg, cyl))}
which uses the functional "short-hand" notation

Note that a returned ggplot object is a list with $data field. This can be taken advantaged of. Personally I think the style is cleaner:)
ggpass=function(pp){
print(pp)
return(pp$data)
}
mtcars %>%
{ggplot() + geom_point(aes(cyl, mpg))} %>% ggpass() %>%
{ggplot() + geom_point(aes(mpg, cyl))}

Related

ggplot and dplyr filter reference

Here is my code:
mtcars %>% filter(cyl == 4) %>%
ggplot(., aes(mpg, hp, color=hp)) +
geom_point() +
scale_color_gradient(low = "darkorange2", high = "darkred",
breaks=c(min(mtcars$hp), max(mtcars$hp)),
labels=c("Min","Max"))
What I would like to do is, include the breaks in the scale_color_gradient function in the filter I have called beforehand. I know that .$hp works in base R and only using the variable name in dplyr, but how do I use it in this case?

You can put all the plotting code in braces to keep the "right" object in the .. Also if you want to go from min to max, you can use range(). For example
mtcars %>% filter(cyl == 4) %>%
{ggplot(., aes(mpg, hp, color=hp)) +
geom_point() +
scale_color_gradient(low = "darkorange2", high = "darkred",
breaks=range(.$hp),
labels=c("Min","Max"))}

unable to set xlim and ylim using min() and max() in ggplot

I am missing something crucial here and can't see it.
Why does min and max not work to set the axis limits?
mtcars %>%
select(mpg, cyl, disp, wt) %>%
filter(complete.cases(disp)) %>%
ggplot() +
geom_point(aes(x=mpg, y=disp, colour=cyl), size=3) +
xlim(min(mpg, na.rm=TRUE),max(mpg, na.rm=TRUE)) +
ylim(min(disp, na.rm=TRUE),max(disp, na.rm=TRUE)) +
scale_colour_gradient(low="red",high="green", name = "cyl")
This works:
mtcars %>%
select(mpg, cyl, disp, wt) %>%
filter(complete.cases(disp)) %>%
ggplot() +
geom_point(aes(x=mpg, y=disp, colour=cyl), size=3) +
# xlim(min(mpg, na.rm=TRUE),max(mpg, na.rm=TRUE)) +
# ylim(min(disp, na.rm=TRUE),max(disp, na.rm=TRUE)) +
scale_colour_gradient(low="red",high="green", name = "cyl")

ggplot cannot access the column values in the way that dplyr can.
You need to add in the data:
mtcars %>%
select(mpg, cyl, disp, wt) %>%
filter(complete.cases(disp)) %>%
ggplot() +
geom_point(aes(x=mpg, y=disp, colour=cyl), size=3) +
xlim(min(mtcars$mpg, na.rm=TRUE),max(mtcars$mpg, na.rm=TRUE)) +
ylim(min(mtcars$disp, na.rm=TRUE),max(mtcars$disp, na.rm=TRUE)) +
scale_colour_gradient(low="red",high="green", name = "cyl")

You can't reference column names in ggplot objects except inside aes() and in a formula or vars() in a facet_* function. But the helper function expand_scale is there to help you expand the scales in a more controlled way.
For example:
# add 1 unit to the x-scale in each direction
scale_x_continuous(expand = expand_scale(add = 1))
# have the scale exactly fit the data, no padding
scale_x_continuous(expand = expand_scale(0, 0))
# extend the scale by 10% in each direction
scale_x_continuous(expand = expand_scale(mult = .1))
See ?scale_x_continuous and especially ?expand_scale for details. It's also possible to selectively pad just the top or just the bottom of each scale, there are examples in ?expand_scale.

Add titles to ggplots created with map()

What's the easiest way to add titles to each ggplot that I've created below using the map function? I want the titles to reflect the name of each data frame - i.e. 4, 6, 8 (cylinders).
Thanks :)
mtcars_split <-
mtcars %>%
split(mtcars$cyl)
plots <-
mtcars_split %>%
map(~ ggplot(data=.,mapping = aes(y=mpg,x=wt)) +
geom_jitter()
# + ggtitle(....))
plots

Use map2 with names.
plots <- map2(
mtcars_split,
names(mtcars_split),
~ggplot(data = .x, mapping = aes(y = mpg, x = wt)) +
geom_jitter() +
ggtitle(.y)
)
Edit: alistaire pointed out this is the same as imap
plots <- imap(
mtcars_split,
~ggplot(data = .x, mapping = aes(y = mpg, x = wt)) +
geom_jitter() +
ggtitle(.y)
)

Perhaps you'd be interested in using facet_wrap instead
ggplot(mtcars, aes(y=mpg, x=wt)) + geom_jitter() + facet_wrap(~cyl)

You can use purrr::map2():
mtcars_split <- mtcars %>% split(mtcars$cyl)
plots <- map2(mtcars_split, titles,
~ ggplot(data=.x, aes(mpg,wt)) + geom_jitter() + ggtitle(.y)
)
EDIT
Sorry duplicated with Paul's answer.

What is the difference between the "+" operator in ggplot2 and the "%>%" operator in magrittr?

What is the difference between the "+" operator in ggplot2 and the "%>%" operator in magrittr?
I was told that they are the same, however if we consider the following script.
library(magrittr)
library(ggplot2)
# 1. This works
ggplot(data = mtcars, aes(x=wt, y = mpg)) + geom_point()
# 2. This works
ggplot(data = mtcars) + aes(x=wt, y = mpg) + geom_point()
# 3. This works
ggplot(data = mtcars) + aes(x=wt, y = mpg) %>% geom_point()
# 4. But this doesn't
ggplot(data = mtcars) %>% aes(x=wt, y = mpg) %>% geom_point()

Piping is very different from ggplot2's addition. What the pipe operator, %>%, does is take the result of the left-hand side and put it as the first argument of the function on the right-hand side. For example:
1:10 %>% mean()
# [1] 5.5
Is exactly equivalent to mean(1:10). The pipe is more useful to replace multiply nested functions, e.g.,
x = factor(2008:2012)
x_num = as.numeric(as.character(x))
# could be rewritten to read from left-to-right as
x_num = x %>% as.character() %>% as.numeric()
but this is all explained nicely over at What does %>% mean in R?, you should read through that for a couple more examples.
Using this knowledge, we can re-write your pipe examples as nested functions and see that they still do the same things; but now it (hopefully) is obvious why #4 doesn't work:
# 3. This is acceptable ggplot2 syntax
ggplot(data = mtcars) + geom_point(aes(x=wt, y = mpg))
# 4. This is not
geom_point(aes(ggplot(data = mtcars), x=wt, y = mpg))
ggplot2 includes a special "+" method for ggplot objects, which it uses to add layers to plots. I didn't know until you asked your question that it also works with the aes() function, but apparently that's defined as well. These are all specially defined within ggplot2. The use of + in ggplot2 predates the pipe, and while the usage is similar, the functionality is quite different.
As an interesting side-note, Hadley Wickham (the creator of ggplot2) said that:
...if I'd discovered the pipe earlier, there never would've been a ggplot2, because you could write ggplot graphics as
ggplot(mtcars, aes(wt, mpg)) %>%
geom_point() %>%
geom_smooth()

dplyr + ggplot2: Plotting not working via piping

I want to plot a subset of my dataframe. I am working with dplyr and ggplot2. My code only works with version 1, not version 2 via piping. What's the difference?
Version 1 (plotting is working):
data <- dataset %>% filter(type=="type1")
ggplot(data, aes(x=year, y=variable)) + geom_line()
Version 2 with piping (plotting is not working):
data %>% filter(type=="type1") %>% ggplot(data, aes(x=year, y=variable)) + geom_line()
Error:
Error in ggplot.data.frame(., data, aes(x = year, :
Mapping should be created with aes or aes_string
Thanks for your help!

Solution for version 2: a dot . instead of data:
data %>%
filter(type=="type1") %>%
ggplot(., aes(x=year, y=variable)) +
geom_line()

I usually do this, which also dispenses with the need for the .:
library(dplyr)
library(ggplot2)
mtcars %>%
filter(cyl == 4) %>%
ggplot +
aes(
x = disp,
y = mpg
) +
geom_point()

During typing with piping if you reenter the data name as you have as I shown with bold below, function confuses the sequence of arguments.
data %>% filter(type=="type1") %>% ggplot(***data***, aes(x=year, y=variable)) + geom_line()
Hope it works for you.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Multiple ggplots with magrittr tee operator - r

Note that a returned ggplot object is a list with $data field. This can be taken advantaged of. Personally I think the style is cleaner:) ggpass=function(pp){ print(pp) return(pp$data) } mtcars %>% {ggplot() + geom_point(aes(cyl, mpg))} %>% ggpass() %>% {ggplot() + geom_point(aes(mpg, cyl))}

Related

ggplot and dplyr filter reference

unable to set xlim and ylim using min() and max() in ggplot

Add titles to ggplots created with map()

What is the difference between the "+" operator in ggplot2 and the "%>%" operator in magrittr?

dplyr + ggplot2: Plotting not working via piping

Categories

Resources