dplyr + ggplot2: Plotting not working via piping - r

I want to plot a subset of my dataframe. I am working with dplyr and ggplot2. My code only works with version 1, not version 2 via piping. What's the difference?
Version 1 (plotting is working):
data <- dataset %>% filter(type=="type1")
ggplot(data, aes(x=year, y=variable)) + geom_line()
Version 2 with piping (plotting is not working):
data %>% filter(type=="type1") %>% ggplot(data, aes(x=year, y=variable)) + geom_line()
Error:
Error in ggplot.data.frame(., data, aes(x = year, :
Mapping should be created with aes or aes_string
Thanks for your help!

Solution for version 2: a dot . instead of data:
data %>%
filter(type=="type1") %>%
ggplot(., aes(x=year, y=variable)) +
geom_line()

I usually do this, which also dispenses with the need for the .:
library(dplyr)
library(ggplot2)
mtcars %>%
filter(cyl == 4) %>%
ggplot +
aes(
x = disp,
y = mpg
) +
geom_point()

During typing with piping if you reenter the data name as you have as I shown with bold below, function confuses the sequence of arguments.
data %>% filter(type=="type1") %>% ggplot(***data***, aes(x=year, y=variable)) + geom_line()
Hope it works for you.

Related

How to exclude facet wrap/grid with less than n number of observations in R

I am using ggplot to create numerous dot charts (using geom_point and facet_wrap) for two variables with multiple categories. Multiple categories have only a few points, which I cannot use,
Is there a way to set a minimum number of observations for each facet (i.e. only show plots with 10 or more observations)?
by_AB <- group_by(df, A, B)
by_AB%>%
ggplot(aes(X,Y)) +
geom_point() +
facet_wrap(A~B, scales="free") +
geom_smooth(se = FALSE) +
theme_bw()```
It is best to remove the small groups from your data before plotting. This is very easy if you do
df %>%
group_by(A, B) %>%
filter(n() > 1) %>%
ggplot(aes(X,Y)) +
geom_point() +
facet_wrap(A~B, scales="free") +
geom_smooth(se = FALSE) +
theme_bw()
Obviously, we don't have your data, so here is an example using the built-in mtcars data set. Suppose we want to plot mpg against wt, but facet by carb:
library(tidyverse)
mtcars %>%
ggplot(aes(wt, mpg)) +
geom_point() +
facet_wrap(.~carb)
Two of our facets look out of place because they only have a single point. We can simply filter these groups out en route to ggplot:
mtcars %>%
group_by(carb) %>%
filter(n() > 1) %>%
ggplot(aes(wt, mpg)) +
geom_point() +
facet_wrap(.~carb)
Created on 2022-08-25 with reprex v2.0.2

ggplot for each column in a data

I am missing some basics in R.
How do I make a plot for each column in a data frame?
I have tried making plots for each column separately. I was wondering if there was a easier way?
library(dplyr)
library(ggplot2)
data(economics)
#scatter plots
ggplot(economics,aes(x=pop,y=pce))+
geom_point()
ggplot(economics,aes(x=pop,y=psavert))+
geom_point()
ggplot(economics,aes(x=pop,y=uempmed))+
geom_point()
ggplot(economics,aes(x=pop,y=unemploy))+
geom_point()
#boxplots
ggplot(economics,aes(y=pce))+
geom_boxplot()
ggplot(economics,aes(y=pop))+
geom_boxplot()
ggplot(economics,aes(y=psavert))+
geom_boxplot()
ggplot(economics,aes(y=uempmed))+
geom_boxplot()
ggplot(economics,aes(y=unemploy))+
geom_boxplot()
All I'm looking for is having 1 box plot 2*2 and 1 2*2 scatter plot with ggplot2. I understand there is facet grid which I have failed to understand how to implement.(I believe this can be achieved easily with par(mfrow()) and base R plots. I saw somewhere else using using widening the data? which i didn't understand.
In cases like this the solution is almost always to reshape the data from wide to long format.
economics %>%
select(-date) %>%
tidyr::gather(variable, value, -pop) %>%
ggplot(aes(x = pop, y = value)) +
geom_point(size = 0.5) +
facet_wrap(~ variable, scales = "free_y")
economics %>%
tidyr::gather(variable, value, -date) %>%
ggplot(aes(y = value)) +
geom_boxplot() +
facet_wrap(~ variable, scales = "free_y")

Unable to loop through ggplot histogram

I'm trying to loop through every column of the iris data set and plot a histogram in ggplot. So I'm expecting 5 different histograms to appear. However, my for loop below returns nothing. How can I fix this?
library(ggplot2)
for (i in colnames(iris)){
ggplot(iris, aes(x = i))+
geom_histogram()
}
Instead of using a for loop, the tidyverse/ggplot way would be to reshape the data from wide to long and then plot using facet_wrap
library(tidyverse)
iris %>%
gather(key, val, -Species) %>%
ggplot(aes(val)) +
geom_histogram(bins = 30) +
facet_wrap(~key, scales = "free_x")
Using dplyr, tidyr and ggplot:
library(ggplot2)
library(dplyr)
library(tidyr)
iris %>%
gather(Mesure, Value, -Species) %>%
ggplot(aes(x=Value)) + geom_histogram() + facet_grid(rows=vars(Species), cols=vars(Mesure))
Result:

ggplot geom_boxplot and plotting last value with geom_point

I'm new to R. I was trying to plot the last value of each variable in a data frame on top of a boxplot. Without success I was trying:
ggplot(iris, aes(x=Species,y=Sepal.Length)) +
geom_boxplot() +
geom_point(iris, aes(x=unique(iris$Species), y=tail(iris,n=1)))
Thanks, Bill
One approach is
library(tidyverse)
iris1 <- iris %>%
group_by(Species) %>%
summarise(LastVal = last(Sepal.Length))
ggplot(iris, aes(x=Species,y=Sepal.Length)) +
geom_boxplot() +
geom_point(data = iris1, aes(x = Species, y = LastVal))

Multiple ggplots with magrittr tee operator

I am trying to figure out why the tee operator, %T>%, does not work when I pass the data to a ggplot command.
This works fine
library(ggplot2)
library(dplyr)
library(magrittr)
mtcars %T>%
qplot(x = cyl, y = mpg, data = ., geom = "point") %>%
qplot(x = mpg, y = cyl, data = ., geom = "point")
And this also works fine
mtcars %>%
{ggplot() + geom_point(aes(cyl, mpg)) ; . } %>%
ggplot() + geom_point(aes(mpg, cyl))
But when I use the tee operator, as below, it throws "Error: ggplot2 doesn't know how to deal with data of class protoenvironment".
mtcars %T>%
ggplot() + geom_point(aes(cyl, mpg)) %>%
ggplot() + geom_point(aes(mpg, cyl))
Can anyone explain why this final piece of code does not work?
Either
mtcars %T>%
{print(ggplot(.) + geom_point(aes(cyl, mpg)))} %>%
{ggplot(.) + geom_point(aes(mpg, cyl))}
or abandon the %T>% operator and use an ordinary pipe with the "%>T%" operation made explicit as a new function as suggested in this answer
techo <- function(x){
print(x)
x
}
mtcars %>%
{techo( ggplot(.) + geom_point(aes(cyl, mpg)) )} %>%
{ggplot(.) + geom_point(aes(mpg, cyl))}
As TFlick noted, the reason the %T>% operator doesn't work here is because of the precedence of operations: %any% is done before +.
I think your problem has to do with order of operations. The + is stronger than the %T>% operator (according to the ?Syntax help page). You need to pass in the data= parameter to ggplot before you add the geom_point otherwise things get messy. I think you want
mtcars %T>%
{print(ggplot(.) + geom_point(aes(cyl, mpg)))} %>%
{ggplot(.) + geom_point(aes(mpg, cyl))}
which uses the functional "short-hand" notation
Note that a returned ggplot object is a list with $data field. This can be taken advantaged of. Personally I think the style is cleaner:)
ggpass=function(pp){
print(pp)
return(pp$data)
}
mtcars %>%
{ggplot() + geom_point(aes(cyl, mpg))} %>% ggpass() %>%
{ggplot() + geom_point(aes(mpg, cyl))}

Resources