iteratively apply ggplot function within a map function - r

I would like to generate a series of histograms for all variables in a dataset, but I am clearly not preparing the data correctly for use in the map function.
library(tidyverse)
mtcars %>%
select(wt, disp, hp) %>%
map(., function(x)
ggplot(aes(x = x)) + geom_histogram()
)
I can accomplish this task with a for loop (h/t but am trying to do the same thing within the tidyverse.
foo <- function(df) {
nm <- names(df)
for (i in seq_along(nm)) {
print(
ggplot(df, aes_string(x = nm[i])) +
geom_histogram())
}
}
mtcars %>%
select(wt, disp, hp) %>%
foo(.)
Any help is greatly appreciated.

Something like this would also work:
library(purrr)
library(dplyr)
mtcars %>%
select(wt, disp, hp) %>%
names() %>%
map(~ggplot(mtcars, aes_string(x = .)) + geom_histogram())
or:
mtcars %>%
select(wt, disp, hp) %>%
{map2(list(.), names(.), ~ ggplot(.x, aes_string(x = .y)) + geom_histogram())}

To use purrr::map, you could melt your data frame, and then split it based on the variable name into a list of data frames
library(reshape2)
library(dplyr)
library(ggplot2)
library(purrr)
melt(mtcars) %>%
split(.$variable) %>%
map(., ~ggplot(.x, aes(x=value)) +
geom_histogram())
You can also use ggplot2::facet_wrap to plot them all at once
library(reshape2)
library(dplyr)
library(ggplot2)
melt(mtcars) %>%
ggplot(., aes(x=value, label=variable)) +
geom_histogram() +
facet_wrap(~variable, nrow=ncol(mtcars))

Related

how to slice data in lapply function

I want to arrange N ggplot (each one is facetted) on a grid with grid.arrange.
library(tidyverse)
library(ggplot2)
library(gridExtra)
plots <- lapply(unique(mtcars$cyl), function(cyl) {
data <- mtcars %>% filter(cyl == cyl)
ggplot(data, aes(x=mpg, y=hp))+
geom_point(color = "blue")+
facet_wrap(.~carb)}) %>%
do.call(grid.arrange, .)
do.call(grid.arrange, plots )
The problem is that all the plots are based on the entire dataset and they render the same plot, while they shuold be different as I filter them in line
data <- mtcars %>% filter(cyl == cyl).
filter deals with cyl too letteral and treated as a string, therefore cyl==cyl is TRUE for the entire dataset. You can solve this by unquote cyl using !! or use another variable name in the function e.g. x.
#Option 1
data <- mtcars %>% filter(cyl == !!cyl)
#Option 2
... function(x) {
data <- mtcars %>% filter(cyl == x)
...
Here is a tidyverse approach
library(tidyverse)
group_plots <- mtcars %>%
group_split(cyl) %>%
map(~ggplot(., aes(x = mpg, y = hp))+
geom_point(color = "blue") +
facet_wrap(.~carb))
do.call(gridExtra::grid.arrange, group_plots)
Try use split() first:
library(tidyverse)
library(gridExtra)
l <- split(mtcars, mtcars$cyl) # divide based on cyl in a list
plots <- lapply(l, function(x) {
ggplot(x, aes(x=mpg, y=hp)) +
geom_point(color = "blue") +
facet_wrap(.~carb)
}) # call lapply() on each element
do.call(grid.arrange, plots)

ggplot geom_boxplot color and group variables

I'm trying to make a straightforward boxplot in ggplot. I'm not sure how get a grouping variable and a color/fill variable. I've tried to gather, but that doesn't seem to work. Any thoughts?
library(tidyverse)
# Does not work
mtcars %>%
as_tibble() %>%
ggplot(aes(factor(gear),
mpg,
group = vs)) +
geom_boxplot(aes(fill = as.factor(gear)))
# Does not work either
mtcars %>%
as_tibble() %>%
select(gear, mpg, vs) %>%
gather(key, value, -vs) %>%
ggplot(aes(key,
value)) +
geom_boxplot(aes(color = vs))
I'm not sure this is your intended output (gear as x-axis and fill), but here's a working example:
mtcars %>%
ggplot(
aes(
x = factor(gear),
y = mpg,
color = factor(vs),
fill = factor(gear)
)
) + geom_boxplot()
I've found being explicit when declaring your aesthetic mappings can be helpful when learning ggplot2.
Alternatively:
mtcars %>%
as_tibble() %>%
group_by(vs) %>%
ggplot(aes(factor(gear),
mpg,
fill=as.factor(gear))) +
geom_boxplot()

How to Reduce Multiple Repeating Lines of Tidyverse Code - for Multiple Charts with X by Different Y's - Using a Vector of Names

library(tidyverse)
Using the Iris dataset, the code below uses a tidyverse approach to create multiple charts and is ultimately what I want to achieve. However, it seems repetitive to write out the three lines for "gg1","gg2", and "gg3", so I'm attempting to rewrite this in a more efficient way. (See code below)
iris0 <- iris %>%
group_by(Species) %>%
nest() %>%
mutate(
gg1 = map(data, ~ ggplot(., aes(Sepal.Length, Sepal.Width)) + geom_point()),
gg2 = map(data, ~ ggplot(., aes(Sepal.Length, Petal.Width)) + geom_point()),
gg3 = map(data, ~ ggplot(., aes(Sepal.Length, Petal.Length)) + geom_point()),
g = pmap(list(gg1, gg2, gg3), ~ gridExtra::grid.arrange(..1, ..2, ..3))
)
Ignoring the final gridExtra part for now as well as the names (gg1, gg2,and gg3), below is an attempt to produce the three charts from the code above in one line of code by using a vector of names called Y. But this doesn't seem to work. I've tried a few other variants, but help would be appreciated...
Y<-c("Sepal.Width","Petal.Width","Petal.Length")
iris0<-iris%>%
group_by(Species)%>%
nest()%>%
mutate(
map(data,~ggplot(.,aes(Sepal.Length))+geom_point(aes_string(Y))))
I think reshaping and using different grouping variables may be the easier-to-follow approach here, but you can attempt something the lines you are currently working with using a double map loop to loop through each species and each y variable.
Don't forget to define the y argument when mapping to the y aesthetic in geom_point. That will avoid at least some of the errors you are getting now.
I include code for sending the plots on to grid.arrange.
iris %>%
group_by(Species) %>%
nest() %>%
mutate(graphs = map(data, function(x) {
map(Y, function(y) ggplot(x, aes(Sepal.Length )) +
geom_point( aes_string(y = y) ) )
}),
grid.graphs = map(graphs, ~gridExtra::grid.arrange(grobs = .x) ) )
You can add species names to the multiplots using the top argument, but I had to group the nested dataset for this to work.
iris %>%
group_by(Species) %>%
nest() %>%
group_by(Species) %>%
mutate(graphs = map(data, function(x) {
map(Y, function(y) ggplot(x, aes(Sepal.Length )) +
geom_point( aes_string(y = y) ) )
}),
grid.graphs = map(graphs, ~gridExtra::grid.arrange(grobs = .x,
top = grid::textGrob(Species) ) ) )
I find nested loops relatively difficult to follow. A reshaping approach would take a fair amount more code, so ends up looking pretty complicated. Essentially you make your dataset long, then nest on both species and the y variable prior to making the plots.
I ended up doing a second nest for the grid.arrange, which seems pretty convoluted.
iris %>%
gather(variable, yvar, one_of(Y) ) %>%
group_by(Species, variable) %>%
nest() %>%
group_by(Species, variable) %>%
mutate(graphs = map(data, ~ggplot(.x, aes(Sepal.Length, yvar) ) +
geom_point() +
labs(y = unique(variable) ) ) ) %>%
group_by(Species) %>%
nest(graphs, .key = "graphs") %>%
mutate(grid.graphs = map(graphs, ~gridExtra::grid.arrange(grobs = flatten(.x)) ) )

Using purrr to plot a mean line on a list of ggplot2 objects

I am playing around with purrr and the methods outlines in this post
My goal here is to use purrr::map2 (or some variant) to apply a function (mean in this case) by a group (cyl) then create some plots that use the result of the previous application of a function in the resulting plot. Or put, another way, I want to add a vertical line for the mean on each of these plots using the mean_list list column all within a dplyr chain. Is this possible?
library(tidyverse)
mt_list <- mtcars %>%
group_by(cyl) %>%
nest() %>%
mutate(mean_list = map2(data, cyl, ~mean(.$disp))) %>%
mutate(plot = map2(data, cyl, ~ggplot(data = .x) +
geom_point(aes(y = drat, x = disp)) #+
#geom_vline(data = mean_list ,aes(xintercept) Unsure about this step
))
This is an example of one type of plot I'm after but this seems like a silly way to do this when the whole point is to have everything contained within a nice tibble like mt_list
mt_list$plot[[1]] +
geom_vline(aes(xintercept = mt_list$mean_list[[1]]))
This can be done by passing mean_list as the second argument to map2 rather than cyl, then using xintercept = .y in your geom_vline.
mt_list <- mtcars %>%
group_by(cyl) %>%
nest() %>%
mutate(mean_list = map(data, ~mean(.$disp))) %>%
mutate(plot = map2(data, mean_list, ~ ggplot(data = .x) +
geom_point(aes(y = drat, x = disp)) +
geom_vline(xintercept = .y)
))
Note that for this particular use case, you can also avoid having to create mean_list at all by using aes(xintercept = mean(disp)):
mt_list <- mtcars %>%
group_by(cyl) %>%
nest() %>%
mutate(plot = map(data, ~ ggplot(data = .) +
geom_point(aes(y = drat, x = disp)) +
geom_vline(aes(xintercept = mean(disp)))))

Apply a ggplot-function per group with dplyr and set title per group

I would like to create one separate plot per group in a data frame and include the group in the title.
With the iris dataset I can in base R and ggplot do this
plots1 <- lapply(split(iris, iris$Species),
function(x)
ggplot(x, aes(x=Petal.Width, y=Petal.Length)) +
geom_point() +
ggtitle(x$Species[1]))
Is there an equivalent using dplyr?
Here's an attempt using facets instead of title.
p <- ggplot(data=iris, aes(x=Petal.Width, y=Petal.Length)) + geom_point()
plots2 = iris %>% group_by(Species) %>% do(plots = p %+% . + facet_wrap(~Species))
where I use %+% to replace the dataset in p with the subset for each call.
or (working but complex) with ggtitle
plots3 = iris %>%
group_by(Species) %>%
do(
plots = ggplot(data=.) +
geom_point(aes(x=Petal.Width, y=Petal.Length)) +
ggtitle(. %>% select(Species) %>% mutate(Species=as.character(Species)) %>% head(1) %>% as.character()))
The problem is that I can't seem to set the title per group with ggtitle in a very simple way.
Thanks!
Use .$Species to pull the species data into ggtitle:
iris %>% group_by(Species) %>% do(plots=ggplot(data=.) +
aes(x=Petal.Width, y=Petal.Length) + geom_point() + ggtitle(unique(.$Species)))
library(dplyr, warn.conflicts = FALSE)
library(ggplot2)
plots3 <- iris %>%
group_by(Species) %>%
group_map(~ ggplot(.) + aes(x=Petal.Width, y=Petal.Length) + geom_point() + ggtitle(.y[[1]]))
length(plots3)
#> [1] 3
# for example, the second plot :
plots3[[2]]
Created on 2021-11-19 by the reprex package (v2.0.1)
This is another option using rowwise:
plots2 = iris %>%
group_by(Species) %>%
do(plots = p %+% .) %>%
rowwise() %>%
do(x=.$plots + ggtitle(.$Species))

Resources