I would like to create multiple plots using for loop setup. However my code does not work. Could anyone give me some guidance on this?
for i in 1:4 {
paste0("p_carb_",i) <- ggplot(mtcars%>% filter(carb==4), aes(x = wt, y = mpg, color = disp))
+ geom_point()
}
Perhaps this?
library(ggplot2)
library(dplyr)
ggs <- lapply(sort(unique(mtcars$carb)), function(crb) {
ggplot(filter(mtcars, carb == crb), aes(x = wt, y = mpg, color = disp)) +
geom_point()
})
This produces six plots, which when the first two are viewed side-by-side (calling ggs[[1]] and then ggs[[2]]), we see
An alternative might be to facet the data, as in
ggplot(mtcars, aes(x = wt, y = mpg, color = disp)) +
facet_wrap(~ carb) +
geom_point()
But the literal translation of your paste(..) <- ... code into something syntactically correct, we'd use an anti-pattern in R: assign:
for (crb in sort(unique(mtcars$carb))) {
gg <- ggplot(filter(mtcars, carb == crb), aes(x = wt, y = mpg, color = disp)) +
geom_point()
assign(paste0("carb_", crb), gg)
}
Again, this is not the preferred/best-practices way of doing things. It is generally considered much better to keep like-things in a list for uniform/consistent processing of them.
Multiple IDs ... two ways:
Nested lapply:
carbs <- sort(unique(mtcars$carb))
ggs <- lapply(carbs, function(crb) {
gears <- subset(mtcars, carb == crb)$gear
lapply(gears, function(gr) {
ggplot(dplyr::filter(mtcars, carb == crb, gear == gr), aes(x = wt, y = mpg, color = disp)) +
geom_point()
})
})
Where ggs[[1]] is a list of lists. ggs[[1]][[1]] will be one plot.
split list, one-deep:
carbsgears <- split(mtcars, mtcars[,c("carb", "gear")], drop = TRUE)
ggs <- lapply(carbsgears, function(dat) {
ggplot(dat, aes(x = wt, y = mpg, color = disp)) + geom_point()
})
Here, ggs is a list only one-deep. The names are just concatenated strings of the two fields, so since we have mtcars$carb with values c(1,2,3,4,6,8) and mtcars$gear with values c(3,4,5), removing combinations without data we have names:
names(ggs)
# [1] "1.3" "2.3" "3.3" "4.3" "1.4" "2.4" "4.4" "2.5" "4.5" "6.5" "8.5"
where "1.3" is carb == 1 and gear == 3. When column names have dots in them, this might become ambiguous.
Related
How can you specify a facet parameter that is optional for facet_wrap(), without a weird additional label showing up?
For facet_wrap(), it works as expected when facets are specified. But if it's NULL, there is a weird (all) facet. Is it possible to get rid of that facet label without adding another parameter to the function?
foo_wrap <- function(x) {
ggplot(mtcars) +
aes(x = mpg, y = disp) +
geom_point() +
facet_wrap(vars({{ x }}))
}
foo_wrap (cyl) # cylinder facets
foo_wrap (NULL) # how to get rid of "(all)"?
How can you get rid of "(all)"?
Adding these examples as references in case people find this by searching
Below is an example function with optional facets for facet_grid(), where it works as expected:
foo_grid <- function(x) {
ggplot(mtcars) +
aes(x = mpg, y = disp) +
geom_point() +
facet_grid(rows=NULL, cols=vars({{ x }}))
}
foo_grid (cyl) # cylinder facets
foo_grid (NULL) # no facets, as expected
Here is an example with hard coded rows facetting. Note that you need to call vars():
foo_grid_am <- function(x) {
ggplot(mtcars) +
aes(x = mpg, y = disp) +
geom_point() +
facet_grid(rows=vars(am), cols=vars({{ x }}))
}
foo_grid_am (cyl) # automatic-manual x cylinder facets
One option would be to add a conditional facet layer where I use rlang::quo_is_null(rlang::enquo(x)) to check whether a faceting variable was provided or not:
Note: I made NULL the default.
library(ggplot2)
library(rlang)
foo_wrap <- function(x = NULL) {
facet_layer <- if (!rlang::quo_is_null(rlang::enquo(x))) facet_wrap(vars({{ x }}))
ggplot(mtcars) +
aes(x = mpg, y = disp) +
geom_point() +
facet_layer
}
foo_wrap(cyl)
foo_wrap(NULL)
I'm struggling combining the use of variables' labels (provided by the expsspackage) in a ggplot2plot from a function I've written to repeat it several times.
In other words, the following code works as expected.
data(mtcars)
library(expss)
library(ggplot2)
mtcars <- apply_labels(mtcars,
mpg = "MPG",
cyl = "CYL",
wt = "WEIGHT")
use_labels(mtcars, {
# from the example of the package's vignette
ggplot(..data) +
geom_point(aes(y = mpg, x = wt))
})
If I want to write a function like
myplot <- function(x,y) {
ggplot(data=mtcars) +
geom_point(aes(y = {{y}}, x = {{x}}))
}
myplot(mpg, cyl)
myplot(mpg, wt)
This works as appropriate as well.
But if I use
myplot <- function(x,y) {
use_labels(data=mtcars, {
ggplot(..data) +
geom_point(aes(y = y, x = x))
})
}
myplot("mpg", "cyl")
This does not work anymore, i.e. the plot is not correct and the labels are not shown.
I've tried
myplot <- function(x,y) {
use_labels(data=mtcars, {
ggplot(data=mtcars) +
geom_point(aes(y = mtcars[[y]], x = mtcars[[x]]))
})
}
myplot("mpg", "cyl")
Then the plot is correct, but the labels are not shown...
Much easier solution: the ggeasy package (https://rdrr.io/cran/ggeasy/man/easy_labs.html)
The following works perfectly:
myplot <- function(x,y) {
ggplot(data=mtcars) +
geom_point(aes(y = {{y}}, x = {{x}}))+
ggeasy::easy_labs(teach=TRUE)
}
myplot(mpg, cyl)
I'm looking for help in order to merge two plots and respect their respective scale size.
Here is a reproductible example:
data1<- subset(mtcars, cyl = 4)
data1$mpg <- data1$mp*5.6
data2<- subset(mtcars, cyl = 8)
p1 <- ggplot(data1, aes(wt, mpg, colour = cyl)) + geom_point()
p2 <- ggplot(data2, aes(wt, mpg, colour = cyl)) + geom_point()
grid.arrange(p1, p2, ncol = 2)
But what I'm looking for is to merge the two plots and respect the scale size and get something like :
It would be nice to not use a package which need to define the ratio since it's difficult to known how much I should reduce the second plot compared to the first one... And event more difficult when we have more than 2 plots.
I think what you are trying to achieve is something like this:
library(tidyverse)
mtcars %>%
filter(cyl %in% c(4, 8)) %>%
mutate(mpg = ifelse(cyl == 4, mpg * 5.6, mpg)) %>%
ggplot(aes(x = wt, y = mpg, col = as.factor(cyl))) +
geom_point(show.legend = FALSE) +
facet_wrap(~ cyl)
NOTE: I see some bugs in your original code. For example, if you want to use subset() to subset your data, you have to change your code from:
data1 <- subset(mtcars, cyl = 4)
to:
data1 <- subset(mtcars, cyl == 4)
subset(mtcars, cyl = 4) does not do anything.
I'm trying to write the code to use my dataset and make a new graph for each column of a dataset, rather than have to write out a new value for y each time in the code.
I have a dataset where each row is a person, each column is a measurement in the blood (ie, insulin, glucose, etc). I have a few extra columns with descriptive categories that I"m using for my groups (ie lean, obese). I'd like to make a graph for each of those column measurements (ie, one graph for insulin, another for glucose, etc). I have 90 different variables to cycle through.
I've figured out how to a boxplot for each of these, but can't figure out how to have the code "loop"? so that I don't have to re-write the code for each variable.
Using the mtcars dataset as an example, I have it making a graph where the y is disp, and then another graph where y = hp, and then y = drat.
data("mtcars")
#boxplot with individual points - first y variable
ggplot(data = mtcars, aes(x = cyl, y = disp)) +
geom_boxplot()+
geom_point()
#boxplot with individual points - 2nd y variable
ggplot(data = mtcars, aes(x = cyl, y = hp)) +
geom_boxplot()+
geom_point()
#boxplot with individual points - 3rd y variable
ggplot(data = mtcars, aes(x = cyl, y = drat)) +
geom_boxplot()+
geom_point()
How do I set this up so my code will automatically cycle through all of the variables in the dataset (I have 90 of them)?
Here's a basic solution, where you would populate vector_of_yvals with your 90 variables to loop through:
library(tidyverse)
plot_func <- function(yval){
p <- ggplot(data = mtcars, aes(x = cyl, y = yval)) +
geom_boxplot()+
geom_point()
p
}
vector_of_yvals <- c("disp", "hp", "drat")
list_of_plots <- map(vector_of_yvals, plot_func)
You can populate vector_of_yvals with all of the variables in your dataframe by doing:
vector_of_yvals <- colnames(mtcars)
This will give you a vector:
[1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" "carb"
If you don't want to include cyl in your vector, you can filter it out like so:
vector_of_yvals <- vector_of_yvals %>% .[. != "cyl"]
Here is a slightly different version using a for loop and the using !!sym() to evaluate the variable text string:
library(rlang)
variables<-c("disp", "hp", "drat")
for (var in variables) {
# print(var)
p<-ggplot(data = mtcars, aes(x = cyl, y = !!sym(var), group=cyl)) +
geom_boxplot()+
geom_point()
print(p)
}
I want to create multiple plots that have the same x but different y's using purrr package methodology. That is, I would like to use the map() or walk() functions to perform this.
Using mtcars dataset for simplicity.
ggplot(data = mtcars, aes(x = hp, y = mpg)) + geom_point()
ggplot(data = mtcars, aes(x = hp, y = cyl)) + geom_point()
ggplot(data = mtcars, aes(x = hp, y = disp)) + geom_point()
edit
So far I have tried
y <- list("mpg", "cyl", "disp")
mtcars %>% map(y, ggplot(., aes(hp, y)) + geom_point()
This is one possibility
ys <- c("mpg","cyl","disp")
ys %>% map(function(y)
ggplot(mtcars, aes(hp)) + geom_point(aes_string(y=y)))
It's just like any other map function, you just need to configure your aesthetics properly in the function.
I've made a bit more general function for this, because it's part of EDA protocol (Zuur et al., 2010). This article from Ariel Muldoon helped me.
plotlist <- function(data, resp, efflist) {
require(ggplot2)
require(purrr)
y <- enquo(resp)
map(efflist, function(x)
ggplot(data, aes(!!sym(x), !!y)) +
geom_point(alpha = 0.25, color = "darkgreen") +
ylab(NULL)
)
}
where:
data is your dataframe
resp is response variable
efflist is a char of effects (independent variables)
Of course, you may change the geom and/or aesthetics as it needs. The function returns a list of plots which you can pass to e.g. cowplot or gridExtra as in example:
library(gridExtra)
library(dplyr) # just for pipes
plotlist(mtcars, hp, c("mpg","cyl","disp")) %>%
grid.arrange(grobs = ., left = "HP")