R ggplot2 for loop plots same data - r

I have put together a simple for loop to generate a series of plots and then use grid.arrange to plot them. I have two problems:
The axes of the plots change correctly to the column names, but the same data is plotted on each graph. Having put in a breakpoint and stepped through the code it appears to be incrementing correctly so I'm not sure why.
I have set the plot aesthetic to group on year, however this produces intermediate .5 years that appear in the legend. This hasn't happened to me before.
Should all be reproducible using mtcars.
library(ggplot2)
library(gridExtra)
result <- mtcars
for(i in 1:2) {
nam <- paste("p", i, sep = "")
assign(
nam, ggplot(result, aes(x = disp, y = results[i+4], group = gear, color = gear)) +
geom_line() +
geom_point() +
scale_colour_distiller(palette = "Dark2", direction = -1, guide = "legend") +
scale_y_continuous(name = colnames(results[i+4])) +
scale_x_continuous(name = "x")
)
}
plist <- mget(paste0("p", 1:2))
do.call(grid.arrange, plist)

I think trying to access the columns by their number in the aes mapping is confusing ggplot. This works:
for(i in 1:2) {
nam <- paste("p", i, sep = "")
assign(
nam, ggplot(result,aes_string(x="disp",y=colnames(result)[i+4], group="gear", color="gear")) +
geom_line() +
geom_point() +
scale_colour_distiller(palette = "Dark2", direction=-1, guide="legend") +
scale_y_continuous(name=colnames(result[i+4])) +
scale_x_continuous(name="x")
)
}
I would suggest iterating over the names though; this makes the code much clearer. Here's a version that does this and skips the detour around the environment:
plots <- lapply(c("drat", "wt"), function(column) {
ggplot(result,aes_string(x="disp",y=column, group="gear", color="gear")) +
geom_line() + geom_point() +
scale_colour_distiller(palette = "Dark2", direction=-1, guide="legend") +
scale_y_continuous(name=column) +
scale_x_continuous(name="x")}) %>%
do.call(grid.arrange, .)
do.call(grid.arrange, plots)

Your using results and result. And you should use aes_string and then refer to the variables by string name:
You should also avoid to make tons of assignments. Just put it all into a list()
library(ggplot2)
library(gridExtra)
result<-mtcars
for(i in 1:2) {
nam <- paste("p", i, sep = "")
assign(
nam, ggplot(result,aes_string(x="disp",y=names(result)[i+4], group="gear", color="gear")) +
geom_line() +
geom_point() +
scale_colour_distiller(palette = "Dark2", direction=-1, guide="legend") +
scale_y_continuous(name=colnames(result[i+4])) +
scale_x_continuous(name="x")
)
}
plist <- mget(paste0("p", 1:2))
do.call(grid.arrange, plist)

The problem is that the plot is generated in the for loop, but evaluated in the do.call. Since i has changed in the for loop, both are evaluated with i = 2. You can confirm this with:
i <- 3
do.call(grid.arrange, plist)
A small adjustment to your code fixes the issue:
for(i in 1:2) {
nam <- paste("p", i, sep = "")
coln <- colnames(result[i+4])
assign(
nam, ggplot(result,aes_(x=~disp,y=as.name(coln), group=~gear, color=~gear)) +
geom_line() +
geom_point() +
scale_colour_distiller(palette = "Dark2", direction=-1, guide="legend") +
scale_y_continuous(name=coln) +
scale_x_continuous(name="x")
)
}
plist <- mget(paste0("p", 1:2))
do.call(grid.arrange, plist)

You should take full advantage of ggplot::facet_wrap
This means tidying your data to a single data frame that's interpretable to ggplot
Data
temp <- mtcars
Tidy data
library(purrr)
library(dplyr)
Names <- map_chr(1:2, ~names(temp)[.x+4])
# "drat" "wt"
data <- map_df(1:2, ~temp[,c("cyl", names(temp)[.x+4])] %>% setNames(c("cyl", "value")), .id="iteration") %>%
mutate(iteration = Names[as.numeric(iteration)])
plot with facet_wrap
ggplot(data=data, aes(x=cyl, y=value, label=iteration)) +
geom_line() +
geom_point() +
facet_wrap(~iteration)

Related

for-loop to create ggplots

I trying to make boxplots with ggplot2.
The code I have to make the boxplots with the format that I want is as follows:
p <- ggplot(mg_data, aes(x=Treatment, y=CD68, color=Treatment)) +
geom_boxplot(mg_data, mapping=aes(x=Treatment, y=CD68))
p+ theme_classic() + geom_jitter(shape=16, position=position_jitter(0.2))
I can was able to use the following code to make looped boxplots:
variables <- mg_data %>%
select(10:17)
for(i in variables) {
print(ggplot(mg_data, aes(x = Treatment, y = i, color=Treatment)) +
geom_boxplot())
}
With this code I get the boxplots however, they do not have the name label of what variable is being select for the y-axis, unlike the original code when not using the for loop. I also do not know how to add the formating code to the loop:
p + theme_classic() + geom_jitter(shape=16, position=position_jitter(0.2))
Here is a way. I have tested with built-in data set iris, just change the data name and selected columns and it will work.
suppressPackageStartupMessages({
library(dplyr)
library(ggplot2)
})
variables <- iris %>%
select(1:4) %>%
names()
for(i in variables) {
g <- ggplot(iris, aes(x = Species, y = get(i), color=Species)) +
geom_boxplot() +
ylab(i)
print(g)
}
Edit
Answering to a comment by user TarJae, reproduced here because answers are less deleted than comments:
Could you please expand with saving all four files. Many thanks.
The code above can be made to save the plots with a ggsave instruction at the loop end. The filename is the variable name and the plot is the default, the return value of last_plot().
for(i in variables) {
g <- ggplot(iris, aes(x = Species, y = get(i), color=Species)) +
geom_boxplot() +
ylab(i)
print(g)
ggsave(paste0(i, ".png"), device = "png")
}
Try this:
variables <- mg_data %>%
colnames() %>%
`[`(10:17)
for (i in variables) {
print(ggplot(mg_data, aes(
x = Treatment, y = {{i}}, color = Treatment
)) +
geom_boxplot())
}
Another option is to use lapply. It's approximately the same as using a loop, but it hides the actual looping part and can make your code look a little cleaner.
variables = iris %>%
select(1:4) %>%
names()
lapply(variables, function(x) {
ggplot(iris, aes(x = Species, y = get(x), color=Species)) +
geom_boxplot() + ylab(x)
})

Handle ggplot2 axis text face programmatically

(x-posted to community.rstudio.com)
I'm wondering if it's possible to change the axis text in ggplot2 programatically or if there is some native way to do this in ggplot2. In this reprex, the idea is that I want to bold the axis text of a variable y that has an absolute value of x over 1.5. I can add it in manually via theme(), and that works fine:
library(ggplot2)
library(dplyr)
library(forcats)
set.seed(2939)
df <- data.frame(x = rnorm(15), y = paste0("y", 1:15), group = rep(1:3, 5))
df <- mutate(df, big_number = abs(x) > 1.5, face = ifelse(big_number, "bold",
"plain"))
p <- ggplot(df, aes(x = x, y = fct_inorder(y), col = big_number)) + geom_point() +
theme(axis.text.y = element_text(face = df$face))
p
Plot 1 with no facets
But if I facet it by group, y gets reordered and ggplot2 has no idea how face is connected to df and thus y, so it just bolds in the same order as the first plot.
p + facet_grid(group ~ .)
Plot 2 with facets
And it's worse if I use a different scale for each.
p + facet_grid(group ~ ., scales = "free")
Plot 3 with facets and different scales
What do you think? Is there a general way to handle this that would work consistently here?
Idea: Don't change theme, change y-axis labels. Create a call for every y with if/else condition and parse it with parse.
Not the most elegant solution (using for loop), but works (need loop as bquote doesn't work with ifelse). I always get confused when trying to work with multiple expressions (more on that here).
Code:
# Create data
library(tidyverse)
set.seed(2939)
df <- data.frame(x = rnorm(15), y = paste0("y", 1:15), group = rep(1:3, 5)) %>%
mutate(yF = fct_inorder(y),
big_number = abs(x) > 1.5)
# Expressions for y-axis
# ifelse doesn't work
# ifelse(df$big_number, bquote(bold(1)), bquote(plain(2)))
yExp <- c() # Ignore terrible way of concatenating
for(i in 1:nrow(df)) {
if (df$big_number[i]) {
yExp <- c(yExp, bquote(bold(.(as.character(df$yF[i])))))
} else {
yExp <- c(yExp, bquote(plain(.(as.character(df$yF[i])))))
}
}
# Plot with facets
ggplot(df, aes(x, yF, col = big_number)) +
geom_point() +
scale_y_discrete(breaks = levels(df$yF),
labels = parse(text = yExp)) +
facet_grid(group ~ ., scales = "free")
Result:
Inspired by #PoGibas, I also used a function in scale_y_discrete(), which works, too.
bold_labels <- function(breaks) {
big_nums <- filter(df, y %in% breaks) %>%
pull(big_number)
labels <- purrr::map2(
breaks, big_nums,
~ if (.y) bquote(bold(.(.x))) else bquote(plain(.(.x)))
)
parse(text = labels)
}
ggplot(df, aes(x, fct_inorder(y), col = big_number)) +
geom_point() +
scale_y_discrete(labels = bold_labels) +
facet_grid(group ~ ., scales = "free")

how to add superscript into facet labels of facet_wrap? [duplicate]

This question already has answers here:
Changing facet label to math formula in ggplot2
(5 answers)
Closed 9 years ago.
I have a dataset from which I would like to plot small multiples, specifically in a 2-by-2 array, like this:
mydf <- data.frame(letter = factor(rep(c("A", "B", "C", "D"), each = 20)), x = rnorm(80), y = rnorm(80))
ggplot(mydf, aes(x = x, y = y)) + geom_smooth(method = "lm") + geom_point() + facet_wrap(~ letter, ncol = 2)
However, I want each facet label to include an expression, such as
expression(paste("A or ", alpha))
I can make this happen using facet_grid() via
f_names <- list('A' = expression(paste("A or ", alpha)), 'B' = expression(paste("B or ", beta)), 'C' = expression(paste("C or ", gamma)), 'D' = expression(paste("D or ", delta)))
f_labeller <- function(variable, value){return(f_names[value])}
ggplot(mydf, aes(x = x, y = y)) + geom_smooth(method = "lm") + geom_point() + facet_grid(~ letter, labeller = f_labeller)
But then I lose the 2-by-2 array. How can I rename the facet_wrap() facet labels with an expression? Or, how can I solve this by recreating the 2-by-2 array using facet_grid(), but only faceting by a single variable?
(This question builds off of the parenthetical note in #baptiste's answer to this previous question.)
Thanks!
In order to do what I asked, first load this labeller function from #Roland first appearing here:
facet_wrap_labeller <- function(gg.plot,labels=NULL) {
#works with R 3.0.1 and ggplot2 0.9.3.1
require(gridExtra)
g <- ggplotGrob(gg.plot)
gg <- g$grobs
strips <- grep("strip_t", names(gg))
for(ii in seq_along(labels)) {
modgrob <- getGrob(gg[[strips[ii]]], "strip.text",
grep=TRUE, global=TRUE)
gg[[strips[ii]]]$children[[modgrob$name]] <- editGrob(modgrob,label=labels[ii])
}
g$grobs <- gg
class(g) = c("arrange", "ggplot",class(g))
g
}
Then save the original ggplot() object:
myplot <- ggplot(mydf, aes(x = x, y = y)) + geom_smooth(method = "lm") + geom_point() + facet_wrap(~ letter, ncol = 2)
Then call facet_wrap_labeller() and feed the expression labels as an argument:
facet_wrap_labeller(myplot, labels = c(expression(paste("A or ", alpha)), expression(beta), expression(gamma), expression(delta)))
The expressions should now appear as the facet_wrap() labels.

Ordering ggplot legend by the final value in a data frame

I would like to re-order the elements in a legend, as they appear top to bottom in an R ggplot. That is: I'd like the order dictated by comparing the Y value at the right most point X axis point. In the following data, I'd like the legend to read from the top: bush, foo, baz, bar.
Update: following #alexwhan comments, I have added the data to the script.
Update 2: this is now exactly what I was hoping for, thanks to #thomas-kern on #R (bosie) irc.freenode. The trick was to add both, i.e.
scale_linetype_discrete(breaks = ord$Variant) + scale_shape_discrete(breaks = ord$Variant)
Here's my R:
library(plyr)
library(ggplot2)
require(grid)
args <- commandArgs(trailingOnly = TRUE)
lines <- "
X,Variant,Y
1,foo,123
1,bar,134
1,baz,135
1,bush,136
2,foo,221
2,bar,104
2,baz,155
2,bush,336
"
con <- textConnection(lines)
DF <- read.csv(con, header=TRUE)
close(con)
cdata <- ddply(DF, .(Variant,X), summarise, N = length(Y), mean=round(mean(Y),2), sd=round(sd(Y),2), se=round(sd(Y)/sqrt(length(Y)),2))
ord <- cdata[cdata$X == max(cdata$X),]
ord <- ord[order(ord$Variant, decreasing=T),]
pdf("out.pdf")
none <- element_blank()
bp <- ggplot(cdata, aes(x=X, y=mean, group=Variant)) + xlab("X label") + geom_line(aes(linetype=Variant)) + geom_point(aes(shape=Variant)) + ylab("Y Value") + labs(title = "mytitle") + scale_linetype_discrete(breaks = ord$Variant) + scale_shape_discrete(breaks = ord$Variant)
print(bp + theme(legend.justification=c(1,0), legend.position=c(1,0), legend.key.width=unit(3,"line"), legend.title=element_blank(), text = element_text(size=18)) + theme(panel.background = element_rect(fill='white', colour='black')) + theme(panel.grid.major = none, panel.grid.minor = none))
dev.off()
This generates exactly what I'm after:
It really helps if you provide the data your plot is made with. Here's an example of how to approach with some data I made up:
dat <- data.frame(x = c(1,2), y = rnorm(8), group = rep(c("bar", "baz", "bush", "foo"), each = 2))
ord <- dat[dat$x == max(dat$x),]
ord <- ord[order(ord$y, decreasing=T),]
ggplot(dat, aes(x, y)) + geom_point(aes(shape = group)) + geom_line(aes(group = group)) +
scale_shape_discrete(breaks = ord$group)

How to add expressions to labels in facet_wrap? [duplicate]

This question already has answers here:
Changing facet label to math formula in ggplot2
(5 answers)
Closed 9 years ago.
I have a dataset from which I would like to plot small multiples, specifically in a 2-by-2 array, like this:
mydf <- data.frame(letter = factor(rep(c("A", "B", "C", "D"), each = 20)), x = rnorm(80), y = rnorm(80))
ggplot(mydf, aes(x = x, y = y)) + geom_smooth(method = "lm") + geom_point() + facet_wrap(~ letter, ncol = 2)
However, I want each facet label to include an expression, such as
expression(paste("A or ", alpha))
I can make this happen using facet_grid() via
f_names <- list('A' = expression(paste("A or ", alpha)), 'B' = expression(paste("B or ", beta)), 'C' = expression(paste("C or ", gamma)), 'D' = expression(paste("D or ", delta)))
f_labeller <- function(variable, value){return(f_names[value])}
ggplot(mydf, aes(x = x, y = y)) + geom_smooth(method = "lm") + geom_point() + facet_grid(~ letter, labeller = f_labeller)
But then I lose the 2-by-2 array. How can I rename the facet_wrap() facet labels with an expression? Or, how can I solve this by recreating the 2-by-2 array using facet_grid(), but only faceting by a single variable?
(This question builds off of the parenthetical note in #baptiste's answer to this previous question.)
Thanks!
In order to do what I asked, first load this labeller function from #Roland first appearing here:
facet_wrap_labeller <- function(gg.plot,labels=NULL) {
#works with R 3.0.1 and ggplot2 0.9.3.1
require(gridExtra)
g <- ggplotGrob(gg.plot)
gg <- g$grobs
strips <- grep("strip_t", names(gg))
for(ii in seq_along(labels)) {
modgrob <- getGrob(gg[[strips[ii]]], "strip.text",
grep=TRUE, global=TRUE)
gg[[strips[ii]]]$children[[modgrob$name]] <- editGrob(modgrob,label=labels[ii])
}
g$grobs <- gg
class(g) = c("arrange", "ggplot",class(g))
g
}
Then save the original ggplot() object:
myplot <- ggplot(mydf, aes(x = x, y = y)) + geom_smooth(method = "lm") + geom_point() + facet_wrap(~ letter, ncol = 2)
Then call facet_wrap_labeller() and feed the expression labels as an argument:
facet_wrap_labeller(myplot, labels = c(expression(paste("A or ", alpha)), expression(beta), expression(gamma), expression(delta)))
The expressions should now appear as the facet_wrap() labels.

Resources