Multiple equally sized plots in an RMarkdown document - r

I would like to align the area of several plots, each of them created by separate chunks in an RMarkdown document (preferably .html) "nicely". My problem: Because of the different lengths of the y-axis texts. The plotted area doesn't overlap perfectly (A pity because my actual x-axis is months).
Setting the fig.width= and out.width= don't help here as they consider the axis text lengths.
Dummy Data chunk:
require(ggplot2)
df = expand.grid(y = LETTERS,
x = paste0('A', 1:10),
stringsAsFactors = FALSE)
set.seed(42)
df$fill = rnorm(nrow(df))
df2 = df
df2$y = unlist(lapply(lapply(df2$y, function(x) rep(x, 10)), paste0, collapse = ''))
Plot-Chunk1:
gg1 = ggplot(df, aes(y = y, x = x, fill = fill)) +
geom_tile()
gg1
Plot-Chunk2:
gg2 = ggplot(df2, aes(y = y, x = x, fill = fill)) +
geom_tile()
gg2
The plots in the RMarkdown document should look like that (red lines highlight the desired alignment):
I achieved this with the patchwork package. However, like this I can only use one chunk and not multiple.
Patchwork-Plot-Chunk:
require(patchwork)
gg1 / gg2 +
plot_annotation(tag_levels = 'A')

Edited (tidier?) solution: cowplot::align_plots
Having a bit of a play around with cowplot::align_plots, it would be possible to set a standard panel width to use across all graphs. But to do this across chunks when you're constructing each graph 'blind' to the forthcoming ones, you could create a 'template' plot with labels as wide as needed (gg_set below). Each subsequent graph would then adopt the sizing of this unused plot:
require(ggplot2)
df <- expand.grid(y = LETTERS,
x = paste0('A', 1:10),
stringsAsFactors = FALSE)
set.seed(42)
df$fill = rnorm(nrow(df))
df2 <- df
df2$y <-
unlist(lapply(lapply(df2$y, function(x)
rep(x, 5)), paste0, collapse = ''))
# df for setting max size needed - might need experimented with
dfset <- df
dfset$y <-
unlist(lapply(lapply(df$y, function(x)
rep(x, 10)), paste0, collapse = ''))
# 'template' plot
gg_set <- ggplot(dfset, aes(y = y, x = x, fill = fill)) +
geom_tile()
require(cowplot)
# Chunk 1
gg1 <- ggplot(df, aes(y = y, x = x, fill = fill)) +
geom_tile()
ggs <- align_plots(gg_set, gg1, align = "v")
# Only extracting relevant graph.
ggdraw(ggs[[2]])
# Chunk 2
gg2 <- ggplot(df2, aes(y = y, x = x, fill = fill)) +
geom_tile()
ggs <- align_plots(gg_set, gg2, align = "v")
ggdraw(ggs[[2]])
Created on 2021-12-17 by the reprex package (v2.0.1)
Untidy former solution
I've previously used an admittedly messy solution, which really just involves padding all labels with blank rows above and below to greater than the max length:
require(ggplot2)
#> Loading required package: ggplot2
df <- expand.grid(y = LETTERS,
x = paste0('A', 1:10),
stringsAsFactors = FALSE)
set.seed(42)
df$fill = rnorm(nrow(df))
df2 <- df
df2$y <-
unlist(lapply(lapply(df2$y, function(x)
rep(x, 10)), paste0, collapse = ''))
df$y <-
paste0(paste0(rep(" ", 40), collapse = ""), "\n", df$y, "\n", paste0(rep(" ", 40)))
df2$y <-
paste0(paste0(rep(" ", 40), collapse = ""), "\n", df2$y, "\n", paste0(rep(" ", 40)))
gg1 <- ggplot(df, aes(y = y, x = x, fill = fill)) +
geom_tile()
gg1
gg2 <- ggplot(df2, aes(y = y, x = x, fill = fill)) +
geom_tile()
gg2
I would hope their is a more formal solution which allows a static panel sizing, and I look forward to hearing other answers. But had used this as a quick fix!
Created on 2021-12-17 by the reprex package (v2.0.1)

The patchwork package also includes the function align_patches() which works similar to cowplot::align_plots().
gg_l = patchwork::align_patches(gg1,
gg2)
Plot-Chunk1:
gg_l[[1]]
Plot-Chunk2:
gg_l[[2]]
Data from question.

Related

Retaining editable text in exported vector formats in R/ggplot

It seems like there has to be a way to retain editable text when exporting graphics from R into a vector format (e.g., pdf, eps, svg), but I have not been able to locate it. My graphics are primarily created using ggplot2, and I am running R via RStudio on Windows.
#generate a plot
set.seed(1)
df <- data.frame(
gp = factor(rep(letters[1:3], each = 10)),
y = rnorm(30)
)
ds <- do.call(rbind, lapply(split(df, df$gp), function(d) {
data.frame(mean = mean(d$y), sd = sd(d$y), gp = d$gp)
}))
ggplot(df, aes(gp, y)) +
geom_point() +
geom_point(data = ds, aes(y = mean), colour = 'red', size = 3)
#export
ggsave("plot.eps")
ggsave("plot.pdf")
ggsave("plot_cairo.pdf", device=cairo_pdf)
ggsave("plot.svg")
All of these options generate a vector file with text (axis labels, etc) converted to outlines, which are no longer editable as text - which defeats a major point of the vector format, at least for my use case.
Ok, so typical use cases, the svglite library will retain text - see plot 1 export below. If you put two plots together using the patchwork library, the text is converted to outlines and no longer retained as editable text.
set.seed(1)
df <- data.frame(
gp = factor(rep(letters[1:3], each = 10)),
y = rnorm(30)
)
ds <- do.call(rbind, lapply(split(df, df$gp), function(d) {
data.frame(mean = mean(d$y), sd = sd(d$y), gp = d$gp)
}))
p1<-ggplot(df, aes(gp, y)) +
geom_point() +
geom_point(data = ds, aes(y = mean), colour = 'red', size = 3)
p2<-ggplot(df, aes(gp, y)) +
geom_point() +
geom_point(data = ds, aes(y = mean), colour = 'green', size = 3)
library(patchwork)
p3 <- p1|p2
ggsave(plot = p1, "p1.svg", device = svglite)
ggsave(plot = p3, "p3.svg", device = svglite)

R: Coloring Individual Bars in Barplots

I am working with the R programming language.
I simulated the following data:
set.seed(123)
myFun <- function(n = 5000) {
a <- do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE))
paste0(a, sprintf("%04d", sample(9999, n, TRUE)), sample(LETTERS, n, TRUE))
}
name = myFun(400)
variable = rnorm(400, 50,10)
part1 = data.frame(name,variable)
name = myFun(10)
variable = rnorm(10, 130,10)
part2 = data.frame(name,variable)
final = rbind(part1, part2)
final$name = substr(final$name,1,3)
Then, I made a barplot of this data:
library(ggplot2)
p<-ggplot(data=final, aes(x=name, y=variable)) +
geom_bar(stat="identity") + ggtitle(" Title of Barplot")
Is there a way I can take the "names" with the ten largest values of "variable" and place them as labels on top of the corresponding bars?
I saw this link over here that shows how to do this for all bars (https://www.geeksforgeeks.org/how-to-add-labels-over-each-bar-in-barplot-in-r/) - but is there a way to do this only for the 10 largest bars?
Thanks!
Note: Is there some way to better "center" the labels and prevent them from overlapping?
Another option using geom_text_repel which has a lot of options for positioning your labels. Here is a reproducible example:
set.seed(123)
myFun <- function(n = 5000) {
a <- do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE))
paste0(a, sprintf("%04d", sample(9999, n, TRUE)), sample(LETTERS, n, TRUE))
}
name = myFun(400)
variable = rnorm(400, 50,10)
part1 = data.frame(name,variable)
name = myFun(10)
variable = rnorm(10, 130,10)
part2 = data.frame(name,variable)
final = rbind(part1, part2)
library(dplyr)
# Create subset dataframe
final_10 <- final %>%
arrange(desc(variable)) %>%
slice(1:10)
library(ggplot2)
library(ggrepel)
ggplot(data=final, aes(x=name, y=variable)) +
geom_bar(stat="identity") +
geom_text_repel(data = final_10, aes(x = name, y = variable, label = variable), size = 2, segment.color = "grey50") +
ggtitle(" Title of Barplot")
Created on 2022-08-24 with reprex v2.0.2
Here you go, just create a new variable with the name for the top 10 and NA otherwise
# Setup
library(tidyverse)
# Create a label column
final <-
final %>%
arrange(desc(variable)) %>%
mutate(label = ifelse(row_number() <= 10, name, NA))
# Add geom_text() for label
p <-
ggplot(
data = final,
aes(
x = name,
y = variable,
label = label)) +
geom_bar(stat="identity") +
geom_text() +
ggtitle(" Title of Barplot")

Using assign within ggplot loop gives incorrect plots

I'm creating three plots in a loop over I and using assign to save each plot. The y variable is scaled by the loop index. The scaling should ensure that the final panel of plots each has y going from 0 to 1. This isn't happening and the plots seem to be being changed as the loop runs. I'd be grateful if someone could explain this apparently odd behaviour.
library(dplyr)
library(ggplot2)
library(gridExtra)
loci = c(1,2,3)
x <- seq(0,1,0.01)
df <- expand.grid(x = x, loci = loci)
df <- df %>% mutate(y = loci * x)
cols = c("red", "blue", "green")
for (i in loci){
plot_this <- df %>% filter(loci == i)
my_plot = ggplot(plot_this) +
geom_point( aes( x = x, y = y/i), colour = cols[i]) +
ylim(0,3) + ggtitle(paste0("i = ", i))
assign(paste0("plot_", i), my_plot)
print(plot_1)
}
grid.arrange(plot_1, plot_2, plot_3, ncol = 3)
It's due to the lazy evaluation nature of ggplot, and more explanation can be found in this post.
"Looping" with lapply avoids the problem.
Data
library(ggplot2)
library(gridExtra)
library(dplyr)
loci = c(1,2,3)
x <- seq(0,1,0.01)
df <- expand.grid(x = x, loci = loci)
df <- df %>% mutate(y = loci * x)
cols = c("red", "blue", "green")
Code
my_plot <- lapply(loci, function(i) {
df %>%
filter(loci == i) %>%
ggplot() +
geom_point(aes(x = x, y = y/i), colour = cols[i]) +
ylim(0,3) +
ggtitle(paste0("i = ", i))
})
grid.arrange(my_plot[[1]], my_plot[[2]], my_plot[[3]], ncol = 3)
Created on 2022-04-26 by the reprex package (v2.0.1)

Applying ggplotly() on geom_bar with factors

I running into a problem with plotly right now. I'm translating ggplot2 graphs into interactive graphs. Here is an example code:
library(ggplot2)
library(plotly)
x <- sample(x = c(1,2), size = 100, replace = TRUE)
y <- sample(x = c(3,4), size = 100, replace = TRUE)
df <- data.frame(x, y)
df <- transform(df,
x = ifelse(x == 1,"cat","dog"),
y = ifelse(y == 3, "young","old")
)
p <- ggplot(df, aes(x= x, fill = y)) +
geom_bar(position = "stack") +
coord_flip() +
theme(legend.position="none")
ggplotly(p)
I would expect the ggplotly function to create an interactive version but it seems that it's not able to handle factor counts.
ggplot result:
plotly translation:
Any suggestions to solve the issue are welcome!

ggplot2: Splitting facet/strip text into two lines

Consider the following ggplot2 graph with long facet/strip text
broken in two lines.
The text goes outside the area devoted to facet titles.
library(ggplot2)
x <- c(1:3, 1:3)
y <- c(3:1, 1:3)
grp <- c(0, 0, 0, 1, 1, 1)
p <- qplot(x=x, y=y) + geom_line() + facet_wrap(~ grp)
grob <- ggplotGrob(p)
strip.elem.y <- grid.ls(getGrob(grob, "strip.text.x",
grep=TRUE, global=TRUE))$name
grob <- geditGrob(grob, strip.elem.y[1],
label="First line and\n second line" )
grid.draw(grob)
Is there a way to increase the height of the strip text area ?
ggplot2 supports a built in way of doing this using label_wrap_gen.
x <- c(1:3, 1:3)
y <- c(3:1, 1:3)
grp = c(rep("group 1 with a long name",3),rep("group 2 with a long name",3))
d = data.frame(x = x, y =y, grp = grp)
ggplot(d, aes(x=x,y=y)) + geom_line() + facet_wrap(~ grp, labeller = label_wrap_gen(width=10))
You can use a 2-line label:
grp <- c(rep("foo\nbar",3), 1, 1, 1)
qplot(x=x, y=y) + geom_line() + facet_wrap(~ grp)
I tried this a variety of ways but was frustrated getting the paste(strwrap(text, width=40), collapse=" \n") to give me results for the single row of data and not concatenate the each bit of text from the entire list.
I came up with a solution that worked best for me. I wrote a function like the one below. Given a dataframe data with column text
wrapit <- function(text) {
wtext <- paste(strwrap(text,width=40),collapse=" \n ")
return(wtext)
}
data$wrapped_text <- llply(data$text, wrapit)
data$wrapped_text <- unlist(data$wrapped_text)
After I called this function, I just applied my labeller function to the wrapped_text column instead of the text column.
Expanding on the useful example from #groceryheist we can use the argument multi_line = True with label_wrap_gen() to get the desired effect without having to specify a fixed width.
library(ggplot2)
x = c(1:3, 1:3)
y = c(3:1, 1:3)
grp = c(rep("group 1 with a very very very long name",3),
rep("group 2 with an even longer name",3))
df = data.frame(x = x, y =y, grp = grp)
ggplot(df, aes(x,y)) +
geom_line() +
facet_wrap(~ grp,
labeller = label_wrap_gen(multi_line = TRUE))
Ref: https://ggplot2.tidyverse.org/reference/labellers.html

Resources