Line break in R plot legend - r

I create many plots in R with data input from another script that are held in a separate variable each. I put the variables in a string and force a line break with \n. This works as intended, but the legend is not justified at all. xjust and yjust seem to not do anything. Also, when placing the legend e.g. in bottomright, it stretches beyond the margin of the plot. Any idea how I can properly place my legend justified at the corner of a plot?
Here a reproducible code snippet:
plot(c(0,3), c(0,3), type="n", xlab="x", ylab="y")
a <- 2.3456
b <- 3.4567
c <- 4.5678
d <- 5.6789
Corner_text <- function(text, location = "bottomright"){
legend(location, legend = text, bty = "o", pch = NA, cex = 0.5, xjust = 0)
}
Corner_text(sprintf("a = %3.2f m\n b = %3.2f N/m\UB2\n c = %3.2f deg\n d = %3.2f perc", a, b, c, d))

legend is usually used to explain what the points or lines (and the different colors) represent. Therefore, inside the legend box (bty) there is a space where the lines/points are supposed to be. This probably explains why you think your text is not left-justified (you also have a problem of space after your line-break (\n): if you put a space after a line-break, it will be your first character on the next line, hence the text does not appear justified).
In your example, you don't have lines or points to explain, hence, I would use text rather than legend.
To know where "bottomright" is on your axes, you can use the graphical parameters par("xaxp") and par("yaxp") (it gives you the values of first and last ticks and the number of ticks on your axis).
On the x-axis, from the last tick, you need to shift left to have space for the widest line.
In R code, it gives:
# your plot
plot(c(0,3), c(0,3), type="n", xlab="x", ylab="y")
# your string (without the extra spaces)
text_to_put <- sprintf("a = %3.2f m\nb = %3.2f N/m\UB2\nc = %3.2f deg\nd = %3.2f perc", a, b, c, d)
# the width of widest line
max_str <- max(strwidth(strsplit(text_to_put, "\n")[[1]]))
# put the text
text(x=par("xaxp")[2]-max_str, y=par("yaxp")[1], labels=text_to_put, adj=c(0, 0))
# if really you need the box (here par("usr") is used to know the extreme values on both axes)
x_offset <- par("xaxp")[1]-par("usr")[1]
y_offset <- par("yaxp")[1]-par("usr")[3]
segments(rep(par("xaxp")[2]-max_str-x_offset, 2), c(par("usr")[3], par("yaxp")[1]+strheight(text_to_put)+y_offset), c(par("xaxp")[2]-max_str-x_offset, par("usr")[2]), rep(par("yaxp")[1]+strheight(text_to_put)+y_offset, 2))

An example on how to do this using ggplot2, where legend creation is automatic when you map a variable in aes:
library(ggplot2)
units <- c('m', 'N/m\UB2', 'deg', 'perc')
p <- ggplot() + geom_hline(aes(yintercept = 1:4, color = letters[1:4])) + #simple example
scale_color_discrete(name = 'legend title',
breaks = letters[1:4],
labels = paste(letters[1:4], '=', c(a, b, c, d), units))
Or inside the plot:
p + theme(legend.position = c(1, 0), legend.justification = c(1, 0))
Or closer to your easthetic:
p + guides(col = guide_legend(keywidth = 0, override.aes = list(alpha = 0))) +
theme_bw() +
theme(legend.position = c(1, 0), legend.justification = c(1, 0),
legend.background = element_rect(colour = 'black'))

Related

How to arrange `ggplot2` objects side-by-side and ensure equal plotting areas?

I am trying to arrange two ggplot2 plots side by side, i.e., in a two-column
layout using the package gridExtra. I am interested in ensuring that both
plots have equal plotting area (i.e., the gray plot panel is the same for both
plots) regardless of the height of the x-axis labels. As you can see in the
example below, when longer x-axis labels are used, gridExtra::grid.arrange()
seems to compensate this by adjusting the plotting area (i.e., the grayed out
part of the plot).
# Dummy data.
data <- data.frame(x = 1:10, y = rnorm(10))
# Dummy labels.
x_labels_short <- 1:10
x_labels_long <- 100001:100010
# Common settings for both `ggplot2` plots.
layers <- list(
labs(
x = "Plot title"
),
theme(
axis.text.x = element_text(
angle = 90,
vjust = 0.5,
hjust = 1
)
)
)
# `ggplot2 plot (a).
plot_a <- ggplot(data, aes(x, y)) +
scale_x_continuous(breaks = 1:10, labels = x_labels_short) +
layers
# `ggplo2` plot (b).
plot_b <- ggplot(data, aes(x, y)) +
scale_x_continuous(breaks = 1:10, labels = x_labels_long) +
layers
# Showing the plots side by side.
gridExtra::grid.arrange(
plot_a,
plot_b,
ncol = 2
)
Output:
What I want is for both plots to (1) have equal plotting area and (b) the x-axis
title of plot_a to be aligned with that of plot_b (i.e., the x-axis title of
plot_a to be offset based on the length of of the x-axis labels of plot_b).
If this is not clear, this is what I want to achieve would look like with base
R.
# Wrapper for convenience.
plot_gen <- function(data, labels) {
plot(
NULL,
xlim = c(1, 10),
ylim = c(min(data$y), max(data$y)),
xlab = "",
ylab = "y",
xaxt = "n"
)
axis(
side = 1,
at = 1:10,
labels = labels,
las = 2
)
title(
xlab = "Plot title",
line = 4.5
)
}
# Get `par`.
old_par = par(no.readonly = TRUE)
# Set the two-column layout.
layout(matrix(1:2, ncol = 2))
# Adjust margins.
par(mar = old_par$mar + c(1.5, 0, 0, 0))
# Plot base plot one.
plot_gen(data, x_labels_short)
# Plot base plot two.
plot_gen(data, x_labels_long)
# Restore `par`.
par(old_par)
# Restore layout.
layout(1:1)
Output:
Quick mention. I found a similar question on SO (i.e.,
How to specify the size of a graph in ggplot2 independent of axis labels), however I fail to see how the
answers address the problem. Also, the plots I am trying to arrange are based
on different data and I don't think I can use a facet_wrap approach.
One suggestion: the patchwork package.
library(patchwork)
plot_a + plot_b
It also works for more complex layouts, e.g.:
(plot_a | plot_b) / plot_a

Add axes to grid of ggplots

I have a grid composed of several ggplots and want to add an x axis, where axis ticks and annotations are added between the plots. I could not came up with a better solution than to create a custom plot for the axis and adding it below with arrangeGrob. But they do not align with the plots (I draw arrows where the numbers should be). Also there is a large white space below which I don't want.
I will also need an analogue for the y-axis.
library(ggplot2)
library(gridExtra)
library(ggpubr)
library(grid)
# Create a grid with several ggplots
p <-
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
theme_transparent() +
theme(plot.background = element_rect(color = "black"))
main.plot <- arrangeGrob(p, p, p, p, p, p, p, p, ncol = 4, nrow = 2)
# grid.draw(main.plot)
# Now add an x axis to the main plot
x.breaks <- c(0, 1, 2.5, 8, 10)
p.axis <- ggplot() +
ylim(-0.1, 0) +
xlim(1, length(x.breaks)) +
ggpubr::theme_transparent()
for (i in seq_along(x.breaks)) {
p.axis <- p.axis +
geom_text(aes_(x = i, y = -0.01, label = as.character(x.breaks[i])), color = "red")
}
# p.axis
final.plot <- arrangeGrob(main.plot, p.axis, nrow = 2)
grid.draw(final.plot)
Any help appreciated.
Note: In the code below, I assume each plot in your grid has equal width / height, & used equally spaced label positions. If that's not the case, you'll have to adjust the positions yourself.
Adding x-axis to main.plot:
library(gtable)
# create additional row below main plot
# height may vary, depending on your actual plot dimensions
main.plot.x <- gtable_add_rows(main.plot, heights = unit(20, "points"))
# optional: check results to verify position of the new row
dev.off(); gtable_show_layout(main.plot.x)
# create x-axis labels as a text grob
x.axis.grob <- textGrob(label = x.breaks,
x = unit(seq(0, 1, length.out = length(x.breaks)), "npc"),
y = unit(0.75, "npc"),
just = "top")
# insert text grob
main.plot.x <- gtable_add_grob(main.plot.x,
x.axis.grob,
t = nrow(main.plot.x),
l = 1,
r = ncol(main.plot.x),
clip = "off")
# check results
dev.off(); grid.draw(main.plot.x)
You can do the same for the y-axis:
# create additional col
main.plot.xy <- gtable_add_cols(main.plot.x, widths = unit(20, "points"), pos = 0)
# create y-axis labels as a text grob
y.breaks <- c("a", "b", "c") # placeholder, since this wasn't specified in the question
y.axis.grob <- textGrob(label = y.breaks,
x = unit(0.75, "npc"),
y = unit(seq(0, 1, length.out = length(y.breaks)), "npc"),
just = "right")
# add text grob into main plot's gtable
main.plot.xy <- gtable_add_grob(main.plot.xy,
y.axis.grob,
t = 1,
l = 1,
b = nrow(main.plot.xy) - 1,
clip = "off")
# check results
dev.off(); grid.draw(main.plot.xy)
(Note that the above order of x-axis followed by y-axis should not be switched blindly. If you are adding rows / columns, it's good habit to use gtable_show_layout() frequently to check the latest gtable object dimensions, & ensure that you are inserting new grobs into the right cells.)
Finally, let's add some buffer on all sides, so that the labels & plot borders don't get cut off:
final.plot <- gtable_add_padding(main.plot.xy,
padding = unit(20, "points"))
dev.off(); grid.draw(final.plot)

Colours across Plots / Heatmaps in R

I am creating a number of heatmaps in R, but I am having problems when it comes to keeping the colour scale consistent across graphs.
I find that the colours are scaled within a graph, is there a way to make colours consistent across graphs? Ie. So that that colour difference between a value of 0.4 and 0.5 is always the same?
Code Example:
set.seed(123)
d1 = matrix(rnorm(9, mean = 0.2, sd = 0.1), ncol = 3)
d2 = matrix(rnorm(9, mean = 0.8, sd = 0.1), ncol = 3)
mat = list(d1, d2)
for(m in mat)
heatmap(m, Rowv = NA ,Colv = NA)
You'll note in the example that cell (2,3) the first graph is similar to cell (1,3) in the second, despite being ~0.8 different
Here's a way to do it with ggplot2, if you're open to not using base graphics:
library(reshape2)
library(ggplot2)
# Set common limits for color scale
limits = range(unlist(mat))
Here's the code for two separate graphs. The last line of code for each graph ensures that they use the same z limits for setting the colors:
ggplot(melt(mat[[1]]), aes(Var1, Var2, fill=value)) +
geom_tile() +
scale_fill_continuous(limits=limits)
ggplot(melt(mat[[2]]), aes(Var1, Var2, fill=value)) +
geom_tile() +
scale_fill_continuous(limits=limits)
Another option is to plot both heatmaps in a single graph using facetting, which automatically ensures both graphs are on the same color scale:
ggplot(melt(mat), aes(Var1, Var2, fill=value)) +
geom_tile() +
facet_grid(. ~ L1)
I've used the default colors here, but for either approach you can set the color scale to be anything you wish. For example:
ggplot(melt(mat), aes(Var1, Var2, fill=value)) +
geom_tile() +
facet_grid(. ~ L1) +
scale_fill_gradient(low="red", high="green")
You could use the image function directly (heatmap uses image), though it will require some extra formatting to match the output of heatmap. You can use zlim to set the color range. Quoting from the ?image page:
the minimum and maximum z values for which colors should be plotted,
defaulting to the range of the finite values of z. Each of the given
colors will be used to color an equispaced interval of this range. The
midpoints of the intervals cover the range, so that values just
outside the range will be plotted.
# define zlim min and max for all the plots
minz = Reduce(min, mat)
maxz = Reduce(max, mat)
for(m in mat) {
image( m, zlim = c(minz, maxz), col = heat.colors(20))
}
To get closer to the formatting produced by heatmap, you can just reuse some code from the heatmap function:
for(m in mat) {
labCol = dim(m)[2]
labRow = dim(m)[1]
image(seq_len(labCol), seq_len(labRow), m, zlim = c(minz, maxz),
col = heat.colors(20), axes = FALSE, xlab = "", ylab = "",
xlim = 0.5 + c(0, labCol), ylim = 0.5 + c(0, labRow))
axis(1, 1L:labCol, labels = seq_len(labCol), las = 2, line = -0.5, tick = 0)
axis(4, 1L:labRow, labels = seq_len(labRow), las = 2, line = -0.5, tick = 0)
}
Using the breaks argument to image is another option. It allows more flexibility than zlim in setting the breakpoints for colors. Quoting from the help page, breaks is
a set of finite numeric breakpoints for the colours: must have one
more breakpoint than colour and be in increasing order. Unsorted
vectors will be sorted, with a warning.

grid.arrange: manipulating labels, legends, spacing in parallel coordinate plots

I wrote a function that will plot a user-specified number of parallel coordinate subplots, all in one big plot with one column:
library(gridExtra)
library(GGally)
plotClusterPar = function(cNum){
plot_i = vector("list", length=cNum)
for (i in 1:cNum){
x = data.frame(a=runif(100,0,1),b=runif(100,0,1),c=runif(100,0,1),d=runif(100,0,1))
plot_i[[i]] = ggparcoord(x, columns=1:4, scale="globalminmax", alphaLines = 0.9)+ylab("Count")
}
p = do.call("grid.arrange", c(plot_i, ncol=1))
}
The user will call (to create 3 subplots):
plotClusterPar(3)
There are four things I am trying to do, and when I try them, I get errors, so I left it at its bare working syntax! Here is what I aim to do:
1) I desire to have one y-axis label "Count", rather than an individual one for each subplot.
2) I do not wish to have any x-axis label. As default (currently), there is the label "variable" indicated under each subplot. If extra space (in between subplots) is created after removing an x-axis label, then I would like to erase that newly-created horizontal space (in between subplots).
3) I hope to color all lines in each subplot the same color. For instance, the top subplot would have all red lines, the next subplot would have all blue lines, etc. I do not mind what the colors are!
4) I strive to have a color legend at the bottom of all the subplots. (Similar to the first answer here: Universal x axis label and legend at bottom using grid.arrange), but of course the number of colors is just equal to the number of subplots.
EDIT:
I have tried to change the color, using things like:
plot_i[[i]] = ggparcoord(x, columns=1:4, scale="globalminmax", alphaLines = 0.9, colour=i)+ylab("Count")
Or hardcoding, which would not even work, because I have a loop. But this still does not work:
plot_i[[i]] = ggparcoord(x, columns=1:4, scale="globalminmax", alphaLines = 0.9, colour="red")+ylab("Count")
I tried adding colour as a layer, but that does not work:
plot_i[[i]] = ggparcoord(x, columns=1:4, scale="globalminmax", alphaLines = 0.9)+ylab("Count")+colour("red")
I also tried to give a common plot title and y-axis:
plotClusterPar = function(cNum){
plot_i = vector("list", length=cNum)
for (i in 1:cNum){
x = data.frame(a=runif(100,0,1),b=runif(100,0,1),c=runif(100,0,1),d=runif(100,0,1))
plot_i[[i]] = ggparcoord(x, columns=1:4, scale="globalminmax", alphaLines = 0.9)
}
p = do.call("grid.arrange", c(plot_i, ncol=1, main = textGrob("Main Title", vjust = 1, gp = gpar(fontface = "bold", cex = 1.5)), left = textGrob("Global Y-axis Label", rot = 90, vjust = 1)))
}
But this led to an error:
Error in arrangeGrob(..., as.table = as.table, clip = clip, main = main, :
input must be grobs!

How to manipulate y-axis text labels in R varImpPlot?

The following sample resembles my dataset:
require(randomForest)
alpha = c(1,2,3,4,5,6)
bravo = c(2,3,4,5,6,7)
charlie = c(2,6,5,3,5,6)
mydata = data.frame(alpha,bravo,charlie)
myrf = randomForest(alpha~bravo+charlie, data = mydata, importance = TRUE)
varImpPlot(myrf, type = 2)
I cannot seem to control the placement of the y-axis labels in varImpPlot. I have tried altering the plot parameters (e.g. mar, oma), with no success. I need the y-axis labels shifted to the left in order to produce a PDF with proper label placement.
How can I shift the y-axis labels to the left?
I tried to use adj parameter but it produces a bug. As varImpPlot , use dotchart behind, Here a version using lattice dotplot. Then you can customize you axs using scales parameters.
imp <- importance(myref, class = NULL, scale = TRUE, type = 2)
dotplot(imp, scales=list(y =list(cex=2,
at = c(1,2),
col='red',
rot =20,
axs='i') ,
x =list(cex=2,col='blue')) )
You can extract the data needed to construct the plot out of myref and construct a plot with ggplot. By doing so you have more freedom in tweaking the plot. Here are some examples
library(ggplot2)
str(myrf)
str(myrf$importance)
data <- as.data.frame(cbind(rownames(myrf$importance),round(myrf$importance[,"IncNodePurity"],1)))
colnames(data) <- c("Parameters","IncNodePurity")
data$IncNodePurity <- as.numeric(as.character(data$IncNodePurity))
Standard plot:
(p <- ggplot(data) + geom_point(aes(IncNodePurity,Parameters)))
Rotate y-axis labels:
(p1 <- p+ theme(axis.text.y = element_text(angle = 90, hjust = 1)))
Some more tweaking (also first plot shown here):
(p2 <- p1 + scale_x_continuous(limits=c(3,7),breaks=3:7) + theme(axis.title.y = element_blank()))
Plot that looks like the varImpPlot (second plot shown here) :
(p3 <- p2+ theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.minor.y = element_blank(),
panel.grid.major.y = element_line(colour = 'gray', linetype = 'dashed'),
panel.background = element_rect(fill='white', colour='black')))
Saving to pdf is easy with ggplot:
ggsave("randomforestplot.pdf",p2)
or
ggsave("randomforestplot.png",p2)
p2
p3
Did I understood correctly, that you want to get texts charlie and bravo more left of the boundary of the plot? If so, here's one hack to archive this, based on the modification of the rownames used in plotting:
myrf = randomForest(alpha~bravo+charlie, data = mydata, importance = TRUE)
#add white spaces at the end of the rownames
rownames(myrf$importance)<-paste(rownames(myrf$importance), " ")
varImpPlot(myrf, type = 2)
The adj parameter in dotchart is fixed as 0 (align to right), so that cannot be changed without modifying the code of dotchart:
mtext(labs, side = 2, line = loffset, at = y, **adj = 0**, col = color,
las = 2, cex = cex, ...)
(from dotchart)
EDIT:
You can make another type of hack also. Take the code of dotchart, change the above line to
mtext(labs, side = 2, line = loffset, at = y, adj = adjust_ylab, col = color,
las = 2, cex = cex, ...)
Then add argument adjust_ylab to the argument list, and rename the function as for example dotchartHack. Now copy the code of varImpPlot, find the line which calls dotchart, change the function name to dotchartHack and add the argument adjust_ylab=adjust_ylab to function call, rename the function to varImpPlotHack and add adjust_ylab to this functions argument list. Now you can change the alignment of the charlie and bravo by changing the parameter adjust_ylab:
myrf = randomForest(alpha~bravo+charlie, data = mydata, importance = TRUE)
varImpPlotHack(myrf, type = 2,adjust_ylab=0.5)
From ?par:
The value of adj determines the way in which text strings are
justified in text, mtext and title. A value of 0 produces
left-justified text, 0.5 (the default) centered text and
right-justified text. (Any value in [0, 1] is allowed, and on most
devices values outside that interval will also work.)

Resources