R: ggplot slight adjustment for clustering summary - r

Please check my reproducible example and the result chart.
X = t(USArrests)
plot_color_clust = function(X,N=N,
cols=c("red","blue", "orange", "darkgreen","green","yellow","grey","black","white")
){
library(ggplot2)
library(gridExtra)
library(gtable)
library(scales)
library(ggdendro)
library(grid)
library(plyr)
if(N>length(cols)) stop("N too big. Not enough colors in cols.")
if(N>ncol(X)) stop("N too big. Not enough columns in data.")
fit = ClustOfVar::hclustvar(X.quanti = X)
dd.row = as.dendrogram(fit)
ddata_x <- dendro_data(dd.row)
temp = cutree(fit,k=N)
lab <- ggdendro::label(ddata_x)
x=c()
for(i in 1:nrow(lab)){
x[i]= paste( "clust", as.vector(temp[ lab$label[i]==names(temp) ]) ,sep="")
}
lab$group <- x
p1 <- ggplot(segment(ddata_x)) +
geom_segment(aes(x=x, y=y, xend=xend, yend=yend))+coord_flip()+
geom_text(data=lab,
aes(label=label, x=x, y=0, colour=group),hjust=1) +
theme(legend.position="none",
axis.title.y=element_blank(),
axis.title.x=element_blank(),
axis.text.x = element_text(angle = 0, hjust = 0),
axis.title.x = element_text(angle = 0, hjust = 0))+
theme(axis.text = element_blank(), axis.title = element_blank(),
axis.ticks = element_blank(), axis.ticks.margin = unit(0, "lines"),
axis.ticks.length = unit(0, "cm"))+
scale_colour_manual(values=cols)+coord_flip()+
scale_y_continuous(limits = c(-0.1, 2.1))
df2<-data.frame(cluster=cutree(fit,N),states=factor(fit$labels,levels=fit$labels[fit$order]))
df3<-ddply(df2,.(cluster),summarise,pos=mean(as.numeric(states)))
p2 = ggplot(df2,aes(states,y=1,fill=factor(cluster)))+geom_tile()+
scale_y_continuous(expand=c(0,0))+
theme(axis.title=element_blank(),
axis.ticks=element_blank(),
axis.text=element_blank(),
legend.position="none")+coord_flip()+
geom_text(data=df3,aes(x=pos,label=cluster))+
scale_fill_manual(name = "This is my title", values = cols)
gp1<-ggplotGrob(p1)
gp2<-ggplotGrob(p2)
maxHeight = grid::unit.pmax(gp1$heights[2:5], gp2$heights[2:5])
gp1$heights[2:5] <- as.list(maxHeight)
gp2$heights[2:5] <- as.list(maxHeight)
#grid.arrange(gp2, gp1, ncol=2,widths=c(1/6,5/6))
R = arrangeGrob(gp2,gp1,ncol=2,widths=c(1/6,5/6))
R
}
plot_color_clust(X,6)
Questions:
These two parts (left colors tiles and right clustering tree) has inconsistent heights. How do we adjust their heights for them to match each other's?
How can we make the tree on the right side shorter so states names (clustered subjects) can have more space to be fully displayed?
Is there a way make the white space between those two parts smaller?
Your tweaking of the code is appreciated. Thanks.

One major change: Rather than matching heights of the two charts, I extract the plot panel from gp2, then insert it into column 2 of gp1. There are no margins surrounding the resultant gp2, and thus, partly takes care of your point 3.
With respect to point 2: expand the limits of the axis to make room of the labels. (See point 2. in the code below). The parameters for points 2 and 3 were set by trial-and-error. Adjusting one parameter means the other needs to be adjusted.
With respect to point 1: expand the axis using the additive component of exapnd to add half a unit to each end of the axis (See point 1. in the code below).
Minor edit: updating to ggplot2 2.2.0 and R 3.3.2
axis.ticks.margin is deprecated
X = t(USArrests)
plot_color_clust = function(X, N = N,
# cols=c("red","blue", "orange", "darkgreen","green","yellow","grey","black","white")
cols = rainbow(N) # Easier to pick colours
){
library(ggplot2)
library(gtable)
library(grid)
library(ggdendro)
library(plyr)
if(N > length(cols)) stop("N too big. Not enough colors in cols.")
if(N > ncol(X)) stop("N too big. Not enough columns in data.")
fit = ClustOfVar::hclustvar(X.quanti = X)
dd.row = as.dendrogram(fit)
ddata_x <- dendro_data(dd.row)
temp = cutree(fit, k = N)
lab <- ggdendro::label(ddata_x)
x = c()
for(i in 1:nrow(lab)){
x[i] = paste("clust", as.vector(temp[lab$label[i] == names(temp)]), sep = "")
}
lab$group <- x
p1 <- ggplot(segment(ddata_x)) +
geom_segment(aes(x = x, y = y, xend = xend, yend = yend)) +
geom_text(data = lab, aes(label = label, x = x, y = -.05, colour = group), # y = -.05 adds a little space between label and tree
size = 4, hjust = 1) +
scale_x_continuous(expand = c(0, .5)) + # 1. Add half a unit to each end of the vertical axis
expand_limits(y = -0.4) + # 2. Make room for labels
theme_classic() +
scale_colour_manual(values = cols) +
coord_flip() +
theme(legend.position = "none", axis.line = element_blank(),
axis.text = element_blank(), axis.title = element_blank(),
axis.ticks = element_blank(),
axis.ticks.length = unit(0, "cm"))
df2 <- data.frame(cluster = cutree(fit, N),
states = factor(fit$labels, levels = fit$labels[fit$order]))
df3 <- ddply(df2, .(cluster),summarise,pos=mean(as.numeric(states)))
p2 <- ggplot(df2, aes(states, y = 1,
fill = factor(as.character(cluster)))) + # 'as.character' - so that colours match with 10 or more clusters
geom_tile() +
scale_y_continuous(expand = c(0, 0)) +
scale_x_discrete(expand = c(0, 0)) +
coord_flip() +
geom_text(data = df3,aes(x = pos, label = cluster, size = 12)) +
scale_fill_manual(values = cols)
gp1 <- ggplotGrob(p1) # Get ggplot grobs
gp2 <- ggplotGrob(p2)
gp2 <- gp2[6, 4] # 3. Grab plot panel only from tiles plot (thus, no margins)
gp1 <- gtable_add_grob(gp1, gp2, t = 6, l = 2, name = "tiles") # 3. Insert it into dendrogram plot
gp1$widths[2] = unit(1, "cm") # 3. Set width of column containing tiles
grid.newpage()
grid.draw(gp1)
}
plot_color_clust(X, 6)

Related

Alignment of y axis labels in faced_grid and ggplot?

By using ggplot and faced_grid functions I'm trying to make a heatmap. I have a categorical y axis, and I want y axis labels to be left aligned. When I use theme(axis.text.y.left = element_text(hjust = 0)), each panels' labels are aligned independently. Here is the code:
#data
set.seed(1)
gruplar <- NA
for(i in 1:20) gruplar[i] <- paste(LETTERS[sample(c(1:20),sample(c(1:20),1),replace = T) ],
sep="",collapse = "")
gruplar <- cbind(gruplar,anagruplar=rep(1:4,each=5))
tarih <- data.frame(yil= rep(2014:2019,each=12) ,ay =rep_len(1:12, length.out = 72))
gruplar <- gruplar[rep(1:nrow(gruplar),each=nrow(tarih)),]
tarih <- tarih[rep_len(1:nrow(tarih),length.out = nrow(gruplar)),]
grouped <- cbind(tarih,gruplar)
grouped$value <- rnorm(nrow(grouped))
#plot
p <- ggplot(grouped,aes(ay,gruplar,fill=value))
p <- p + facet_grid(anagruplar~yil,scales = "free",
space = "free",switch = "y")
p <- p + theme_minimal(base_size = 14) +labs(x="",y="") +
theme(strip.placement = "outside",
strip.text.y = element_text(angle = 90))
p <- p + geom_raster(aes(fill = value), na.rm = T)
p + theme(axis.text.y.left = element_text(hjust = 0, size=14))
I know that by putting spaces and using a mono-space font I can solve the problem, but I have to use the font 'Calibri Light'.
Digging into grobs isn't my favourite hack, but it can serve its purpose here:
# generate plot
# (I used a smaller base_size because my computer screen is small)
p <- ggplot(grouped,aes(ay,gruplar,fill=value)) +
geom_raster(aes(fill = value),na.rm = T) +
facet_grid(anagruplar~yil,scales = "free",space = "free",switch = "y") +
labs(x="", y="") +
theme_minimal(base_size = 10) +
theme(strip.placement = "outside",
strip.text.y = element_text(angle = 90),
axis.text.y.left = element_text(hjust = 0, size=10))
# examine ggplot object: alignment is off
p
# convert to grob object: alignment is unchanged (i.e. still off)
gp <- ggplotGrob(p)
dev.off(); grid::grid.draw(gp)
# change viewport parameters for left axis grobs
for(i in which(grepl("axis-l", gp$layout$name))){
gp$grobs[[i]]$vp$x <- unit(0, "npc") # originally 1npc
gp$grobs[[i]]$vp$valid.just <- c(0, 0.5) # originally c(1, 0.5)
}
# re-examine grob object: alignment has been corrected
dev.off(); grid::grid.draw(gp)
I guess one option is to draw the labels on the right-hand side, and move that column in the gtable,
p <-ggplot(grouped,aes(ay,gruplar,fill=value)) +
facet_grid(anagruplar~yil,scales = "free",space = "free",switch = "y") +
geom_raster(aes(fill = value),na.rm = T) +
theme_minimal(base_size = 12) + labs(x="",y="") +
scale_y_discrete(position='right') +
theme(strip.placement = "outside", strip.text.y = element_text(angle = 90))+
theme(axis.text.y.left = element_text(hjust = 0,size=14))
g <- ggplotGrob(p)
id1 <- unique(g$layout[grepl("axis-l", g$layout$name),"l"])
id2 <- unique(g$layout[grepl("axis-r", g$layout$name),"l"])
g2 <- gridExtra::gtable_cbind(g[,seq(1,id1-1)],g[,id2], g[,seq(id1+1, id2-1)], g[,seq(id2+1, ncol(g))])
library(grid)
grid.newpage()
grid.draw(g2)
This seems like a bug in ggplot2, or at least what I consider an undesirable / unexpected behavior. You may have seen the approach suggested here, which uses string padding on a mono-space font to achieve the alignment.
This is pretty hacky, but if you need to achieve alignment using a particular font, you might replace the axis labels altogether with geom_text. I have a mostly-working solution, but it is ugly, in that each step seems to break something else!
library(ggplot2); library(dplyr)
# To add a blank facet before 2014, I convert to character
grouped$yil = as.character(grouped$yil)
# I add some rows for the dummy facet, in year "", to use for labels
grouped <- grouped %>%
bind_rows(grouped %>%
group_by(gruplar) %>%
slice(1) %>%
mutate(yil = "",
value = NA_real_) %>%
ungroup())
p <- ggplot(grouped,
aes(ay,gruplar,fill=value)) +
geom_raster(aes(fill = value),na.rm = T) +
scale_x_continuous(breaks = 4*0:3) +
facet_grid(anagruplar~yil,
scales = "free",space = "free",switch = "y") +
theme_minimal(base_size = 14) +
labs(x="",y="") +
theme(strip.placement = "outside",
strip.text.y = element_text(angle = 90),
axis.text.y.left = element_blank(),
panel.grid = element_blank()) +
geom_text(data = grouped %>%
filter(yil == ""),
aes(x = -40, y = gruplar, label = gruplar), hjust = 0) +
scale_fill_continuous(na.value = "white")
p
(The last problem with this plot that I can see is that it shows an orphaned "0" on the x axis of the dummy facet. Need another hack to get rid of that!)

Adjust step scale with ggplot for slope graph in R

I'm drawing a slope graph with ggplot, but the labels get clustered together and are not shown properly because of the scale of the two axis.
Any idea?
My code and the graph Is there any way to adjust step scale?
Thanks alot!
#Read file as numeric data
betterlife<-read.csv("betterlife.csv",skip=4,stringsAsFactors = F)
num_data <- data.frame(data.matrix(betterlife))
numeric_columns <- sapply(num_data,function(x){mean(as.numeric(is.na(x)))<0.5})
final_data <- data.frame(num_data[,numeric_columns],
betterlife[,!numeric_columns])
## rescale selected columns data frame
final_data <- data.frame(lapply(final_data[,c(3,4,5,6,7,10,11)], function(x) scale(x, center = FALSE, scale = max(x, na.rm = TRUE)/100)))
## Add country names as indicator
final_data["INDICATOR"] <- NA
final_data$INDICATOR <- betterlife$INDICATOR
employment.data <- final_data[5:30,]
indicator <- employment.data$INDICATOR
## Melt data to draw graph
employment.melt <- melt(employment.data)
#plot
sg = ggplot(employment.melt, aes(factor(variable), value,
group = indicator,
colour = indicator,
label = indicator)) +
theme(legend.position = "none",
axis.text.x = element_text(size=5),
axis.text.y=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
axis.ticks=element_blank(),
axis.line=element_blank(),
panel.grid.major.x = element_line("black", size = 0.1),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
panel.background = element_blank())
# plot the right-most labels
sg1 = sg + geom_line(size=0.15) +
geom_text(data = subset(employment.melt, variable == "Life.expectancy"),
aes(x = factor(variable), label=sprintf(" %2f %s",value,INDICATOR)), size = 1.75, hjust = 0)
# plot the left-most labels
sg1 = sg1 + geom_text(data = subset(employment.melt, variable == "Employment.rate"),
aes(x = factor(variable), label=sprintf("%s %2f ",INDICATOR,value)), size = 1.75, hjust = 1)
sg1
Have you tried to set up a scale, for example x (but I think you should do it for y too)
scale_x_continuous(breaks = seq(0, 100, 5))
where 0 - 100 is the range and 5 is the step size. You need to adjust these values according to your graph.
Source

ggplot2 - custom grob over axis lines

I'm trying to generate an axis line break in ggplot2 (with a white segment over the axis lines) and I'm having some trouble.
Using the informative post annotate-ggplot-with-an-extra-tick-and-label I was able to generate the custom grobs in given location, while also turning the panel off to "draw" outside of the plotting area.
I'm also familiar with other packages such as plotrix and am able to replicate broken axis in base, but more than anything I'm interested in learning why the axis grobs I'm creating aren't overwriting the line. Here is some sample code:
library(ggplot2) # devtools::install_github("hadley/ggplot2")
library(grid)
library(scales)
data("economics_long")
econ <- economics_long
econ$value01 <- (econ$value01/2)
x <- ggplot(econ, aes(date, value01,group=1)) + scale_y_continuous(labels=c(0.0,0.1,0.2,0.3,0.4,0.5,1.0), breaks=c(0.0,0.1,0.2,0.3,0.4,0.5,0.6),limits = c(0,.6),expand = c(0, 0)) +
geom_smooth(colour="deepskyblue", show.legend = TRUE ) + theme_bw()
theme_white <- theme(panel.background=element_blank(),
panel.border=element_rect(color="white"),
plot.margin = unit(c(.2, 0, .2, .2), "cm"),
panel.grid.major.y=element_blank(),
panel.grid.major.x=element_blank(),
panel.grid.minor.x=element_blank(),
panel.grid.minor.y=element_blank(),
axis.title.y = element_blank(),
axis.line.x=element_line(color="gray", size=1),
axis.line.y=element_line(color="gray", size=1),
axis.text.x=element_text(size=12),
axis.text.y=element_text(size=12),
axis.ticks=element_line(color="gray", size=1),
legend.position="none"
)
x <- x + theme_white
gline = linesGrob(y = c(0, 1.5),x = c(-.015, .015), gp = gpar(col = "black", lwd = 2.5))
gline2 = linesGrob(y = c(-0.25, 0.5),x = c(0, 0), gp = gpar(col = "red", lwd = 5))
p = x + annotation_custom(gline, ymin=.55, ymax=.575, xmin=-Inf, xmax=Inf) +
annotation_custom(gline, ymin=.525, ymax=.55, xmin=-Inf, xmax=Inf) +
annotation_custom(gline2, ymin=.55, ymax=.575, xmin=-Inf, xmax=Inf)
# grobs are placed under the axis lines....
g = ggplotGrob(p)
g$layout$clip[g$layout$name=="panel"] <- "off"
grid.draw(g)
Which creates this image:
I'm curious why the annotation_custom grobs are placed under the axis lines and whether there is a better solution to adding custom grobs using ggplot2. There appears to be an order in which graphics are placed in the plotting windows - how might this be alternated so that the custom grobs are placed after the axis lines?
You were close. The layout data frame is were you turned off clipping. There is another column in the layout data frame that gives the order in which the various plot elements are drawn - z. The plot panel (including the annotation) is drawn second (after the background), then later the axes are drawn. Change the value of z for the plot panel to something larger than the z values for the axes.
library(ggplot2) # devtools::install_github("hadley/ggplot2")
library(grid)
library(scales)
data("economics_long")
econ <- economics_long
econ$value01 <- (econ$value01/2)
x <- ggplot(econ, aes(date, value01,group=1)) + scale_y_continuous(labels=c(0.0,0.1,0.2,0.3,0.4,0.5,1.0), breaks=c(0.0,0.1,0.2,0.3,0.4,0.5,0.6),limits = c(0,.6),expand = c(0, 0)) +
geom_smooth(colour="deepskyblue", show.legend = TRUE ) + theme_bw()
theme_white <- theme(panel.background=element_blank(),
panel.border=element_rect(color="transparent"),
plot.margin = unit(c(.2, 0, .2, .2), "cm"),
panel.grid.major.y=element_blank(),
panel.grid.major.x=element_blank(),
panel.grid.minor.x=element_blank(),
panel.grid.minor.y=element_blank(),
axis.title.y = element_blank(),
axis.line.x=element_line(color="gray", size=1),
axis.line.y=element_line(color="gray", size=1),
axis.text.x=element_text(size=12),
axis.text.y=element_text(size=12),
axis.ticks=element_line(color="gray", size=1),
legend.position="none"
)
x <- x + theme_white
gline = linesGrob(y = c(0, 1.5),x = c(-.015, .015), gp = gpar(col = "black", lwd = 2.5))
gline2 = linesGrob(y = c(-0.25, 0.5),x = c(0, 0), gp = gpar(col = "red", lwd = 5))
p = x + annotation_custom(gline, ymin=.55, ymax=.575, xmin=-Inf, xmax=Inf) +
annotation_custom(gline, ymin=.525, ymax=.55, xmin=-Inf, xmax=Inf) +
annotation_custom(gline2, ymin=.55, ymax=.575, xmin=-Inf, xmax=Inf)
# grobs are placed under the axis lines....
g = ggplotGrob(p)
g$layout$clip[g$layout$name=="panel"] <- "off"
g$layout # Note that z for panel is 1. Change it to something bigger.
g$layout$z[g$layout$name=="panel"] = 17
grid.newpage()
grid.draw(g)

R: add calibrated axes to PCA biplot in ggplot2

I am working on an ordination package using ggplot2. Right now I am constructing biplots in the traditional way, with loadings being represented with arrows. I would also be interested though to use calibrated axes and represent the loading axes as lines through the origin, and with loading labels being shown outside the plot region. In base R this is implemented in
library(OpenRepGrid)
biplot2d(boeker)
but I am looking for a ggplot2 solution. Would anybody have any thoughts how to achieve something like this in ggplot2? Adding the variable names outside the plot region could be done like here I suppose, but how could the line segments outside the plot region be plotted?
Currently what I have is
install.packages("devtools")
library(devtools)
install_github("fawda123/ggord")
library(ggord)
data(iris)
ord <- prcomp(iris[,1:4],scale=TRUE)
ggord(ord, iris$Species)
The loadings are in ord$rotation
PC1 PC2 PC3 PC4
Sepal.Length 0.5210659 -0.37741762 0.7195664 0.2612863
Sepal.Width -0.2693474 -0.92329566 -0.2443818 -0.1235096
Petal.Length 0.5804131 -0.02449161 -0.1421264 -0.8014492
Petal.Width 0.5648565 -0.06694199 -0.6342727 0.5235971
How could I add the lines through the origin, the outside ticks and the labels outside the axis region (plossibly including the cool jittering that is applied above for overlapping labels)?
NB I do not want to turn off clipping, since some of my plot elements could sometimes go outside the bounding box
EDIT: Someone else apparently asked a similar question before, though the question is still without an answer. It points out that to do something like this in base R (though in an ugly way) one can do e.g.
plot(-1:1, -1:1, asp = 1, type = "n", xaxt = "n", yaxt = "n", xlab = "", ylab = "")
abline(a = 0, b = -0.75)
abline(a = 0, b = 0.25)
abline(a = 0, b = 2)
mtext("V1", side = 4, at = -0.75*par("usr")[2])
mtext("V2", side = 2, at = 0.25*par("usr")[1])
mtext("V3", side = 3, at = par("usr")[4]/2)
Minimal workable example in ggplot2 would be
library(ggplot2)
df <- data.frame(x = -1:1, y = -1:1)
dfLabs <- data.frame(x = c(1, -1, 1/2), y = c(-0.75, -0.25, 1), labels = paste0("V", 1:3))
p <- ggplot(data = df, aes(x = x, y = y)) + geom_blank() +
geom_abline(intercept = rep(0, 3), slope = c(-0.75, 0.25, 2)) +
theme_bw() + coord_cartesian(xlim = c(-1, 1), ylim = c(-1, 1)) +
theme(axis.title = element_blank(), axis.text = element_blank(), axis.ticks = element_blank(),
panel.grid = element_blank())
p + geom_text(data = dfLabs, mapping = aes(label = labels))
but as you can see no luck with the labels, and I am looking for a solution that does not require one to turn off clipping.
EDIT2: bit of a related question is how I could add custom breaks/tick marks and labels, say in red, at the top of the X axis and right of the Y axis, to show the coordinate system of the factor loadings? (in case I would scale it relative to the factor scores to make the arrows clearer, typically combined with a unit circle)
Maybe as an alternative, you could remove the default panel box and axes altogether, and draw a smaller rectangle in the plot region instead. Clipping the lines not to clash with the text labels is a bit tricky, but this might work.
df <- data.frame(x = -1:1, y = -1:1)
dfLabs <- data.frame(x = c(1, -1, 1/2), y = c(-0.75, -0.25, 1),
labels = paste0("V", 1:3))
p <- ggplot(data = df, aes(x = x, y = y)) +
geom_blank() +
geom_blank(data=dfLabs, aes(x = x, y = y)) +
geom_text(data = dfLabs, mapping = aes(label = labels)) +
geom_abline(intercept = rep(0, 3), slope = c(-0.75, 0.25, 2)) +
theme_grey() +
theme(axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
panel.grid = element_blank()) +
theme()
library(grid)
element_grob.element_custom <- function(element, ...) {
rectGrob(0.5,0.5, 0.8, 0.8, gp=gpar(fill="grey95"))
}
panel_custom <- function(...){ # dummy wrapper
structure(
list(...),
class = c("element_custom","element_blank", "element")
)
}
p <- p + theme(panel.background=panel_custom())
clip_layer <- function(g, layer="segment", width=1, height=1){
id <- grep(layer, names(g$grobs[[4]][["children"]]))
newvp <- viewport(width=unit(width, "npc"),
height=unit(height, "npc"), clip=TRUE)
g$grobs[[4]][["children"]][[id]][["vp"]] <- newvp
g
}
g <- ggplotGrob(p)
g <- clip_layer(g, "segment", 0.85, 0.85)
grid.newpage()
grid.draw(g)
What about this:
use the following code.
If you want the labels also on top and on the right have a look at:
http://rpubs.com/kohske/dual_axis_in_ggplot2
require(ggplot2)
data(iris)
ord <- prcomp(iris[,1:4],scale=TRUE)
slope <- ord$rotation[,2]/ord$rotation[,1]
p <- ggplot() +
geom_point(data = as.data.frame(ord$x), aes(x = PC1, y = PC2)) +
geom_abline(data = as.data.frame(slope), aes(slope=slope))
info <- ggplot_build(p)
x <- info$panel$ranges[[1]]$x.range[1]
y <- info$panel$ranges[[1]]$y.range[1]
p +
scale_x_continuous(breaks=y/slope, labels=names(slope)) +
scale_y_continuous(breaks=x*slope, labels=names(slope)) +
theme(axis.text.x = element_text(angle=90, vjust=0.5),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank())

Plot multiple ggplot plots on a single image with left alignment of the plots and a single legend

I would like to place several different ggplot plots into a single image. After much exploring, I am finding that ggplot is fantastic at generating a single plot or a series of plots if the data is formatted correctly. However, when you want to combine multiple plots, there are so many different options to combine them it get confusing and quickly convoluted. I have the following desires for my final plot:
The left axes of all the individual plots are aligned so that the plots can all share a common x-axis present by the bottom most plot
There is a single common legend on the right of the plot (preferably positioned near the top of the plot)
The top two indicator plots do not have any y-axis tics or numbers
There is a minimum amount of space between the plots
The indicator plots (isTraining and isTesting) take up a smaller amount of vertical space so that the remaining three plots can fill the space as needed
I have searched for solutions to meet the above requirements but it just is not working correctly. The following code does a lot of this (albeit in a possibly convoluted way) but falls short of satisfying my above listed requirements. The following are my specific issues:
The code that I found to align the left sides of the plots is not working for some reason
The method that I am currently using to get multiple plots on the same page seems difficult to use and there is most likely a better technique (I am open to suggestions)
The x-axis title is not showing up in the result
The legend is not aligned to the top of the plot (I do not know the easy way to do this at all, so I have not tried. Suggestions are welcome)
Any help in solving any of these issues would be greatly appreciated.
Self Contained Code Example
(It is a bit long but for this question I thought that there could be strange interactions)
# Load needed libraries ---------------------------------------------------
library(ggplot2)
library(caret)
library(grid)
rm(list = ls())
# Genereate Sample Data ---------------------------------------------------
N = 1000
classes = c('A', 'B', 'C', 'D', 'E')
set.seed(37)
ind = 1:N
data1 = sin(100*runif(N))
data2 = cos(100*runif(N))
data3 = cos(100*runif(N)) * sin(100*runif(N))
data4 = factor(unlist(lapply(classes, FUN = function(x) {rep(x, N/length(classes))})))
data = data.frame(ind, data1, data2, data3, Class = data4)
rm(ind, data1, data2, data3, data4, N, classes)
# Sperate into smaller datasets for training and testing ------------------
set.seed(1976)
inTrain <- createDataPartition(y = data$data1, p = 0.75, list = FALSE)
data_Train = data[inTrain,]
data_Test = data[-inTrain,]
rm(inTrain)
# Generate Individual Plots -----------------------------------------------
data1_plot = ggplot(data) + theme_bw() + geom_point(aes(x = ind, y = data1, color = Class))
data2_plot = ggplot(data) + theme_bw() + geom_point(aes(x = ind, y = data2, color = Class))
data3_plot = ggplot(data) + theme_bw() + geom_point(aes(x = ind, y = data3, color = Class))
isTraining = ggplot(data_Train) + theme_bw() + geom_point(aes(x = ind, y = 1, color = Class))
isTesting = ggplot(data_Test) + theme_bw() + geom_point(aes(x = ind, y = 1, color = Class))
# Set the desired legend properties before extraction to grob -------------
data1_plot = data1_plot + theme(legend.key = element_blank())
# Extract the legend from one of the plots --------------------------------
getLegend<-function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
return(legend)}
leg = getLegend(data1_plot)
# Remove legend from other plots ------------------------------------------
data1_plot = data1_plot + theme(legend.position = 'none')
data2_plot = data2_plot + theme(legend.position = 'none')
data3_plot = data3_plot + theme(legend.position = 'none')
isTraining = isTraining + theme(legend.position = 'none')
isTesting = isTesting + theme(legend.position = 'none')
# Remove the grid from the isTraining and isTesting plots -----------------
isTraining = isTraining + theme(panel.grid.minor=element_blank(), panel.grid.major=element_blank())
isTesting = isTesting + theme(panel.grid.minor=element_blank(), panel.grid.major=element_blank())
# Remove the y-axis from the isTraining and the isTesting Plots -----------
isTraining = isTraining + theme(axis.ticks = element_blank(), axis.text = element_blank())
isTesting = isTesting + theme(axis.ticks = element_blank(), axis.text = element_blank())
# Remove the margin from the plots and set the XLab to null ---------------
tmp = theme(panel.margin = unit(c(0, 0, 0, 0), units = 'cm'), plot.margin = unit(c(0, 0, 0, 0), units = 'cm'))
data1_plot = data1_plot + tmp + labs(x = NULL, y = 'Data 1')
data2_plot = data2_plot + tmp + labs(x = NULL, y = 'Data 2')
data3_plot = data3_plot + tmp + labs(x = NULL, y = 'Data 3')
isTraining = isTraining + tmp + labs(x = NULL, y = 'Training')
isTesting = isTesting + tmp + labs(x = NULL, y = 'Testing')
# Add the XLabel back to the bottom plot ----------------------------------
data3_plot = data3_plot + labs(x = 'Index')
# Remove the X-Axis from all the plots but the bottom one -----------------
# data3 is to the be last plot...
data1_plot = data1_plot + theme(axis.ticks.x = element_blank(), axis.text.x = element_blank())
data2_plot = data2_plot + theme(axis.ticks.x = element_blank(), axis.text.x = element_blank())
isTraining = isTraining + theme(axis.ticks.x = element_blank(), axis.text.x = element_blank())
isTesting = isTesting + theme(axis.ticks.x = element_blank(), axis.text.x = element_blank())
# Store plots in a list for ease of processing ----------------------------
plots = list()
plots[[1]] = isTraining
plots[[2]] = isTesting
plots[[3]] = data1_plot
plots[[4]] = data2_plot
plots[[5]] = data3_plot
# Fix the widths of the plots so that the left side of the axes align ----
# Note: This does not seem to function correctly....
# I tried to adapt from:
# http://stackoverflow.com/questions/13294952/left-align-two-graph-edges-ggplot
plotGrobs = lapply(plots, ggplotGrob)
plotGrobs[[1]]$widths[2:5]
maxWidth = plotGrobs[[1]]$widths[2:5]
for(i in length(plots)) {
maxWidth = grid::unit.pmax(maxWidth, plotGrobs[[i]]$widths[2:5])
}
for(i in length(plots)) {
plotGrobs[[i]]$widths[2:5] = as.list(maxWidth)
}
plotAtPos = function(x = 0.5, y = 0.5, width = 1, height = 1, obj) {
pushViewport(viewport(x = x + 0.5*width, y = y + 0.5*height, width = width, height = height))
grid.draw(obj)
upViewport()
}
grid.newpage()
plotAtPos(x = 0, y = 0.85, width = 0.9, height = 0.1, plotGrobs[[1]])
plotAtPos(x = 0, y = 0.75, width = 0.9, height = 0.1, plotGrobs[[2]])
plotAtPos(x = 0, y = 0.5, width = 0.9, height = 0.2, plotGrobs[[3]])
plotAtPos(x = 0, y = 0.3, width = 0.9, height = 0.2, plotGrobs[[4]])
plotAtPos(x = 0, y = 0.1, width = 0.9, height = 0.2, plotGrobs[[5]])
plotAtPos(x = 0.9, y = 0, width = 0.1, height = 1, leg)
The visual result of the above is in the following image:
Aligning ggplots should be done with rbind.gtable; here it's fairly straight-forward since the gtables all have the same number of columns. Setting the panel heights and adding a legend on the side is also more straight-forward with gtable than with grid viewports, in my opinion.
The only slight annoyance is that rbind.gtable currently doesn't handle unit.pmax to set the widths as required. It's easy to fix though, see the rbind_max function below.
require(gtable)
rbind_max <- function(...){
gtl <- lapply(list(...), ggplotGrob)
bind2 <- function (x, y)
{
stopifnot(ncol(x) == ncol(y))
if (nrow(x) == 0)
return(y)
if (nrow(y) == 0)
return(x)
y$layout$t <- y$layout$t + nrow(x)
y$layout$b <- y$layout$b + nrow(x)
x$layout <- rbind(x$layout, y$layout)
x$heights <- gtable:::insert.unit(x$heights, y$heights)
x$rownames <- c(x$rownames, y$rownames)
x$widths <- grid::unit.pmax(x$widths, y$widths)
x$grobs <- append(x$grobs, y$grobs)
x
}
Reduce(bind2, gtl)
}
gp <- do.call(rbind_max, plots)
gp <- gtable_add_cols(gp, widths = sum(leg$widths))
panels <- gp$layout$t[grep("panel", gp$layout$name)]
# set the relative panel heights 1/3 for the top two
gp$heights[panels] <- lapply(c(1,1,3,3,3), unit, "null")
# set the legend justification to top (it's a gtable embedded in a gtable)
leg[["grobs"]][[1]][["vp"]] <- viewport(just = c(0.5,1))
gp <- gtable_add_grob(gp, leg, t = 1, l = ncol(gp))
grid.newpage()
grid.draw(gp)

Resources