R - how to allocate screen space to complex ggplot images - r

I am trying to write a script that produces four different plots in a single image. Specifically, I want to recreate this graphic as closely as possible:
My current script produces four plots similar to these but I cannot figure out how to allocate screen real-estate accordingly. I want to:
modify the height and width of the plots so that all four have uniform width, one is substantially taller than the others which have uniform height among them
define the position of the legends by coordinates so that I can use screen space effectively
modify the overall shape of my image explicitly as needed (maybe I will need it closer to square-shaped at some point)
GENERATE SOME DATA TO PLOT
pt_id = c(1:279) # DEFINE PATIENT IDs
smoke = rbinom(279,1,0.5) # DEFINE SMOKING STATUS
hpv = rbinom(279,1,0.3) # DEFINE HPV STATUS
data = data.frame(pt_id, smoke, hpv) # PRODUCE DATA FRAME
ADD ANATOMICAL SITE DATA
data$site = sample(1:4, 279, replace = T)
data$site[data$site == 1] = "Hypopharynx"
data$site[data$site == 2] = "Larynx"
data$site[data$site == 3] = "Oral Cavity"
data$site[data$site == 4] = "Oropharynx"
data$site_known = 1 # HACK TO FACILITATE PRODUCING BARPLOTS
ADD MUTATION FREQUENCY DATA
data$freq = sample(1:1000, 279, replace = F)
DEFINE BARPLOT
require(ggplot2)
require(gridExtra)
bar = ggplot(data, aes(x = pt_id, y = freq)) + geom_bar(stat = "identity") + theme(axis.title.x = element_blank(), axis.ticks.x = element_blank(), axis.text.x = element_blank()) + ylab("Number of Mutations")
# DEFINE BINARY PLOTS
smoke_status = ggplot(data, aes(x=pt_id, y=smoke, fill = "red")) + geom_bar(stat="identity") + theme(legend.position = "none", axis.title.x = element_blank(), axis.ticks.x = element_blank(), axis.text.x = element_blank()) + ylab("Smoking Status")
hpv_status = ggplot(data, aes(x=pt_id, y = hpv, fill = "red")) + geom_bar(stat="identity") + theme(legend.position = "none", axis.title.x = element_blank(), axis.ticks.x = element_blank(), axis.text.x = element_blank()) + ylab("HPV Status")
site_status = ggplot(data, aes(x=pt_id, y=site_known, fill = site)) + geom_bar(stat="identity")
PRODUCE FOUR GRAPHS TOGETHER
grid.arrange(bar, smoke_status, hpv_status, site_status, nrow = 4)
I suspect that the functions needed to accomplish these tasks are already included in ggplot2 and gridExtra but I have not been able to figure out how. Also, if any of my code is excessively verbose or there is a simpler, more-elegant way to do what I have already done - please feel free to comment on that as well.

Here are the steps to get the layout you describe:
1) Extract the legend as a separate grob ("graphical object"). We can then lay out the legend separately from the plots.
2) Left-align the edges of the four plots so that the left edges and the x-scales line up properly. The code to do that comes from this SO answer. That answer has a function to align an arbitrary number of plots, but I wasn't able to get that to work when I also wanted to change the proportional space allotted to each plot, so I ended up doing it the "long way" by adjusting each plot separately.
3) Lay out the plots and the legend using grid.arrange and arrangeGrob. The heights argument allocates different proportions of the total vertical space to each plot. We also use the widths argument to allocate horizontal space to the plots in one wide column and the legend in another narrow column.
4) Plot to a device in whatever size you desire. This is how you get a particular shape or aspect ratio.
library(gridExtra)
library(grid)
# Function to extract the legend from a ggplot graph as a separate grob
# Source: https://stackoverflow.com/a/12539820/496488
get_leg = function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
legend
}
# Get legend as a separate grob
leg = get_leg(site_status)
# Add a theme element to change the plot margins to remove white space between the plots
thm = theme(plot.margin=unit(c(0,0,-0.5,0),"lines"))
# Left-align the four plots
# Adapted from: https://stackoverflow.com/a/13295880/496488
gA <- ggplotGrob(bar + thm)
gB <- ggplotGrob(smoke_status + thm)
gC <- ggplotGrob(hpv_status + thm)
gD <- ggplotGrob(site_status + theme(plot.margin=unit(c(0,0,0,0), "lines")) +
guides(fill=FALSE))
maxWidth = grid::unit.pmax(gA$widths[2:5], gB$widths[2:5], gC$widths[2:5], gD$widths[2:5])
gA$widths[2:5] <- as.list(maxWidth)
gB$widths[2:5] <- as.list(maxWidth)
gC$widths[2:5] <- as.list(maxWidth)
gD$widths[2:5] <- as.list(maxWidth)
# Lay out plots and legend
p = grid.arrange(arrangeGrob(gA,gB,gC,gD, heights=c(0.5,0.15,0.15,0.21)),
leg, ncol=2, widths=c(0.8,0.2))
You can then determine the shape or aspect ratio of the final plot by setting the parameters of the output device. (You may have to adjust font sizes when you create the underlying plots in order to get the final layout to look the way you want it.) The plot pasted in below is a png saved directly from the RStudio graph window. Here's how you would save the plot as PDF file (but there are many other "devices" you can use (e.g., png, jpeg, etc.) to save in different formats):
pdf("myPlot.pdf", width=10, height=5)
p
dev.off()
You also asked about more efficient code. One thing you can do is create a list of plot elements that you use multiple times and then just add the name of the list object to each plot. For example:
my_gg = list(geom_bar(stat="identity", fill="red"),
theme(legend.position = "none",
axis.title.x = element_blank(),
axis.ticks.x = element_blank(),
axis.text.x = element_blank()),
plot.margin = unit(c(0,0,-0.5,0), "lines"))
smoke_status = ggplot(data, aes(x=pt_id, y=smoke)) +
labs(y="Smoking Status") +
my_gg

Related

Is it possible to specify the size / layout of a single plot to match a certain grid in R?

I use R for most of my data analysis. Until now I used to export the results as a CSV and visualized them using Macs Numbers.
The reason: The Graphs are embeded in documents and there is a rather large border on the right side reserved for annotations (tufte handout style). Between the acutal text and the annotations column there is white space. The plot of the graphs needs to fit the width of text while the legend should be placed in the annotation column.
I would prefer to also create the plots within R for a better workflow and higher efficiency. Is it possible to create such a layout using plotting with R?
Here is an example of what I would like to achieve:
And here is some R Code as a starter:
library(tidyverse)
data <- midwest %>%
head(5) %>%
select(2,23:25) %>%
pivot_longer(cols=2:4,names_to="Variable", values_to="Percent") %>%
mutate(Variable=factor(Variable, levels=c("percbelowpoverty","percchildbelowpovert","percadultpoverty"),ordered=TRUE))
ggplot(data=data, mapping=aes(x=county, y=Percent, fill=Variable)) +
geom_col(position=position_dodge(width=0.85),width=0.8) +
labs(x="County") +
theme(text=element_text(size=9),
panel.background = element_rect(fill="white"),
panel.grid = element_line(color = "black",linetype="solid",size= 0.3),
panel.grid.minor = element_blank(),
panel.grid.major.x=element_blank(),
axis.line.x=element_line(color="black"),
axis.ticks= element_blank(),
legend.position = "right",
legend.title = element_blank(),
legend.box.spacing = unit(1.5,"cm") ) +
scale_y_continuous(breaks= seq(from=0, to=50,by=5),
limits=c(0,51),
expand=c(0,0)) +
scale_fill_manual(values = c("#CF232B","#942192","#000000"))
I know how to set a custom font, just left it out for easier saving.
Using ggsave
ggsave("Graph_with_R.jpeg",plot=last_plot(),device="jpeg",dpi=300, width=18, height=9, units="cm")
I get this:
This might resample the result aimed for in the actual case, but the layout and sizes do not fit exact. Also recognize the different text sizes between axis titles, legend and tick marks on y-axes. In addition I assume the legend width depends on the actual labels and is not fixed.
Update
Following the suggestion of tjebo I posted a follow-up question.
Can it be done? Yes. Is it convenient? No.
If you're working in ggplot2 you can translate the plot to a gtable, a sort of intermediate between the plot specifications and the actual drawing. This gtable, you can then manipulate, but is messy to work with.
First, we need to figure out where the relevant bits of our plot are in the gtable.
library(ggplot2)
library(gtable)
library(grid)
plt <- ggplot(mtcars, aes(factor(cyl), fill = factor(vs))) +
geom_bar(position = position_dodge2(preserve = "single"))
# Making gtable
gt <- ggplotGrob(plt)
gtable_show_layout(gt)
Then, we can make a new gtable with prespecified dimensions and place the bits of our old gtable into it.
# Making a new gtable
new <- gtable(widths = unit(c(12.5, 1.5, 4), "cm"),
heights = unit(9, "cm"))
# Adding main panel and axes in first cell
new <- gtable_add_grob(
new,
gt[7:9, 3:5], # If you see the layout above as a matrix, the main bits are in these rows/cols
t = 1, l = 1
)
# Finding the legend
legend <- gt$grobs[gt$layout$name == "guide-box"][[1]]
legend <- legend$grobs[legend$layout$name == "guides"][[1]]
# Adding legend in third cell
new <- gtable_add_grob(
new, legend, t = 1, l = 3
)
# Saving as raster
ragg::agg_png("test.png", width = 18, height = 9, units = "cm", res = 300)
grid.newpage(); grid.draw(new)
dev.off()
#> png
#> 2
Created on 2021-04-02 by the reprex package (v1.0.0)
The created figure should match the dimensions you're looking for.
Another option is to draw the three components as separate plots and stitch them together in the desired ratio.
The below comes quite close to the desired ratio, but not exactly. I guess you'd need to fiddle around with the values given the exact saving dimensions. In the example I used figure dimensions of 7x3.5 inches (which is similar to 18x9cm), and have added the black borders just to demonstrate the component limits.
library(tidyverse)
library(patchwork)
data <- midwest %>%
head(5) %>%
select(2,23:25) %>%
pivot_longer(cols=2:4,names_to="Variable", values_to="Percent") %>%
mutate(Variable=factor(Variable, levels=c("percbelowpoverty","percchildbelowpovert","percadultpoverty"),ordered=TRUE))
p1 <-
ggplot(data=data, mapping=aes(x=county, y=Percent, fill=Variable)) +
geom_col() +
scale_fill_manual(values = c("#CF232B","#942192","#000000"))
p_legend <- cowplot::get_legend(p1)
p_main <- p1 <-
ggplot(data=data, mapping=aes(x=county, y=Percent, fill=Variable)) +
geom_col(show.legend = FALSE) +
scale_fill_manual(values = c("#CF232B","#942192","#000000"))
p_main + plot_spacer() + p_legend +
plot_layout(widths = c(12.5, 1.5, 4)) &
theme(plot.margin = margin(),
plot.background = element_rect(colour = "black"))
Created on 2021-04-02 by the reprex package (v1.0.0)
update
My solution is only semi-satisfactory as pointed out by the OP. The problem is that one cannot (to my knowledge) define the position of the grob in the third panel.
Other ideas for workarounds:
One could determine the space needed for text (but this seems not so easy) and then to size the device accordingly
Create a fake legend - however, this requires the tiles / text to be aligned to the left with no margin, and this can very quickly become very hacky.
In short, I think teunbrand's solution is probably the most straight forward one.
Update 2
The problem with the left alignment should be fixed with Stefan's suggestion in this thread

R: place geom_text() relative to plot borders rather than fixed position on the plot

I am creating a number of plots using ggplot2 in R and want a way to standardize implementation of a cutoff line. I have data on a number of different measures for four cities over a ~10 year time period. I've plotted them as line graphs with each city a different color within a given graph. I will be creating a plot for each of the different measures I have (around 20).
On each of these graphs, I need to put two cutoff lines (with a word next to them) representing implementation of some policy so that people reading the graphs can easily identify the difference between performance before and after the implementation. Below is approximately the code I'm currently using.
gg_plot1<- ggplot(data=ggdata, aes(x=Year, y=measure1, group=Area, color=Area)) +
geom_vline(xintercept=2011, color="#EE0000") +
geom_text(aes(x=2011, label="City1\n", y=0.855), color="#EE0000", angle=90, hjust=0, family="serif") +
geom_vline(xintercept=2007, color="#000099") +
geom_text(aes(x=2007, label="City2", y=0.855), color="#000099", angle=0, hjust=1, family="serif") +
geom_line(size=.75) +
geom_point(size=1.5) +
scale_y_continuous(breaks=round(seq(min(ggdata$measure1, na.rm=T), max(ggdata$measure1, na.rm=T), by=0.01), 2)) +
scale_x_continuous(breaks=min(ggdata$Year):max(ggdata$Year)) +
scale_color_manual(values=c("#EE0000", "#00DDFF", "#009900", "#000099")) +
theme(axis.text.x = element_text(angle=90, vjust=1),
panel.background = element_rect(fill="white", color="white"),
panel.grid.major = element_line(color="grey95"),
text = element_text(size=11, family="serif"))
The problem with this implementation is that it relies on placing the two geom_text() on a particular place on the specific graph. These different measures all have different ranges so in order to do this I'd need to go plot by plot and find a spot to place them. What I'd prefer to do is something like force the range of each plot down by X% and put the geom_text() aligned to the bottom of the range. The lines shouldn't need adjusting (same year in every plot), just the position of the text. I found some similar questions here but none that had to do with the specific problem of placing something in the same position on different graphs with different ranges.
Is there a way to do what I'm looking for? If I had to guess, it'd something like using relative positioning rather than absolute but I haven't been able to find away to do that within ggplot. For the record, I'm aware the two geom_text()s are oriented differently. I did that to compare which we prefered but left it for you all. We will ultimately be going with the one that has the text rotated 90deg. Additionally, some of these will be faceted together so that might provide an extra layer of difficulty. Haven't gotten to that point yet.
Sidebar: an alternative way to visualize this would be to change the line from solid to dotted at the cutoff year. Is this possible? I'm not sure the client would want that but I'd love to present it as an option if anyone can point me in the direction of where to learn about how to do that.
Edit to add:
Sample data which shows what happens when running it with different y-ranges
ggdata <- data.frame(Area=rep(c("City1", "City2", "City3", "City4"), times=7),
Year=c(rep(2006,4), rep(2007,4), rep(2008,4), rep(2009,4), rep(2010,4), rep(2011,4), rep(2012,4)),
measure1=rnorm(28,10,2),
measure2=rnorm(28,50,10))
Sample plot which has the geom_text()s in the proper position, but this was done using the code above with a fixed position within the plot. When I replicate the code using a different measure that has a differnet y-range it ends up stretching the plot window.
You can use the y-range of the data to position to the text labels. I've set the y-limits explicitly in the example below, but that's not absolutely necessary unless you want to change them from the defaults. You can also adjust the x-position of the text labels using the x-range of the data. The code below will position the labels at the bottom of the plot, regardless of the y-range of the data.
I've also switched from geom_text to annotate. geom_text overplots the text labels multiple times, once for each row in the data. annotate plots the label once.
ypos = min(ggdata$measure1) + 0.005*diff(range(ggdata$measure1))
xv = 0.02
xh = 0.01
xadj = diff(range(ggdata$Year))
ggplot(data=ggdata, aes(x=Year, y=measure1, group=Area, color=Area)) +
geom_vline(xintercept=2011, color="#EE0000") +
geom_vline(xintercept=2007, color="#000099") +
geom_line(size=.75) +
geom_point(size=1.5) +
annotate(geom="text", x=2011 - xv*xadj, label="City1", y=ypos, color="#EE0000", angle=90, hjust=0, family="serif") +
annotate(geom="text", x=2007 - xh*xadj, label="City2", y=ypos, color="#000099", angle=0, hjust=1, family="serif") +
scale_y_continuous(limits=range(ggdata$measure1),
breaks=round(seq(min(ggdata$measure1, na.rm=T), max(ggdata$measure1, na.rm=T), by=1), 0)) +
scale_x_continuous(breaks=min(ggdata$Year):max(ggdata$Year)) +
scale_color_manual(values=c("#EE0000", "#00DDFF", "#009900", "#000099")) +
theme(axis.text.x = element_text(angle=90, vjust=1),
panel.background = element_rect(fill="white", color="white"),
panel.grid.major = element_line(color="grey95"),
text = element_text(size=11, family="serif"))
UPDATE: To respond to your comment, here's how you can create a separate plot for each "measure" column in your data frame.
First, we create reproducible data with three measure columns:
library(ggplot2)
library(gridExtra)
library(scales)
set.seed(4)
ggdata <- data.frame(Year=rep(2006:2012,each=4),
Area=rep(paste0("City",1:4), 7),
measure1=rnorm(28,10,2),
measure2=rnorm(28,50,10),
measure3=rnorm(28,-50,5))
Now, we take the code from above and package it in a function. The function take an argument called measure_var. This is the data column, provided as a character_string, that will provide the y-values for the plot. Note that we now use aes_string instead of aes inside ggplot.
plot_func = function(measure_var) {
ypos = min(ggdata[ , measure_var]) + 0.005*diff(range(ggdata[ , measure_var]))
xv = 0.02
xh = 0.01
xadj = diff(range(ggdata$Year))
ggplot(data=ggdata, aes_string(x="Year", y=measure_var, group="Area", color="Area")) +
geom_vline(xintercept=2011, color="#EE0000") +
geom_vline(xintercept=2007, color="#000099") +
geom_line(size=.75) +
geom_point(size=1.5) +
annotate(geom="text", x=2011 - xv*xadj, label="City1", y=ypos,
color="#EE0000", angle=90, hjust=0, family="serif") +
annotate(geom="text", x=2007 - xh*xadj, label="City2", y=ypos,
color="#000099", angle=0, hjust=1, family="serif") +
scale_y_continuous(limits=range(ggdata[ , measure_var]),
breaks=pretty_breaks(5)) +
scale_x_continuous(breaks=min(ggdata$Year):max(ggdata$Year)) +
scale_color_manual(values=c("#EE0000", "#00DDFF", "#009900", "#000099")) +
theme(axis.text.x = element_text(angle=90, vjust=1),
panel.background = element_rect(fill="white", color="white"),
panel.grid.major = element_line(color="grey95"),
text = element_text(size=11, family="serif")) +
ggtitle(paste("Plot of", measure_var))
}
We can now run the function once like this: plot_func("measure1"). However, let's run it on all the measure columns in one go by using lapply. We give lapply a vector with the names of the measure columns (names(ggdata)[grepl("measure", names(ggdata))]), and it runs plot_func on each of these columns in turn, storing the resulting plots in the list plot_list.
plot_list = lapply(names(ggdata)[grepl("measure", names(ggdata))], plot_func)
Now if we wish, we can lay them all out together using grid.arrange. In this case, we only need one legend, rather than a separate legend for each plot, so we extract the legend as a separate graphical object and lay it out beside the three plots.
# Function to get legend from a ggplot as a separate graphical object
# Source: https://github.com/tidyverse/ggplot2/wiki/Share-a-legend-between-two-ggplot2-graphs/047381b48b0f0ef51a174286a595817f01a0dfad
g_legend<-function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
return(legend)
}
# Get legend
leg = g_legend(plot_list[[1]])
# Lay out all of the plots together with a single legend
grid.arrange(arrangeGrob(grobs=lapply(plot_list, function(x) x + guides(colour=FALSE))),
leg,
ncol=2, widths=c(10,1))

how to tweak geom_text() in a ggplot2 function in R

I am creating a step plot using a self-defined function "stepPlot" which is working. I got stuck when I attempted to put a geom_text() inside this function. Interestingly, geom_text() works when it's not in the function. Can any one help me tweak the geom_text(). There are two parts: (1) "labelPosiX" is the horizontal position of the label text, and (2) geom_text() where is at the end of the function. "labelPosiY", the vertical position of the label, will be manually specified with a number. These two lines of codes have been inactivated. Thanks in advance
stepPlot <- function(Data,xVar, yVar, LegendTitle="", GroupLabels, Plottitle="",labelPosiY, labelText="A"
# plot specifications remain the same over data subsets. Ignore these setting when calling the function
GroupColour=c("black","blue","orange"), LineTypeGroup=c("solid","solid","solid"), LineSize=1,
LegendPosition=c(0.5,0.8),
YaxisTitle="", YAxisTitleSize=element_blank(),
XAxisText=element_text(size=20),AxisTextSize=15,LegendTitleSize=10, LegendTextSize=10,LegendKeySize=10,
PlotTitleSize=15
){
# define x limits (Xmin, Xmax), x break increments (BreakIncreX),level of breaks (GroupBreaks),horizontal position of label text (labelPosiX)
Xmin <- min(Data[xVar])-1
Xmax <- max(Data[xVar])+1
BreakIncreX <- round((Xmax-Xmin)/6)
GroupBreaks <-unique(Data$trt_label)
#labelPosiX <-min(Data[xVar])+2
# define y maximal limit (limitYMax),y break increments (BreakIncreY)
library(plyr)
limitYMax <- round_any(max(Data[yVar]), 100, f = ceiling)
BreakIncreY <- round_any(max(Data[yVar])/5, 100, f = ceiling)
# step plot
ggplot(Data, aes_string(x=xVar, y=yVar, group='trt_label'))+
geom_step(aes(colour=trt_label, linetype=trt_label), direction='hv',size= LineSize)+ #specify step curve from different group with colours, colour by default
scale_y_continuous(YaxisTitle, limits=c(0,limitYMax), expand=c(0,0), breaks=seq(0,limitYMax,by=BreakIncreY))+
scale_x_continuous("Age of adults in days", limits=c(Xmin, Xmax), expand=c(0,0), breaks=seq(Xmin,Xmax,by=BreakIncreX)) +
scale_colour_manual(name=LegendTitle,
breaks=GroupBreaks,
labels=GroupLabels,
values=GroupColour
)+ # change default colours to manually specified grey scale
scale_linetype_manual(name =LegendTitle,
breaks=GroupBreaks,
labels=GroupLabels,
values=LineTypeGroup
)+
guides(colour = guide_legend(LegendTitle), linetype = guide_legend(LegendTitle))+ # merge two legends into a single one
theme_bw() + # maek background theme black and white
theme(axis.title.x = element_blank(), #font size of x axis title
axis.title.y = YAxisTitleSize, #font size of y axis title
axis.text.x = XAxisText, #font size of x axis text
axis.text.y = element_text(size=AxisTextSize), #font size of y axis text
legend.position=LegendPosition,
legend.title=element_text(size=LegendTitleSize), #font size of legend title
legend.text = element_text(colour="black", size = LegendTextSize, face = "bold"), #font size of legend text
legend.key.size=unit(LegendKeySize,'points'), ## ben - added to shrink the legend
legend.background=element_blank(), ## ben - added to get rid of white background
panel.grid.major = element_line(size = 0.5, colour = '#FFFFFF'),
panel.grid.minor = element_line(colour = NA), # colour = NA to suppress gridlines, reappear if colour='black'
plot.title=element_text( face="bold", size=PlotTitleSize) # aduust plot title size
)+
ggtitle(Plottitle)
# add label text
**#+ geom_text(aes(labelPosiX, labelPosiY, label="test"), colour="black",size=5)** }
}
My old way to add text works but I am hoping to move the geom_text into the function.
source("C:/Now/R/Rfunction_stepPlot.R")
fig17b <-stepPlot(Data=df17b,xVar= "age", yVar='mean_cumSumDurLeftByBeeAge',
LegendTitle="Precocious topical",GroupLabels=c("acetone", "untreated", "methoprene"),
Plottitle="weighed hive"
)+
geom_text(aes((min(df17b$age)+2), 3700, label="A"), colour="black",size=5)
Here's one way you can pass the (x,y) positions of geom_text to a function:
In general, make it a part of the data frame that ggplot is plotting. (Using SimonO101's suggestion, here's how it works for mtcars.)
plotFunction <- function (df, labelPosiY) {
df$xPos = df$cyl #add columns to the data frame
df$yPos = labelPosiY
p <- ggplot(data=df, aes(x=cyl, y=mpg)) + geom_step(aes(colour=gear, direction='hv',size=2))
p <- p + geom_text(aes(xPos, y=yPos, label="test"), colour="black",size=5)
return (p)
}
Now, calling
plotFunction(mtcars, 17)
produces
You can try making the geom_text part of your code work, and then bringing in all the other aspects of your plot.

R, ggplot - Graphs sharing the same y-axis but with different x-axis scales

Context
I have some datasets/variables and I want to plot them, but I want to do this in a compact way. To do this I want them to share the same y-axis but distinct x-axis and, because of the different distributions, I want one of the x-axis to be log scaled and the other linear scaled.
Example
Suppose I have a long tailed variable (that I want the x-axis to be log-scaled when plotted):
library(PtProcess)
library(ggplot2)
set.seed(1)
lambda <- 1.5
a <- 1
pareto <- rpareto(1000,lambda=lambda,a=a)
x_pareto <- seq(from=min(pareto),to=max(pareto),length=1000)
y_pareto <- 1-ppareto(x_pareto,lambda,a)
df1 <- data.frame(x=x_pareto,cdf=y_pareto)
ggplot(df1,aes(x=x,y=cdf)) + geom_line() + scale_x_log10()
And a normal variable:
set.seed(1)
mean <- 3
norm <- rnorm(1000,mean=mean)
x_norm <- seq(from=min(norm),to=max(norm),length=1000)
y_norm <- pnorm(x_norm,mean=mean)
df2 <- data.frame(x=x_norm,cdf=y_norm)
ggplot(df2,aes(x=x,y=cdf)) + geom_line()
I want to plot them side by side using the same y-axis.
Attempt #1
I can do this with facets, which looks great, but I don't know how to make each x-axis with a different scale (scale_x_log10() makes both of them log scaled):
df1 <- cbind(df1,"pareto")
colnames(df1)[3] <- 'var'
df2 <- cbind(df2,"norm")
colnames(df2)[3] <- 'var'
df <- rbind(df1,df2)
ggplot(df,aes(x=x,y=cdf)) + geom_line() +
facet_wrap(~var,scales="free_x") + scale_x_log10()
Attempt #2
Use grid.arrange, but I don't know how to keep both plot areas with the same aspect ratio:
library(gridExtra)
p1 <- ggplot(df1,aes(x=x,y=cdf)) + geom_line() + scale_x_log10() +
theme(plot.margin = unit(c(0,0,0,0), "lines"),
plot.background = element_blank()) +
ggtitle("pareto")
p2 <- ggplot(df2,aes(x=x,y=cdf)) + geom_line() +
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
plot.margin = unit(c(0,0,0,0), "lines"),
plot.background = element_blank()) +
ggtitle("norm")
grid.arrange(p1,p2,ncol=2)
PS: The number of plots may vary so I'm not looking for an answer specifically for 2 plots
Extending your attempt #2, gtable might be able to help you out. If the margins are the same in the two charts, then the only widths that change in the two plots (I think) are the spaces taken by the y-axis tick mark labels and axis text, which in turn changes the widths of the panels. Using code from here, the spaces taken by the axis text should be the same, thus the widths of the two panel areas should be the same, and thus the aspect ratios should be the same. However, the result (no margin to the right) does not look pretty. So I've added a little margin to the right of p2, then taken away the same amount to the left of p2. Similarly for p1: I've added a little to the left but taken away the same amount to the right.
library(PtProcess)
library(ggplot2)
library(gtable)
library(grid)
library(gridExtra)
set.seed(1)
lambda <- 1.5
a <- 1
pareto <- rpareto(1000,lambda=lambda,a=a)
x_pareto <- seq(from=min(pareto),to=max(pareto),length=1000)
y_pareto <- 1-ppareto(x_pareto,lambda,a)
df1 <- data.frame(x=x_pareto,cdf=y_pareto)
set.seed(1)
mean <- 3
norm <- rnorm(1000,mean=mean)
x_norm <- seq(from=min(norm),to=max(norm),length=1000)
y_norm <- pnorm(x_norm,mean=mean)
df2 <- data.frame(x=x_norm,cdf=y_norm)
p1 <- ggplot(df1,aes(x=x,y=cdf)) + geom_line() + scale_x_log10() +
theme(plot.margin = unit(c(0,-.5,0,.5), "lines"),
plot.background = element_blank()) +
ggtitle("pareto")
p2 <- ggplot(df2,aes(x=x,y=cdf)) + geom_line() +
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
plot.margin = unit(c(0,1,0,-1), "lines"),
plot.background = element_blank()) +
ggtitle("norm")
gt1 <- ggplotGrob(p1)
gt2 <- ggplotGrob(p2)
newWidth = unit.pmax(gt1$widths[2:3], gt2$widths[2:3])
gt1$widths[2:3] = as.list(newWidth)
gt2$widths[2:3] = as.list(newWidth)
grid.arrange(gt1, gt2, ncol=2)
EDIT
To add a third plot to the right, we need to take more control over the plotting canvas. One solution is to create a new gtable that contains space for the three plots and an additional space for a right margin. Here, I let the margins in the plots take care of the spacing between the plots.
p1 <- ggplot(df1,aes(x=x,y=cdf)) + geom_line() + scale_x_log10() +
theme(plot.margin = unit(c(0,-2,0,0), "lines"),
plot.background = element_blank()) +
ggtitle("pareto")
p2 <- ggplot(df2,aes(x=x,y=cdf)) + geom_line() +
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
plot.margin = unit(c(0,-2,0,0), "lines"),
plot.background = element_blank()) +
ggtitle("norm")
gt1 <- ggplotGrob(p1)
gt2 <- ggplotGrob(p2)
newWidth = unit.pmax(gt1$widths[2:3], gt2$widths[2:3])
gt1$widths[2:3] = as.list(newWidth)
gt2$widths[2:3] = as.list(newWidth)
# New gtable with space for the three plots plus a right-hand margin
gt = gtable(widths = unit(c(1, 1, 1, .3), "null"), height = unit(1, "null"))
# Instert gt1, gt2 and gt2 into the new gtable
gt <- gtable_add_grob(gt, gt1, 1, 1)
gt <- gtable_add_grob(gt, gt2, 1, 2)
gt <- gtable_add_grob(gt, gt2, 1, 3)
grid.newpage()
grid.draw(gt)
The accepted answer is exactly what makes people run when comes to plotting using R! This is my solution:
library('grid')
g1 <- ggplot(...) # however you draw your 1st plot
g2 <- ggplot(...) # however you draw your 2nd plot
grid.newpage()
grid.draw(cbind(ggplotGrob(g1), ggplotGrob(g2), size = "last"))
This takes care of the y axis (minor and major) guide-lines to align in multiple plots, effortlessly.
Dropping some axis text, unifying the legends, ..., are other tasks that can be taken care of while creating the individual plots, or by using other means provided by grid or gridExtra packages.
The accepted answer looks a little too daunting to me. So I find two ways to get around it with less efforts. Both are based on your Attempt #2 grid.arrange() method.
1. Make plot 1 no y-axis as well
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank()
So all the plots will be the same. You won't have problems with different aspects ratios. You will need to generate a separate y-axis with R or your favorite image editting app.
2. Fix and respect aspects ratio
Add aspect.ratio = 1 or whatever ratio you desire to theme() of individual plots. Then use respect=TRUE in your grid.arrange()
This way you can keep y-axis in plot1 and still maintains aspects ratio in all plots. Inspired by this answer.
Hope you find these helpful!

Remove facet_wrap labels completely

I'd like to remove the labels for the facets completely to create a sort of sparkline effect, as for the audience the labels are irrelevant, the best I can come up with is:
library(MASS)
library(ggplot2)
qplot(week,y,data=bacteria,group=ID, geom=c('point','line'), xlab='', ylab='') +
facet_wrap(~ID) +
theme(strip.text.x = element_text(size=0))
So can I get rid of the (now blank) strip.background completely to allow more space for the "sparklines"?
Or alternatively is there a better way to get this "sparkline" effect for a large number of binary valued time-series like this?
For ggplot v2.1.0 or higher, use element_blank() to remove unwanted elements:
library(MASS) # To get the data
library(ggplot2)
qplot(
week,
y,
data = bacteria,
group = ID,
geom = c('point', 'line'),
xlab = '',
ylab = ''
) +
facet_wrap(~ ID) +
theme(
strip.background = element_blank(),
strip.text.x = element_blank()
)
In this case, the element you're trying to remove is called strip.
Alternative using ggplot grob layout
In older versions of ggplot (before v2.1.0), the strip text occupies rows in the gtable layout.
element_blank removes the text and the background, but it does not remove the space that the row occupied.
This code removes those rows from the layout:
library(ggplot2)
library(grid)
p <- qplot(
week,
y,
data = bacteria,
group = ID,
geom = c('point', 'line'),
xlab = '',
ylab = ''
) +
facet_wrap(~ ID)
# Get the ggplot grob
gt <- ggplotGrob(p)
# Locate the tops of the plot panels
panels <- grep("panel", gt$layout$name)
top <- unique(gt$layout$t[panels])
# Remove the rows immediately above the plot panel
gt = gt[-(top-1), ]
# Draw it
grid.newpage()
grid.draw(gt)
I'm using ggplot2 version 1 and the commands required have changed.
Instead of
ggplot() ... +
opts(strip.background = theme_blank(), strip.text.x = theme_blank())
you now use
ggplot() ... +
theme(strip.background = element_blank(), strip.text = element_blank())
For more detail see http://docs.ggplot2.org/current/theme.html
Sandy's updated answer seems good but, possibly has been rendered obsolete by updates to ggplot? From what I can tell the following code (a simplified version of Sandy's original answer) reproduces Sean's original graph without any extra space:
library(ggplot2)
library(grid)
qplot(week,y,data=bacteria,group=ID, geom=c('point','line'), xlab='', ylab='') +
facet_wrap(~ID) +
theme(strip.text.x = element_blank())
I am using ggplot 2.0.0.
As near as I can tell, Sandy's answer is correct but I think it's worth mentioning that there seems to be a small difference the width of a plot with no facets and the width of a plot with the facets removed.
It isn't obvious unless you're looking for it but, if you stack plots using the viewport layouts that Wickham recommends in his book, the difference becomes apparent.

Resources