Constant width in ggplot barplots - r

How to make the width of bars and spaces between them fixed for several barplots using ggplot, having different number of bars on each plot?
Here is a failed try:
m <- data.frame(x=1:10,y=runif(10))
ggplot(m, aes(x,y)) + geom_bar(stat="identity")
ggplot(m[1:3,], aes(x,y)) + geom_bar(stat="identity")
Adding width=1 to geom_bar(...) doesn't help as well. I need the second plot automatically to have less width and the same bar width and spaces as the first one.

Edit:
It appears the OP simply wants this:
library(gridExtra)
grid.arrange(p1,arrangeGrob(p2,widths=c(1,2),ncol=2), ncol=1)
I am not sure, if it's possible to pass absolute widths to geom_bar. So, here is an ugly hack:
set.seed(42)
m <- data.frame(x=1:10,y=runif(10))
p1 <- ggplot(m, aes(x,y)) + geom_bar(stat="identity")
p2 <- ggplot(m[1:3,], aes(x,y)) + geom_bar(stat="identity")
g1 <- ggplotGrob(p1)
g2 <- ggplotGrob(p2)
I used str to find the correct grob and child. You could use more sophisticated methods to generalize this if necessary.
#store the old widths
old.unit <- g2$grobs[[4]]$children[[2]]$width[[1]]
#change the widths
g2$grobs[[4]]$children[[2]]$width <- rep(g1$grobs[[4]]$children[[2]]$width[[1]],
length(g2$grobs[[4]]$children[[2]]$width))
#copy the attributes (units)
attributes(g2$grobs[[4]]$children[[2]]$width) <- attributes(g1$grobs[[4]]$children[[2]]$width)
#position adjustment (why are the bars justified left???)
d <- (old.unit-g2$grobs[[4]]$children[[2]]$width[[1]])/2
attributes(d) <- attributes(g2$grobs[[4]]$children[[2]]$x)
g2$grobs[[4]]$children[[2]]$x <- g2$grobs[[4]]$children[[2]]$x+d
#plot
grid.arrange(g1,g2)

Wrapped the other suggestions in a function that only requires a single graph.
fixedWidth <- function(graph, width=0.1) {
g2 <- graph
#store the old widths
old.unit <- g2$grobs[[4]]$children[[2]]$width[[1]]
original.attibutes <- attributes(g2$grobs[[4]]$children[[2]]$width)
#change the widths
g2$grobs[[4]]$children[[2]]$width <- rep(width,
length(g2$grobs[[4]]$children[[2]]$width))
#copy the attributes (units)
attributes(g2$grobs[[4]]$children[[2]]$width) <- original.attibutes
#position adjustment (why are the bars justified left???)
d <- (old.unit-g2$grobs[[4]]$children[[2]]$width[[1]])/2
attributes(d) <- attributes(g2$grobs[[4]]$children[[2]]$x)
g2$grobs[[4]]$children[[2]]$x <- g2$grobs[[4]]$children[[2]]$x+d
return(g2)
}

Related

Fix Plot Size in ggplot2 relative to plot title

I'm using ggplot2 to create some figures with titles, but finding that when titles have a descender (e.g., lowercase p, q, g, y) the actual size of the plot shrinks slightly to accommodate the larger space needed by the title.
Are there ways within normal ggplot functionality to fix the plot size so that figures are in 100% consistent position regardless of title?
Here's some quick sample code that shows the issue; folks might need to run code locally to clearly see the differences in the images.
library(ggplot2)
# No letters with descenders in title
ggplot(data=mtcars,aes(x=disp,y=mpg)) +
geom_point() + ggtitle("Scatter Plot")
# Title has a descender (lowercase 'p')
ggplot(data=mtcars,aes(x=disp,y=mpg)) +
geom_point() + ggtitle("Scatter plot")
you can set the relevant height in the gtable,
library(ggplot2)
p1 <- ggplot() + ggtitle("a")
p2 <- ggplot() + ggtitle("a\nb")
gl <- lapply(list(p1,p2), ggplotGrob)
th <- do.call(grid::unit.pmax, lapply(gl, function(g) g$heights[3]))
gl <- lapply(gl, function(g) {g$heights[3] <- th; g})
gridExtra::grid.arrange(grobs = gl, nrow=1)
Edit: here's how to edit one plot for simplicity
g = ggplotGrob(qplot(1,1) + ggtitle('title'))
g$heights[3] = grid::unit(3,"line")
grid.draw(g)

ggplot: clipping lines between facets

Say I have a plot like this:
# Load libraries
library(ggplot2)
library(grid)
# Load data
data(mtcars)
# Plot results
p <- ggplot(data = mtcars)
p <- p + geom_bar(aes(cyl))
p <- p + coord_flip()
p <- p + facet_wrap(~am)
print(p)
Now, I want to plot lines all the way across both facets where the bars are. I add this:
p <- p + geom_vline(aes(xintercept = cyl))
which adds the lines, but they don't cross both facets. So, I try to turn off clipping using this solution:
# Turn off clipping
gt <- ggplot_gtable(ggplot_build(p))
gt$layout$clip[gt$layout$name == "panel"] <- "off"
# Plot results
grid.draw(gt)
but that doesn't solve the problem: the lines are still clipped. So, I wondered if this is specific to geom_vline and tried approaches with geom_abline and geom_line (the latter with values across ±Inf), but the results are the same. In other posts, the clipping solution seems to work for text and points, but presumably in this case the lines are only defined within the limits of the figure. (I even tried gt$layout$clip <- "off" to switch off all possible clipping, but that didn't solve the problem.) Is there a workaround?
library(grid)
library(gtable)
# Starting from your plot `p`
gb <- ggplot_build(p)
g <- ggplot_gtable(gb)
# Get position of y-axis tick marks
ys <- gb$layout$panel_ranges[[1]][["y.major"]]
# Add segments at these positions
# subset `ys` if you only want to add a few
# have a look at g$layout for relevant `l` and `r` positions
g <- gtable_add_grob(g, segmentsGrob(y0=ys, y1=ys,
gp=gpar(col="red", lty="dashed")),
t = 7, l = 4, r=8)
grid.newpage()
grid.draw(g)
see ggplot, drawing multiple lines across facets for how to rescale values for more general plotting. ie
data2npc <- function(x, panel = 1L, axis = "x") {
range <- pb$layout$panel_ranges[[panel]][[paste0(axis,".range")]]
scales::rescale(c(range, x), c(0,1))[-c(1,2)]
}
start <- sapply(c(4,6,8), data2npc, panel=1, axis="y")
g <- gtable_add_grob(g, segmentsGrob(y0=start, y1=start),
t=7, r=4, l=8)

ggplot2 can't draw a correct plot with only two or three data points

I'm using R to generate some plots of some metrics and getting nice results like this for data that has > 3 data points:
However, I'm noticing that for data with only a few values - I get very poor results.
If I draw a plot with only two data points, I get a blank plot.
foo_two_points.dat
cluster,account,current_database,action,operation,count,day
cluster19,col0063,col0063,foo_two,two_bar,10,2016-10-04 00:00:00-07:00
cluster61,dwm4944,dwm4944,foo_two,two_bar,2,2016-12-14 00:00:00-08:00
If I draw one data point, it works.
foo_one_point.dat
cluster,account,current_database,action,operation,count,day
cluster1,foo0424,foo0424,fooone,,2,2016-11-01 00:00:00-07:00
Three, it almost works, but isn't accurate.
foo_three_points.dat
cluster,account,current_database,action,operation,count,day
cluster23,col2225,col2225,foo_three,bar,9,2016-12-22 00:00:00-08:00
cluster23,col2225,col2225,foo_three,bar,1,2016-12-29 00:00:00-08:00
cluster12,red1782,red1782,foo_three,bar,2,2016-10-25 00:00:00-07:00
4, 5, etc. all seem fine
But two or three points - nope.
Here is my plot.r file:
library(ggplot2)
library(scales)
args<-commandArgs(TRUE)
filename<-args[1]
n = nchar(filename) - 4
thetitle = substring(filename, 1, n)
print(thetitle)
png_filename <- stringi::stri_flatten(stringi::stri_join(c(thetitle,'.png')))
wide<-as.numeric(args[2])
high<-as.numeric(args[3])
legend_left<-as.numeric(args[4])
pos <- if(legend_left == 1) c(1,0) else c(0,1)
place <- if(legend_left == 1) 'left' else 'right'
print(wide)
print(high)
print(filename)
print(png_filename)
dat = read.csv(filename)
dat$account = as.character(dat$account)
dat$action=as.character(dat$action)
dat$operation = as.character(dat$operation)
dat$count = as.integer(dat$count)
dat$day = as.Date(dat$day)
dat[is.na(dat)]<-"N/A"
png(png_filename,width=wide,height=high)
p <- ggplot(dat, aes(x=day, y=count, fill=account, labels=TRUE))
p <- p + geom_histogram(stat="identity")
p <- p + scale_x_date(labels=date_format("%b-%Y"), limits=as.Date(c('2016-10-01','2017-01-01')))
p <- p + theme(legend.position="bottom")
p <- p + guides(fill=guide_legend(nrow=5, byrow=TRUE))
p <- p + theme(text = element_text(size=15))
p<-p+labs(title=thetitle)
print(p)
dev.off()
Here's the command I use to run it:
RScript plot.r foo_five_points.dat 1600 800 0
What am I doing wrong?
I don't know if this is a bug, I think it is actually by design and the bars are getting clipped as they spill over into the limits.
I also think this is more of a geom_bar than a geom_histogram as this doesn't seem to be distribution data, but that is irrelevant to the issue, both behave the same.
One solution it is to set the width parameter explicitly in geom_histo instead of letting it be calculated:
p <- ggplot(dat, aes(x=day, y=count, fill=account, labels=TRUE))
p <- p + geom_histogram(stat="identity",width=1)
p <- p + scale_x_date(labels=date_format("%b-%Y"), limits=as.Date(c('2016-10-1','2017-01-01')))
p <- p + theme(legend.position="bottom")
p <- p + guides(fill=guide_legend(nrow=5, byrow=TRUE))
p <- p + theme(text = element_text(size=15))
p<-p+labs(title=thetitle)
Then your two point example that is blank above gives you this - which seems right:
Can't be sure that setting the width explicitly will work when you have a lot of data though and the bars keep needing to get smaller - I suppose you could set it conditionally.

ggplot, drawing multiple lines across facets

I drew two panels in a column using ggplot2 facet, and would like to add two vertical lines across the panels at x = 4 and 8. The following is the code:
library(ggplot2)
library(gtable)
library(grid)
dat <- data.frame(x=rep(1:10,2),y=1:20+rnorm(20),z=c(rep("A",10),rep("B",10)))
P <- ggplot(dat,aes(x,y)) + geom_point() + facet_grid(z~.) + xlim(0,10)
Pb <- ggplot_build(P);Pg <- ggplot_gtable(Pb)
for (i in c(4,8)){
Pg <- gtable_add_grob(Pg, moveToGrob(i/10,0),t=8,l=4)
Pg <- gtable_add_grob(Pg, lineToGrob(i/10,1),t=6,l=4)
}
Pg$layout$clip <- "off"
grid.newpage()
grid.draw(Pg)
The above code is modified from:ggplot, drawing line between points across facets.
And .
There are two problems in this figure. First, only one vertical line was shown. It seems that moveToGrob only worked once.. Second, the shown line is not exact at x = 4. I didn't find the Pb$panel$ranges variable, so is there a way that I can correct the range as well? Thanks a lot.
Updated to ggplot2 V3.0.0
In the simple scenario where panels have common axes and the lines extend across the full y range you can draw lines over the whole gtable cells, having found the correct npc coordinates conversion (cf previous post, updated because ggplot2 keeps changing),
library(ggplot2)
library(gtable)
library(grid)
dat <- data.frame(x=rep(1:10,2),y=1:20+rnorm(20),z=c(rep("A",10),rep("B",10)))
p <- ggplot(dat,aes(x,y)) + geom_point() + facet_grid(z~.) + xlim(0,10)
pb <- ggplot_build(p)
pg <- ggplot_gtable(pb)
data2npc <- function(x, panel = 1L, axis = "x") {
range <- pb$layout$panel_params[[panel]][[paste0(axis,".range")]]
scales::rescale(c(range, x), c(0,1))[-c(1,2)]
}
start <- sapply(c(4,8), data2npc, panel=1, axis="x")
pg <- gtable_add_grob(pg, segmentsGrob(x0=start, x1=start, y0=0, y1=1, gp=gpar(lty=2)), t=7, b=9, l=5)
grid.newpage()
grid.draw(pg)
You can just use geom_vline and avoid the grid mess altogether:
ggplot(dat, aes(x, y)) +
geom_point() +
geom_vline(xintercept = c(4, 8)) +
facet_grid(z ~ .) +
xlim(0, 10)

How to decrease space from boxplot to edge with ggplot2?

I want to put a boxplot beneath a histogram. I already figured out how to do this, but the boxplot and the histogram are equally sized and I want to slim down the boxplot.
When I decrease the width, the spaces to the edges stay the same. However, I want to decrease the width of the whole thing.
Here is what I have so far:
library(ggplot2)
h = ggplot(mtcars, aes(x=hp)) +
geom_histogram(aes(y = ..density..)) +
scale_x_continuous(breaks=c(100, 200, 300), limits=c(0,400)) +
geom_density(, linetype="dotted")
b <- ggplot(mtcars,aes(x=factor(0),hp))+geom_boxplot(width=0.1) +
coord_flip(ylim=c(0,400)) +
scale_y_continuous(breaks=c(100, 200, 300)) +
theme(axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank())
library(gridExtra)
plots <- list(h, b)
grobs <- list()
widths <- list()
for (i in 1:length(plots)){
grobs[[i]] <- ggplotGrob(plots[[i]])
widths[[i]] <- grobs[[i]]$widths[2:5]
}
maxwidth <- do.call(grid::unit.pmax, widths)
for (i in 1:length(grobs)){
grobs[[i]]$widths[2:5] <- as.list(maxwidth)
}
do.call("grid.arrange", c(grobs, ncol=1))
Edit:
If I use grid.arrange() like so:
grid.arrange(heights=c(4,1), h, b)
The proportions are exactly like I wanted it, but I cannot figure out how to adjust my first example above so that the axes are aligned again.
Anyone?
You just need to use your width-corrected grobs, not the original plots, in the grid.arrange call.
grid.arrange(heights = c(4, 1), grobs[[1]], grobs[[2]])

Resources