Know proportions of ggplot2 plot - r

I usually save the plots from ggplot2 using the the png device. The width and the height of the output are set by the arguments of the function. Blank zones are drawn when the "natural proportions" of the graph dont't suit the proportions of the device. In order to avoid this and use the whole defined canvas, the proportions of the plot must be known. ¿Is there a way to find out this value without trial and error?
This code can be used as an example:
x <- seq(from = 0, to = 1, by = 0.1)
y <- seq(from = 1, to = 2, by = 0.1)
df <- expand.grid(x = x, y = y)
df <- cbind(df, z = rnorm(ncol(df), 0, 1))
p <- ggplot(df, aes(x,y, fill = z)) + geom_raster() + coord_fixed()
ppi <- 300
#Value 0.4 is used to change inches into milimeters
png("plot.png", width = 16*0.4*ppi, height = 20*0.4*ppi, res = ppi)
print(p)
dev.off()
It can be seen that some blank space is added at the top and at the bottom to fill the png file. This could be easily corrected by using a proportion different from 20/16, which is not optimal.

You can modify the ratio arg inside coord_fixed():
p <- ggplot(df, aes(x,y, fill = z)) +
geom_raster() +
coord_fixed(ratio = 20/16)
Alteratively you can specify the aspect.ratio inside the theme():
p <- ggplot(df, aes(x,y, fill = z)) +
geom_raster() +
theme(aspect.ratio = 20/16)
The result is the same:

Related

How to plot chart by ggplot2 with fixed scale?

I need to draw different kind of pics using ggplot2 in R and it have to be in proper scale when I print it. It means that I have to define exact image and chart size, the same scale of x/y-axes.
Could you please advice me what kind of options I should to use?
ggplot(Surface, aes(x, y, z = z)) +
geom_contour_filled(binwidth = 10, alpha = 0.6) +
geom_textcontour(binwidth = 20, size = 2.5)
Maybe you want to use coord_fixed with ratio argument:
aspect ratio, expressed as y / x
Here is your code:
library(ggplot2)
ggplot(Surface, aes(x, y, z = z)) +
geom_contour_filled(binwidth = 10, alpha = 0.6) +
geom_textcontour(binwidth = 20, size = 2.5) +
coord_fixed(ratio = 2/1)
If you mean geom_text_contour from metR package then we could do:
This example is taken from here How to add labels in a contour plot using ggplot2? (credits #bbiasi)
library(metR)
library(tidyverse)
library(data.table)
v <- data.table::melt(volcano)
ggplot(v, aes(Var1, Var2)) +
ggplot2::geom_contour(aes(z = value))+
metR::geom_text_contour(aes(z = value))
# adding ggsave with width and height
ggsave("test_myplot.png", width = 9, height = 7)

Is it possible to align x axis title to a value of the axis?

Having a tibble and a simple scatterplot:
p <- tibble(
x = rnorm(50, 1),
y = rnorm(50, 10)
)
ggplot(p, aes(x, y)) + geom_point()
I get something like this:
I would like to align (center, left, right, as the case may be) the title of the x-axis - here rather blandly x - with a specific value on the axis, say the off-center 0 in this case. Is there a way to do that declaratively, without having to resort to the dumb (as in "free of context") trial-and-error element_text(hjust=??). The ?? are rather appropriate here because every value is a result of experimentation (my screen and PDF export in RStudio never agree on quite some plot elements). Any change in the data or the dimensions of the rendering may (or may not) invalidate the hjust value and I am looking for a solution that graciously repositions itself, much like the axes do.
Following the suggestions in the comments by #tjebo I dug a little deeper into the coordinate spaces. hjust = 0.0 and hjust = 1.0 clearly align the label with the Cartesian coordinate system extent (but magically left-aligned and right-aligned, respectively) so when I set specific limits, calculation of the exact value of hjust is straightforward (aiming for 0 and hjust = (0 - -1.5) / (3.5 - -1.5) = 0.3):
ggplot(p, aes(x, y)) +
geom_point() +
coord_cartesian(ylim = c(8, 12.5), xlim = c(-1.5, 3.5), expand=FALSE) +
theme(axis.title.x = element_text(hjust = 0.3))
This gives an acceptable result for a label like x, but for longer labels the alignment is off again:
ggplot(p %>% mutate(`Longer X label` = x), aes(x = `Longer X label`, y = y)) +
geom_point() +
coord_cartesian(ylim = c(8, 12.5), xlim = c(-1.5, 3.5), expand=FALSE) +
theme(axis.title.x = element_text(hjust = 0.3))
Any further suggestions much appreciated.
Another option (different enough hopefully to justify the second answer) is as already mentioned to create the annotation as a separate plot. This removes the range problem. I like {patchwork} for this.
library(tidyverse)
library(patchwork)
p <- tibble( x = rnorm(50, 1), y = rnorm(50, 10))
p1 <- tibble( x = rnorm(50, 1), y = 100*rnorm(50, 10))
## I like to define constants outside my ggplot call
mylab <- "longer_label"
x_demo <- c(-1, 2)
demo_fct <- function(p){
p1 <- ggplot(p, aes(x, y)) +
geom_point() +
labs(x = NULL) +
theme(plot.margin = margin())
p2 <- ggplot(p, aes(x, y)) +
## you need that for your correct alignment with the first plot
geom_blank() +
annotate(geom = "text", x = x_demo, y = 1,
label = mylab, hjust = 0) +
theme_void() +
# you need that for those annoying margin reasons
coord_cartesian(clip = "off")
p1 / p2 + plot_layout(heights = c(1, .05))
}
demo_fct(p) + plot_annotation(title = "demo1 with x at -1 and 2")
demo_fct(p1) + plot_annotation(title = "demo2 with larger data range")
Created on 2021-12-04 by the reprex package (v2.0.1)
I still think you will fair better and easier with custom annotation. There are typically two ways to do that. Either direct labelling with a text layer (for single labels I prefer annotate(geom = "text"), or you create a separate plot and stitch both together, e.g. with patchwork.
The biggest challenge is the positioning in y dimension. For this I typically take a semi-automatic approach where I only need to define one constant, and set the coordinates relative to the data range, so changes in range should in theory not matter much. (they still do a bit, because the panel dimensions also change). Below showing examples of exact label positioning for two different data ranges (using the same constant for both)
library(tidyverse)
# I only need patchwork for demo purpose, it is not required for the answer
library(patchwork)
p <- tibble( x = rnorm(50, 1), y = rnorm(50, 10))
p1 <- tibble( x = rnorm(50, 1), y = 100*rnorm(50, 10))
## I like to define constants outside my ggplot call
y_fac <- .1
mylab <- "longer_label"
x_demo <- c(-1, 2)
demo_fct <- function(df, x) {map(x_demo,~{
## I like to define constants outside my ggplot call
ylims <- range(df$y)
ggplot(df, aes(x, y)) +
geom_point() +
## set hjust = 0 for full positioning control
annotate(geom = "text", x = ., y = min(ylims) - y_fac*mean(ylims),
label = mylab, hjust = 0) +
coord_cartesian(ylim = ylims, clip = "off") +
theme(plot.margin = margin(b = .5, unit = "in")) +
labs(x = NULL)
})
}
demo_fct(p, x_demo) %>% wrap_plots() + plot_annotation(title = "demo 1, label at x = -1 and x = 2")
demo_fct(p1, x_demo) %>% wrap_plots() + plot_annotation(title = "demo 2 - different data range")
Created on 2021-12-04 by the reprex package (v2.0.1)

facet_zoom can't change breaks of zoomed plot

I currently have a plot and have used facet_zoom to focus on records between 0 and 10 in the x axis. The following code reproduces an example:
require(ggplot2)
require(ggforce)
require(dplyr)
x <- rnorm(10000, 50, 25)
y <- rexp(10000)
data <- data.frame(x, y)
ggplot(data, aes(x = x, y = y)) +
geom_point() +
facet_zoom(x = dplyr::between(x, 0, 10))
I want to change the breaks on the zoomed portion of the graph to be the equivalent of:
ggplot(data, aes(x = x, y = y)) +
geom_point() +
facet_zoom(x = dplyr::between(x, 0, 10)) +
scale_x_continuous(breaks = seq(0,10,2))
But this changes the breaks of the original plot as well. Is it possible to just change the breaks of the zoomed portion whilst leaving the original plot as default?
This works for your use case:
ggplot(data, aes(x = x, y = y)) +
geom_point() +
facet_zoom(x = between(x, 0, 10)) +
scale_x_continuous(breaks = pretty)
From ?scale_x_continuous, breaks would accept the following (emphasis added):
One of:
NULL for no breaks
waiver() for the default breaks computed by the transformation object
A numeric vector of positions
A function that takes the limits as input and returns breaks as output
pretty() is one such function. It doesn't offer very fine control, but does allow you to have some leeway to specify breaks across different facets with very different scales.
For illustration, here are two examples with different desired number of breaks. See ?pretty for more details on the other arguments this function accepts.
p <- ggplot(data, aes(x = x, y = y)) +
geom_point() +
facet_zoom(x = between(x, 0, 10))
cowplot::plot_grid(
p + scale_x_continuous(breaks = function(x) pretty(x, n = 3)),
p + scale_x_continuous(breaks = function(x) pretty(x, n = 10)),
labels = c("n = 3", "n = 10"),
nrow = 1
)
Of course, you can also define your own function to convert plot limits into desired breaks, (e.g. something like p + scale_x_continuous(breaks = function(x) seq(min(x), max(x), length.out = 5))), but I generally find these functions require more tweaking to get right, & pretty() is often good enough.

Automated way to prevent ggplot hexbin from cutting geoms off axes

This is a slightly different question from an earlier post(ggplot hexbin shows different number of hexagons in plot versus data frame).
I am using hexbin() to bin data into hexagon objects, and ggplot() to plot the results. I notice that, sometimes, the hexagons on the edge of the plot are cut in half. Below is an example.
library(hexbin)
library(ggplot2)
set.seed(1)
data <- data.frame(A=rnorm(100), B=rnorm(100), C=rnorm(100), D=rnorm(100), E=rnorm(100))
maxVal = max(abs(data))
maxRange = c(-1*maxVal, maxVal)
x = data[,c("A")]
y = data[,c("E")]
h <- hexbin(x=x, y=y, xbins=5, shape=1, IDs=TRUE, xbnds=maxRange, ybnds=maxRange)
hexdf <- data.frame (hcell2xy (h), hexID = h#cell, counts = h#count)
ggplot(hexdf, aes(x = x, y = y, fill = counts, hexID = hexID)) +
geom_hex(stat = "identity") +
coord_cartesian(xlim = c(maxRange[1], maxRange[2]), ylim = c(maxRange[1], maxRange[2]))
This creates a graphic where one hexagon is cut off at the top and one hexagon is cut off at the bottom:
Another approach I can try is to hard-code a value (here 1.5) to be added to the limits of the x and y axis. Doing so does seem to solve the problem in that no hexagons are cut off anymore.
ggplot(hexdf, aes(x = x, y = y, fill = counts, hexID = hexID)) +
geom_hex(stat = "identity") +
scale_x_continuous(limits = maxRange * 1.5) +
scale_y_continuous(limits = maxRange * 1.5)
However, even though the second approach solves the problem in this instance, the value of 1.5 is arbitrary. I am trying to automate this process for a variety of data and variety of bin sizes and hexagon sizes that could be used. Is there a solution to keeping all hexagons fully visible in the plot without having to hard-code an arbitrary value that may be too large or too small for certain instances?
Consider that you can skip the computation of hexbin, and let ggplot do the job.
Then, if you prefer to manually set the width of the bins you can set the binwidth and modify the limits:
bwd = 1
ggplot(data, aes(x = x, y = y)) +
geom_hex(binwidth = bwd) +
coord_cartesian(xlim = c(min(x) - bwd, max(x) + bwd),
ylim = c(min(y) - bwd, max(y) + bwd),
expand = T) +
geom_point(color = "red") +
theme_bw()
this way, hexagons should never be truncated (though you may end up with some "empty" space.
Result with bwd = 1:
Result with bwd = 3:
If instead you prefer to programmatically set the number of the bins, you can use:
nbins_x <- 4
nbins_y <- 6
range_x <- range(data$A, na.rm = T)
range_y <- range(data$E, na.rm = T)
bwd_x <- (range_x[2] - range_x[1])/nbins_x
bwd_y <- (range_y[2] - range_y[1])/nbins_y
ggplot(data, aes(x = A, y = E)) +
geom_hex(bins = c(nbins_x,nbins_y)) +
coord_cartesian(xlim = c(range_x[1] - bwd_x, range_x[2] + bwd_x),
ylim = c(range_y[1] - bwd_y, range_y[2] + bwd_y),
expand = T) +
geom_point(color = "red")+
theme_bw()

Horizontal barplot for comparison two data - based on ratio

I would like to create a horizontal barplot to compare two of my tables. I already did the comparison and created a table with ratio.
That's how the data looks like:
> dput(data)
structure(list(Name=c('Mazda RX4','Mazda RX4 Wag','Datsun 710','Hornet 4 Drive',
'Hornet Sportabout','Valiant','Duster 360','Merc 240D','Merc 230','Merc 280','Merc 280C',
'Merc 450SE','Merc 450SL','Merc 450SLC','Cadillac Fleetwood','Lincoln Continental',
'Chrysler Imperial','Fiat 128','Honda Civic','Toyota Corolla'),ratio=c(1.393319198903125,
0.374762569687951,0.258112791829808,0.250298480396529,1.272180366473129,0.318000456484454,
0.264074483447591,0.350798965144559,2.310541690719624,1.314300844213157,1.18061486696761,
0.281581177092538,0.270164442687919,2.335578882236703,2.362339701969396,1.307731925943769,
0.347550384302281,0.232276047899868,0.125643566969327,0.281209747680576),Freq=c(2L,9L,2L,2L,
4L,2L,2L,3L,3L,5L,2L,2L,2L,7L,2L,4L,4L,2L,2L,4L)),.Names=c('Name','ratio','Freq'),class=
'data.frame',row.names=c(NA,20L))
I would like to achieve something like that:
In the middle I would put 1. Based on the calculated ratio I would like to put the proper scale which goes up to 3 to the right for example and to 0 to the left (can be different of course).
Each of the cars should have a separate bar. It will give 20 bars on this plot.
Additional thing would be to put the numbers from column Freq on the plots. It's not obligatory but would help.
I don't really see how that plot makes much sense with your data, as there is no quantity that adds up to 1 (or a common total). It could make sense with proportions, not so much with ratio's. I might be missing something... Perhaps you're looking for something like this?
library(ggplot2)
r <- range(d$ratio)
br <- seq(floor(r[1]), ceiling(r[2]), 0.5)
ggplot(d, aes(x = Name, y = ratio - 1)) +
geom_bar(stat = 'identity', position = 'identity') +
coord_flip() +
ylab('ratio') + xlab('car') +
scale_y_continuous(breaks = br - 1, labels = br) +
theme_bw()
Add geom_text(aes(label = Freq), y = r[2] - 0.95) for the labels on the right side.
Or if you want to center the value of 1 (a bit more tricky):
r <- range(d$ratio)
m <- ceiling(max(abs(range(d$ratio))))
br <- seq(-m + 1, m - 1, 0.25)
ggplot(d, aes(x = Name, y = ratio - 1)) +
geom_bar(stat = 'identity', position = 'identity') +
geom_text(aes(label = Freq), y = m - 1.1) +
coord_flip() +
ylab('ratio') + xlab('car') +
scale_y_continuous(breaks = br, labels = br + 1, limits = c(-m + 1, m - 1),
expand = c(0, 0)) +
theme_bw()
## plot precomputations
yexpand <- 0.2;
barheight <- 0.8;
xlim <- c(0,3);
xticks <- seq(xlim[1L],xlim[2L],0.25);
ylim <- c(1-barheight/2-yexpand,nrow(data)+barheight/2+yexpand);
yticks <- seq_len(nrow(data));
cols <- c('#6F7EB3','#D05B5B');
## draw plot
par(mar=c(5,4,4,2)+0.1+c(0,3,0,0));
plot(NA,xlim=xlim,ylim=ylim,xaxs='i',yaxs='i',axes=F,ann=F);
segments(xlim[1L],ylim[1L],xlim[1L],ylim[2L],xpd=NA);
axis(1L,xticks,cex.axis=0.7);
axis(2L,yticks,data$Name,las=2L,cex.axis=0.7);
mtext(expression(italic(Ratio)),1L,3);
mtext(expression(italic(Car)),2L,5.5);
mtext(data$Freq,4L,0.75,at=yticks,las=2L,cex=0.7);
y1 <- seq_len(nrow(data))-barheight/2;
y2 <- seq_len(nrow(data))+barheight/2;
rect(xlim[1L],y1,data$ratio,y2,col=cols[1L],lwd=0.5);
rect(data$ratio,y1,xlim[2L],y2,col=cols[2L],lwd=0.5);
abline(v=1);

Resources