Plot features cropped by Margin - r

When I compile the following MWE I observe that the maximum point (3,5) is significantly cut/cropped by the margins.
The following example is drastically reduced for simplicity.
In my actual data the following are all impacted by limiting my coord_cartesian manually if the coresponding x-axis aesthetic is on the max x value.
Point symbol
Error bars
Statistical symbols inserted by text annotation
MWE
library(ggplot2)
library("grid")
print("Program started")
n = c(0.1,2, 3, 5)
s = c(0,1, 2, 3)
df = data.frame(n, s)
gg <- ggplot(df, aes(x=s, y=n))
gg <- gg + geom_point(position=position_dodge(width=NULL), size = 1.5)
gg <- gg + geom_line(position=position_dodge(width=NULL))
gg <- gg + coord_cartesian( ylim = c(0, 5), xlim = c((-0.05)*3, 3));
print(gg)
print("Program complete - a graph should be visible.")
To show my data appropriately I would consider using any of the following that are possible (influenced by the observation that the x-axis labels themselves are never cut):
Make the margin transparent so the point isn't cut
unless the point is cut by the plot area and not the margin
Bring the panel with the plot area to the front
unless the point is cut by the plot area and not the margin so order is independent
Use xlim = c((-0.05)*3, (3*0.05)) to extend the axis range but implement some hack to not show the overhanging axis bar after the maximum point of 3?
this is how I had it originally but I was told to remove the overhang after the 3 as it was unacceptable.

Is this what you mean by option 1:
gg <- ggplot(df, aes(x=s, y=n)) +
geom_point(position=position_dodge(width=NULL), size = 3) +
geom_line(position=position_dodge(width=NULL)) +
coord_cartesian(xlim=c(0,3), ylim=c(0,5))
# Turn of clipping, so that point at (3,5) is not clipped by the panel grob
gg1 <- ggplot_gtable(ggplot_build(gg))
gg1$layout$clip[gg1$layout$name=="panel"] <- "off"
grid.draw(gg1)

Related

ggplot2 - x axis transformation same scale range but must have different align

I have a problem when draw plot with double x-axis (one of the x-axis is an axis transformation). My expected result is even though they have same scale range, but one of the axes has 8mm left deviation (The second x-axis have left align with the main x-axis)
Expect output (like this):
But now, it is same range, same align.
This is my code:
library(ggplot2)
library(cowplot)
data <- data.frame(temp_c=runif(100, min=-5, max=30), outcome=runif(100))
plot <- ggplot(data) +
geom_point(aes(x=temp_c, y=outcome)) +
theme_classic() +
labs(x='Temperature (Celsius)')
x2plot <- ggplot(data) +
geom_point(aes(x=temp_c, y=outcome)) +
theme_classic() +
scale_x_continuous(label=function(x) x) +
labs(x='Temperature (Fahrenehit)')
x <- get_x_axis(x2plot)
xl <- get_plot_component(x2plot, "xlab-b")
plot_grid(plot, ggdraw(x), ggdraw(xl), align='v', axis='rl', ncol=1,
rel_heights=c(0.8, 0.05, 0.05))
Please help me. Thank you so much.
You cannot do axis='lr', this will force both both plots to aligned left to right. Also this ggdraw(xl) will not work. If it is indeed 0.8mm, you can create the 2nd plot with 0.8mm offset:
g = x2plot+theme(plot.margin=margin(0,0,0,0.8,unit="mm"))
x = ggdraw(get_x_axis(x2plot))
Then call plot_grid :
plot_grid(plot, ggdraw(x), align='v', axis='l', ncol=1,
rel_heights=c(0.8, 0.05))+
draw_label("Temperature (Fahrenehit)",
x=0.5, y= 0, vjust=-0.1, angle= 0,size=11)

facet_zoom() while setting axis limits

I would like to use facet_zoom() to zoom in on part of an axis that has limits explicitly set. However, using scale_*(limits = *) and coord_cartesian(xlim = *) overrides the zoomed facet's scales as well such that both have the same limits. Is there a way around this? Maybe I could add some data points near the limits and then set their alpha = 0... Any other ideas?
library(ggplot2)
library(ggforce)
# works with no limits specified
ggplot(mpg, aes(x = hwy, y = cyl)) +
geom_point() +
facet_zoom(xlim = c(20, 25))
# fails with limits specified
ggplot(mpg, aes(x = hwy, y = cyl)) +
scale_x_continuous(limits = c(0, 50)) +
geom_point() +
facet_zoom(xlim = c(20, 25))
# fails with coord_cartesian()
ggplot(mpg, aes(x = hwy, y = cyl)) +
scale_x_continuous() +
coord_cartesian(xlim = c(0, 50)) +
geom_point() +
facet_zoom(xlim = c(20, 25))
I don't have enough knowledge of the underlying intricacies in FacetZoom, but you can check if the following workarounds provide a reasonable starting point.
Plot for demonstration
One of the key differences between setting limits in scales_* vs. coord_* is the clipping effect (screenshot taken from the ggplot2 cheatsheet found here). Since this effect isn't really clear in a scatterplot, I added a geom_line layer and adjusted the specified limits so that the limits extend beyond the data range on one end of the x-axis, & clips the data on the other end.
p <- ggplot(mpg, aes(x = hwy, y = cyl)) +
geom_point() +
geom_line(aes(colour = fl), size = 2) +
facet_zoom(xlim = c(20, 25)) +
theme_bw()
# normal zoomed plot / zoomed plot with limits set in scale / coord
p0 <- p
p1 <- p + scale_x_continuous(limits = c(0, 35))
p2 <- p + coord_cartesian(xlim = c(0, 35))
We can see that while p0 behaves as expected, both p1 & p2 show both the original facet (top) & the zoomed facet (bottom) with the same range of c(0, 35).
In p1's case, the shaded box also expanded to cover the entire top facet. In p2's case, the zoom box stayed in exactly the same position as p0, & as a result no longer covers the zoomed range of c(20, 25).
Workaround for limits set in scale_*
# convert ggplot objects to form suitable for rendering
gp0 <- ggplot_build(p0)
gp1 <- ggplot_build(p1)
# re-set zoomed facet's limits to match zoomed range
k <- gp1$layout$layout$SCALE_X[gp1$layout$layout$name == "x"]
gp1$layout$panel_scales_x[[k]]$limits <- gp1$layout$panel_scales_x[[k]]$range$range
# re-set zoomed facet's panel parameters based on original version p0
k <- gp1$layout$layout$PANEL[gp1$layout$layout$name == "x"]
gp1$layout$panel_params[[k]] <- gp0$layout$panel_params[[k]]
# convert built ggplot object to gtable of grobs as usual & print result
gt1 <- ggplot_gtable(gp1)
grid::grid.draw(gt1)
The zoomed facet now shows the zoomed range c(20, 25) correctly, while the shaded box shrinks to cover the correct range in the original facet. Since this method removes unseen data points, all lines in the original facet stay within the confines of the facet.
Workaround for limits set in coord_*
# convert ggplot objects to form suitable for rendering
gp0 <- ggplot_build(p0)
gp1 <- ggplot_build(p1)
# apply coord limits to original facet's scale limits
k <- gp2$layout$layout$SCALE_X[gp2$layout$layout$name == "orig"]
gp2$layout$panel_scales_x[[k]]$limits <- gp2$layout$coord$limits$x
# re-set zoomed facet's panel parameters based on original version without setting
# limits in scale
k <- gp1$layout$layout$PANEL[gp1$layout$layout$name == "x"]
gp2$layout$panel_params[[k]] <- gp0$layout$panel_params[[k]]
# convert built ggplot object to gtable of grobs as usual,
# & print result
gt2 <- ggplot_gtable(gp2)
grid::grid.draw(gt2)
The zoomed facet now shows the zoomed range c(20, 25) correctly, while the shaded box shifts to cover the correct range in the original facet. Since this method includes unseen data points, some lines in the original facet extend beyond the facet's confines.
Note: These workarounds should work with zoom in y + limits set in y-axis as well, as long as all references to "x" / panel_scales_x / SCALE_X above are changed to "y" / panel_scales_y / SCALE_Y. I haven't tested this for other combinations such as zoom in both x & y, but the broad principle ought to be similar.

How to plot histograms of raw data on the margins of a plot of interpolated data

I would like to show in the same plot interpolated data and a histogram of the raw data of each predictor. I have seen in other threads like this one, people explain how to do marginal histograms of the same data shown in a scatter plot, in this case, the histogram is however based on other data (the raw data).
Suppose we see how price is related to carat and table in the diamonds dataset:
library(ggplot2)
p = ggplot(diamonds, aes(x = carat, y = table, color = price)) + geom_point()
We can add a marginal frequency plot e.g. with ggMarginal
library(ggExtra)
ggMarginal(p)
How do we add something similar to a tile plot of predicted diamond prices?
library(mgcv)
model = gam(price ~ s(table, carat), data = diamonds)
newdat = expand.grid(seq(55,75, 5), c(1:4))
names(newdat) = c("table", "carat")
newdat$predicted_price = predict(model, newdat)
ggplot(newdat,aes(x = carat, y = table, fill = predicted_price)) +
geom_tile()
Ideally, the histograms go even beyond the margins of the tileplot, as these data points also influence the predictions. I would, however, be already very happy to know how to plot a histogram for the range that is shown in the tileplot. (Maybe the values that are outside the range could just be added to the extreme values in different color.)
PS. I managed to more or less align histograms to the margins of the sides of a tile plot, using the method of the accepted answer in the linked thread, but only if I removed all kind of labels. It would be particularly good to keep the color legend, if possible.
EDIT:
eipi10 provided an excellent solution. I tried to modify it slightly to add the sample size in numbers and to graphically show values outside the plotted range since they also affect the interpolated values.
I intended to include them in a different color in the histograms at the side. I hereby attempted to count them towards the lower and upper end of the plotted range. I also attempted to plot the sample size in numbers somewhere on the plot. However, I failed with both.
This was my attempt to graphically illustrate the sample size beyond the plotted area:
plot_data = diamonds
plot_data <- transform(plot_data, carat_range = ifelse(carat < 1 | carat > 4, "outside", "within"))
plot_data <- within(plot_data, carat[carat < 1] <- 1)
plot_data <- within(plot_data, carat[carat > 4] <- 4)
plot_data$carat_range = as.factor(plot_data$carat_range)
p2 = ggplot(plot_data, aes(carat, fill = carat_range)) +
geom_histogram() +
thm +
coord_cartesian(xlim=xrng)
I tried to add the sample size in numbers with geom_text. I tried fitting it in the far right panel but it was difficult (/impossible for me) to adjust. I tried to put it on the main graph (which would anyway probably not be the best solution), but it didn’t work either (it removed the histogram and legend, on the right side and it did not plot all geom_texts). I also tried to add a third row of plots and writing it there. My attempt:
n_table_above = nrow(subset(diamonds, table > 75))
n_table_below = nrow(subset(diamonds, table < 55))
n_table_within = nrow(subset(diamonds, table >= 55 & table <= 75))
text_p = ggplot()+
geom_text(aes(x = 0.9, y = 2, label = paste0("N(>75) = ", n_table_above)))+
geom_text(aes(x = 1, y = 2, label = paste0("N = ", n_table_within)))+
geom_text(aes(x = 1.1, y = 2, label = paste0("N(<55) = ", n_table_below)))+
thm
library(egg)
pobj = ggarrange(p2, ggplot(), p1, p3,
ncol=2, widths=c(4,1), heights=c(1,4))
grid.arrange(pobj, leg, text_p, ggplot(), widths=c(6,1), heights =c(6,1))
I would be very happy to receive help on either or both tasks (adding sample size as text & adding values outside plotted range in a different color).
Based on your comment, maybe the best approach is to roll your own layout. Below is an example. We create the marginal plots as separate ggplot objects and lay them out with the main plot. We also extract the legend and put it outside the marginal plots.
Set-up
library(ggplot2)
library(cowplot)
# Function to extract legend
#https://github.com/hadley/ggplot2/wiki/Share-a-legend-between-two-ggplot2-graphs
g_legend<-function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
return(legend) }
thm = list(theme_void(),
guides(fill=FALSE),
theme(plot.margin=unit(rep(0,4), "lines")))
xrng = c(0.6,4.4)
yrng = c(53,77)
Plots
p1 = ggplot(newdat, aes(x = carat, y = table, fill = predicted_price)) +
geom_tile() +
theme_classic() +
coord_cartesian(xlim=xrng, ylim=yrng)
leg = g_legend(p1)
p1 = p1 + thm[-1]
p2 = ggplot(diamonds, aes(carat)) +
geom_line(stat="density") +
thm +
coord_cartesian(xlim=xrng)
p3 = ggplot(diamonds, aes(table)) +
geom_line(stat="density") +
thm +
coord_flip(xlim=yrng)
plot_grid(
plot_grid(plotlist=list(p2, ggplot(), p1, p3), ncol=2,
rel_widths=c(4,1), rel_heights=c(1,4), align="hv", scale=1.1),
leg, rel_widths=c(5,1))
UPDATE: Regarding your comment about the space between the plots: This is an Achilles heel of plot_grid and I don't know if there's a way to fix it. Another option is ggarrange from the experimental egg package, which doesn't add so much space between plots. Also, you need to save the output of ggarrange first and then lay out the saved object with the legend. If you run ggarrange inside grid.arrange you get two overlapping copies of the plot:
# devtools::install_github('baptiste/egg')
library(egg)
pobj = ggarrange(p2, ggplot(), p1, p3,
ncol=2, widths=c(4,1), heights=c(1,4))
grid.arrange(pobj, leg, widths=c(6,1))

grid.draw Cutting Off Facet Grid Title and X&Y Labels

I am trying to create a facet_grid() in ggplot() and am having issues with the margins of my plot. I am using grid.draw() for my final plot, and cannot figure out how to adjust the margins for printing. When I save my plot, it appears fine (see below). However, when I actually print my plot out to hard copy, half of the X&Y labels and my plot title are cut off.
I've attempted using par() to no avail. Here is a reproducible example, similar to my actual plot. I need to keep the panel <- off part because in my actual plot, I have plotted numbers above each bar and they get cut off by the facet sides for days at the beginning/end of each month. I'm thinking this might be the root of the issue, but I'm not really sure to be honest.
data(airquality)
library(stats)
library(ggplot2)
library(gtable)
library(gridExtra)
library(grid)
library(dplyr)
library(scales)
facet <- ggplot() +
geom_bar(data=airquality,aes(y=Wind,x=Day,fill=Temp),colour="black",stat="identity",position='stack') +
#theme_bw() +
facet_grid(~Month) +
theme(axis.title.x=element_text(face="bold",size=14),axis.title.y=element_text(face="bold",size=14),axis.text.x=element_text(face="bold",size=10),axis.text.y=element_text(face="bold",size=10))+
ylab("Wind") +
theme(panel.margin = unit(5, "mm"),panel.border=element_rect(color="black",fill=NA),panel.background = element_rect(fill="grey84"),plot.title = element_text (size=20,face="bold"),legend.position="right",panel.grid.minor=element_blank(),strip.text.x=element_text(size=12,face="bold"),strip.background=element_rect(fill=NA,colour="black"),legend.title=element_text(size=14,face="bold")) +
ggtitle("Test")
gt <- ggplot_gtable(ggplot_build(facet))
gt$layout$clip[gt$layout$name=="panel"] <- "off"
grid.draw(gt)
Thanks for any and all help! Please, let me know if you need any clarification or have any questions.
you have two options:
set some margins in the ggplot theme
assign a viewport of specific size to the plot/gtable, smaller than the device window by some margin. Here's an illustration of using both strategies at once
library(grid)
fig_size <- c(6, 4) # inches
margin <- unit(4, "line")
p <- ggplot() + theme(plot.background=element_rect(colour="red", size=2, fill="grey50"),
plot.margin = unit(1:4, "line"))
g <- ggplotGrob(p)
g$vp <- viewport(width = unit(fig_size[1], "in") - margin, height=unit(fig_size[2],"in")- margin)
ggsave("plot.pdf", g, width=fig_size[1], height=fig_size[2])

How to control ggplot's plotting area proportions instead of fitting them to devices in R?

By default, each plot in ggplot fits its device.
That's not always desirable. For instance, one may need to make tiles in geom_tile to be squares. Once you change the device or change the number of elements on x/y-axis, the tiles are no longer squares.
Is it possible to set hard proportions or size for a plot and fit the plot in its device's window (or make the device width and height proportional to those of the plot)?
You can specify the aspect ratio of your plots using coord_fixed().
> library(ggplot2)
> df <- data.frame(
+ x = runif(100, 0, 5),
+ y = runif(100, 0, 5))
If we just go ahead and plot these data then we get a plot which conforms to the dimensions of the output device.
> ggplot(df, aes(x=x, y=y)) + geom_point()
If, however, we use coord_fixed() then we get a plot with fixed aspect ratio (which, by default has x- and y-axes of same length). The size of the plot will be determined by the shortest dimension of the output device.
> ggplot(df, aes(x=x, y=y)) + geom_point() + coord_fixed()
Finally, we can adjust the fixed aspect ratio by specifying an argument to coord_fixed(), where the argument is the ratio of the length of the y-axis to the length of the x-axis. So, to get a plot that is twice as tall as it is wide, we would use:
> ggplot(df, aes(x=x, y=y)) + geom_point() + coord_fixed(2)
A cleaner way is to use the theme(aspect.ratio) argument e.g.
library(ggplot2)
d <- data.frame(x=rnorm(100),y=rnorm(100)*1000)
ggplot(d,aes(x,y))+
geom_point() +
theme(aspect.ratio=1/10) #Long and skinny
coord_fixed() sets the ratio of x/y coordinates, which isn't always the same thing (e.g. in this case, where the units of x and y are very different.
Here's an easy device to treat your plot with respect,
library(ggplot2)
p = qplot(1:10, (1:10)^3)
g = ggplotGrob(p)
g$respect = TRUE
library(grid)
grid.draw(g)

Resources