ggplotly with facet_wrap(scales = 'free_y') creates uneven panel widths - r

The following code creates a faceted Plotly plot using ggplotly, where each facet has the same height and width:
library(ggplot2)
library(plotly)
data <- expand.grid(
measure = paste0('m', 1:6),
x = seq(as.Date('2017-01-01'), as.Date('2017-10-01'), length.out = 10))
data$y <- runif(nrow(data))
test <- ggplot(
aes(
x = x,
y = y),
data = data) +
geom_point() +
facet_wrap(~measure)
g <- ggplotly(test)
print(g)
But for a new iteration, I want to allow the y-scales to vary, so I added scales = 'free_y'. This causes two problems:
The middle panel in each row becomes narrower than the outer panels, and the data points disappear in panels [2,1] and [2,2], for no clear reason.
test <- ggplot(
aes(
x = x,
y = y),
data = data) +
geom_point() +
facet_wrap(~measure, scales = 'free_y')
g <- ggplotly(test)
print(g)
If I change to scales = 'free' instead of 'free_y', I get the data points back, but the middle panel in each row is still narrower, and I need to add panel spacing to keep the x-axis labels from getting overlapped:
test <- ggplot(
aes(
x = x,
y = y),
data = data) +
geom_point() +
facet_wrap(~measure, scales = 'free')
g <- ggplotly(test)
print(g)
Is there a good way to keep the panel widths equal with varying y scales?

Related

Automated way to prevent ggplot hexbin from cutting geoms off axes

This is a slightly different question from an earlier post(ggplot hexbin shows different number of hexagons in plot versus data frame).
I am using hexbin() to bin data into hexagon objects, and ggplot() to plot the results. I notice that, sometimes, the hexagons on the edge of the plot are cut in half. Below is an example.
library(hexbin)
library(ggplot2)
set.seed(1)
data <- data.frame(A=rnorm(100), B=rnorm(100), C=rnorm(100), D=rnorm(100), E=rnorm(100))
maxVal = max(abs(data))
maxRange = c(-1*maxVal, maxVal)
x = data[,c("A")]
y = data[,c("E")]
h <- hexbin(x=x, y=y, xbins=5, shape=1, IDs=TRUE, xbnds=maxRange, ybnds=maxRange)
hexdf <- data.frame (hcell2xy (h), hexID = h#cell, counts = h#count)
ggplot(hexdf, aes(x = x, y = y, fill = counts, hexID = hexID)) +
geom_hex(stat = "identity") +
coord_cartesian(xlim = c(maxRange[1], maxRange[2]), ylim = c(maxRange[1], maxRange[2]))
This creates a graphic where one hexagon is cut off at the top and one hexagon is cut off at the bottom:
Another approach I can try is to hard-code a value (here 1.5) to be added to the limits of the x and y axis. Doing so does seem to solve the problem in that no hexagons are cut off anymore.
ggplot(hexdf, aes(x = x, y = y, fill = counts, hexID = hexID)) +
geom_hex(stat = "identity") +
scale_x_continuous(limits = maxRange * 1.5) +
scale_y_continuous(limits = maxRange * 1.5)
However, even though the second approach solves the problem in this instance, the value of 1.5 is arbitrary. I am trying to automate this process for a variety of data and variety of bin sizes and hexagon sizes that could be used. Is there a solution to keeping all hexagons fully visible in the plot without having to hard-code an arbitrary value that may be too large or too small for certain instances?
Consider that you can skip the computation of hexbin, and let ggplot do the job.
Then, if you prefer to manually set the width of the bins you can set the binwidth and modify the limits:
bwd = 1
ggplot(data, aes(x = x, y = y)) +
geom_hex(binwidth = bwd) +
coord_cartesian(xlim = c(min(x) - bwd, max(x) + bwd),
ylim = c(min(y) - bwd, max(y) + bwd),
expand = T) +
geom_point(color = "red") +
theme_bw()
this way, hexagons should never be truncated (though you may end up with some "empty" space.
Result with bwd = 1:
Result with bwd = 3:
If instead you prefer to programmatically set the number of the bins, you can use:
nbins_x <- 4
nbins_y <- 6
range_x <- range(data$A, na.rm = T)
range_y <- range(data$E, na.rm = T)
bwd_x <- (range_x[2] - range_x[1])/nbins_x
bwd_y <- (range_y[2] - range_y[1])/nbins_y
ggplot(data, aes(x = A, y = E)) +
geom_hex(bins = c(nbins_x,nbins_y)) +
coord_cartesian(xlim = c(range_x[1] - bwd_x, range_x[2] + bwd_x),
ylim = c(range_y[1] - bwd_y, range_y[2] + bwd_y),
expand = T) +
geom_point(color = "red")+
theme_bw()

How to apply separate coord_cartesian() to "zoom in" into individual panels of a facet_grid()?

Inspired by the Q Finding the elbow/knee in a curve I started to play around with smooth.spline().
In particular, I want to visualize how the parameter df (degree of freedom) influences the approximation and the first and second derivative. Note that this Q is not about approximation but about a specific problem (or edge case) in visualisation with ggplot2.
First attempt: simple facet_grid()
library(ggplot2)
ggplot(ap, aes(x, y)) +
geom_point(data = dp, alpha = 0.2) +
geom_line() +
facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) +
theme_bw()
dp is a data.table containing the data points for which an approximation is sought and ap is a data.table with the approximated data plus the derivatives (data are given below).
For each row, facet_grid() with scales = "free_y" has choosen a scale which displays all data. Unfortunately, one panel has kind of "outliers" which make it difficult to see details in the other panels. So, I want to "zoom in".
"Zoom in" using coord_cartesian()
ggplot(ap, aes(x, y)) +
geom_point(data = dp, alpha = 0.2) +
geom_line() +
facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) +
theme_bw() +
coord_cartesian(ylim = c(-200, 50))
With the manually selected range, more details in the panels of row 3 have been made visible. But, the limit has been applied to all panels of the grid. So, in row 1 details hardly can been distinguished.
What I'm looking for is a way to apply coord_cartesian() with specific parameters separately to each individual panel (or group of panels, e.g., rowwise) of the grid. For instance, is it possible to manipulate the ggplot object afterwards?
Workaround: Combine individual plots with cowplot
As a workaround, we can create three separate plots and combine them afterwards using the cowplot package:
g0 <- ggplot(ap[deriv == 0], aes(x, y)) +
geom_point(data = dp, alpha = 0.2) +
geom_line() +
facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) +
theme_bw()
g1 <- ggplot(ap[deriv == 1], aes(x, y)) +
geom_line() +
facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) +
theme_bw() +
coord_cartesian(ylim = c(-50, 50))
g2 <- ggplot(ap[deriv == 2], aes(x, y)) +
geom_line() +
facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) +
theme_bw() +
coord_cartesian(ylim = c(-200, 100))
cowplot::plot_grid(g0, g1, g2, ncol = 1, align = "v")
Unfortunately, this solution
requires to write code to create three separate plots,
duplicates strips and axes and adds whitespace which isn't available for display of the data.
Is facet_wrap() an alternative?
We can use facet_wrap() instead of facet_grid():
ggplot(ap, aes(x, y)) +
# geom_point(data = dp, alpha = 0.2) + # this line causes error message
geom_line() +
facet_wrap(~ deriv + df, scales = "free_y", labeller = label_both, nrow = 3) +
theme_bw()
Now, the y-axes of every panel are scaled individually exhibiting details of some of the panels. Unfortunately, we still can't "zoom in" into the bottom right panel because using coord_cartesian() would affect all panels.
In addition, the line
geom_point(data = dp, alpha = 0.2)
strangely causes
Error in gList(list(x = 0.5, y = 0.5, width = 1, height = 1, just = "centre", :
only 'grobs' allowed in "gList"
I had to comment this line out, so the the data points which are to be approximated are not displayed.
Data
library(data.table)
# data points
dp <- data.table(
x = c(6.6260, 6.6234, 6.6206, 6.6008, 6.5568, 6.4953, 6.4441, 6.2186,
6.0942, 5.8833, 5.7020, 5.4361, 5.0501, 4.7440, 4.1598, 3.9318,
3.4479, 3.3462, 3.1080, 2.8468, 2.3365, 2.1574, 1.8990, 1.5644,
1.3072, 1.1579, 0.95783, 0.82376, 0.67734, 0.34578, 0.27116, 0.058285),
y = 1:32,
deriv = 0)
# approximated data points and derivatives
ap <- rbindlist(
lapply(seq(2, length(dp$x), length.out = 4),
function(df) {
rbindlist(
lapply(0:2,
function(deriv) {
result <- as.data.table(
predict(smooth.spline(dp$x, dp$y, df = df), deriv = deriv))
result[, c("df", "deriv") := list(df, deriv)]
})
)
})
)
Late answer, but the following hack just occurred to me. Would it work for your use case?
Step 1. Create an alternative version of the intended plot, limiting the range of y values such that scales = "free_y" gives a desired scale range for each facet row. Also create the intended facet plot with the full data range:
library(ggplot2)
library(dplyr)
# alternate plot version with truncated data range
p.alt <- ap %>%
group_by(deriv) %>%
mutate(upper = quantile(y, 0.75),
lower = quantile(y, 0.25),
IQR.multiplier = (upper - lower) * 10) %>%
ungroup() %>%
mutate(is.outlier = y < lower - IQR.multiplier | y > upper + IQR.multiplier) %>%
mutate(y = ifelse(is.outlier, NA, y)) %>%
ggplot(aes(x, y)) +
geom_point(data = dp, alpha = 0.2) +
geom_line() +
facet_grid(deriv ~ df, scales = "free_y", labeller = label_both) +
theme_bw()
# intended plot version with full data range
p <- p.alt %+% ap
Step 2. Use ggplot_build() to generate plot data for both ggplot objects. Apply the panel parameters of the alt version onto the intended version:
p <- ggplot_build(p)
p.alt <- ggplot_build(p.alt)
p$layout$panel_params <- p.alt$layout$panel_params
rm(p.alt)
Step 3. Build the intended plot from the modified plot data, & plot the result:
p <- ggplot_gtable(p)
grid::grid.draw(p)
Note: in this example, I truncated the data range by setting all values more than 10*IQR away from the upper / lower quartile in each facet row as NA. This can be replaced by any other logic for defining outliers.

Align x axes of box plot and line plot using ggplot

Im trying to align the x-axes of a bar plot and line plot in one window frame using ggplot. Here is the fake data I'm trying to do it with.
library(ggplot2)
library(gridExtra)
m <- as.data.frame(matrix(0, ncol = 2, nrow = 27))
colnames(m) <- c("x", "y")
for( i in 1:nrow(m))
{
m$x[i] <- i
m$y[i] <- ((i*2) + 3)
}
My_plot <- (ggplot(data = m, aes(x = x, y = y)) + theme_bw())
Line_plot <- My_plot + geom_line()
Bar_plot <- My_plot + geom_bar(stat = "identity")
grid.arrange(Line_plot, Bar_plot)
Thank you for your help.
#eipi10 answers this particular case, but in general you also need to equalize the plot widths. If, for example, the y labels on one of the plots take up more space than on the other, even if you use the same axis on each plot, they will not line up when passed to grid.arrange:
axis <- scale_x_continuous(limits=range(m$x))
Line_plot <- ggplot(data = m, aes(x = x, y = y)) + theme_bw() + axis + geom_line()
m2 <- within(m, y <- y * 1e7)
Bar_plot <- ggplot(data = m2, aes(x = x, y = y)) + theme_bw() + axis + geom_bar(stat = "identity")
grid.arrange(Line_plot, Bar_plot)
In this case, you have to equalize the plot widths:
Line_plot <- ggplot_gtable(ggplot_build(Line_plot))
Bar_plot <- ggplot_gtable(ggplot_build(Bar_plot))
Bar_plot$widths <-Line_plot$widths
grid.arrange(Line_plot, Bar_plot)
The gridlines on the x axes will be aligned if you use scale_x_continuous to force ggplot to use limits you specify.
My_plot <- ggplot(data = m, aes(x = x, y = y)) + theme_bw() +
scale_x_continuous(limits=range(m$x))
Now, when you add the layers, the axes will share the common scaling.

R: Align plots with different x ranges

I have two dataframes dataA and dataB, both of which contain a time and a value column. Time columns are closely related, but non-identical. Now, I generate two plots with ggplot, e.g.:
plotA <- ggplot(dataA) + geom_line(aes(x = time, y = value))
plotB <- ggplot(dataB) + geom_line(aes(x = time, y = value))
How can I use grid.arrange or a similar function to display the two plots vertically and so that x-axis labels and grid lines align?
You could use facets to align the plots.
Firstly, both data sets need to be combined:
dataAB <- rbind(dataA[c("time", "value")], dataB[c("time", "value")])
A new column indicates the original data set:
dataAB$ind <- c(rep("A", nrow(dataA)), rep("B", nrow(dataB)))
Plot:
library(ggplot2)
ggplot(dataAB) +
geom_line(aes(x = time, y = value)) +
facet_wrap( ~ ind, ncol = 1, scales = "free_y")

How to add different lines for facets

I have data where I look at the difference in growth between a monoculture and a mixed culture for two different species. Additionally, I made a graph to make my data clear.
I want a barplot with error bars, the whole dataset is of course bigger, but for this graph this is the data.frame with the means for the barplot.
plant species means
Mixed culture Elytrigia 0.886625
Monoculture Elytrigia 1.022667
Monoculture Festuca 0.314375
Mixed culture Festuca 0.078125
With this data I made a graph in ggplot2, where plant is on the x-axis and means on the y-axis, and I used a facet to divide the species.
This is my code:
limits <- aes(ymax = meansS$means + eS$se, ymin=meansS$means - eS$se)
dodge <- position_dodge(width=0.9)
myplot <- ggplot(data=meansS, aes(x=plant, y=means, fill=plant)) + facet_grid(. ~ species)
myplot <- myplot + geom_bar(position=dodge) + geom_errorbar(limits, position=dodge, width=0.25)
myplot <- myplot + scale_fill_manual(values=c("#6495ED","#FF7F50"))
myplot <- myplot + labs(x = "Plant treatment", y = "Shoot biomass (gr)")
myplot <- myplot + opts(title="Plant competition")
myplot <- myplot + opts(legend.position = "none")
myplot <- myplot + opts(panel.grid.minor=theme_blank(), panel.grid.major=theme_blank())
So far it is fine. However, I want to add two different horizontal lines in the two facets. For that, I used this code:
hline.data <- data.frame(z = c(0.511,0.157), species = c("Elytrigia","Festuca"))
myplot <- myplot + geom_hline(aes(yintercept = z), hline.data)
However if I do that, I get a plot were there are two extra facets, where the two horizontal lines are plotted. Instead, I want the horizontal lines to be plotted in the facets with the bars, not to make two new facets. Anyone a idea how to solve this.
I think it makes it clearer if I put the graph I create now:
Make sure that the variable species is identical in both datasets. If it a factor in one on them, then it must be a factor in the other too
library(ggplot2)
dummy1 <- expand.grid(X = factor(c("A", "B")), Y = rnorm(10))
dummy1$D <- rnorm(nrow(dummy1))
dummy2 <- data.frame(X = c("A", "B"), Z = c(1, 0))
ggplot(dummy1, aes(x = D, y = Y)) + geom_point() + facet_grid(~X) +
geom_hline(data = dummy2, aes(yintercept = Z))
dummy2$X <- factor(dummy2$X)
ggplot(dummy1, aes(x = D, y = Y)) + geom_point() + facet_grid(~X) +
geom_hline(data = dummy2, aes(yintercept = Z))

Resources