ggplot facet_wrap_paginate pages grouped by variable - r

I'm working on creating some harvest plots for a paper and am stumbling a bit with the code. I have recreated the important bits using 'diamonds' so that it's easier for people to recreate
Part 1
The aim is to create a bar chart that will facet by multiple variables, e.g. 'carat' and 'color', as these will act as the titles for the plots. I've used ggforce's paginate to allow me to spread it over multiple pages, as I'd like each page to show results by a group - here I've added values '1', '2', or '3' to each row of the dataframe. Whilst I could subset the dataframe and create the plots individually, the issue is that the widths of the bars aren't consistent between pages, even when I add width = x to geom_bar (though the widths are the same within each page).
Does anyone have an idea of how I can accomplish this? I was wondering if aes_string would help, but wasn't sure it'd work with the multiple facets I need.
Part 2
When I try and add in some code to save the images it overrides the grid.arrange ... command to specify plot size (so they are all consistent) and adjusts to fill the white space. Is this easily fixed?
Thanks,
Cal
library(ggplot2)
library(ggforce)
library(plyr)
library(dplyr)
library(grid)
library(egg)
df = diamonds
df$Group<- rep(1:3,length.out=nrow(df))
for (i in df$Group) {
p <- ggplot(data=df, aes(x=cut, y=clarity, fill=price)) +
# preserve = single keeps all bars same width, rather than adjusting to
# the space
geom_bar(position=position_dodge2(preserve = 'single'),
stat="identity", color = "black", size = 0.2) +
# paginate allows the chart to be printed on multiple pages
# strip position adds facet title to top of page
facet_wrap_paginate(c("carat","color"), ncol = 3, nrow = 3,
scales = "fixed", strip.position = "top", page = i)
# manually adjust the size of the plot panels
grid.arrange(grobs = lapply(
list(p),
set_panel_size,
width = unit(8,"cm"),
height = unit(5,"cm")
))
}

Related

How to alter distances between plots in a 4 X 4 graph panel?

I am trying to create a graph panel with 8 graphs in total ( 4 x 4). Each graph corresponds to a different gene, whereby there are three lines ( one for control, one for UC disease and one for Crohns), representing the average change in expression comparing a first measurement and a second.
The code I am using to run each of the plots is;
s <- ggplot(X876, aes(x=Timepoint, y=value, group=Group)) +
geom_line(aes(color=Group), size=1)+
geom_point(aes(color=Group), size=2.5) +
labs(y="X876") + ylim(0.35, 0.55) +
theme_classic() +
scale_color_manual(values=c("darkmagenta", "deepskyblue4", "dimgrey"))
Using grid.arrange(l, m, n, o, p, q, r, s, nrow=4, nrow=4), creates a graph panel where the y axes names overlap.
I have seen on here about changing the plot margins via,
pl = replicate(3, ggplot(), FALSE)
grid.arrange(grobs = pl)
margin = theme(plot.margin = unit(c(2,2,2,2), "cm"))
grid.arrange(grobs = lapply(pl, "+", margin))
However, I am unsure how this can be applied to increase the vertical height between the plots on the top and bottom rows. For each of the graphs l, m, n, o, p, q, r, s do I need to include
+ theme(plot.margin=unit(c(t,r,b,l),"cm"))
and then run the grid.arrange(l, m, n, o, p, q, r, s, nrow=4, ncol=4)
Please could somebody suggest which values do I need to include for top (t), right(r), bottom (b), left(l) to only increase the distance (by about 3cms) between the top and bottom row? I am trying different values and I'm not getting a decent graph panel yet.
Thank-you
Probably the easiest way is to create your own theme based on the theme_classic theme and then modify the plotting margins (and anything else) the way that you prefer.
theme_new <- theme_classic() +
theme(plot.margin=unit(c(1,0,1,0), "cm")) # t,r,b,l
Then set the theme (will revert back to the default on starting a new R session).
theme_set(theme_new)
The alternative is to use grid.arrange and modify the margins using the grobs as you've already mentioned.
Once the panels have been arranged, you can then modify the top and bottom margins (or left and right) by specifying the vp argument of grid.arrange, which allows you to modify the viewport of multiple grobs on a single page. You can specify the height and width using the viewport function from the grid package.
For example, if you have a list of ggplot() grobs called g.list that contain your individual plots (l,m,n,o,p,q,r,s), then the following would reduce the height of the viewport by 90%, which effectively increases the top and bottom margins equally by 5%.
library(grid)
library(gridExtra)
grid.arrange(grobs = g.list, vp=viewport(height=0.9))
Without your data, I can't test it, especially to see if the y-axes labels overlap. And I don't know why you think increasing the top and bottom margins can solve that problem since the y-axes are, by default, on the left-hand side of the graph.
Anyway, I'll use the txhousing dataset from the ggplot2 package to see if I can reproduce your problem.
library(ggplot2)
data(txhousing)
theme_new <- theme_classic() +
theme(plot.margin=unit(c(0.1,0.1,0.1,0.1), "cm"), text=element_text(size=8))
theme_set(theme_new)
tx.list <- split(txhousing, txhousing$year)
g.list <- lapply(tx.list, function(data)
{
ggplot(data, aes(x=listings, y=sales)) +
geom_point(size=0.5)
} )
grid.arrange(grobs = g.list, vp=viewport(height=0.9))
I don't see any overlapping. And I don't see why increasing the top and bottom margins would make much difference.
The question was asked a couple of years ago, but I bumped into it only now and thought that I might share a quick and dirty tip for this, which works good enough in many cases.
In some situations the theme is already so complex that this trick might be the easiest way: adding a few \n's (newlines) to the x and y axis names, as this will affect the distances between the plots in the panel. I've learned this trick for a slightly different purpose from here (originally from here).
I'll use the same logic for the example dataset (in this case: Orange from R built-in data sets) as in the excellent code by the previous answerer.
library(ggplot2)
library(gridExtra)
or.list <- split(Orange, Orange$Tree)
g.list <- lapply(or.list, function(data)
{
ggplot(data, aes(x=age, y=circumference)) +
theme_classic() +
geom_point(size=0.5) +
scale_x_continuous(name = "Age\n\n") +
scale_y_continuous(name = "\n\n\nCircumference")
} )
grid.arrange(grobs = g.list)

ggplot geom_text_repel text exceeding the limit of plot

How can I prevent geom_text_repel() to display part of the labels when labels are close to plot boundary. Here is an example with a facet_grid e.g. in chr3 facet the label on the top "ZNF717" is not completely displayed.
example with mtcars with forcing 20 facets and long labels :
mtcars %>%
rowwise() %>%
mutate(label="test_label") %>%
mutate(facet=runif(n = n(),min = 1,max=20)) %>%
ggplot(aes(x=disp,y=hp,label=label)) +
geom_text_repel() +
facet_grid(~facet)
Each panel is self contained and by default plotting is limited to the plotting area. This can be overridden by modifying the default coordinates. With this extreme example, using facet_wrap() with two rows was needed. I also decreased the font size of the labels, and restricted repulsion so that it moves labels only vertically. (Obviously tick labels and panel names would need to be tweaked further in real use.)
library(ggplot2)
library(ggrepel)
library(dplyr)
mtcars %>%
rowwise() %>%
mutate(label="test_label") %>%
mutate(facet=runif(n = n(),min = 1,max=20)) %>%
ggplot(aes(x=disp,y=hp,label=label)) +
geom_text_repel(direction = "y", hjust = 0.5, size = 2) +
facet_wrap(~facet, nrow = 2) +
coord_cartesian(clip = "off")
The code above answers the question but creates a new problem at least in the mtcars example as geoms work on a panel by panel basis, the repulsion cannot prevent overlap of labels that extend into neighbouring panels. Surprisingly, in addition some unexpected clipping on the left side takes place when saving to bitmap formats but not when saving to PDF (at least within RStudio).
A further option, is to make sure that the labels fit in the available space by using using the angle aesthetic to rotate the labels, or abbreviating the text used for labels.

ggplot2 multi-variable scatterplot, Changing Labels and View in Margins

I am trying to create a scatterplot based on four values. My data is just lists of prices (BASIC,VALUE,DELUXE,ULTIMATE). I want VALUE and DELUXE to be the two axis (x,y) and then have the size and color of the points represent the data for the other two columns.
It is hard to set up a reproducible example, because it is only an issue when I get a lot of values listed. i have about 300 points, with about 30 different color/value labels(For ULTIMATE, and 20 size/value labels(For BASIC)
> gg <- ggplot(d, aes(x=DELUXE_PRICE, y=VALUE_PRICE,color=ULTIMATE_PRICE,size=BASIC_PRICE)) + geom_point(alpha = 1)
> plot(gg)
My code does this well, and lists the colors/size with the corresponding value on the side. This is great, but I would like to alter how that is displayed, so that it is not cut off. I would like to be able to "wrap" the values into more columns, or shrink the display size of those so that they fit.
Currently, this lists ULTIMATE in three columns, to the right of the plot area, but cuts off the top of the labels (it extends well above the plot area)
This lists BASIC size/value labels to the right of the plot area, below ULTIMATE labels, in one column, so about half are cut off at the bottom.
I can increase the margins with:
> gg <- ggplot(d, aes(x=DELUXE_PRICE, y=VALUE_PRICE,color=ULTIMATE_PRICE,size=BASIC_PRICE)) + geom_point(alpha = 1) +theme(plot.margin = unit(c(4,2,4,2), "cm"))
> plot(gg)
This gets more of it in, but creates lots of white area and a smaller view of the plot. I would like to be able to just increase the right margin if necessary, and "wrap" the labels in more columns extending to the right. (i.e. put ULTIMATE into 4 columns instead of 3, and put BASIC into 3-4 columns instead of 1 - So that they are shorter and don't run out the plot area.
There is some built in functionality I found to do the required operation. It lies in adding a guides() argument to the plot, specifying whether I am dealing with the color or size legend, and specifying the number of columns with "ncol = " (You can also specify rows). Giving it an order ranking allows you to rank these as well, so my resulting code was:
> gg <- ggplot(Table, aes(x=DELUXE_PRICE, y=VALUE_PRICE,color=ULTIMATE_PRICE,size=BASIC_PRICE)) + geom_point(alpha = 1) + guides(color = guide_legend(order = 0,ncol = 4),size = guide_legend(order = 1,ncol = 4))

sjp.frq - different colors for bars in r

I am preparing series of plots, using sjPlot package. For simple frequencies presentation I use sjp.frq. I would like to use different colors for each bar. I found the option to choose color but it works only for whole series: the switch geom.colors allows to change the color of all bars. Even the combination geom.colors=c("color1","color2","color3") doesn't work.
Is there any solution to achieve something similar to this:
data(mpg)
sjp.frq(mpg$year,title = "", axis.title = "",
show.prc = TRUE, show.n = FALSE,
show.axis.values = FALSE)
I'm not sure, but I think I recall that ggplot2 used this color scheme by default for plots, if no color aesthetics was specified. However, later versions of ggplot now use a single color for simple frequencies (without grouping/colour aesthetics):
library(ggplot2)
library(sjmisc)
data(efc)
ggplot(efc, aes(e42dep)) + geom_bar()
That's why the image you posted has different colors, while now sjp.frq prints bars in one color only. Since you don't have a grouping aesthetics for simple frequency bars, you can't provide different colors for each geom / bar in sjp.frq. In this case, you have to find your own solution and add a group-aes, like:
ggplot(efc, aes(e42dep, fill = to_label(e42dep))) +
geom_bar() +
labs(y = NULL, x = get_label(efc$e42dep), fill = get_label(efc$e42dep)) +
scale_x_continuous(breaks = c(1:4), labels = get_labels(efc$e42dep))
However, to me it does not make much sense to give each bar a seperate color and provide axis labels. Using a legend instead of axis labels (drop axis labels) would work, but this makes the graph less intuitive, because you have to switch between legend and bars to find out which bar represents which category. For simple frequency plots, this is unnecessary complexity.

Equal size for multiple panels with different y-axis scales in lattice

I have multiple variables of a time series that differ in their scales. I want to plot each variable over time in a single-page, and each plot will have its own y-axis.
Seems to be easy, but I have a symmetry problem, since the plots that have higher values for y-axis were flattened to the right compared with the ones with smaller values for y-axis. Another problem with the panel size appeared when I decided to keep the x-axis only in two plots. These panels became more flattened than the others.
I'm relatively new to lattice and I have searched a lot with no success. First I tried to arrange the plots with grid.arrange, but I can't modify a specific panel with this function. So I tried to arrange plots with print and then use panel.widths and panel.heights. but it doesn't give the exactly equal size for all panels.
Any suggestions to get multiple panels with equal sizes considering different y-axis and x-axis presence/absence? Example below:
#Data
a<-c(1058.2557,821.2002,1004.5201,296.8243,374.3730,746.0718,954.6511,264.7352)
b<-c(100,60,40,36,42,32,42,32)
c<-c(116.610418,164.462337,47.862511,12.613479,4.253702,39.868584,21.591731,6.037917)
d<-c(4,10,3,2,1,5,11,13)
e<-c(20,30,10,50,21,60,20,70)
est1<-c("16:00","19:00","22:00","01:00","04:00","07:00","10:00","13:00")
newest1<-factor(est1,levels=unique(est1))
mysettings<-list(layout.heights=list(top.padding=-1,bottom.padding=-1),
layout.widths=list(right.padding=-2))
plo1<-barchart(a~newest1,scales=list(x=list(alternating=0)),par.settings=mysettings)
plo2<-barchart(b~newest1,scales=list(x=list(alternating=0)),par.settings=mysettings)
plo3<-barchart(c~newest1,scales=list(x=list(alternating=0)),par.settings=mysettings)
plo4<-barchart(d~newest1,scales=list(x=list(rot=45)),par.settings=mysettings)
plo5<-barchart(e~newest1,scales=list(x=list(rot=45)),par.settings=mysettings)
trellis.device(windows, height=6, width=7)
print(plo1, split=c(1,1,2,3),more=T)
print(plo2, split=c(2,1,2,3),more=T)
print(plo3, split=c(1,2,2,3),more=T)
print(plo4, split=c(2,2,2,3),more=T)
print(plo5, split=c(1,3,2,3),more=F)
Generally you wouldn't layout related plots like that in lattice. You would typically use a grouping variable. For this to work, you need all your data in one data.frame
dd <- data.frame(make.groups(a=a,b=b,c=c,d=d,e=e), newest1=newest1)
And to make things look a bit nicer i'll define a custom axis function
axis.yout<- function(side, ...) {
if(side %in% c("left", "right")) {
if (panel.number() %% 2 == which(c("right","left")==side)-1) {
panel.axis(side = side, outside =TRUE)
}
} else {
axis.default(side = side, ...)
}
}
now I plot with
barchart(data~newest1 | which, dd, layout=c(2,3),
scales=list(alternating=T, y=list(relation="free")),
par.settings=list(layout.widths=list(right.padding=5, axis.panel = c(1, 0))),
axis=axis.yout
)
which result in
which all share a common x-axis while allowing for free and independently labeled y-axis. And the spacing/passing is all consistent because we used a single call to lattice. Normally you wouldn't bother with a custom axis function like this, but when the scales relation is "free", lattice gets a bit grumpy about alternating labels.
I am sure someone will post a nice lattice solution. Meanwhile, you may consider a ggplot alternative.
library(reshape2)
library(ggplot2)
First, collect your vectors in a data frame, and reshape data from a wide to a long format:
df <- data.frame(newest1, a, b, c, d, e)
df2 <- melt(df, id.var = "newest1")
Plot the data in separate facets, one facet for each of the original vectors (which in the melted data ("df2") appear as different levels of the "variable" variable). We allow independent ("free") y axis scales in each facet:
ggplot(data = df2, aes(x = newest1, y = value)) +
geom_bar(stat = "identity") +
facet_wrap(~ variable, scales = "free_y") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))

Resources