Adjusting the relative space of facets (without regard to coordinate space) - r

I have a primary graph and some secondary information that I want to facet in another graph below it. Facetting works great except I do not know how to control the relative space used by one facet versus another. Am aware of space='free' but this is only useful if the ranges correspond to the desired relative sizing.
So for instance, I may want a graph where the first facet occupies 80% and the second 20%. Here is an example:
data <- rbind(
data.frame(x=1:500, y=rnorm(500,sd=1), type='A'),
data.frame(x=1:500, y=rnorm(500,sd=5), type='B'))
ggplot() +
geom_line(aes(x=x, y=y, colour=type), data=data) +
facet_grid(type ~ ., scale='free_y')
The above creates 2 facets of equal vertical dimension. Adding in space='free' in the facet_grid function changes the dimensions such that the lower facet is roughly 5x larger than the upper (as expected).
Supposing I want the upper to be 2x as large, with the same data set and ordering of facets. How can I accomplish this?
Is the only way to do this with some trickery in rescaling the data set and manually overriding axis labels (and if so, how)?
Alternative
As indicated below can use viewports to render as multiple graphs. I had considered this and in-fact had implemented using this approach in the past with standard plot and viewports.
The problem is that it is very difficult to get x-axis to align with this approach. So if there is a way to fix the size of the y-axis label region and the size of the legend region, can produce 2 graphs that have the same rendering area.

You don't need to use facets for this - you can also do this by using the viewport function.
> ratio = 1/3
> v1 = viewport(width=1,height=ratio,y=1-ratio/2)
> v2 = viewport(width=1,height=1-ratio,y=(1-ratio)/2)
> print(qplot(1:10,11:20,geom="point"),vp=v1)
> print(qplot(1:10,11:20,geom="line"),vp=v2)
Ratio is the proportion of the top panel to the whole page. Try 2/3 and 4/5 as well.
This approach can get ugly if your legend or axis labels in the two plots are different sizes, but for a fix, see the align.plots function in the ggExtra package and ggplot2 author Hadley Wickam's notes on this very topic.
There's no easy way to do this with facets currently, although if you are prepared to go down to editing the Grid, you can modify the ggplot graph after it has been plotted to get this effect.
See also this question on using grid and ggplot2 to create join plots using R.

Kohske Takahashi posted a patch to facet_grid that allows specification of the relative sizing of facets. See the thread:
http://groups.google.com/group/ggplot2/browse_thread/thread/7c5454dcc04bc7b8
With luck we'll see this in a future version of ggplot2.

Related

Formatting legend to fit 45+ legend items in R

I need help formatting a legend in ggplot2. I have approximatley 45 legened items. When I display the legend, my graph shrinks becuase the graph and legend items don't fit. I'm wondering how I can get all my legend items to display, but also have a reasonably sized graph. Is there a way to make my longer legend items go over multiple lines? Or, is there a way to make some legend items occupy more of the white space above/below the page? Any help will be super appreciated! Below is a screenshot of my current plot, along with my code.
guild_chart <-
ggplot(chart, aes(x=factor(Site,level=level_order1), y=`Row 1`, fill=Label)) +
geom_bar(stat="identity") +
scale_fill_manual(values =colfundose) +
theme_bw()+ ylab("# of reads") +
xlab("Location")
A frame challenge, if I may:
This is probably a bad way to visualise this data. The groups are impossible to distinguish from one another, and very difficult to compare. What is the purpose of this graph? What information do you wish to convey to the viewer? With that question in mind, think about how you can design the graph in a legible way.
To increase legibility, I would consider combining factors into groups and visualising these instead of the individual levels you are currently displaying.
As others have noted, presenting your data in this stacked bar layout is difficult to interpret. In addition to the challenge in discriminating between different groups, its also tough to estimate the number of reads for any sort of comparison.
As an alternative presentation, would it make sense to visualize these read counts as a heatmap? You could have a column (or row) for each of your seven locations containing 45 squares, colored to indicate # of reads. Now your legend is a color gradient with the range of read counts across the dataset. An advantage here is you can keep your 45 categories, if this is important, but have them right next to their respective rows, minimizing lookups in a legend.

ggplot draw multiple plots by levels of a variable

I have a sample dataset
d=data.frame(n=rep(c(1,1,1,1,1,1,2,2,2,3),2),group=rep(c("A","B"),each=20),stringsAsFactors = F)
And I want to draw two separate histograms based on group variable.
I tried this method suggested by #jenesaisquoi in a separate post here
Generating Multiple Plots in ggplot by Factor
ggplot(data=d)+geom_histogram(aes(x=n,y=..count../sum(..count..)),binwidth = 1)+facet_wrap(~group)
It did the trick but if you look closely, the proportions are wrong. It didn't calculate the proportion for each group but rather a grand proportion. I want the proportion to be 0.6 for number 1 for each group, not 0.3.
Then I tried dplyr package, and it didn't even create two graphs. It ignored the group_by command. Except the proportion is right this time.
d%>%group_by(group)%>%ggplot(data=.)+geom_histogram(aes(x=n,y=..count../sum(..count..)),binwidth = 1)
Finally I tried factoring with color
ggplot(data=d)+geom_histogram(aes(x=n,y=..count../sum(..count..),color=group),binwidth = 1)
But the result is far from ideal. I was going to accept one output but with the bins side by side, not on top of each other.
In conclusion, I want to draw two separate histograms with correct proportions calculated within each group. If there is no easy way to do this, I can live with one graph but having the bins side by side, and with correct proportions for each group. In this example, number 1 should have 0.6 as its proportion.
By changing ..count../sum(..count..) to ..density.., it gives you the desired proportion
ggplot(data=d)+geom_histogram(aes(x=n,y=..density..),binwidth = 1)+facet_wrap(~group)
You actually have the separation of charts by variable correct! Especially with ggplot, you sometimes need to consider the scales of the graph separately from the shape. Facet_wrap applies a new layer to your data, regardless of scale. It will behave the same, no matter what your axes are. You could also try adding scale_y_log10() as a layer, and you'll notice that the overall shape and style of your graph is the same, you've just changed the axes.
What you actually need is a fix to your scales. Understandable - frequency plots can be confusing. ..count../sum(..count..)) treats each bin as an independent unit, regardless of its value. See a good explanation of this here: Show % instead of counts in charts of categorical variables
What you want is ..density.., which is basically the count divided by the total count. The difference is subtle in principle, but the important bit is that the value on the x-axis matters. For an extreme case of this, see here: Normalizing y-axis in histograms in R ggplot to proportion, where tiny x-axis values produced huge densities.
Your original code will still work, just substituting the aesthetics I described above.
ggplot(data=d)+geom_histogram(aes(x=n,y=..density..,)binwidth = 1)+facet_wrap(~group)
If you're still confused about density, so are lots of people. Hadley Wickham wrote a long piece about it, you can find that here: http://vita.had.co.nz/papers/density-estimation.pdf

How to preserve plot sizes when stacking with common x-axes in ggarrange?

I am trying to stack plots with common x- and y-axes in ggplot. What I want to do is have only the bottom plot show the x-axis labels and titles. But I've never been able to figure out how to do this cleanly in ggplot2 without having the bottom plot be squished by carrying the virtue of the x-axis labels/title. There must be an easy way to do this- everyone wants to stack graphs, right?!
I'm currently trying with ggarrange. Example code below. Note that the bottom plot gets compressed vertically because it has the tick and axis labels. I could just have the top two have white font labels/title, but then there is an unseemly amount of margin space between the three if you use that hack.
I'm definitely open to packages other than gpubr, but I am hoping for something not too elaborate that I can use in subsequent situations, as I'm sure I'll encounter this again...
Help, please!! -Ryan
#
require(ggplot2); require(ggpubr)
X=data.frame(seq(as.Date("2001-01-01"),as.Date("2001-12-31"),by='days')); colnames(X)='date'
X$Y1=sample(80:100,size=nrow(X),replace=T)
X$Y2=sample(100:120,size=nrow(X),replace=T)
X$Y3=sample(50:70,size=nrow(X),replace=T)
plot.Y1= ggplot(X, aes(x=date,y=Y1))+
geom_line()+lims(y=c(50,150))+
theme(axis.title.x = element_blank(),axis.text.x=element_blank())
plot.Y2= ggplot(X, aes(x=date,y=Y2))+
geom_line()+lims(y=c(50,150))+
theme(axis.title.x = element_blank(),axis.text.x=element_blank())
plot.Y3= ggplot(X, aes(x=date,y=Y3))+
geom_line()+lims(y=c(50,150))
x11(10,8)
ggarrange(plot.Y1,plot.Y2,plot.Y3,nrow=3,ncol=1)
Bottom plot is squished!
try this,
egg::ggarrange(plot.Y1,plot.Y2,plot.Y3,ncol=1)

How to enlarge the size of facet in R

I'm working on the following dataset where each facet shows the bleaching for one kind of coral at one site across the time period. My problem is how to enlarge the size of each facet to see the trend more clearly, as in current facets, it is hard to see the trend because of the small change in bleaching....
here is my code,
cb1<-aggregate(cb$latitude, list(Site=cb$site), mean)
cb$site=factor(cb$site, levels=cb1$Site[order(cb1$x)])
ggplot(cb,aes(year,bleaching)) +
geom_point() +
facet_grid(site~kind) +
geom_smooth(method="lm",color="grey") +
coord_cartesian(ylim=c(0,1))
due to the current size of the grid of facets, some lines seem flat but actually they are not.
You cannot really increase the sizes of the facets unless you increases the size of the plot overall. One option would be to save a large version of the plot:
p<-ggplot(cb,aes(year,bleaching))+geom_point()+facet_grid(site~kind)+geom_smooth(method="lm",color="grey")+coord_cartesian(ylim=c(0,1))
ggsave("file_name.jpg", plot = p, width = 24, height = 24, units = "in")
If you have limited space (e.g. the plot has to go on an A4 sheet) then the facet_grid_paginate function from ggforce would be a good option. It allows you to split faceted plots over multiple pages. You can define the number of rows and columns per page. See this link.
Alternatively, if you want to show that the lines are not flat more clearly, you can try toying with a couple of the arguments to facet_grid. facet_grid allows you to set the scales to free, free_x or free_y. Setting free_y would mean that each facet has its own y-axis (not necessarily between 0 and one (assuming you also removed the ylim=c(0,1). This would, however, make the the facets more difficult to compare with each other.

How does one plot a 3D stacked histogram in R?

I want to plot stacked histograms in R; i.e. stack individual histograms in the third dimension.
thank you all for your suggestions, especially the one by Shane.
#hadley, I agree with your points, however, my situation is different: the main point I'm trying to convey by plotting four stacked histograms is that the tails vary significantly....the part that will get obscured is of no consequence in the data I'm presenting....also, being able to read the frequency axis is also not important since I'll be plotting the relative frequencies...
One doesn't. This is a terrible display of data because the front histograms obscure the rear histograms and the perspective makes it just about impossible to read the values off the y-axis.
You could try using either rgl (see here) or 3dscatterplot (as in this example). Lattice also supports this:
library(lattice)
library(latticeExtra)
?panel.3dbars
You can see an example of this on the Learnr blog.
I don't believe that's technically a stacked histogram (a stacked histogram stacks the bars on top of each other). Moreover, a different kind of histogram could be more informative: look at the ggplot2 the documentation here for some examples.
hist_cut <- ggplot(diamonds, aes(x=price, fill=cut))
hist_cut + geom_bar() # defaults to stacking
Another option is to use latticing instead, with facet_wrap in ggplot2 (see this post as an example).

Resources