Auto fit labels in R boxplot - r

We are currently using R to automatically generate various kinds of boxplots.
The problem we have is that the length of our labels varies considerably between different plots and classes in one plot.
Is there a way to automatically adjust the plot so that all the labels will fit it nicely?
Specifying a worst case mar isn't feasible because in some plots the labels are considerably shorter than in others.

Lattice is the graphics library most likely to be helpful here. I say that for two reasons: (i) lattice is based on the grid system, and by accessing grid's graphical primitives, you can get much finer control over, among other things, the location of your panel output; and (ii) there's more to work with--the R standard graphics package has 70 different parameters, while Lattice has 371--by my count anyway, (length(names(unlist(trellis.par.get())))), yet those 371 are not in a flat structure like they are in the base package, but instead are collected in a hierarchical structure (with 30 or so parameter groups at the top level).
What you want is relative positioning of your axis labels. I would recommend going down one level for this sort of task. So to do what you want, just change the relevant grob slots then just redraw the two grobs (using the R interactive prompt):
library(lattice)
library(grid)
bwplot(~runif(200, 10, 99), xlab="x-axis label", ylab="y-axis label")
# move the x-axis label to the far left
grid.edit("[.]xlab$", grep=T, x=unit(0, "npc"), just="left", redraw=T)
# move it to the far right
grid.edit("[.]xlab$", grep=T, x=unit(1, "npc"), just="right", redraw=T)
# move it to the center
grid.edit("[.]xlab$", grep=T, x=unit(0.5, "npc"), just="center", redraw=T)
# same for y-axis
grid.edit("[.]ylab$", grep=T, y=unit(0.5, "npc"), just="center", redraw=T)
"[.].xlab$", grid.edit takes a gPath object (just a path traversing a gTree, which is just a grob that contains other grobs); because i didn't know where in the gPath my object of interest resides (the x-axis/y-axis label, i used a regular expression form for the object;
"grep=T", just tells grid.edit to treat the previous parameter as a regular expression;
"x=unit(0.5, 'npc')", specifying viewport coordinates here (in this case, just the x value); 'npc' ('normalized parent coordinates', which is the default) treats the viewport origin as (0,0), and assigns it a width & height of 1 unit each. Hence, i've specified the center of the viewport along the x axis.

With the base plotting system, a quick solution could be to rotate the x-labels to be vertical, using las=2 or las=3. Of course this also only works if your labels are not extremely long, but beyond a certain label length you will run into trouble with any type of plot anyway (shortening labels would be the way to go then).
But I agree with #doug that for more fine grained control, lattice or ggplot2 should be considered.

Related

Decorana on R and using xlim

So I've been running a DCA analysis on a species/site count spreadsheet (DCA file made using Vegan and decorana command). I'm having a bit of overlap with my points, so I'm trying to extend DCA 1 axis. I keep trying to use the xlim value to narrow it down to -2,2, but it just won't do it. For some reason, it seems tied to the ylim value. If I drop the ylim to -1,1, that will force the xlim to -2,2, but I can't actually have the ylim that small.
> plot(DCA, type = "n", xlim = c(-2,2))1
First plot shows result of this command. Trying to include a ylim of (-2,2) didn't change it either. Second plot shows result of this command:
> plot(DCA, type = "n", xlim = c(-2,2)), ylim = c(-2,2)2
I'm not exactly an expert at this, and I feel like I might be making a stupid mistake. Anyone got any ideas?
The plot is enforcing equal aspect ratio, and if you insist on having full range on y-axis (you do not set ylim), the x-axis will be set to accommodate the range you want to show on y-axis. You must either change the shape of your graphics display for a shorter x-axis, or then you also must set the range on y-axis with ylim Your choice. If you draw on square graphics window, the output will be square and it will be taken care that both the x and y scale (xlim, ylim) will fit the square. Changing the shape of the graphics window or setting both limits will help. Function locator() can be used to query coordinates in an existing plot, and these can be used to set up the new limits.
It is better to start again from a clean table. My intention was not to be rude, but trying to complement the answer in comments leads into terse messages where details and special cases are hard to handle. So I try to explain here how you can control the display and the axis limits in vegan ordination graphics. This answer has two main parts: (1) why we insist on equal aspect ratio, and (2) how to survive with equal aspect ratio. The latter can indeed be tricky, but we have not made it tricky because we are evil, but it is necessarily tricky.
First about the equal aspect ratio. This means that the numeric scales is equal on vertical (y) and horizontal (x) axes. In ordination the "importance" of an axis is normally reflected by its length: important axes are long, minor axes are short. In eigenvector methods this is determined by eigenvalues (which actually define the scatter of points along axes). In DCA (decorana) it should be the SD scaling so that one unit is equal to average width of species responses. We pay great attention in scaling axes accordingly, and we want to show this in the plot so that long x axis remains long and short y axis remains short even when the graph is designed for a portrait printer page shape. In this way the axis scaling (tic values) are equally spaced on both axes, and distances in the graph are equal horizontally, vertically or diagonally. Also if you draw a circle in the plot, it will remain a circle and not flattened or elongated to an ellipse. So this is something that we insist as a necessary feature of an ordination graph.
Insistence on equal scaling of equal axes comes with a price. One obvious price is that the ordination plot may not fill the graph area, but there can be a plenty of empty space in the graph for shorter axes. You can get rid of this adjusting graph shape – typically making it flatter. Another price that must be paid is the one that bit you: setting axis limits is tricky. However, this is not a vegan invention, but we use base R plot command with option asp = 1 for equal aspect ratio of axes.
To see how we can set axis limits with equal ratio, let us generate a regular rectangular grid and plot it:
x <- seq(-2, 2, by=0.2)
xy <- expand.grid(x, x)
plot(xy, asp=1)
This is a square grid on a square plot and nothing very special. However, if we plot the same grid on rectangle the aspect ratio remains equal, numeric scales are equal and the points remain equidistant on a square, but there is a lot of empty space and x-axis has longer numeric scale (but the points have unchanged numeric scale). If we try reduce only the x-scale, we face a disappointment:
plot(xy, asp=1, xlim=c(-1,1), main="xlim=c(-1,1)")
The graph is essentially similar as the first unlimited case with unchanged x. Setting xlim does not remove any points, but it only tells plot that do not reserve space but for that range on x-axis. However, y-scale is still longer and with equal aspect ratio it will also set the scale for x-axis and since there is empty space in the graph, the points are plotted there (this is analogous to having empty space even when there is nothing to plot there). To really limit the x-axis, you must simultaneously limit the y-axis accordingly:
plot(xy, asp=1, xlim=c(-1,1), ylim=c(-1,1), main="xlim=c(-1,1), ylim=c(-1,1)")
This give the desired x-limits .
Like I wrote, we did not invent that nastiness, but this is base R plot(..., asp=1) behaviour. I know this can be tricky (I have used that myself and sometimes I get irritated). I have been thinking could we be more user-friendly and by-pass base R. It is pretty easy to do this in a way that works in many normal use cases, but it is much harder to do this so that it works in most cases, and I don't know how to do this in all possible cases. If anybody knows, pull requests are welcome in vegan.
Finally, there is one tool that may help: vegan has an interactive dynamic plotting tool orditkplot where you can zoom into a plot by selecting a rectangle with left mouse button. However, this function may not work in all R systems, but if it works it gives an easy way of studying details of the plot (but if you have Mac with one-button mouse, don't ask me how it works: I don't know). You can start this with
orditkplot(mod, display="sites") # or "species", but only one
Even without orditkplot you can use base R function locator(): click the diagonally opposite corners of the rectangle you want to focus on, and this give you the xlim and ylim you need to set to zoom into this rectangle.

R partykit::ctree offset labels on edges

I am working with ctree and my data set has a covariate of factors that create a node. There are enough factors for that covariate and their names are long enough that they overlap on each other in the edges created at the node. I want to find a way to stop this overlap.
I checked other questions and found one answer that supplies some help. The plot for ctree relies on the grid package and I can use functions to write new labels on the edge. My problem now is that I don't know how to suppress the labels that are printed as default when I plot the tree. I don't know enough about grid or plot.party to figure out which object needs to be suppressed.
An example of my problem in the following image:
Code for my example problem:
libary(partykit)
library(tidyverse) #this is here for the mpg data set in next line. not required for partykit
data(mpg)
irt <- ctree(hwy~as.factor(class),data=mpg)
plot(irt)
The resulting 1st node has one edge with "2seater, compact, midsize, subcompact" and the other edge with "minivan, pickup, suv". What I end up seeing in the plot is "2seater, compact, midsize, subcompaminivan, pickup, sub". I've already made the graphics device full screen. (I have other trees that only have one node and so that makes those look odd at the full screen dimension, so I don't want to go back and forth.)
The partial solution I have is
plot(irt, pop=FALSE)
seekViewport("edge1-1")
grid.text("2seater, compact,\n midsize, subcompact")
This stacks "2seater, compact" on top of "midsize, subcompact" and would keep them from overlapping "minivan, pickup, suv". But now, I have the original too-long label still in the plot. And the edge that the label I'm trying fix is attached to has a break in a place that doesn't work with the new stacked label. It would be nice to fix that edge, but the real problem is suppressing the original, too-long label on edge1-1.
The edge labels are drawn by the function edge_simple() which offers various kinds of justifications for the edge labels, see ?edge_simple. The justification is only applied if the edge labels are on average longer than justmin, defaulting to Inf (i.e.: no justification). Various justifications are possible (alternating, increasing, decreasing, or equal).
Thus, in your case the simplest solution is probably to set justmin to a small enough finite value. Alternatively (or additionally) you could also decrease the font size by setting gpar(fontsize = ...). For illustration both examples below have been generated on a 6in x 8in PNG device:
library("partykit")
data("mpg", package = "ggplot2")
irt <- ctree(hwy ~ factor(class), data = mpg)
plot(irt, ep_args = list(justmin = 15))
plot(irt, ep_args = list(justmin = 15), gp = gpar(fontsize = 10))

Save Filled Area of Polygon in R

I am plotting polygon in R and saving it.Problem, I am facing is that the whole plot is saved as png file but I want to save only the filled area in the polygon.
Is there a way for that ?
x<-c(0.000000000,0.010986328,0.006351471,-0.004634857)
y<-c(0.000000000,0.007232612,0.012841203,0.006199415)
file_name = paste("~/Downloads/Plot", ".png", sep="")
png(file_name,width=1280,height = 720,units="px",res=200)
plot(x,y,axes=FALSE,ylab='',xlab='')+polygon(x,y,col="#FF0000FF")
dev.off()
If you're drawing a monofigure plot (which is the default), then I believe there are three possible sources of spacing that can cause a plot element to not extend to the edges of the graphics device:
1: data coordinate limits that are larger than the extent of the plot element.
2: "internal spacing", which is best thought of as an expansion of the plot area that sits inside the margins.
3: margins. This is normally where axes, ticks, tick labels, axis labels, titles, and sometimes legends are drawn.
All of these sources of spacing can be eliminated with the following customizations:
1: set the xlim and ylim graphics parameters to perfectly fit the target plot element.
2: set xaxs='i',yaxs='i', which can be done with either a preemptive par() call or on the initial plot() call.
3: zero the margins with mar=c(0,0,0,0). This must be done with par() prior to the initial plot() call.
Example:
## generate data
pts <- data.frame(x=c(0.2,0.4,0.9,0.7),y=c(0.5,0.4,0.5,0.6));
## precompute plot parameters
xlim <- range(pts$x);
ylim <- range(pts$y);
## draw plot
par(mar=c(0,0,0,0));
plot(NA,xlim=xlim,ylim=ylim,xaxs='i',yaxs='i',axes=F,ann=F);
points(pts$x,pts$y,pch=21L);
polygon(pts$x,pts$y,col='red',pch=21L);
Multifigure plots can incur one additional source of spacing, namely outer margins, but it looks like that doesn't concern you for this problem. In any case, I'm pretty sure outer margins always default to zero anyway.
See par() for the relevant documentation.
It looks like I misunderstood the question. What you want is a transparent background, which is different from simply fitting the image size to the plot element.
You can use the png() function to set the background to be transparent by passing bg='transparent', as explained on the documentation page.
For example, here's my fitted image saved with a transparent background:
Note that not all image viewers will correctly detect and/or clearly depict the transparency of the background. I would highly recommend GIMP, which is basically a free Photoshop knockoff, albeit markedly lighter in features. GIMP depicts transparent regions as a kind of checkerboard of grey squares, which looks like this:

R, ggplot2, size of plot area

I use R and ggplot2 to produce graphs for my thesis. I can port them to tikz objects using the tikzdevice or to a pdf using the pdf device very easily, and in each case, specifying the width and height of the overall plot is straightforward.
However, I am actually more interested in specifying the width of the plot AREA (ie the inner box), since differences in this (particularly for plots on the same page) are more easily detected by the eye, even if they are a couple of points different in the final document.
Of course, the source of this issue can be easily put down to the axis labels that vary depending on the content.
My question is, How can I define 'fixed' axis label widths as a global option, or define the width to be exported as the inner plotting area for ggplot2 objects....

Adjusting the relative space of facets (without regard to coordinate space)

I have a primary graph and some secondary information that I want to facet in another graph below it. Facetting works great except I do not know how to control the relative space used by one facet versus another. Am aware of space='free' but this is only useful if the ranges correspond to the desired relative sizing.
So for instance, I may want a graph where the first facet occupies 80% and the second 20%. Here is an example:
data <- rbind(
data.frame(x=1:500, y=rnorm(500,sd=1), type='A'),
data.frame(x=1:500, y=rnorm(500,sd=5), type='B'))
ggplot() +
geom_line(aes(x=x, y=y, colour=type), data=data) +
facet_grid(type ~ ., scale='free_y')
The above creates 2 facets of equal vertical dimension. Adding in space='free' in the facet_grid function changes the dimensions such that the lower facet is roughly 5x larger than the upper (as expected).
Supposing I want the upper to be 2x as large, with the same data set and ordering of facets. How can I accomplish this?
Is the only way to do this with some trickery in rescaling the data set and manually overriding axis labels (and if so, how)?
Alternative
As indicated below can use viewports to render as multiple graphs. I had considered this and in-fact had implemented using this approach in the past with standard plot and viewports.
The problem is that it is very difficult to get x-axis to align with this approach. So if there is a way to fix the size of the y-axis label region and the size of the legend region, can produce 2 graphs that have the same rendering area.
You don't need to use facets for this - you can also do this by using the viewport function.
> ratio = 1/3
> v1 = viewport(width=1,height=ratio,y=1-ratio/2)
> v2 = viewport(width=1,height=1-ratio,y=(1-ratio)/2)
> print(qplot(1:10,11:20,geom="point"),vp=v1)
> print(qplot(1:10,11:20,geom="line"),vp=v2)
Ratio is the proportion of the top panel to the whole page. Try 2/3 and 4/5 as well.
This approach can get ugly if your legend or axis labels in the two plots are different sizes, but for a fix, see the align.plots function in the ggExtra package and ggplot2 author Hadley Wickam's notes on this very topic.
There's no easy way to do this with facets currently, although if you are prepared to go down to editing the Grid, you can modify the ggplot graph after it has been plotted to get this effect.
See also this question on using grid and ggplot2 to create join plots using R.
Kohske Takahashi posted a patch to facet_grid that allows specification of the relative sizing of facets. See the thread:
http://groups.google.com/group/ggplot2/browse_thread/thread/7c5454dcc04bc7b8
With luck we'll see this in a future version of ggplot2.

Resources