Align text to a plot with variable size in R - r

I am very new in using the power of R to create graphical output.
I use the forest()-function in the metafor-package to create Forest plots of my meta-analyses. I generate several plots using a loop and then save them via png().
for (i in 1:ncol(df)-2)){
dat <- escalc(measure="COR", ri=ri, ni=ni, data=df) # Calcultes Effect Size
res_re <- rma.uni(yi, vi, data=dat, method="DL", slab=paste(author)) # Output of meta-analysis
png(filename=path, width=8.27, height=11.69, units ="in", res = 210)
forest(res_re, showweight = T, addfit= T, cex = .9)
text(-1.6, 18, "Author(s) (Year)", pos=4)
text( 1.6, 18, "Correlation [95% CI]", pos=2)
dev.off()
}
This works great if the size of the plot is equal. However, each iteration of the loop integrates a different number of studies in the forest plot. Thus, the text-elements are not on the right place and the forest-plot with many studies looks a bit strange. I have two questions:
How can I align the "Author(s) (Year)" and "Correlation [95%CI]" automatically to the changing size of the forest-plot such that the headings are above the upper line of the forest-table?
How can I scale the size of the forest plot such that the width and the size of the text-elements is the same for all plots and for each additional study just a new line will be added (changing height)?
Each forest-plot should look like this:

Here is what you will have to do to get this to work:
I would fix xlim across plots, so that there is a fixed place to place the "Author(s) (Year)" and "Correlation [95%CI]" headings. After you have generated a forest plot, take a look at par()$usr[1:2]. Use these values as a starting point to adjust xlim so that it is appropriate for all your plots. Then use those two values for the two calls to text().
There are k rows in each plot. The headings should go two rows above that. So, use text(<first xlim value>, res_re$k+2, "Author(s) (Year)", pos=4) and text(<second xlim value>, res_re$k+2, "Correlation [95% CI]", pos=2)
Set cex in text() to the same value you specified in your call to forest().
The last part is tricky. You have fixed cex, so the size of the text-elements should be the same across plots. But if there are more studies, then the k rows get crammed into less space, so they become less separated. If I understand you correctly, you want to keep the spacing between rows equal across plots by adjusting the actual height of the plot. Essentially, this will require making height in the call to png() a function of k. For each extra study, an additional amount needs to be added to height so that the row spacing stays constant, so something along the lines of height=<some factor> + res_re$k * <some factor>. But the increase in height as a function of k may also be non-linear. Getting this right would take a lot of try and error. There may be a clever way of determining this programmatically (digging into ?par and maybe ?strheight).
So make it easier for others to chime in, the last part of your question comes down to this: How do I have to adjust the height value of a plotting device, so that the absolute spacing between the rows in plot(1:10) and plot(1:20) stays equal? This is an interesting question in itself, so I am going to post this as a separate question.

ad 4.: In Wolfgangs question (Constant Absolute Spacing of Row in R Plots) you will find how to make plot-height depending on the amount of rows in it.
For forest() it would work a little different, since this function internally modifies the par("mar")-values.
However, if you set margins to zero, you only need to include the attribute yaxs="i" in your forest()-function, so that the y-axis will be segmented for the range of the data and nothing else. The device than needs to be configured to have the height (length(ma$yi)+4.5)*fact*res with fact as inches/line (see below) and res as pixels/inch (resolution).
The 4.5 depends if you have left addfit=T and intercept=T in your meta-analysis model (in that case forest() internally sets ylim <- c(-1.5, k + 3)). Otherwise you'd have to use 2.5 (than it would be ylim <- c(0.5, k + 3)).
If you feel like using margins you would do the following (I edited the following part, after I recognized some mistake):
res <- 'your desired resolution' # pixels per inch
fact <- par("mai")[1]/par("mar")[1] # calculate inches per line
### this following part is copied from inside the forest()-function.
# forest() modifies the margin internally in the same way.
par.mar <- par("mar")
par.mar.adj <- par.mar - c(0, 3, 1, 1)
par.mar.adj[par.mar.adj < 0] <- 0
###
ylim <- c(-1.5, length(ma$yi)+3) # see above
ylim.abs <- abs(ylim[1])+abs(ylim[2])-length(ma$yi) # calculate absolute distance of ylim-argument
pixel.bottom <- (par.mar.adj[1])*fact*res # calculate pixels to add to bottom and top based on the margin that is internally used by forest().
pixel.top <- (par.mar.adj[3])*fact*res
png(filename='path',
width='something meaningful',
height=((length(ma$yi)+ylim.abs)*fact*res) + pixel.bottom + pixel.top,
res=res)
par(mar=par.mar) # make sure that inside the new device the margins you want to define are actually used.
forest(res_re, showweight = T, addfit= T, cex = .9, yaxs="i")
...
dev.off()

Related

How can I eliminate extra space in a stripchart in base R?

I have created a stripchart in R using the code below:
oldFaithful <- read.table("http://www.isi-stats.com/isi/data/prelim/OldFaithful1.txt", header = TRUE)
par(bty = "n") #Turns off plot border
stripchart(oldFaithful, #Name of the data frame we want to graph
method = "stack", #Stack the dots (no overlap)
pch = 20, #Use dots instead of squares (plot character)
at = 0, #Aligns dots along axis
xlim = c(40,100)) #Extends axis to include all data
The plot contains a large amount of extra space or whitespace at the top of the graph, as shown below.
Is there a way to eliminate the extra space at the top?
Short Answer
Add the argument offset=1, as in
stripchart(oldFaithful, offset=1, ...)
Long Answer
You really have to dig into the code of stripchart to figure this one out!
When you set a ylim by calling stripchart(oldFaithful, ylim=c(p,q)) or when you let stripchart use its defaults, it does in fact set the ylim when it creates the empty plotting area.
However, it then has to plot the points on that empty plotting area. When it does so, the y-values for the points at one x-value are specified as (1:n) * offset * csize. Here's the catch, csize is based on ylim[2], so the smaller you make the upper ylim, the smaller is csize, effectively leaving the space at the top of the chart no matter the value of ylim[2].
As a quick aside, notice that you can "mess with" ylim[1]. Try this:
stripchart(oldFaithful, ylim=c(2,10), pch=20, method="stack")
OK, back to solving your problem. There is a second reason that there is space at the top of the plot, and that second reason is offset. By default, offset=1/3 which (like csize) is "shrinking" down the height of the y-values of the points being plotted. You can negate this behavior be setting offset closer or equal to one, as in offset=0.9.

Alter axes in R with NMDS plot

I'm trying to change the axes in my NMDS plot to zoom into where my sites are plotted. I assume that the space chosen in a product of the species points which I do not have plotted. I have tried adding xlim to my code to no avail and was wondering if I have it in the wrong place or if another action is needed. Below is a copy of my code.
#NMDS on pooled abundance with NA's omitted
NMDS_HPA<-metaMDS(HP_Abundance_omit[,-1],k=2, trymax=1000)
plot(NMDS_HPA, type="n", display="sites", xlim=c(-1.5,1.5))
with(descriptors, levels(T))
colorvec<-c("seagreen4", "tan4", "mediumblue")
plot(NMDS_HPA, type="n", xlim=c(-1.5,1.5))
title(main="NMDS using Abundance with Bray-Curtis", sub="Habitats Pooled")
ordihull(NMDS_HPA, groups=treat, draw="polygon", col="grey90", label=F)
with(descriptors, points(NMDS_HPA, display="sites", col=colorvec[T], pch=21, bg=colorvec[T]))
with(descriptors, legend("topright", legend=levels(T), bty="n", col=colorvec, pch=21, pt.bg=colorvec))
Thanks
If you don't set the ylim too, then vegan has no choice but to show more (or less) of the x-axis than you want because the scaling of the axis must be retained; a unit change along one axis must match the same unit change along the other. Otherwise, how would you know how to represent Euclidean distances (easily) on the figure? As those Euclidean distances are supposed to reflect the rank ordering of the original dissimilarities, maintaining the aspect ratio or relative scaling of the axes to one another is important.
You can see this in action just by using your mouse to rescale the size of the device window on screen. R replots the figure using different axis limits all the time in order to maintain an aspect ratio of 1.
Consider this reproducible example:
library("vegan")
data(dune)
set.seed(56)
sol <- metaMDS(dune)
Choosing a section in both the x and y axes works as expected
## zoom in on the section (-0.5,0.5)(-0.5,0.5)
plot(sol, xlim = c(-0.5, 0.5), ylim = c(-0.5,0.5))
If you want to retain the full y-axis but only show say the middle 50% of the x axis then you have to plot on a device whose width is ~ 50% that of the height (approximately because R's default is to use different sized margins on the top/bottom left/right margins.)
png("~/mds-zoom2.png", height = 700, width = 350, res = 100, pointsize = 16)
plot(sol, xlim = c(-0.5, 0.5))
dev.off()
which produces
This is almost right. You could solve the problem exactly by setting the margins equal around the plot using par(mar = rep(4, 4) + 0.1) and then work out the ratio of the range of the scores on the x and y axes (get the scores(sol) and compute the range() on both columns then compute the ratio of the two ranges), then use that to give you the desired height of the plot for the width you want to state.
If you just plotted the points rather than the NMDS object than xlim works just fine
plot(NMDS_HPA$points, xlim=c(-1.5,1.5))

Set margins to cater for large legend

I'm trying to figure out a way to calculate the height of a legend for a plot prior to setting the margins of the plot. I intend to place the legend below the plot below the x-axis labels and title.
As it is part of a function which plots a range of things the legend can grow and shrink in size to cater for 2 items, up to 15 or more, so I need to figure out how I can do this dynamically rather that hard-coding. So, in the end I need to dynamically set the margin and some other bits and pieces.
The key challenge is to figure out the height of the legend to feed into par(mar) prior to drawing the plot, but after dissecting the base codes for legend however, it seems impossible to get a solid estimate of the height value unless the plot is actually drawn (chicken and egg anyone?)
Here's what I've tried already:
get a height using the legend$rect$h output from the base legend function (which seems to give a height value which is incorrect unless the plot is actually drawn)
calculate the number of rows in the legend (easy) and multiply this by the line height (in order to do this, seems you'd need to translate into inches (the base legend code uses yinch and I've also tried grconvertY but neither of those work unless a plot has been drawn).
Another challenge is to work out the correct y value for placement of the legend - I figure that once I've solved the first challenge, the second will be easy.
EDIT:
After a day of sweating over how this is (not) working. I have a couple of insights and a couple of questions. For the sake of clarity, this is what my function essentially does:
step 1) set the margins
step 2) create the barplot on the left axis
step 3) re-set the usr coordinates - this is necessary to ensure alignment of the right axis otherwise it plots against the x-axis scale. Not good when they are markedly different.
step 4) create the right axis
step 5) create a series of line charts on the right axis
step 6) do some labelling of the two axes and the x-axis
step 7) add in the legend
Here are the questions
Q1) What units are things reported in? I'm interested in margin lines and coordinates (user-coordinates), inches is self explanatory. - I can do some conversions using grconvertY() but I'm not sure what I'm looking at and what I should be converting to - the documentation isn't so great.
Q2) I need to set the margin in step 1 so that there is enough room at the bottom of the chart for the legend. I think I'm getting that right, however I need to set the legend after the right axis and line charts are set, which means that the user coordinates (and the pixel value of an inch, has changed. Because of Q1 above I'm not sure how to translate one system to the other. Any ideas in this regard would be appreciated.
After another day of sweating over this here's what solved it mostly for me.
I pulled apart the code for the core legend function and compiled this:
#calculate legend buffer
cin <- par("cin")
Cex <- par("cex")
yc <- Cex * cin[2L] #cin(inches) * maginfication
yextra <- 0
ymax <- yc * max(1, strheight("Example", units = "inches", cex = Cex)/yc)
ychar <- yextra + ymax #coordinates
legendHeight <- (legendLines * ychar) + yc # in
Which is essentially mimicking the way the core function calculates legend height but returns the height in inches rather than in user coordinates. legendLines is the number of lines in the legend.
After that, it's a doddle to work out how to place the legend, and to set the lower margin correctly. I'm using:
#calculate inches per margin line
inchesPerMarLine<-par("mai")[1]/par("mar")[1]
To calculate the number of inches per margin line, and the following to set the buffers (for the axis labels and title, and the bottom of the chart), and the margin of the plot.
#set buffers
bottomBuffer = 1
buffer=2
#calculate legend buffer
legBuffer <- legendHeight/inchesPerMarLine
#start the new plot
plot.new()
# set margin
bottomMargin <- buffer + legBuffer + bottomBuffer
par(mar=c(bottomMargin,8,3,5))
The plot is made
barplot(data, width=1, col=barCol, names.arg=names, ylab="", las=1 ,axes=F, ylim=c(0,maxL), axis.lty=1)
And then the legend is placed. I've used a different method to extract the legend width which does have some challenges when there is a legend with 1 point, however, it works ok for now. Putting the legend into a variable allows you to access the width of the box like l$rect$w. trace=TRUE and plot=FALSE stop the legend being written to the plot just yet.
ycoord <- -1*(yinch(inchesPerMarLine*buffer)*1.8)
l<-legend(x=par("usr")[1], y=ycoord, inset=c(0,-0.25), legendText, fill=legendColour, horiz=FALSE, bty = "n", ncol=3, trace=TRUE,plot=FALSE)
lx <- mean(par("usr")[1:2]-(l$rect$w/2))
legend(x=lx, y=ycoord, legendText, fill=legendColour, horiz=FALSE, bty = "n", ncol=3)
For completeness, this is how I calculate the number of lines in the legend. Note - the number of columns in the legend is 3. labelSeries is the list of legend labels.
legendLines <- ceiling(nrow(labelSeries)/3)

asp is producing unnecessary whitespace within the axes of my R plot. How can I reformat the graph?

I'm trying to create a scatter plot + linear regression line in R 3.0.3. I originally tried to create it with the following simple call to plot:
plot(hops$average.temperature, hops$percent.alpha.acids)
This created this first plot:
As you can see, the scales of the Y and X axes differ. I tried fixing this using the asp parameter, as follows:
plot(hops$average.temperature, hops$percent.alpha.acids, asp=1, xaxp=c(13,18,5))
This produced this second plot:
Unfortunately, setting asp to 1 appears to have compressed the X axis while using the same amount of space, leaving large areas of unused whitespace on either side of the data. I tried using xlim to constrain the size of the X-axis, but asp seemed to overrule it as it didn't have any effect on the plot.
plot(hops$average.temperature, hops$percent.alpha.acids, xlim=c(13,18), asp=1, xaxp=c(13,18,5))
Any suggestions as to how I could get the axes to be on the same scale without creating large amounts of whitespace?
Thanks!
One solution would be to use par parameter pty and set it to "s". See ?par:
pty
A character specifying the type of plot region to be used; "s"
generates a square plotting region and "m" generates the maximal
plotting region.
It forces the plot to be square (thus conteracting the side effect of asp).
hops <- data.frame(a=runif(100,13,18),b=runif(100,2,6))
par(pty="s")
plot(hops$a,hops$b,asp=1)
I agree with plannapus that the issue is with your plotting area. You can also fix this within the device size itself by ensuring that you plot to a square region. The example below opens a plotting device with square dimension; then the margins are also set to maintain these proportions:
Example:
n <- 20
x <- runif(n, 13, 18)
y <- runif(n, 2, 6)
png("plot.png", width=5, height=5, units="in", res=200)
par(mar=c(5,5,1,1))
plot(x, y, asp=1)
dev.off()

Binary spark lines with R

I'm looking to plot a set of sparklines in R with just a 0 and 1 state that looks like this:
Does anyone know how I might create something like that ideally with no extra libraries?
I don't know of any simple way to do this, so I'm going to build up this plot from scratch. This would probably be a lot easier to design in illustrator or something like that, but here's one way to do it in R (if you don't want to read the whole step-by-step, I provide my solution wrapped in a reusable function at the bottom of the post).
Step 1: Sparklines
You can use the pch argument of the points function to define the plotting symbol. ASCII symbols are supported, which means you can use the "pipe" symbol for vertical lines. The ASCII code for this symbol is 124, so to use it for our plotting symbol we could do something like:
plot(df, pch=124)
Step 2: labels and numbers
We can put text on the plot by using the text command:
text(x,y,char_vect)
Step 3: Alignment
This is basically just going to take a lot of trial and error to get right, but it'll help if we use values relative to our data.
Here's the sample data I'm working with:
df = data.frame(replicate(4, rbinom(50, 1, .7)))
colnames(df) = c('steps','atewell','code','listenedtoshell')
I'm going to start out by plotting an empty box to use as our canvas. To make my life a little easier, I'm going to set the coordinates of the box relative to values meaningful to my data. The Y positions of the 4 data series will be the same across all plotting elements, so I'm going to store that for convenience.
n=ncol(df)
m=nrow(df)
plot(1:m,
seq(1,n, length.out=m),
# The following arguments suppress plotting values and axis elements
type='n',
xaxt='n',
yaxt='n',
ann=F)
With this box in place, I can start adding elements. For each element, the X values will all be the same, so we can use rep to set that vector, and seq to set the Y vector relative to Y range of our plot (1:n). I'm going to shift the positions by percentages of the X and Y ranges to align my values, and modified the size of the text using the cex parameter. Ultimately, I found that this works out:
ypos = rev(seq(1+.1*n,n*.9, length.out=n))
text(rep(1,n),
ypos,
colnames(df), # These are our labels
pos=4, # This positions the text to the right of the coordinate
cex=2) # Increase the size of the text
I reversed the sequence of Y values because I built my sequence in ascending order, and the values on the Y axis in my plot increase from bottom to top. Reversing the Y values then makes it so the series in my dataframe will print from top to bottom.
I then repeated this process for the second label, shifting the X values over but keeping the Y values the same.
text(rep(.37*m,n), # Shifted towards the middle of the plot
ypos,
colSums(df), # new label
pos=4,
cex=2)
Finally, we shift X over one last time and use points to build the sparklines with the pipe symbol as described earlier. I'm going to do something sort of weird here: I'm actually going to tell points to plot at as many positions as I have data points, but I'm going to use ifelse to determine whether or not to actually plot a pipe symbol or not. This way everything will be properly spaced. When I don't want to plot a line, I'll use a 'space' as my plotting symbol (ascii code 32). I will repeat this procedure looping through all columns in my dataframe
for(i in 1:n){
points(seq(.5*m,m, length.out=m),
rep(ypos[i],m),
pch=ifelse(df[,i], 124, 32), # This determines whether to plot or not
cex=2,
col='gray')
}
So, piecing it all together and wrapping it in a function, we have:
df = data.frame(replicate(4, rbinom(50, 1, .7)))
colnames(df) = c('steps','atewell','code','listenedtoshell')
BinarySparklines = function(df,
L_adj=1,
mid_L_adj=0.37,
mid_R_adj=0.5,
R_adj=1,
bottom_adj=0.1,
top_adj=0.9,
spark_col='gray',
cex1=2,
cex2=2,
cex3=2
){
# 'adJ' parameters are scalar multipliers in [-1,1]. For most purposes, use [0,1].
# The exception is L_adj which is any value in the domain of the plot.
# L_adj < mid_L_adj < mid_R_adj < R_adj
# and
# bottom_adj < top_adj
n=ncol(df)
m=nrow(df)
plot(1:m,
seq(1,n, length.out=m),
# The following arguments suppress plotting values and axis elements
type='n',
xaxt='n',
yaxt='n',
ann=F)
ypos = rev(seq(1+.1*n,n*top_adj, length.out=n))
text(rep(L_adj,n),
ypos,
colnames(df), # These are our labels
pos=4, # This positions the text to the right of the coordinate
cex=cex1) # Increase the size of the text
text(rep(mid_L_adj*m,n), # Shifted towards the middle of the plot
ypos,
colSums(df), # new label
pos=4,
cex=cex2)
for(i in 1:n){
points(seq(mid_R_adj*m, R_adj*m, length.out=m),
rep(ypos[i],m),
pch=ifelse(df[,i], 124, 32), # This determines whether to plot or not
cex=cex3,
col=spark_col)
}
}
BinarySparklines(df)
Which gives us the following result:
Try playing with the alignment parameters and see what happens. For instance, to shrink the side margins, you could try decreasing the L_adj parameter and increasing the R_adj parameter like so:
BinarySparklines(df, L_adj=-1, R_adj=1.02)
It took a bit of trial and error to get the alignment right for the result I provided (which is what I used to inform the default values for BinarySparklines), but I hope I've given you some intuition about how I achieved it and how moving things using percentages of the plotting range made my life easier. In any event, I hope this serves as both a proof of concept and a template for your code. I'm sorry I don't have an easier solution for you, but I think this basically gets the job done.
I did my prototyping in Rstudio so I didn't have to specify the dimensions of my plot, but for posterity I had 832 x 456 with the aspect ratio maintained.

Resources