Automatically choose "line" in axis with stacked axis labels in R - r

Long story short, I'm creating a type of plot where on the x-axis, I have two different variables whose values are crossed with one another (in this minimal example, I have a=1,2 and b=1:5). I want to show how another variable (the one on the y axis) varies as a function of both a and b. What I'm trying to do is figure out a way to automatically place the second group of labels (in this case, the "b" variable). Here's a minimal example:
set.seed(1)
par(mar=c(7,3.5,1,1))
plot(1:10, 1:10+runif(10, -1, 1), xaxt="n", xlab="", ylab="", type="o")
axis(1, 1:10, rep(c("a=1", "a=2"), 5), las=2)
axis(1, seq(from=1.5, to=9.5, by=2), paste0("b=", 1:5), las=2, line=2, lwd=0)
Which produces the following graphic:
In this case, I got lucky and chose "line=2" as the correct placement. But if I modify things a bit (and make my variable labels a little more "bulky"):
set.seed(1)
par(mar=c(7,3.5,1,1))
plot(1:10, 1:10+runif(10, -1, 1), xaxt="n", xlab="", ylab="", type="o")
axis(1, 1:10, rep(c("Control", "Treatment"), 5), las=2)
axis(1, seq(from=1.5, to=9.5, by=2), paste0("b=", 1:5), las=2, line=2, lwd=0)
Now the groups of labels overlap:
Is there a way to automatically determine what "line" the axis is at so I can place my second group of labels without overlapping the first?

Here's something you could try. Instead of using line, which can be unreliable as you noted, you could pad white spaces to you second axis call using sprintf. Below, I first calculate the maximum number of characters in the first axis call, and multiply by 2.5, which seems to work well in this case. You might want to tweak this. Then, I use the default line=1 but the with spaces will "push" the second axis call away from the axis line. BTW, you might want to do something like max(nchar(first axis)) + max(nchar(second axis)) to calculate the padding.Finally, note that the "-" sign inside sprintf means add white spaces to the right.
set.seed(1)
par(mar=c(7,3.5,1,1))
max_nchar <- max(nchar(rep(c("Control", "Treatment"), 5))) *2.5
plot(1:10, 1:10+runif(10, -1, 1), xaxt="n", xlab="", ylab="", type="o")
axis(1, 1:10, rep(c("Control", "Treatment"), 5), las=2)
axis(1, seq(from=1.5, to=9.5, by=2),
sprintf(paste0("%-",max_nchar,"s"), paste0("b=", 1:5)), las=2, lwd=0)

Related

R Barplot: Y-axis cut off at the top?

I'm trying to use R to do a barplot. Values I'm plotting range from 0 to 5.0, but are decimal values (such as 4.87) so I don't want to just use the default Y axis, because it just goes up in increments of 1.
I've created a custom Y axis, which works, but if I set the maximum value greater than about 4.5, it cuts off the tickmark at the top of the axis. This looks untidy so I want a way to ensure this tickmark will always appear, but I don't want to shorten my axis as it looks stupid if I do this.
My R code is as follows:
# Bar plot of mean SUS question scores
barplot(meanSUSQuestions$Mean,
main="Mean SUS Question Scores",
cex.main="0.8",
cex.axis="0.8",
cex.lab="0.8",
#names=c("q1", "q2", "q3","q4","q5","q6","q7","q8","q9","q10"),
names=c(1:10),
yaxt="n",
col="red")
axis(2, cex.axis="0.8", at=seq(0, 5, 0.5)) # Create custom Y axis
mtext(text="Mean Score", side=2, line=2, cex=0.8)
mtext(text="Question", side=1, line=2, cex=0.8)
The bar plot that this produces looks like this:
As you can see from the picture, the top tickmark is missing.
How can I get this top tickmark to appear?
barplot generates the image height based on the data. The range of your manual y-axis is considerably larger than the plot area and is thus cut off.
The easiest way to solve the issue in your specific case is to add an yaxp = c(0, 5, 11) to barplot instead of yaxt = "n" and axis.
A self-contained example:
# Bad
x <- 1:5
barplot(x, yaxt = "n") #, add = TRUE)
axis(2, at = seq(0, 6, 2)) # Create custom Y axis
# Good
barplot(x, yaxp = c(0, 6, 2))

How do I find the correct coordinates to align labels with barplot bars?

I'm creating a graphic that has a few different graph elements, using layout() to define plotting regions. I have a separate region for labels that need to align to bars on a barplot in an adjacent plotting region.
I can take a guess at where to plot the labels so that they line up - but the number of these locations will vary so this is not an ideal solution.
Here's an example of what I'm trying to do:
labs <- paste("Some text", letters[1:9])
datA <- table(sample(letters[1:9], size=200, replace=TRUE, prob=rep(c(0.1,0.2,0.3),3)))
layout(matrix(c(1,2,3,3), 2, 2, byrow=TRUE), widths=c(1,2), heights=c(6,1))
plot.new()
text(x=1, y=seq(0.05,1.0,0.111), labels=labs, adj=1, cex=1.4)
barplot(datA, horiz=TRUE, las=1, axes=F, yaxt="n")
How can I find the correct values to plot the labels?
(I'm aware that it looks like this can be solved by just plotting the labels with the barplot - this is not a viable solution for what I'm doing).
The output of barplot gives the heights so:
bp <- barplot(datA, horiz=TRUE, las=1, axes=F, yaxt="n")
text(0*bp, bp, labs, col = "blue", pos = 4)

How can I have the full range in the x- and y-axis labels in a plot?

I have two variables, x and y
x = runif(8, 0, runif(1, 1, 5))
y = x^2
that I want to plot. Note that the range of x (and hence y=x^2) is not always the same.
So, the command
plot(x, y, pch=19, col='red')
produces
However, I don't want the borders around the graph, so I use the bty='n' parameter for plot:
plot(x, y, pch=19, col='red', bty='n')
which produces
This is a bit unfortunate, imho, since I'd like the y-axis to go all the way up to 4 and the x-axis all the way to 2.
So, I ue the xaxp and yaxp parameters in the plot command:
plot(x, y, pch=19, col='red', bty='n',
xaxp=c(
floor (min(x)),
ceiling(max(x)),
5
),
yaxp=c(
floor (min(y)),
ceiling(max(y)),
5
)
)
which produces
This is a bit better, but it still doesn't show the full range. Also, I thought it nice that the default axis labaling uses steps that were like 1,2,3,4 or 0.5,1,1.5,2, not just some arbitrary fractions.
I guess R has some parameter or mechanism to plot the full range in the axis in a "humanly" fashion (0.5,1,1.5 ...) but I didn't find it. So, what could I try?
Try:
plot(x, y, pch=19, col='red', bty='n', xlim=c(min(x),max(x)),
ylim=c(min(y),max(y)), axes=FALSE)
axis(1, at=seq(floor(min(x)), ceiling(max(x)), 0.5))
axis(2, at=seq(floor(min(y)), ceiling(max(y)), 0.5))
Or if you'd prefer to hard-code those axis ranges:
axis(1, at=seq(0, 2, 0.5))
axis(2, at=seq(0, 4, 0.5))
Is that what you were after?

Adding label to secondary axis in R

I have this code:
# Plotting everything
plot( p1, col= "lightgreen", xlim=c(-2.5,4.5), ylim=c(0, 700), main="Daily Total Precipitation for AR and Oct-May", xlab="ln(x)" , ylab="Frequency", xaxt = "n") # first histogram
plot( p2, col="red", xlim=c(-2.5,4.5), ylim=c(0, 700), xaxt = "n" , add=T)
# Adding in text labels on top of the bars
text(x, y, paste(round(percents,2),"%"), cex=0.50, pos=3, offset=0.3, col="black")
axis(side=1, at=breaks) # new x-axis
# parameter that needs to be set to add a new graph on top of the other ones
par(new=T)
plot(x, percents, xlim=c(-2.5,4.5), type="l", col="yellow", lwd=3.0, axes=F, ylab=NA, xlab=NA)
axis(side=4, at=seq(0,100,by=10), col="yellow", col.axis="yellow") # additional y-axis
mtext("Percent", side=4, col="yellow")
# legend settings
legend("topleft", c("AR", "Oct-May"), lwd=10, col=c("red", "lightgreen"))
Which produces this graph:
And I can't seem to figure out how to get the secondary y-axis label to show up in the correct position. Any help or suggestions is greatly appreciated.
Edit: Using RStudio.
One option is to specify the line argument to mtext(). In the example below I add a couple more lines to the right (side = 4) margin of the plot using par(), and then I draw three labels using mtext() at the default (line = 0), line 3 (line = 3), and line -3 (line = -3):
op <- par(mar = c(5,4,4,4) + 0.1)
plot(1:10)
mtext("line0", side = 4)
mtext("line3", side = 4, line = 3)
mtext("line-3", side = 4, line = -3)
par(op)
Note that line numbers increase away from the plot region and that negative line values move into the plot region, or to the left of the right boundary of the plot region.
It takes a little playing with the number of margin lines (as set in par(mar = x)) and which line you want to draw on using mtext(), but a little trial and error should get you what you want.
Note also that you don't need to specify integer values for the line argument. You can specify fractions of lines too: line = 2.5.

Change the number of tick marks on a figure in R

I created a figure of two plots (two years) of climate data (temp and precip) that looks exactly like I want it, except that one of my axes has too many tick marks. With everything I have going on with this figure, I can't find a way to specify fewer tick marks without messing up other parts. I would also like to specify where the tick marks are. Here is the figure:
You can see that the tick marks for the top axis just blur together and the numbers chosen are not very meaningful to me. How can I tell R what I really want?
Here are the datasets I am using: cobs10 and
cobs11.
And here is my code:
par(mfrow=c(2,1))
par(mar = c(5,4,4,4) + 0.3)
plot(cobs10$day, cobs10$temp, type="l", col="red", yaxt="n", xlab="", ylab="",
ylim=c(-25, 30))
axis(side=3, col="black", at=cobs10$day, labels=cobs10$gdd)
at = axTicks(3)
mtext("Thermal Units", side=3, las=0, line = 3)
axis(side=2, col='red', labels=FALSE)
at= axTicks(2)
mtext(side=2, text= at, at = at, col = "red", line = 1, las=0)
mtext("Temperature (C)", side=2, las=0, line=3)
par(new=TRUE)
plot(cobs10$gdd, cobs10$precip, type="h", col="blue", yaxt="n", xaxt="n", ylab="",
xlab="")
axis(side=4, col='blue', labels=FALSE)
at = axTicks(4)
mtext(side = 4, text = at, at = at, col = "blue", line = 1,las=0)
mtext("Precipitation (cm)", side=4, las=0, line = 3)
par(mar = c(5,4,4,4) + 0.3)
plot(cobs11$day, cobs11$temp, type="l", col="red", yaxt="n", xlab="Day of Year",
ylab="", ylim=c(-25, 30))
axis(side=3, col="black", at=cobs11$day, labels=cobs11$gdd)
at = axTicks(3)
mtext("", side=3, las=0, line = 3)
axis(side=2, col='red', labels=FALSE)
at= axTicks(2)
mtext(side=2, text= at, at = at, col = "red", line = 1, las=0)
mtext("Temperature (C)", side=2, las=0, line=3)
par(new=TRUE)
plot(cobs11$gdd, cobs11$precip, type="h", col="blue", yaxt="n", xaxt="n", ylab="",
xlab="", ylim=c(0,12))
axis(side=4, col='blue', labels=FALSE)
at = axTicks(4)
mtext(side = 4, text = at, at = at, col = "blue", line = 1,las=0)
mtext("Precipitation (cm)", side=4, las=0, line = 3)
Thanks for thinking about it.
You've pretty much got the solution already:
axis(side=3, col="black", at=cobs10$day, labels=cobs10$gdd)
Except, you are asking to have ticks and labels at every single entry.
Take a look at the function pretty:
at <- pretty(cobs10$day)
at
# [1] 0 100 200 300 400
These are where the ticks should be placed on the x-axis. Now you need to find the corresponding labels. This is not straigtforward, but we will get:
lbl <- which(cobs10$day %in% at)
lbl
# [1] 100 200 300
lbl <- c(0, cobs10$gdd[lbl]
axis(side=3, at=at[-5], labels=lbl)
Update
I've been a bit annoyed by your use of three different series in a single plot. There are many reasons this is troublesome.
Having two y-values are always troublesome see this article from Stephen Few (go to page 5 for my favorite example); in your case it is not that serious due to the nature of the plots and your use of colours to indicate which y-axis the values belong to. But still, on principle.
Axis ticks should have a fixed function, e.g. linear or logarithm. With your Thermal Units, they appear "randomly" (I know that is not the case, but for an outsider they do).
We gotta do something about your x-axis ticks that just refer to "day of year".
First up, we take a look at your data and see what can be done naively. We recognize that your ''date'' variable is actual dates. Let's exploit it and make R aware of it!
cobs10 <- read.table('cobs10.txt',as.is=TRUE)
cobs10$date <- as.Date(cobs10$date)
plot(temp ~ date, data=cobs10, type='l')
Here, I really like the x-axis ticks and had some trouble replicating it. ''pretty'' on dates insisted on either 4 ticks or 12 ticks. But we will come back to that later.
Next, we can do something about the overlay plotting. Here I use ''par(mfrow=c(3,1))'' to instruct R to have three multiple plots stacked in a single window; with these multiple plots we can differentiate between inner and outer margins. The ''mar'' and ''oma'' arguments refers to the inner and outer margin.
Lets put all three variable together!
par(mfrow=c(3,1), mar=c(0.6, 5.1, 0, 0.6), oma=c(5.1, 0, 1, 0))
plot(temp ~ date, data=cobs10, type='l', ylab='Temperatur (C)')
plot(precip ~ date, data=cobs10, type='l', ylab='Precipitation (cm)')
plot(gdd ~ date, data=cobs10, type='l', ylab='Thermal units')
This looks okay, but not with ticks on top of the plots. Not good. Naturally, we can enable ticks in the first two plots (with ''plot(..., xaxt='n')''), but this will distort the bottom plot. So you will need to do so for all three plots and then add the axis to the outer plotting region.
par(mfrow=c(3,1), mar=c(0.6, 5.1, 0, 0.6), oma=c(5.1, 0, 1, 0))
plot(temp ~ date, data=cobs10, type='l', xaxt='n', ylab='Temperatur (C)')
plot(precip ~ date, data=cobs10, type='l', xaxt='n', ylab='Precipitation (cm)')
plot(gdd ~ date, data=cobs10, type='l', xaxt='n', ylab='Thermal units')
ticks <- seq(from=min(cobs10$date), by='2 months', length=7)
lbl <- strftime(ticks, '%b')
axis(side=1, outer=TRUE, at=ticks, labels=lbl)
mtext('2010', side=1, outer=TRUE, line=3, cex=0.67)
Since ''pretty'' doesn't behave as we want it to, we use ''seq'' to make the sequence of x-axis ticks. Then we format the dates to just display an abbreviation of the month name, but this is done with regard to local settings (I live in Denmark), see ''locale''.
To add the axis-ticks and a label to the outer region, we must remember to specify ''outer=TRUE''; otherwise it is added to the last subplot.
Also note that I specified ''cex=0.67'' to match the font size of the x-axis to the y-axis.
Now I agree that displaying the thermal units in a individual subplot is not optimal, although it is the correct way of displaying it. But there was the issue with the ticks. What we really want is to display some nice values that clearly display that they are not linear. But your data does not necessarily contain these nice values, so we will have to interpolate them ourselves.
For this, I use the ''splinefun''
lbl <- c(0, 2, 200, 1000, 2000, 3000, 4000)
thermals <- splinefun(cobs10$gdd, cobs10$date) # thermals is a function that returns the date (as an integer) for a requested value
thermals(lbl)
## [1] 14649.00 14686.79 14709.55 14761.28 14806.04 14847.68 14908.45
ticks <- as.Date(thermals(lbl), origin='1970-01-01') # remember to specify an origin when converting an integer to a Date.
Now the thermal ticks are in place, lets try it.
par(mfrow=c(2,1), mar=c(0.6, 5.1, 0, 0.6), oma=c(5.1, 0, 4, 0))
plot(temp ~ date, data=cobs10, type='l', xaxt='n', ylab='Temperatur (C)')
plot(precip ~ date, data=cobs10, type='l', xaxt='n', ylab='Precipitation (cm)')
usr <- par('usr')
x.pos <- (usr[2]+usr[1])/2
ticks <- seq(from=min(cobs10$date), by='2 months', length=7)
lbl <- strftime(ticks, '%b')
axis(side=1, outer=TRUE, at=ticks, labels=lbl)
mtext('2010', side=1, at=x.pos, line=3)
lbl <- c(0, 2, 200, 1000, 2000, 3000, 4000)
thermals <- splinefun(cobs10$gdd, cobs10$date) # thermals is a function that returns the date (as an integer) for a requested value
ticks <- as.Date(thermals(lbl), origin='1970-01-01') # remember to specify an origin when converting an integer to a Date.
axis(side=3, outer=TRUE, at=ticks, labels=lbl)
mtext('Thermal units', side=3, line=15, at=x.pos)
Update I changed the mtext function calls in the last code block to ensure that the x-axis texts are centred on the plotting region, not the entire region. You might want to tweak the vertical position by changing the line-argument.

Resources