I'm creating a histogram in R which displays the frequency of several events in a vector. Each event is represented by an integer in the range [1, 9]. I'm displaying the label for each count vertically below the chart. Here's the code:
hist(vector, axes = FALSE, breaks = chartBreaks)
axis(1, at = tickMarks, labels = eventTypes, las = 2, tick = FALSE)
Unfortunately, the labels are too long, so they are cut off by the bottom of the window. How can I make them visible? Am I even using the right chart?
Look at help(par), in particular fields mar (for the margin) and oma (for outer margin).
It may be as simple as
par(mar=c(5,3,1,1)) # extra large bottom margin
hist(vector, axes = FALSE, breaks = chartBreaks)
axis(1, at = tickMarks, labels = eventTypes, las = 2, tick = FALSE)
This doesn't sound like a job for a histogram - the event is not a continuous variable. A barplot or dotplot may be more suitable.
Some dummy data
set.seed(123)
vec <- sample(1:9, 100, replace = TRUE)
vec <- factor(vec, labels = paste("My long event name", 1:9))
A barplot is produced via the barplot() function - we provide it the counts of each event using the table() function for convenience. Here we need to rotate labels using las = 2 and create some extra space of the labels in the margin
## lots of extra space in the margin for side 1
op <- par(mar = c(10,4,4,2) + 0.1)
barplot(table(vec), las = 2)
par(op) ## reset
A dotplot is produced via function dotchart() and has the added convenience of sorting out the plot margins for us
dotchart(table(vec))
The dotplot has the advantage over the barplot of using much less ink to display the same information and focuses on the differences in counts across groups rather than the magnitudes of the counts.
Note how I've set the data up as a factor. This allows us to store the event labels as the labels for the factor - thus automating the labelling of the axes in the plots. It also is a natural way of storing data like I understand you to have.
Perhaps adding \n into your labels so they will wrap onto 2 lines? It's not optimal, but it may work.
You might want to look at this post from Cross Validated
Related
This question already has answers here:
How can I plot with 2 different y-axes?
(6 answers)
R: multiple x axis with annotations
(2 answers)
Closed 15 days ago.
I am using native R plot function to genertae graphics. looking to add a double x-axis on same plot. One holds doubles and the other x-axis holds Date object. I am using the following commands but they dont seem to work.
First x-axis:
axis.Date(1,at=seq(min(x$Date),na.rm=TRUE,max(x$Date),na.rm=TRUE,by="2 years"),format ="%Y-%m-%d",col.axis="white", cex=1)
Second x-axis:
axis(1,at=seq(min(f), max(f), by = 0.1), col.axis="white", cex=1)
The parameters for the R native plot:
x11()
par(mfrow=c(1,1),oma = c(0, 0, 2, 0) )
Result is only Dates on x-axis.
Up front: dual axes can easily be mis-used by mis-representing the data and/or ranges. It's easy for eyes to misconstrue correlation or relationships based on imperfect axis decisions. For scatter plots (such as below), I'm not a fan and tend to avoid them ... but I do use them under very controlled circumstances, as they can provide visual correlation of relative trends.
When I must do it, I'm a fan of using color as a way to more strongly tie points (or lines) with particular axes, though of course this does not work as well with color-impaired readers.
Given that preface ...
I believe the easiest way to handle multiple axes in base-R is to use par(new=TRUE) between plots. Here's an example:
par(mar = c(4,4,4,4) + 0.1)
plot(disp ~ mpg, data = mtcars, las = 1)
par(new = TRUE)
dat <- data.frame(dat = Sys.Date() + 0:5, y = 1:6)
plot(y ~ dat, data = dat, ann = FALSE, yaxt = "n", xaxt = "n", pch = 16, col = "red")
axis.Date(3, dat$dat[1], col = "red", line = 1)
axis(4, col = "red", line = 1, las = 1)
Other differentiating techniques include shapes or line-types (if lines) specific to each side, and adding those as clear markers on the secondary axes.
The use of par(new=TRUE) simply allows the next plot command to not reset/clear the canvas before starting over. This means that the subsequent plotting functions have no knowledge of what is existing. From ?par:
'new' logical, defaulting to 'FALSE'. If set to 'TRUE', the next
high-level plotting command (actually 'plot.new') should _not
clean_ the frame before drawing _as if it were on a *_new_*
device_. It is an error (ignored with a warning) to try to
use 'new = TRUE' on a device that does not currently contain
a high-level plot.
It doesn't work well with all plotting mechanisms (certainly nothing grid or ggplot2), and anything that might be sensitive to margins or oma or other parameters should be tested carefully with various ranges of data.
I intentionally used line=1 to "bump out" the top/right axes, another way to set them apart. Frankly, I often do that for the bottom/left (primary) axes as well, it can be aesthetically preferred ... but it's an option and not required for this technique to at least start the process.
Trying to produce both a stripchart and a boxplot of the same (transformed) data but (because the boxplot is shifted down a tad) I don't want the axis labels twice:
set.seed(3121975)
bee = list(x1=rnbinom(50, mu = 4, size = .1),
x2=rnbinom(30,mu=6,size=.1),
x3=rnbinom(40,mu=2,size=.1))
f = function(x) asinh(sqrt(4*x+1.5))
stripchart(lapply(bee,f),method="stack",offset=.13,ylim=c(.8,3.9))
boxplot(lapply(bee,f),horizontal=TRUE,boxwex=.05,at=(1:3)-.1,add=TRUE,ann=FALSE)
Other things that don't work include: (i) leaving ann to take its default value of !add, (ii) specifying labels for ylab.
I presume I have missed something obvious but I am not seeing what it might be.
Just add yaxt = 'n' into boxplot() to suppress plotting of the y-axis. The argument ann controls axis titles and overall titles, not the axis itself.
When I manually add the following labels with (axis(1, at=1:27, labels=labs[0:27])):
> labs[0:27]
[1] "0\n9.3%" "1\n7.6%" "2\n5.6%" "3\n5.1%" "4\n5.7%" "5\n6.5%" "6\n7.3%" "7\n7.6%" "8\n7.5%" "9\n7%" "10\n6.2%" "11\n5.2%"
[13] "12\n4.2%" ........
I get the following:
How do I force all labels to be drawn so 1,3,5,6, and 11 are not skipped? (also, for extra credit, how do I shift the whole thing down a few pixels?)
If you want to force all labels to display, even when they are very close or overlapping, you can "trick" R into displaying them by adding odd and even axis labels with separate calls to the axis function, as follows:
labs <-c("0\n9.3%","1\n7.6%","2\n5.6%","3\n5.1%","4\n5.7%","5\n6.5%","6\n7.3%",
"7\n7.6%","8\n7.5%","9\n7%", "10\n6.2%","11\n5.2%","12\n4.2%",13:27)
n=length(labs)
plot(1:28, xaxt = "n")
axis(side=1, at=seq(1,n,2), labels=labs[seq(1,n,2)], cex.axis=0.6)
axis(side=1, at=seq(2,n,2), labels=labs[seq(2,n,2)], cex.axis=0.6)
You can play with cex.axis to get the text size that you want. Note, also, that you may have to adjust the number of values in at= and/or labels= so that they are equal.
I agree with #PLapointe and #joran that it's generally better not to tamper with R's default behavior regarding overlap. However, I've had a few cases where axis labels looked fine even when they weren't quite a full "m-width" apart, and I hit on the trick of alternating odd and even labels as a way to get the behavior I wanted.
?axis tells you that:
The code tries hard not to draw overlapping tick labels, and so will omit labels where they would abut or overlap previously drawn labels. This can result in, for example, every other tick being labelled. (The ticks are drawn left to right or bottom to top, and space at least the size of an ‘m’ is left between labels.)
Play with cex.axis so that labels are small enough to fit without overlapping
labs <-c("0\n9.3%","1\n7.6%","2\n5.6%","3\n5.1%","4\n5.7%","5\n6.5%","6\n7.3%",
"7\n7.6%","8\n7.5%","9\n7%", "10\n6.2%","11\n5.2%","12\n4.2%",12:27)
plot(1:27,xaxt = "n")
axis(side=1, at=1:27, labels=labs[0:27],cex.axis=0.35)
If you widen you graph (manually by dragging or programmatically), you can increase the size of your labels.
Although there are some good answers here, the OP didn't want to resize the labels or change anything about the plot besides fitting all of the axis labels. It's annoying, since often there appears to be plenty of room to fit all of the axis labels.
Here's another solution. Draw the plot without the axis, then add ticks with empty labels. Store the positions of the ticks in an object, so then you can go through each one and place it in the correct position on the axis.
plot(1:10, 1:10, yaxt = "n")
axis_ticks = axis(2, axTicks(2), labels = rep("", length(axTicks(2))))
for(i in axis_ticks) axis(2, i)
#PLapointe just posted what I was going to say, but omitted the bonus answer.
Set padj = 0.5 in axis to move the labels down slightly.
Perhaps draw and label one tick at a time, by calling axis repeatedly using mapply...
For example, consider the following data:
x = runif(100)*20
y = 10^(runif(100)*3)
The formula for y might look a bit odd; it gives random numbers distributed across three orders of magnitude such that the data will be evenly distributed on a plot where the y axis is on a log scale. This will help demonstrate the utility of axTicks() by calculating nice tick locations for us on a logged axis.
By default:
plot(x, y, log = "y")
returns:
Notice that 100 and 1000 labels are missing.
We can instead use:
plot(x, y, log = "y", yaxt = "n")
mapply(axis, side = 2, at = axTicks(2), labels = axTicks(2))
which calls axis() once for each tick location returned by axTicks(), thus plotting one tick at a time. The result:
What I like about this solution is that is uses only one line of code for drawing the axis, it prints exactly the default axis R would have made, except all ticks are labeled, and the labels don't go anywhere when the plot is resized:
I can't say the axis is useful in the resized example, but it makes the point about axis labels being permanent!
For the first (default) plot, note that R will recalculate tick locations when resizing.
For the second (always labeled) plot, the number and location of tick marks are not recalculated when the image is resized. The axis ticks calculated by axTicks depend upon the size of the display window when the plot is first drawn.
If you want want to force specific tick locations, try something like:
plot(x, y, log = "y", yaxt = "n")
mapply(axis, side = 2, at = c(1,10,100, 1000), labels = c("one", "ten", "hundred", "thousand"))
which yields:
axis() includes a gap.axis parameter that controls when labels are omitted. Setting this to a very negative number will force all labels to display, even if they overlap.
The padj parameter of axis() controls the y offset whilst plotting an individual axis.
par(mgp = c(3, 2, 0) will adjust the position of all axis labels for the duration of a plotting session: the second value (here 2, default 1) controls the position of the labels.
# Set axis text position, including for Y axis
par(mgp = c(3, 2, 0))
# Plot
plot(1:12, 1:12, log = 'x', ann = FALSE, axes = FALSE)
# Some numbers not plotted:
axis(1, 1:12)
# All numbers plotted, with manual offset
axis(1, 1:12, gap.axis = -100, padj = 0.5)
I had a similar problem where I wanted to stagger the labels and get them to print without losing some. I created two sets of ticks showing second set below the other to make it look as if it staggers.
xaxis_stagger = function(positions,labels) {
odd=labels[seq(1,length(labels),2)]
odd_pos=positions[seq(1,length(positions),2)]
even=labels[seq(2,length(labels),2)]
even_pos=positions[seq(2,length(positions),2)]
axis(side=1,at=odd_pos,labels=odd)
axis(side=1,at=even_pos,labels=even,padj=1.5)
}
So you give the positions where you want the ticks to be and the labels for those ticks and this would then re-organise it into two sets of axis and plot them on the original plot. Original plot would be done with xaxt="n".
I'm working with TraMineR to do a sequence analysis of educational data. I can get R to produce a plot of the 10 most frequent sequences in the data using code similar to the following:
library(TraMineR)
##Loading the data
data(actcal)
##Creating the labels and defining the sequence object
actcal.lab <- c("> 37 hours", "19-36 hours", "1-18 hours", "no work")
actcal.seq <- seqdef(actcal, 13:24, labels=actcal.lab)
## 10 most frequent sequences in the data
actcal.freq <- seqtab(actcal.seq)
actcal.freq
## Plotting the object
seqfplot(actcal.seq, pbarw=FALSE, yaxis="pct", tlim=10:1, cex.legend=.75, withlegend="right")
However, I'd also like to have the frequencies of each sequence (which are in the object actcal.freq) along the right side of the plot. For example, the first sequence in the plot created by the code above represents 37.9% of the data (as the plot currently shows). Per the seqtab, this is 757 subjects. I'd like the number 757 to appear on the right y-axis (and so on for the other sequences).
Is this possible? I've played around with axis(side=4, ...) but never been able to get it to reproduce the spacing of the left y-axis.
OK. This is a bit of a mess, but the function resets the par setting if you include a legend by default, so you need to turn that off. Then you can set the axis a bit more easily, and then we can go back for the legend. This should work with your test data above.
#add padding to the right for axis and legend
par("mar"=c(5,4,4,8)+.1)
#plot w/o axis
seqfplot(actcal.seq, pbarw=FALSE, yaxis="pct", tlim=10:1, withlegend=F)
#plot right axis with freqs
axis(4, at = seq(.7, by=1.2, length.out=length(attr(actcal.freq,"freq")$Freq)),
labels = rev(attr(actcal.freq,"freq")$Freq),
mgp = c(1.5, 0.5, 0), las = 1, tick = FALSE)
#now put the legend on
legend("right", legend=attr(actcal.seq, "labels"),
fill=attr(actcal.seq, "cpal"),
inset=-.3, bty="o", xpd=NA, cex=.75)
You may need to play a bit with the margins and especially the inset= parameter of the legend to get it placed correctly. I hope your real data isn't too much different than this because you really have to dig though the function to see how it does the formatting to get things to match up.
I'm constructing a plot using bargraph.CI from sciplot. The x-axis represents a categorical variable, so the values of this variable are the names for the different positions on the x-axis. Unfortunately these names are long, so at default settings, some of them just disappear. I solved this problem by splitting them into multiple lines by injecting "\n" where needed. This basically worked, but because the names are now multi-line, they look too close to the x-axis. I need to move them farther away. How?
I know I can do this with mgp, but that affects the y-axis too.
I know I can set axisnames=FALSE in my call to barplot.CI, then use axis to create a separate x-axis. (In fact, I'm already doing that, but only to make the x-axis extend farther than it would by default- see my code below.) Then I could give the x-axis its own mgp parameter that would not affect the y-axis. But as far as I can tell, axis() is well set up for ordinal or continuous variables and doesn't seem to work great for categorical variables. After some fiddling, I couldn't get it to put the names in the right locations (i.e. right under their correspondence bars)
Finally, I tried using mgp.axis.labels from Hmisc to set ONLY the x-axis mgp, which is precisely what I want, but as far as I could tell it had no effect on anything.
Ideas? Here's my code.
ylim = c(0.5,0.8)
yticks = seq(ylim[1],ylim[2],0.1)
ylab = paste(100*yticks,"%",sep="")
bargraph.CI(
response = D$accuracy,
ylab = "% Accuracy on Test",
ylim = ylim,
x.factor = D$training,
xlab = "Training Condition",
axes = FALSE
)
axis(
side = 1,
pos = ylim[1],
at = c(0,7),
tick = TRUE,
labels = FALSE
)
axis(
side = 2,
tick = TRUE,
at = yticks,
labels = ylab,
las = 1
)
axis works fine with cateory but you should set the right ticks values and play with pos parameter for offset translation. Here I use xvals the return value of bargraph.CI to set àxis tick marks.
Here a reproducible example:
library(sciplot)
# I am using some sciplot data
dat <- ToothGrowth
### I create along labels
labels <- c('aaaaaaaaaa\naaaaaaaaaaa\nhhhhhhhhhhhhhhh',
'bbbbbbbbbb\nbbbbbbbbbbb\nhhhhhhhhhhhhhh',
'cccccccccc\nccccccccccc\ngdgdgdgdgd')
## I change factor labels
dat$dose <- factor(dat$dose,labels=labels)
ll <- bargraph.CI(x.factor = dose, response = len, data = dat,axisnames=FALSE)
## set at to xvals
axis(side=1,at=ll$xvals,labels=labels,pos=-2,tick=FALSE)