I'd like to place four plots onto a single page. Axis labels should be printed only at the very rim, i.e. x axis labels for the bottom diagrams only, and y axis labels for the left diagrams only. This goes both for the name of the axis as a whole and the individual tick marks. I can generate something along these lines using the following code:
pdf(file = "ExampleOutput.pdf",
width = 6.61,
height = 6.61,
pointsize = 10
)
set.seed(42)
catA <- factor(c("m100", "m500", "m1000", "m2000", "m3000", "m5000"))
catB <- factor(20:28)
samples <- 100
rsample <- function(v) v[ceiling(runif(samples, max=length(v)))]
Tab <- data.frame(catA = rsample(catA),
catB = rsample(catB),
valA = rnorm(samples, 150, 8),
valB = pmin(1,pmax(0,rnorm(samples, 0.5, 0.3))))
par(mfrow = c(2,2))
for (i in 0:3) {
x <- Tab[[1 + i %% 2]]
plot(x, Tab[[3 + i %/% 2]],
xlab = if (i %/% 2 == 1) "Some Categories" else NULL,
ylab = if (i %% 2 == 0) "Some Values" else NULL,
axes = FALSE
)
axis(side = 1,
at=1:nlevels(x),
labels = if (i %/% 2 == 1) levels(x) else FALSE)
axis(side = 2, labels = (i %% 2 == 0))
box(which = "plot", bty = "l")
}
par(mfrow = c(1,1))
dev.off()
I'll welcome suggestions for how to improve my ploting commands, perhaps avoid draing the axes and the L in the lower left corner manually. But that's only a besides.
The result of this sequence looks like this:
The problem here is the huge amount of wasted whitespace. I have the impression that R reserves space for axis and tick labels even if they are not used. As a consequence of this wasted space, for the left bottom diagram, only every second x tick actually gets labeled, which is really bad here.
I'd like to generate a similar plot without that much white space. The actual plots should be the same size, so they line up properly, but the space for the labels should be only at the outside. I imagine a layout like this (mockup created in GIMP):
How can I achieve such a layout?
Here is a slight modification of the general plot you show, assuming that the y and x axis labels pertain to all plots. It uses an outer margin to contain the axis labelling, which we add with title() using argument outer = TRUE. The effect is somewhat like the labelling in ggplot2 or lattice plots.
The key line here is:
op <- par(mfrow = c(2,2),
oma = c(5,4,0,0) + 0.1,
mar = c(0,0,1,1) + 0.1)
which sets plot parameters (the values in place prior to the call are stored in op). We use 5 and 4 lines on sides 1 and 2 for the outer margin, which is the usual number for the mar parameter. Plot region margins (mar) of 1 line each are added to the top and right sides, to give a little room between plots.
The axis labels are added after the for() loop with
title(xlab = "Some Categories",
ylab = "Some Values",
outer = TRUE, line = 3)
The entire script is:
set.seed(42)
catA <- factor(c("m100", "m500", "m1000", "m2000", "m3000", "m5000"))
catB <- factor(20:28)
samples <- 100
rsample <- function(v) v[ceiling(runif(samples, max=length(v)))]
Tab <- data.frame(catA = rsample(catA),
catB = rsample(catB),
valA = rnorm(samples, 150, 8),
valB = pmin(1,pmax(0,rnorm(samples, 0.5, 0.3))))
op <- par(mfrow = c(2,2),
oma = c(5,4,0,0) + 0.1,
mar = c(0,0,1,1) + 0.1)
for (i in 0:3) {
x <- Tab[[1 + i %% 2]]
plot(x, Tab[[3 + i %/% 2]], axes = FALSE)
axis(side = 1,
at=1:nlevels(x),
labels = if (i %/% 2 == 1) levels(x) else FALSE)
axis(side = 2, labels = (i %% 2 == 0))
box(which = "plot", bty = "l")
}
title(xlab = "Some Categories",
ylab = "Some Values",
outer = TRUE, line = 3)
par(op)
which produces
Building heavily on the answer from Gavin Simpson, I now use the following solution:
par(mfrow = c(2, 2), # 2x2 layout
oma = c(2, 2, 0, 0), # two rows of text at the outer left and bottom margin
mar = c(1, 1, 0, 0), # space for one row of text at ticks and to separate plots
mgp = c(2, 1, 0), # axis label at 2 rows distance, tick labels at 1 row
xpd = NA) # allow content to protrude into outer margin (and beyond)
The result looks like this:
As you can see, this is enough to allow printing of all the tick labels as well. If it were not, then according to Gavin's comment, adding cex.axis with a value smaller than 1 to the parameter list should help reduce the font size there.
Just manipulate your parameters, in par. The argument mar controls margin size for individual plot. Change your par to this:
par(mfrow = c(2,2), mar=c(1, 4, 1, 1) + 0.1)#it goes c(bottom, left, top, right)
You need a conditional evaluation that assigns to par('mar') values that are appropriate to the positioning; Here is an example of code (inside your loop) that checks for the "x-layout-position":
pdf(file = "ExampleOutput2.pdf",
width = 6.61,
height = 6.61,
pointsize = 10
)
set.seed(42)
catA <- factor(c("m100", "m500", "m1000", "m2000", "m3000", "m5000"))
catB <- factor(20:28)
samples <- 100
rsample <- function(v) v[ceiling(runif(samples, max=length(v)))]
Tab <- data.frame(catA = rsample(catA),
catB = rsample(catB),
valA = rnorm(samples, 150, 8),
valB = pmin(1,pmax(0,rnorm(samples, 0.5, 0.3))))
par(mfrow = c(2,2), mar= c(3, 4, 1, 1) + 0.1)
for (i in 0:3) {
x <- Tab[[1 + i %% 2]]
plot(x, Tab[[3 + i %/% 2]], mar= if(i %/%2 == 0) {c(4, 4, 1, 1) + 0.1
}else{c(1, 1, 1, 1) + 0.1},
xlab = if (i %/% 2 == 1) "Some Categories" else NULL,
ylab = if (i %% 2 == 0) "Some Values" else NULL,
axes = FALSE
)
axis(side = 1,
at=1:nlevels(x),
labels = if (i %/% 2 == 1) levels(x) else FALSE)
axis(side = 2, labels = (i %% 2 == 0))
box(which = "plot", bty = "l")
}
par(mfrow = c(1,1))
dev.off()
You will need to adjust this to suit you needs, since it only handles two margin conditions andy you really have 4 separate conditions (2 below both needing more bottom-space, with the right one needing less left-space and two above (also with different requirements) . If you shrink the 'mar' value globally it will cut off your x and y labels as can be seen in the loss of the xlab values when you only drop this code into your loop.
Related
I have the following code (Thanks to an answer from #Rawr in this question):
labes1 <- c("P(LNG)","","Volume(LNG)","","P(oil)","","Can.GDP","","US GDP","")
titles <- c("Levels","","","","","Log Difference","","","","")
par(mfrow = c(5, 2), mar = c(0.3, 6, 0, 2), oma = c(5, 0, 3, 2))
lapply(1:10, function(ii) {
x <- plotdata1[, ii, drop = FALSE]
plot(x, xlab = "Quarter", ylab = labes1[ii], axes = FALSE)
axis(2, las = 1)
box()
if (ii %in% 9:10) {
axis(1)
title(xlab = 'Quarter', xpd = NA)
}
if (ii %in% 1:2)
title(main = c('Levels', 'Log Difference')[ii], xpd = NA, line = 1)
})
This produces the following plot:
The obvious issue is the overlaying of the y-axis labels with the y-axis values. I have tried playing around with the mar() and oma() but these just change the margins around, I was hoping this would move things out of the way. How can I move the y-axis labels as separate from the plot? I will also be moving the margins a bit so that the white space between the two columns of plots will be closer together.
You can define the ylab separately, like what you're doing for the xlab, and set the line parameter to define its distance from the plot (as stated in this post).
I got a running example from combining your code and #rawr's from your previous question.
set.seed(1)
z <- ts(matrix(rt(200 * 10, df = 3), 200, 10), start = c(1961, 1), frequency = 12)
z <- z * 1e5 # to make "wide" y-axis labels
## vectors of x, y, and main labels
xl <- sprintf('x label %s', 1:10)
yl <- sprintf('y label %s', 1:10)
ml <- sprintf('main label %s', 1:10)
labes1 <- c("P(LNG)","","Volume(LNG)","","P(oil)","","Can.GDP","","US GDP","")
titles <- c("Levels","","","","","Log Difference","","","","")
par(mfrow = c(5, 2), mar = c(0.3, 6, 0, 2), oma = c(5, 0, 3, 2))
lapply(1:10, function(ii) {
x <- z[, ii, drop = FALSE]
plot(x, xlab = "Quarter", ylab = "", axes = FALSE) # set ylab to ""
axis(2, las = 1)
title(ylab = labes1[ii], line = 4) # set the line at an appropriate distance
box()
if (ii %in% 9:10) {
axis(1)
title(xlab = 'Quarter', xpd = NA)
}
if (ii %in% 1:2)
title(main = c('Levels', 'Log Difference')[ii], xpd = NA, line = 1)
})
The code above outputs the following graph for line = 4 :
and this plot for line = 3 :
I want to plot a confusion matrix, but, I don't want to just use a heatmap, because I think they give poor numerical resolution. Instead, I want to also plot the frequency in the middle of the square. For instance, I like the output of this:
library(mlearning);
data("Glass", package = "mlbench")
Glass$Type <- as.factor(paste("Glass", Glass$Type))
summary(glassLvq <- mlLvq(Type ~ ., data = Glass));
(glassConf <- confusion(predict(glassLvq, Glass, type = "class"), Glass$Type))
plot(glassConf) # Image by default
However, 1.) I don't understand that the "01, 02, etc" means along each axis. How can we get rid of that?
2.) I would like 'Predicted' to be as the label of the 'y' dimension, and 'Actual' to be as the label for the 'x' dimension
3.) I would like to replace absolute counts by frequency / probability.
Alternatively, is there another package that will do this?
In essence, I want this in R:
http://www.mathworks.com/help/releases/R2013b/nnet/gs/gettingstarted_nprtool_07.gif
OR:
http://c431376.r76.cf2.rackcdn.com/8805/fnhum-05-00189-HTML/image_m/fnhum-05-00189-g009.jpg
The mlearning package seems quite inflexible with plotting confusion matrices.
Starting with your glassConf object, you probably want to do something like this:
prior(glassConf) <- 100
# The above rescales the confusion matrix such that columns sum to 100.
opar <- par(mar=c(5.1, 6.1, 2, 2))
x <- x.orig <- unclass(glassConf)
x <- log(x + 0.5) * 2.33
x[x < 0] <- NA
x[x > 10] <- 10
diag(x) <- -diag(x)
image(1:ncol(x), 1:ncol(x),
-(x[, nrow(x):1]), xlab='Actual', ylab='',
col=colorRampPalette(c(hsv(h = 0, s = 0.9, v = 0.9, alpha = 1),
hsv(h = 0, s = 0, v = 0.9, alpha = 1),
hsv(h = 2/6, s = 0.9, v = 0.9, alpha = 1)))(41),
xaxt='n', yaxt='n', zlim=c(-10, 10))
axis(1, at=1:ncol(x), labels=colnames(x), cex.axis=0.8)
axis(2, at=ncol(x):1, labels=colnames(x), las=1, cex.axis=0.8)
title(ylab='Predicted', line=4.5)
abline(h = 0:ncol(x) + 0.5, col = 'gray')
abline(v = 0:ncol(x) + 0.5, col = 'gray')
text(1:6, rep(6:1, each=6),
labels = sub('^0$', '', round(c(x.orig), 0)))
box(lwd=2)
par(opar) # reset par
The above code uses bits and pieces of the confusionImage function called by plot.confusion.
Here is a function for plotting confusion matrices I developed from jbaums excellent answer.
It is similar, but looks a bit nicer (IMO), and does not transpose the confusion matrix you feed it, which might be helpful.
### Function for plotting confusion matrices
confMatPlot = function(confMat, titleMy, shouldPlot = T) {
#' Function for plotting confusion matrice
#'
#' #param confMat: confusion matrix with counts, ie integers.
#' Fractions won't work
#' #param titleMy: String containing plot title
#' #return Nothing: It only plots
## Prepare data
x.orig = confMat; rm(confMat) # Lazy conversion to function internal variable name
n = nrow(x.orig) # conf mat is square by definition, so nrow(x) == ncol(x)
opar <- par(mar = c(5.1, 8, 3, 2))
x <- x.orig
x <- log(x + 0.5) # x<1 -> x<0 , x>=1 -> x>0
x[x < 0] <- NA
diag(x) <- -diag(x) # change sign to give diagonal different color
## Plot confusion matrix
image(1:n, 1:n, # grid of coloured boxes
# matrix giving color values for the boxes
# t() and [,ncol(x):1] since image puts [1,1] in bottom left by default
-t(x)[, n:1],
# ylab added later to avoid overlap with tick labels
xlab = 'Actual', ylab = '',
col = colorRampPalette(c("darkorange3", "white", "steelblue"),
bias = 1.65)(100),
xaxt = 'n', yaxt = 'n'
)
# Plot counts
text(rep(1:n, each = n), rep(n:1, times = n),
labels = sub('^0$', '', round(c(x.orig), 0)))
# Axis ticks but no lables
axis(1, at = 1:n, labels = rep("", n), cex.axis = 0.8)
axis(2, at = n:1, labels = rep("", n), cex.axis = 0.8)
# Tilted axis lables
text(cex = 0.8, x = (1:n), y = -0.1, colnames(x), xpd = T, srt = 30, adj = 1)
text(cex = 0.8, y = (n:1), x = +0.1, colnames(x), xpd = T, srt = 30, adj = 1)
title(main = titleMy)
title(ylab = 'Predicted', line = 6)
# Grid and box
abline(h = 0:n + 0.5, col = 'gray')
abline(v = 0:n + 0.5, col = 'gray')
box(lwd = 1, col = 'gray')
par(opar)
}
Example of output:
I would like to write a plot function for my specific purposes and put the y labels on the left margin. The length of these labels, however, can differ dramatically and depends on the model terms the user comes up with. For this reason, I would like to measure the width of the longest label and set the left margin width accordingly. I found the strwidth function, but I don't understand how to convert its output unit to the unit of the mar argument. An example:
label <- paste(letters, collapse = " ") # create a long label
par(mar = c(5, 17, 4, 2) + 0.1) # 17 is the left margin width
plot(1:2, axes = FALSE, type = "n") # stupid plot example
# if we now draw the axis label, 17 seems to be a good value:
axis(side = 2, at = 1, labels = label, las = 2, tck = 0, lty = 0)
# however, strwidth returns 0.59, which is much less...
lab.width <- strwidth(label) # so how can I convert the units?
You can use mai instead of mar to specify a distance in inches
(instead of "lines").
par(mai = c(1, strwidth(label, units="inches")+.25, .8, .2))
plot(1:2, axes=FALSE)
axis(side = 2, at = 1, labels = label, las = 2, tck = 0, lty = 0)
You can compute the conversion factor between lines and inches
by dividing mar by mai.
inches_to_lines <- ( par("mar") / par("mai") )[1] # 5
lab.width <- strwidth(label, units="inches") * inches_to_lines
par(mar = c(5, 1 + lab.width, 4, 2) + 0.1)
plot(1:2, axes=FALSE)
axis(side = 2, at = 1, labels = label, las = 2, tck = 0, lty = 0)
I'd like to do a vertical histogram. Ideally I should be able to put multiple on a single plot per day.
If this could be combined with quantmod experimental chart_Series or some other library capable of drawing bars for a time series that would be great. Please see the attached screenshot. Ideally I could plot something like this.
Is there anything built in or existing libraries that can help with this?
I wrote something a year or so ago to do vertical histograms in base graphics. Here it is, with a usage example.
VerticalHist <- function(x, xscale = NULL, xwidth, hist,
fillCol = "gray80", lineCol = "gray40") {
## x (required) is the x position to draw the histogram
## xscale (optional) is the "height" of the tallest bar (horizontally),
## it has sensible default behavior
## xwidth (required) is the horizontal spacing between histograms
## hist (required) is an object of type "histogram"
## (or a list / df with $breaks and $density)
## fillCol and lineCol... exactly what you think.
binWidth <- hist$breaks[2] - hist$breaks[1]
if (is.null(xscale)) xscale <- xwidth * 0.90 / max(hist$density)
n <- length(hist$density)
x.l <- rep(x, n)
x.r <- x.l + hist$density * xscale
y.b <- hist$breaks[1:n]
y.t <- hist$breaks[2:(n + 1)]
rect(xleft = x.l, ybottom = y.b, xright = x.r, ytop = y.t,
col = fillCol, border = lineCol)
}
## Usage example
require(plyr) ## Just needed for the round_any() in this example
n <- 1000
numberOfHists <- 4
data <- data.frame(ReleaseDOY = rnorm(n, 110, 20),
bin = as.factor(rep(c(1, 2, 3, 4), n / 4)))
binWidth <- 1
binStarts <- c(1, 2, 3, 4)
binMids <- binStarts + binWidth / 2
axisCol <- "gray80"
## Data handling
DOYrange <- range(data$ReleaseDOY)
DOYrange <- c(round_any(DOYrange[1], 15, floor),
round_any(DOYrange[2], 15, ceiling))
## Get the histogram obects
histList <- with(data, tapply(ReleaseDOY, bin, hist, plot = FALSE,
breaks = seq(DOYrange[1], DOYrange[2], by = 5)))
DOYmean <- with(data, tapply(ReleaseDOY, bin, mean))
## Plotting
par(mar = c(5, 5, 1, 1) + .1)
plot(c(0, 5), DOYrange, type = "n",
ann = FALSE, axes = FALSE, xaxs = "i", yaxs = "i")
axis(1, cex.axis = 1.2, col = axisCol)
mtext(side = 1, outer = F, line = 3, "Length at tagging (mm)",
cex = 1.2)
axis(2, cex.axis = 1.2, las = 1, line = -.7, col = "white",
at = c(75, 107, 138, 169),
labels = c("March", "April", "May", "June"), tck = 0)
mtext(side = 2, outer = F, line = 3.5, "Date tagged", cex = 1.2)
box(bty = "L", col = axisCol)
## Gridlines
abline(h = c(60, 92, 123, 154, 184), col = "gray80")
biggestDensity <- max(unlist(lapply(histList, function(h){max(h[[4]])})))
xscale <- binWidth * .9 / biggestDensity
## Plot the histograms
for (lengthBin in 1:numberOfHists) {
VerticalHist(binStarts[lengthBin], xscale = xscale,
xwidth = binWidth, histList[[lengthBin]])
}
Violin plots might be close enough to what you want. They are density plots that have been mirrored through one axis, like a hybrid of a boxplot and a density plot. (Much easier to understanding by example than description. :-) )
Here is a simple (somewhat ugly) example of the ggplot2 implementation of them:
library(ggplot2)
library(lubridate)
data(economics) #sample dataset
# calculate year to group by using lubridate's year function
economics$year<-year(economics$date)
# get a subset
subset<-economics[economics$year>2003&economics$year<2007,]
ggplot(subset,aes(x=date,y=unemploy))+
geom_line()+geom_violin(aes(group=year),alpha=0.5)
A prettier example would be:
ggplot(subset,aes(x=date,y=unemploy))+
geom_violin(aes(group=year,colour=year,fill=year),alpha=0.5,
kernel="rectangular")+ # passes to stat_density, makes violin rectangular
geom_line(size=1.5)+ # make the line (wider than normal)
xlab("Year")+ # label one axis
ylab("Unemployment")+ # label the other
theme_bw()+ # make white background on plot
theme(legend.position = "none") # suppress legend
To include ranges instead of or in addition to the line, you would use geom_linerange or geom_pointrange.
If you use grid graphics then you can create rotated viewports whereever you want them and plot to the rotated viewport. You just need a function that will plot using grid graphics into a specified viewport, I would suggest ggplot2 or possibly lattice for this.
In base graphics you could write your own function to plot the rotated histogram (modify the plot.histogram function or just write your own from scratch using rect or other tools). Then you can use the subplot function from the TeachingDemos package to place the plot wherever you want on a larger plot.
Alright I decided to waste the evening making a hangman game in R. Got everything looking pretty good except an unknown number of letters that get plotted as seen here:
a
b
f
d
g
Here's an example/attempt using text and mtext:
FUN <- function(n) {
plot.new()
mtext("wrong", side = 3, cex=1.5, adj = 0, padj = 1, col = "red")
wrong <- letters[1:n]
text(0, .8, paste(wrong, collapse = "\n"), offset=.3, cex=1.5)
}
FUN(5)
FUN(10)
FUN2 <- function(n) {
plot.new()
mtext("wrong", side = 3, cex=1.5, adj = 0, padj = 1, col = "red")
wrong <- letters[1:n]
mtext(paste(wrong, collapse = "\n"), side = 3, cex=1.5,
adj = 0, padj = 2.5)
}
FUN2(5)
FUN2(10)
How can I make it so the a in both FUN(5) and FUN(10) plots in the same location?
First - way to go! R Games! You should totally make a package out of it so I can play ;)
For text you can use the adj argument, if you set it to 1. Then the (x,y) coordinates provided to text refer to the top-left corner of the rectangle that contains the text.
adj: one or two values in [0, 1] which specify the x (and
optionally y) adjustment of the labels. On most devices
values outside that interval will also work.
FUN <- function(n) {
plot.new()
mtext("better?", side = 3, cex=1.5, adj = 0, padj = 1, col = "red")
wrong <- letters[1:n]
text(0, .8, paste(wrong, collapse = "\n"), offset=.3, cex=1.5, adj=c(0,1))
}
Note adj=c(0,1), 0 being x alignment and 1 being y alignment (the documentation doesn't really make this clear but since it's a value in [0,1] I assume it to be an adjustment of position as a fraction of the label length in that dimension).
Similarly for mtext you need to use padj=1, being top alignment according to the documentation (since your text direction is left to right). The adj argument is the left-right alignment.
FUN2 <- function(n) {
plot.new()
mtext("better?", side = 3, cex=1.5, adj = 0, padj = 1, col = "red")
wrong <- letters[1:n]
mtext(paste(wrong, collapse = "\n"), side = 3, cex=1.5,
adj = 0, padj = 1) # adj=1 means text on right side instead of left.
}
(Are you going to do humorous stick man figures?? This sounds so fun :D)