I have found that the beanplot is the best way to represent my data. I want to look at multiple beanplots together to visualize my data. Each of my plots contains 3 variables, so each one looks something like what would be generated by this code:
library(beanplot)
a <- rnorm(100)
b <- rnorm(100)
c <- rnorm(100)
beanplot(a, b ,c ,ylim = c(-4, 4), main = "Beanplot",
col = c("#CAB2D6", "#33A02C", "#B2DF8A"), border = "#CAB2D6")
(Would have just included an image but my reputation score is not high enough, sorry)
I have 421 of these that I want to put into one long PDF (EDIT: One plot per page is fine, this was just poor wording on my part). The approach I have taken was to first generate the beanplots in a for loop and store them in a list at each iteration. Then I will use the multiplot function (from the R Cookbook page on multiplot) to display all of my plots on one long column so I can begin my analysis.
The problem is that the beanplot function does not appear to be set up to assign plot objects as a variable. Example:
library(beanplot)
a <- rnorm(100)
b <- rnorm(100)
plot1 <- beanplot(a, b, ylim = c(-5,5), main = "Beanplot",
col = c("#CAB2D6", "#33A02C", "#B2DF8A"), border = "#CAB2D6")
plot1
If you then type plot1 into the R console, you will get back two of the plot parameters but not the plot itself. This means that when I store the plots in the list, I am unable to graph them with multiplot. It will simply return the plot parameters and a blank plot.
This behavior does not seem to be the case with qplot for example which will return a plot when you recall the stored plot. Example:
library(ggplot2)
a <- rnorm(100)
b <- rnorm(100)
plot2 <- qplot(a,b)
plot2
There is no equivalent to the beanplot that I know of in ggplot. Is there some sort of workaround I can use for this issue?
Thank you.
You can simply open a PDF device with pdf() and keep the default parameter onefile=TRUE. Then call all your beanplot()s, one after the other. They will all be in one PDF document, each one on a separate page. See here.
In a scatterplot, I would like to use identify function to label the right top point.
I did this:
identify(x, y, labels=name, plot=TRUE)
*I have a named vector.
Then, while it is running, I point to the right point. Then after stopping it, it shows me the
label of the point.
Do I have to click the point that I want to label each time? Can I save it?
# Here is an example
x = 1:10
y = x^2
name = letters[1:10]
plot(x, y)
identify(x, y, labels = name, plot=TRUE)
# Now you have to click on the points and select finish at the end
# The output will be the labels you have corresponding to the dots.
Regarding saving it:
I couldn't do it using
pdf()
# plotting code
dev.off()
However in Rstudio it was posible to "copy-paste" it. If you need one plot only, i guess this would work.
You can use the return value of identify function to reproduce the labelling:
labels <- rep(letters, length.out=nrow(cars))
p <- identify(cars$speed, cars$dist, labels, plot=T)
#now we can reproduce labelling
plot(cars)
text(cars$speed[p], cars$dist[p], labels[p], pos=3)
To save the plot after using identify, you can use dev.copy:
labels <- rep(letters, length.out=nrow(cars))
identify(cars$speed, cars$dist, labels, plot=T)
#select your points here
dev.copy(png, 'myplot.png', width=600, height=600)
dev.off()
I need to use black and white color for my boxplots in R. I would like to colorfill the boxplot with lines and dots. For an example:
I imagine ggplot2 could do that but I can't find any way to do it.
Thank you in advance for your help!
I thought this was a great question and pondered if it was possible to do this in base R and to obtain the checkered look. So I put together some code that relies on boxplot.stats and polygon (which can draw angled lines). Here's the solution, which is really not ready for primetime, but is a solution that could be tinkered with to make more general.
boxpattern <-
function(y, xcenter, boxwidth, angle=NULL, angle.density=10, ...) {
# draw an individual box
bstats <- boxplot.stats(y)
bxmin <- bstats$stats[1]
bxq2 <- bstats$stats[2]
bxmedian <- bstats$stats[3]
bxq4 <- bstats$stats[4]
bxmax <- bstats$stats[5]
bleft <- xcenter-(boxwidth/2)
bright <- xcenter+(boxwidth/2)
# boxplot
polygon(c(bleft,bright,bright,bleft,bleft),
c(bxq2,bxq2,bxq4,bxq4,bxq2), angle=angle[1], density=angle.density)
polygon(c(bleft,bright,bright,bleft,bleft),
c(bxq2,bxq2,bxq4,bxq4,bxq2), angle=angle[2], density=angle.density)
# lines
segments(bleft,bxmedian,bright,bxmedian,lwd=3) # median
segments(bleft,bxmin,bright,bxmin,lwd=1) # min
segments(xcenter,bxmin,xcenter,bxq2,lwd=1)
segments(bleft,bxmax,bright,bxmax,lwd=1) # max
segments(xcenter,bxq4,xcenter,bxmax,lwd=1)
# outliers
if(length(bstats$out)>0){
for(i in 1:length(bstats$out))
points(xcenter,bstats$out[i])
}
}
drawboxplots <- function(y, x, boxwidth=1, angle=NULL, ...){
# figure out all the boxes and start the plot
groups <- split(y,as.factor(x))
len <- length(groups)
bxylim <- c((min(y)-0.04*abs(min(y))),(max(y)+0.04*max(y)))
xcenters <- seq(1,max(2,(len*(1.4))),length.out=len)
if(is.null(angle)){
angle <- seq(-90,75,length.out=len)
angle <- lapply(angle,function(x) c(x,x))
}
else if(!length(angle)==len)
stop("angle must be a vector or list of two-element vectors")
else if(!is.list(angle))
angle <- lapply(angle,function(x) c(x,x))
# draw plot area
plot(0, xlim=c(.97*(min(xcenters)-1), 1.04*(max(xcenters)+1)),
ylim=bxylim,
xlab="", xaxt="n",
ylab=names(y),
col="white", las=1)
axis(1, at=xcenters, labels=names(groups))
# draw boxplots
plots <- mapply(boxpattern, y=groups, xcenter=xcenters,
boxwidth=boxwidth, angle=angle, ...)
}
Some examples in action:
mydat <- data.frame(y=c(rnorm(200,1,4),rnorm(200,2,2)),
x=sort(rep(1:2,200)))
drawboxplots(mydat$y, mydat$x)
mydat <- data.frame(y=c(rnorm(200,1,4),rnorm(200,2,2),
rnorm(200,3,3),rnorm(400,-2,8)),
x=sort(rep(1:5,200)))
drawboxplots(mydat$y, mydat$x)
drawboxplots(mydat$y, mydat$x, boxwidth=.5, angle.density=30)
drawboxplots(mydat$y, mydat$x, # specify list of two-element angle parameters
angle=list(c(0,0),c(90,90),c(45,45),c(45,-45),c(0,90)))
EDIT: I wanted to add that one could also obtain dots as a fill by basically drawing a pattern of dots, then covering them a "donut"-shaped polygon, like so:
x <- rep(1:10,10)
y <- sort(x)
plot(y~x, xlim=c(0,11), ylim=c(0,11), pch=20)
outerbox.x <- c(2.5,0.5,10.5,10.5,0.5,0.5,2.5,7.5,7.5,2.5)
outerbox.y <- c(2.5,0.5,0.5,10.5,10.5,0.5,2.5,2.5,7.5,7.5)
polygon(outerbox.x,outerbox.y, col="white", border="white") # donut
polygon(c(2.5,2.5,7.5,7.5,2.5),c(2.5,2.5,2.5,7.5,7.5)) # inner box
But mixing that with angled lines in a single plotting function would be a bit difficult, and is generally a bit more challenging, but it starts to get you there.
I think it is hard to do this with ggplot2 since it dont use shading polygon(gris limitatipn). But you can use shading line feature in base plot, paramtered by density and angle arguments in some plot functions ( ploygon, barplot,..).
The problem that boxplot don't use this feature. So I hack it , or rather I hack bxp internally used by boxplot. The hack consist in adding 2 arguments (angle and density) to bxp function and add them internally in the call of xypolygon function ( This occurs in 2 lines).
my.bxp <- function (all.bxp.argument,angle,density, ...) {
.....#### bxp code
xypolygon(xx, yy, lty = boxlty[i], lwd = boxlwd[i],
border = boxcol[i],angle[i],density[i])
.......## bxp code after
xypolygon(xx, yy, lty = "blank", col = boxfill[i],angle[i],density[i])
......
}
Here an example. It should be noted that it is entirely the responsibility of the user to ensure
that the legend corresponds to the plot. So I add some code to rearrange the legend an the boxplot code.
require(stats)
set.seed(753)
(bx.p <- boxplot(split(rt(100, 4), gl(5, 20))))
layout(matrix(c(1,2),nrow=1),
width=c(4,1))
angles=c(60,30,40,50,60)
densities=c(50,30,40,50,30)
par(mar=c(5,4,4,0)) #Get rid of the margin on the right side
my.bxp(bx.p,angle=angles,density=densities)
par(mar=c(5,0,4,2)) #No margin on the left side
plot(c(0,1),type="n", axes=F, xlab="", ylab="")
legend("top", paste("region", 1:5),
angle=angles,density=densities)
A minor question about plotting stacked barplot in R.
The stacked bars represent the series bottom-to-top.
But the legend always shows the series top-to-bottom. I think that is also true with ggplot2::geom_bar
Is there any nicer idiom than using rev(...) twice inside either legend() or barplot() as in:
exports <- data.frame(100*rbind('Americas'=runif(6),'Asia'=runif(6),'Other'=runif(6)))
colnames(exports) <- 2004:2009
series_we_want <- c(1,2,3)
barplot( as.matrix(exports[series_we_want,]), col=mycolors, ...)
legend(x="topleft", legend=rev(rownames(exports)[series_we_want]), col=rev(mycolors) ...)
(If you omit one of the rev()'s the output is obviously meaningless. Seems like an enhance case for adding a single flag yflip=TRUE or yreverse=TRUE)
This is what I got using your code:
exports <- data.frame(100*rbind('Americas'=runif(6),'Asia'=runif(6),'Other'=runif(6)))
colnames(exports) <- 2004:2009
series_we_want <- c(1,2,3)
barplot( as.matrix(exports[series_we_want,]))
legend(x="topleft", legend=rev(rownames(exports)[series_we_want]))
try this:
exports <- data.frame(100*rbind('Americas'=runif(6),'Asia'=runif(6),'Other'=runif(6)))
colnames(exports) <- 2004:2009
series_we_want <- c(1,2,3)
test_data<-as.matrix(exports[series_we_want])
barplot( test_data,
legend.text=as.character(rev(rownames(exports)[series_we_want])),
args.legend = list(x="topleft"))
seems to produce the legend in the opposite order of what you have
I am trying to figure out how to use grconvertX/grconvertX in ggplot. My ultimate goal is to to add annotation to a ggplot2 figure (and possibly lattice) with grid.text and grid.lines by going from user coordinates to device coordinates. I know it can be done with grobs but I am wondering if there is an easier way.
The following code allows me to pass values from user coordinates to ndc coordinates and use those values to annotate the plot with grid.text.
graphics.off() # close graphics windows
library(grid)
library(gridBase)
test= data.frame(
x = c(1,2,3),
y = c(12,10,3),
n = c(75,76,73)
)
par(mar = c(13,5,2,3))
plot(test$y ~ test$x,type="b", ann=F)
for (i in 1:nrow(test))
{
X=grconvertX(i , from="user", to="ndc")
grid.text(x=X, y =0.2, label=paste("GRID.text at\nuser.x=", i, "\n", "ndc.x=", (signif( X, 5)) ) )
grid.lines(x=c(X, X), y = c(0.28, 0.33) )
}
#add some code to save as PDF ...
The code is based on the solution from one of my previous posts: Mixing X and Y coordinate systems . You can see how x coordinates from the original plot were converted to ndc. The advantage of this approach is that I can use device coordinates for Y.
I assumed I could easily do the same in ggplot2 (and possibly in lattice).
library(ggplot2)
graphics.off() # close graphics windows
qplot(x=x, y=y, data=test)+geom_line()+ opts(plot.margin = unit(c(1,3,8,1), "lines"))
for (i in 1:nrow(test))
{
X=grconvertX(i , from="user", to="ndc")
grid.text(x=X, y =0.2, label=paste("GRID.text at\nuser.x=", i, "\n", "ndc.x=", (signif( X, 5)) ) )
grid.lines(x=c(X, X), y = c(0.28, 0.33) )
}
#add some code to save as PDF...
However, it does not work correctly. The coordinates seem to be a bit off. The vertical lines and text don't correspond to the tick labels on the plot. Can anybody tell me how to fix it? Thanks a lot in advance.
The grconvertX and grconvertY functions work with base graphics while ggplot2 uses grid graphics. In general the 2 different graphics engines don't play nicely together (though you have demonstrated using gridBase to help). Your first example works because you started with a base graphic so the user coordinate system exists with the base graph and grconvertX converts from it. In the second case the user coordinate system was never set in the base graphics, so it looks like it might use the default coordinates of 0,1 which are similar but not identical to the top viewport coordinates so you get something similar but not exactly correct (I am actually surprised that you did not get an error or warning
Generally for grid graphics the equivalent for converting between coordinates is to just create a new viewport with the coordinate system of interest (or push/pop to an existing viewport with the correct coordinate system), then add your annotations in that viewport.
Here is an example that creates your plot, then moves down to the viewport containing the main plot, creates a new viewport with the same dimensions but with clipping turned off, the x scale is based on the data and the y scale is 0,1, then adds some text accordingly:
library(ggplot2)
library(grid)
test= data.frame( x = c(1,2,3), y = c(12,10,3), n = c(75,76,73) )
qplot(x=x, y=y, data=test)+geom_line()+ opts(plot.margin = unit(c(1,3,8,1), "lines"))
current.vpTree()
downViewport('panel-3-4')
pushViewport(dataViewport( test$x, clip='off',yscale=c(0,1)))
for (i in 1:nrow(test)) {
grid.text(x=i, y = -0.2, default.units='native',
label=paste("GRID.text at\nuser.x=", i, "\n" ) )
grid.lines(x=c(i, i), y = c(-0.1, 0), default.units='native' )
}
One of the tricky things here is that ggplot2 does not set the viewport scales to match the data being plotted, but does the conversions itself. In this case setting the scale based on the x data worked, but if ggplot2 does something fancier then this might not work. What we would need is some way to get the back tranformed coordinates from ggplot2 to use in the call to grid.text.