What does autoplot.microbenchmark actually plot? - r

According to the docs, microbenchmark:::autoplot "Uses ggplot2 to produce a more legible graph of microbenchmark timings."
Cool! Let's try the example code:
library("ggplot2")
tm <- microbenchmark(rchisq(100, 0),
rchisq(100, 1),
rchisq(100, 2),
rchisq(100, 3),
rchisq(100, 5), times=1000L)
autoplot(tm)
I don't see anything about the...squishy undulations in the documentation, but my best guess from this answer by the function creator is that this is like a smoothed series of boxplots of the time taken to run, with the upper and lower quartiles connected over the body of the shape. Maybe? These plots look too interesting not to find out what is going on here.
What is this a plot of?

The short answer is a violin plot:
It is a box plot with a rotated kernel density plot on each side.
The longer more interesting(?) answer. When you call the autoplot function, you are actually calling
## class(ts) is microbenchmark
autoplot.microbenchmark
We can then inspect the actual function call via
R> getS3method("autoplot", "microbenchmark")
function (object, ..., log = TRUE, y_max = 1.05 * max(object$time))
{
y_min <- 0
object$ntime <- convert_to_unit(object$time, "t")
plt <- ggplot(object, ggplot2::aes_string(x = "expr", y = "ntime"))
## Another ~6 lines or so after this
The key line is + stat_ydensity(). Looking at ?stat_ydensity you
come to the help page on violin plots.

Related

plot function in R producing legend without legend() being called

I'm trying to produce a cumulative incidence plot for a competing hazards survival analysis using plot() in R. For some reason, the plot that is produced has a legend that I have not called. The legend is intersecting with the lines on my graph and I can't figure out how to get rid of it. Please help!
My code is as follows:
CompRisk2 <- cuminc(ftime=ADI$time_DeathTxCensor, fstatus=ADI$status, group=ADI$natADI_quart)
cols <- c("darkorange","coral1","firebrick1","firebrick4","lightskyblue","darkturquoise","dodgerblue","dodgerblue4")
par(bg="white")
plot(CompRisk2,
col=cols,
xlab="Years",
ylab="Probability of Mortality or Transplant",
xlim=c(0,10),
ylim=c(0,0.6))
Which produces the following plot:
I tried adding the following code to move the legend out of the frame, but I got an error:
legend(0,5, legend=c(11,21,31,41,12,22,32,42),
col=c("darkorange","coral1","firebrick1","firebrick4","lightskyblue","darkturquoise","dodgerblue","dodgerblue4"),
lty=1:2, cex=0.8, text.font=4, box.lty=0)
Error: Error in title(...) : invalid graphics parameter
Any help would be much appreciated!
You are using the cuminc function from the cmprsk package. This produces an object of class cuminc, which has an S3 plot method. ?plot.cuminc shows you the documentation and typing plot.cuminc shows you the code.
There is some slightly obscure code that suggests a workaround:
u <- list(...)
if (length(u) > 0) {
i <- pmatch(names(u), names(formals(legend)), 0)
do.call("legend", c(list(x = wh[1], y = wh[2], legend = curvlab,
col = color, lty = lty, lwd = lwd, bty = "n", bg = -999999),
u[i > 0]))
}
This says that any additional arguments passed in ... whose names match the names of arguments to legend will be passed to legend(). legend() has a plot argument:
plot: logical. If ‘FALSE’, nothing is plotted but the sizes are returned.
So it looks like adding plot=FALSE to your plot() command will work.
In principle you could try looking at the other arguments to legend() and see if any of them will adjust the legend position/size as you want. Unfortunately the x argument to legend (which would determine the horizontal position) is masked by the first argument to plot.cuminc.
I don't think that the ellipsis arguments are intended for the legend call inside plot.cuminc. The code offered in Ben's answer suggests that there might be a wh argument that determines the location of the legend. It is not named within the parameters as "x" in the code he offered, but is rather given as a positionally-defined argument. If you look at the plot.cuminc function you do in fact find that wh is documented.
I cannot test this because you have not offered us access to the ADI-object but my suggestion would be to try:
opar <- par(xpd=TRUE) # xpd lets graphics be placed 'outside'
plot(CompRisk2,
col=cols, wh=c(-.5, 7),
xlab="Years",
ylab="Probability of Mortality or Transplant",
xlim=c(0,10),
ylim=c(0,0.6))
par(opar) # restores original graphics parameters
It's always a bit risky to put out a code chunk without testing, but I'm happy to report that I did find a suitable test and it seems to work reasonably as predicted. Using the code below on the object in the SO question prior question about using the gg-packages for cmprsk:
library(cmprsk)
# some simulated data to get started
comp.risk.data <- data.frame("tfs.days" = rweibull(n = 100, shape = 1, scale = 1)*100,
"status.tfs" = c(sample(c(0,1,1,1,1,2), size=50, replace=T)),
"Typing" = sample(c("A","B","C","D"), size=50, replace=T))
# fitting a competing risks model
CR <- cuminc(ftime = comp.risk.data$tfs.days,
fstatus = comp.risk.data$status.tfs,
cencode = 0,
group = comp.risk.data$Typing)
opar <- par(xpd=TRUE) # xpd lets graphics be placed 'outside'
plot(CR,
wh=c(-15, 1.1), # obviously different than the OP's coordinates
xlab="Years",
ylab="Probability of Mortality or Transplant",
xlim=c(0,400),
ylim=c(0,1))
par(opar) # restores graphics parameters
I get the legend to move up and leftward from its original position.

replicating an rgl viewpoint in lattice

It would be convenient to interactively select a decent viewpoint using rgl and then adopt the same orientation in a lattice 3d-plot. For example, given the following plot using a non-informative viewpoint.
library(lattice)
wireframe(volcano, screen = list(x=0, y=0, z=0))
The same can be opened in rgl by
library(rgl)
persp3d(volcano)
view3d(0, 0)
Interactively it is easy to rotate the plot to an informative view.
The matrix giving the current rgl viewpoint in can be extracted by
p <- par3d()
p$userMatrix
How can this matrix be converted into corresponding x,y,z screen parameters to replicate the view in lattice?
UPDATE 1
I tried out 42's conversion below. The code shows the rgl plot and the corresponding lattice plot per row. If I implemented it correctly (see code below), there appears to still be an issue.
# convert rgl viewpoint into lattice
# screen orientation
rgl_to_lattice_viewpoint <- function()
{
p <- par3d()
rotm <- p$userMatrix
B = 360*atan(rotm[1,2]/rotm[2,2])/(2*pi)
P = 360*asin(-rotm[3,2])/(2*pi)
H = 360*atan(rotm[3,1]/rotm[3,3])/(2*pi)
list(x=-B, y=-P, z=-H)
}
# read and plot PNG image
plot_png <- function(f)
{
img <- readPNG(f)
rimg <- as.raster(img) # raster multilayer object
plot(NULL, xlim=c(0,1), ylim=c(0,1), xlab = "", ylab = "",
asp=1, frame=F, xaxt="n", yaxt="n")
rasterImage(rimg, 0, 0, 1, 1)
}
# create rgl snapshot with random rotation and
# corresponding lattice wireframe plot
lattice_plus_rgl_plot <- function()
{
# rgl plot random rotation
persp3d(volcano, col = "green3")
theta <- sample(-180:180, 1)
phi <- sample(-90:90, 1)
view3d(theta, phi, fov=40)
v <- rgl_to_lattice_viewpoint()
f <- tempfile(fileext = ".png")
rgl.snapshot(f)
rgl.close()
# lattice plot
f2 <- tempfile(fileext = ".png")
png(f2)
print(wireframe(volcano, screen = v))
dev.off()
# plot both
plot_png(f)
plot_png(f2)
}
# CREATE SOME PLOTS
library(rgl)
library(lattice)
library(png)
par(mfrow=c(3,2), mar=c(0,0,0,0))
replicate(3, lattice_plus_rgl_plot())
I used the answer to this question for conversion from a rotation matrix to angles: Conversion euler to matrix and matrix to euler . I admit to concern that I see another somewhat different answer here: How to calculate the angle from Roational matrix . (My linear algebra is not good enough to determine which of these is correct.)
p <- par3d()
rotm <- p$userMatrix
B = 360*atan(rotm[1,2]/rotm[2,2])/(2*pi)
P = 360*asin(-rotm[3,2])/(2*pi)
H = 360*atan(rotm[3,1]/rotm[3,3])/(2*pi)
> print(list(B,P,H))
[[1]]
[1] 41.54071
[[2]]
[1] 40.28412
[[3]]
[1] 41.24902
At that point I had already rotated the RGL-object to roughly the "viewing point" that you had suggested. I discovered by experimentation that the negative values supplied to the wireframe call delivered apparently correct results. "Viewer rotation angles" are plausibly seen as the negative for "object rotation angles".
png(); print(wireframe(volcano, screen = list(x=-B, y=-P, z=-H)) ); dev.off()
There is a rotate.wireframe function in the TeachingDemos package but it does not play well with concurrently running rgl plots. (No plot was apparent until I closed the rgl device.) It also seemed kind of buggy when running on a Mac (thick black line across the lattice plot). It uses the X11/XQuartz facilities to manage interaction via tk/tcl functions and I was unable to reproduce the plots from the angles being displayed. Looking at the code I'm not able to understand why that should be so. But your mileage may vary.
This version of your function uses conversions from the orientlib package, and makes the rotation matrix an argument:
rgl_to_lattice_viewpoint <- function(rotm = par3d("userMatrix"))
{
e <- -orientlib::eulerzyx(orientlib::rotmatrix(rotm[1:3, 1:3]))#x*180/pi
list(z = e[1], y = e[2], x = e[3])
}
Note that the z, y, x order is essential.
Using it in place of your function, I get this output:
These get the rotation right. I don't know if it's also possible to get the perspective to match.
Edited to add: rgl version 0.95.1468, so far available only on R-forge,
contains a version of this function and one for base graphics as well.

ggplot2: easy way to plot integral over independent variable?

I'm integrating a function f(t) = 2t (just an example) and would like to plot the integral as a function of time t using
awesome_thing <- function(t) {2*t}
integrate(awesome_thing, lower=0, upper=10)
However, I would like to plot the integral as a function of time in ggplot2, so for this example the plotted points would be (1,1), (2,4), (3,9), ..., (10,100).
Is there an easy way to do this in ggplot (e.g., something similar to how functions are plotted)? I understand I can "manually" evaluate and plot the data for each t, but I thought i'd see if anyone could recommend a simpler way.
Here is a ggplot solution and stat_function
# create a function that is vectorized over the "upper" limit of your
# integral
int_f <- Vectorize(function(f = awesome_thing, lower=0,upper,...){
integrate(f,lower,upper,...)[['value']] },'upper')
ggplot(data.frame(x = c(0,10)),aes(x=x)) +
stat_function(fun = int_f, args = list(f = awesome_thing, lower=0))
Not ggplot2 but shouldn't be difficult to adapt by creating a dataframe to pass to that paradgm:
plot(x=seq(0.1,10, by=0.1),
y= sapply(seq(0.1,10, by=0.1) ,
function(x) integrate(awesome_thing, lower=0, upper=x)$value ) ,
type="l")
The trick with the integrate function is that it retruns a list and you need to extract the 'value'-element for various changes in the upper limit.

contour plot of a custom function in R

I'm working with some custom functions and I need to draw contours for them based on multiple values for the parameters.
Here is an example function:
I need to draw such a contour plot:
Any idea?
Thanks.
First you construct a function, fourvar that takes those four parameters as arguments. In this case you could have done it with 3 variables one of which was lambda_2 over lambda_1. Alpha1 is fixed at 2 so alpha_1/alpha_2 will vary over 0-10.
fourvar <- function(a1,a2,l1,l2){
a1* integrate( function(x) {(1-x)^(a1-1)*(1-x^(l2/l1) )^a2} , 0 , 1)$value }
The trick is to realize that the integrate function returns a list and you only want the 'value' part of that list so it can be Vectorize()-ed.
Second you construct a matrix using that function:
mat <- outer( seq(.01, 10, length=100),
seq(.01, 10, length=100),
Vectorize( function(x,y) fourvar(a1=2, x/2, l1=2, l2=y/2) ) )
Then the task of creating the plot with labels in those positions can only be done easily with lattice::contourplot. After doing a reasonable amount of searching it does appear that the solution to geom_contour labeling is still a work in progress in ggplot2. The only labeling strategy I found is in an external package. However, the 'directlabels' package's function directlabel does not seem to have sufficient control to spread the labels out correctly in this case. In other examples that I have seen, it does spread the labels around the plot area. I suppose I could look at the code, but since it depends on the 'proto'-package, it will probably be weirdly encapsulated so I haven't looked.
require(reshape2)
mmat <- melt(mat)
str(mmat) # to see the names in the melted matrix
g <- ggplot(mmat, aes(x=Var1, y=Var2, z=value) )
g <- g+stat_contour(aes(col = ..level..), breaks=seq(.1, .9, .1) )
g <- g + scale_colour_continuous(low = "#000000", high = "#000000") # make black
install.packages("directlabels", repos="http://r-forge.r-project.org", type="source")
require(directlabels)
direct.label(g)
Note that these are the index positions from the matrix rather than the ratios of parameters, but that should be pretty easy to fix.
This, on the other hand, is how easilyy one can construct it in lattice (and I think it looks "cleaner":
require(lattice)
contourplot(mat, at=seq(.1,.9,.1))
As I think the question is still relevant, there have been some developments in the contour plot labeling in the metR package. Adding to the previous example will give you nice contour labeling also with ggplot2
require(metR)
g + geom_text_contour(rotate = TRUE, nudge_x = 3, nudge_y = 5)

How to plot a violin scatter boxplot (in R)?

I just came by the following plot:
And wondered how can it be done in R? (or other softwares)
Update 10.03.11: Thank you everyone who participated in answering this question - you gave wonderful solutions! I've compiled all the solution presented here (as well as some others I've came by online) in a post on my blog.
Make.Funny.Plot does more or less what I think it should do. To be adapted according to your own needs, and might be optimized a bit, but this should be a nice start.
Make.Funny.Plot <- function(x){
unique.vals <- length(unique(x))
N <- length(x)
N.val <- min(N/20,unique.vals)
if(unique.vals>N.val){
x <- ave(x,cut(x,N.val),FUN=min)
x <- signif(x,4)
}
# construct the outline of the plot
outline <- as.vector(table(x))
outline <- outline/max(outline)
# determine some correction to make the V shape,
# based on the range
y.corr <- diff(range(x))*0.05
# Get the unique values
yval <- sort(unique(x))
plot(c(-1,1),c(min(yval),max(yval)),
type="n",xaxt="n",xlab="")
for(i in 1:length(yval)){
n <- sum(x==yval[i])
x.plot <- seq(-outline[i],outline[i],length=n)
y.plot <- yval[i]+abs(x.plot)*y.corr
points(x.plot,y.plot,pch=19,cex=0.5)
}
}
N <- 500
x <- rpois(N,4)+abs(rnorm(N))
Make.Funny.Plot(x)
EDIT : corrected so it always works.
I recently came upon the beeswarm package, that bears some similarity.
The bee swarm plot is a
one-dimensional scatter plot like
"stripchart", but with closely-packed,
non-overlapping points.
Here's an example:
library(beeswarm)
beeswarm(time_survival ~ event_survival, data = breast,
method = 'smile',
pch = 16, pwcol = as.numeric(ER),
xlab = '', ylab = 'Follow-up time (months)',
labels = c('Censored', 'Metastasis'))
legend('topright', legend = levels(breast$ER),
title = 'ER', pch = 16, col = 1:2)
(source: eklund at www.cbs.dtu.dk)
I have come up with the code similar to Joris, still I think this is more than a stem plot; here I mean that they y value in each series is a absolute value of a distance to the in-bin mean, and x value is more about whether the value is lower or higher than mean.
Example code (sometimes throws warnings but works):
px<-function(x,N=40,...){
x<-sort(x);
#Cutting in bins
cut(x,N)->p;
#Calculate the means over bins
sapply(levels(p),function(i) mean(x[p==i]))->meansl;
means<-meansl[p];
#Calculate the mins over bins
sapply(levels(p),function(i) min(x[p==i]))->minl;
mins<-minl[p];
#Each dot is one value.
#X is an order of a value inside bin, moved so that the values lower than bin mean go below 0
X<-rep(0,length(x));
for(e in levels(p)) X[p==e]<-(1:sum(p==e))-1-sum((x-means)[p==e]<0);
#Y is a bin minum + absolute value of a difference between value and its bin mean
plot(X,mins+abs(x-means),pch=19,cex=0.5,...);
}
Try the vioplot package:
library(vioplot)
vioplot(rnorm(100))
(with awful default color ;-)
There is also wvioplot() in the wvioplot package, for weighted violin plot, and beanplot, which combines violin and rug plots. They are also available through the lattice package, see ?panel.violin.
Since this hasn't been mentioned yet, there is also ggbeeswarm as a relatively new R package based on ggplot2.
Which adds another geom to ggplot to be used instead of geom_jitter or the like.
In particular geom_quasirandom (see second example below) produces really good results and I have in fact adapted it as default plot.
Noteworthy is also the package vipor (VIolin POints in R) which produces plots using the standard R graphics and is in fact also used by ggbeeswarm behind the scenes.
set.seed(12345)
install.packages('ggbeeswarm')
library(ggplot2)
library(ggbeeswarm)
ggplot(iris,aes(Species, Sepal.Length)) + geom_beeswarm()
ggplot(iris,aes(Species, Sepal.Length)) + geom_quasirandom()
#compare to jitter
ggplot(iris,aes(Species, Sepal.Length)) + geom_jitter()

Resources