Using grconvertX/grconvertY in ggplot2 - r

I am trying to figure out how to use grconvertX/grconvertX in ggplot. My ultimate goal is to to add annotation to a ggplot2 figure (and possibly lattice) with grid.text and grid.lines by going from user coordinates to device coordinates. I know it can be done with grobs but I am wondering if there is an easier way.
The following code allows me to pass values from user coordinates to ndc coordinates and use those values to annotate the plot with grid.text.
graphics.off() # close graphics windows
library(grid)
library(gridBase)
test= data.frame(
x = c(1,2,3),
y = c(12,10,3),
n = c(75,76,73)
)
par(mar = c(13,5,2,3))
plot(test$y ~ test$x,type="b", ann=F)
for (i in 1:nrow(test))
{
X=grconvertX(i , from="user", to="ndc")
grid.text(x=X, y =0.2, label=paste("GRID.text at\nuser.x=", i, "\n", "ndc.x=", (signif( X, 5)) ) )
grid.lines(x=c(X, X), y = c(0.28, 0.33) )
}
#add some code to save as PDF ...
The code is based on the solution from one of my previous posts: Mixing X and Y coordinate systems . You can see how x coordinates from the original plot were converted to ndc. The advantage of this approach is that I can use device coordinates for Y.
I assumed I could easily do the same in ggplot2 (and possibly in lattice).
library(ggplot2)
graphics.off() # close graphics windows
qplot(x=x, y=y, data=test)+geom_line()+ opts(plot.margin = unit(c(1,3,8,1), "lines"))
for (i in 1:nrow(test))
{
X=grconvertX(i , from="user", to="ndc")
grid.text(x=X, y =0.2, label=paste("GRID.text at\nuser.x=", i, "\n", "ndc.x=", (signif( X, 5)) ) )
grid.lines(x=c(X, X), y = c(0.28, 0.33) )
}
#add some code to save as PDF...
However, it does not work correctly. The coordinates seem to be a bit off. The vertical lines and text don't correspond to the tick labels on the plot. Can anybody tell me how to fix it? Thanks a lot in advance.

The grconvertX and grconvertY functions work with base graphics while ggplot2 uses grid graphics. In general the 2 different graphics engines don't play nicely together (though you have demonstrated using gridBase to help). Your first example works because you started with a base graphic so the user coordinate system exists with the base graph and grconvertX converts from it. In the second case the user coordinate system was never set in the base graphics, so it looks like it might use the default coordinates of 0,1 which are similar but not identical to the top viewport coordinates so you get something similar but not exactly correct (I am actually surprised that you did not get an error or warning
Generally for grid graphics the equivalent for converting between coordinates is to just create a new viewport with the coordinate system of interest (or push/pop to an existing viewport with the correct coordinate system), then add your annotations in that viewport.
Here is an example that creates your plot, then moves down to the viewport containing the main plot, creates a new viewport with the same dimensions but with clipping turned off, the x scale is based on the data and the y scale is 0,1, then adds some text accordingly:
library(ggplot2)
library(grid)
test= data.frame( x = c(1,2,3), y = c(12,10,3), n = c(75,76,73) )
qplot(x=x, y=y, data=test)+geom_line()+ opts(plot.margin = unit(c(1,3,8,1), "lines"))
current.vpTree()
downViewport('panel-3-4')
pushViewport(dataViewport( test$x, clip='off',yscale=c(0,1)))
for (i in 1:nrow(test)) {
grid.text(x=i, y = -0.2, default.units='native',
label=paste("GRID.text at\nuser.x=", i, "\n" ) )
grid.lines(x=c(i, i), y = c(-0.1, 0), default.units='native' )
}
One of the tricky things here is that ggplot2 does not set the viewport scales to match the data being plotted, but does the conversions itself. In this case setting the scale based on the x data worked, but if ggplot2 does something fancier then this might not work. What we would need is some way to get the back tranformed coordinates from ggplot2 to use in the call to grid.text.

Related

Plot a table with box size changing

Does anyone have an idea how is this kind of chart plotted? It seems like heat map. However, instead of using color, size of each cell is used to indicate the magnitude. I want to plot a figure like this but I don't know how to realize it. Can this be done in R or Matlab?
Try scatter:
scatter(x,y,sz,c,'s','filled');
where x and y are the positions of each square, sz is the size (must be a vector of the same length as x and y), and c is a 3xlength(x) matrix with the color value for each entry. The labels for the plot can be input with set(gcf,properties) or xticklabels:
X=30;
Y=10;
[x,y]=meshgrid(1:X,1:Y);
x=reshape(x,[size(x,1)*size(x,2) 1]);
y=reshape(y,[size(y,1)*size(y,2) 1]);
sz=50;
sz=sz*(1+rand(size(x)));
c=[1*ones(length(x),1) repmat(rand(size(x)),[1 2])];
scatter(x,y,sz,c,'s','filled');
xlab={'ACC';'BLCA';etc}
xticks(1:X)
xticklabels(xlab)
set(get(gca,'XLabel'),'Rotation',90);
ylab={'RAPGEB6';etc}
yticks(1:Y)
yticklabels(ylab)
EDIT: yticks & co are only available for >R2016b, if you don't have a newer version you should use set instead:
set(gca,'XTick',1:X,'XTickLabel',xlab,'XTickLabelRotation',90) %rotation only available for >R2014b
set(gca,'YTick',1:Y,'YTickLabel',ylab)
in R, you should use ggplot2 that allows you to map your values (gene expression in your case?) onto the size variable. Here, I did a simulation that resembles your data structure:
my_data <- matrix(rnorm(8*26,mean=0,sd=1), nrow=8, ncol=26,
dimnames = list(paste0("gene",1:8), LETTERS))
Then, you can process the data frame to be ready for ggplot2 data visualization:
library(reshape)
dat_m <- melt(my_data, varnames = c("gene", "cancer"))
Now, use ggplot2::geom_tile() to map the values onto the size variable. You may update additional features of the plot.
library(ggplot2)
ggplot(data=dat_m, aes(cancer, gene)) +
geom_tile(aes(size=value, fill="red"), color="white") +
scale_fill_discrete(guide=FALSE) + ##hide scale
scale_size_continuous(guide=FALSE) ##hide another scale
In R, corrplotpackage can be used. Specifically, you have to use method = 'square' when creating the plot.
Try this as an example:
library(corrplot)
corrplot(cor(mtcars), method = 'square', col = 'red')

Setting equal xlim and ylim in plot function

Is there a way to get the plot function to generate equal xlimand ylimautomatically?
I do not want to define a fix range beforehand, but I want the plot function to decide about the range itself. However, I expect it to pick the same range for x and y.
A possible solution is to define a wrapper to the plot function:
plot.Custom <- function(x, y, ...) {
.limits <- range(x, y)
plot(x, y, xlim = .limits, ylim = .limits, ...)
}
One way is to manipulate interactively and then choose the right one. A slider will appear once you run the following code.
library(manipulate)
manipulate(
plot(cars, xlim=c(x.min,x.max)),
x.min=slider(0,15),
x.max=slider(15,30))
I'm not aware of anyway to do this using plot(doesn't mean there isn't one). ggplot might be the way to go; it lends itself more to be being retroactively changed since it is designed around a layer system.
library(ggplot2)
#Creating our ggplot object
loop_plot <- ggplot(cars, aes(x = speed, y = dist)) +
geom_point()
#pulling out the 'auto' x & y axis limits
rangepull <- t(cbind(
ggplot_build(loop_plot)$panel$ranges[[1]]$x.range,
ggplot_build(loop_plot)$panel$ranges[[1]]$y.range))
#taking the max and min(so we don't cut out data points)
newrange <- list(cor.min = min(rangepull[,1]), cor.max = max(rangepull[,2]))
#changing our plot size to be nice and symmetric
loop_plot <- loop_plot +
xlim(newrange$cor.min, newrange$cor.max) +
ylim(newrange$cor.min, newrange$cor.max)
Note that the loop_plot object is of ggplot class, and wont actually print until its called.
I used the cars dataset in the code above to show whats going on, but just sub in your data set[s] and then do whatever postmortem your end goal is.
You'll also be able to add in titles and the like based off of the dataset name et cetera which will likely end up producing a clearer visualization out of your loop.
Hopefully this works for your needs.

R ggplot: geom_tile lines in pdf output

I'm constructing a plot that uses geom_tile and then outputting it to .pdf (using pdf("filename",...)). However, when I do, the .pdf result has tiny lines (striations, as one person put it) running through it. I've attached an image showing the problem.
Googling let to this thread, but the only real advice in there was to try passing size=0 to geom_tile, which I did with no effect. Any suggestions on how I can fix these? I'd like to use this as a figure in a paper, but it's not going to work like this.
Minimal code:
require(ggplot2)
require(scales)
require(reshape)
volcano3d <- melt(volcano)
names(volcano3d) <- c("x", "y", "z")
v <- ggplot(volcano3d, aes(x, y, z = z))
pdf("mew.pdf")
print(v + geom_tile(aes(fill=z)) + stat_contour(size=2) + scale_fill_gradient("z"))
This happens because the default colour of the tiles in geom_tile seems to be white.
To fix this, you need to map the colour to z in the same way as fill.
print(v +
geom_tile(aes(fill=z, colour=z), size=1) +
stat_contour(size=2) +
scale_fill_gradient("z")
)
Try to use geom_raster:
pdf("mew.pdf")
print(v + geom_raster(aes(fill=z)) + stat_contour(size=2) + scale_fill_gradient("z"))
dev.off()
good quality in my environment.
I cannot reproduce the problem on my computer (Windows 7), but I remember it was a problem discussed on the list for certain configurations. Brian Ripley (if I remember) recommended
CairoPDF("mew.pdf") # Package Cairo
to get around this
In the interests of skinning this cat, and going into waaay too much detail, this code decomposes the R image into a mesh of quads (as used by rgl), and then shows the difference between a raster plot and a "tile" or "rect" plot.
library(raster)
im <- raster::raster(volcano)
## this is the image in rgl corner-vertex form
msh <- quadmesh::quadmesh(im)
## manual labour for colour scaling
dif <- diff(range(values(im)))
mn <- min(values(im))
scl <- function(x) (x - mn)/dif
This the the traditional R 'image', which draws a little tile or 'rect()' for every pixel.
list_image <- list(x = xFromCol(im), y = rev(yFromRow(im)), z = t(as.matrix(im)[nrow(im):1, ]))
image(list_image)
It's slow, and though it calls the source of 'rect()' under the hood, we can't also set the border colour. Use 'useRaster = TRUE' to use 'rasterImage' for more efficient drawing time, control over interpolation, and ultimately - file size.
Now let's plot the image again, but by explicitly calling rect for every pixel. ('quadmesh' probably not the easiest way to demonstrate, it's just fresh in my mind).
## worker function to plot rect from vertex index
rectfun <- function(x, vb, ...) rect(vb[1, x[1]], vb[2,x[1]], vb[1,x[3]], vb[2,x[3]], ...)
## draw just the borders on the original, traditional image
apply(msh$ib, 2, rectfun, msh$vb, border = "white")
Now try again with 'rect'.
## redraw the entire image, with rect calls
##(not efficient, but essentially the same as what image does with useRaster = FALSE)
cols <- heat.colors(12)
## just to clear the plot, and maintain the plot space
image(im, col = "black")
for (i in seq(ncol(msh$ib))) {
rectfun(msh$ib[,i], msh$vb, col = cols[scl(im[i]) * (length(cols)-1) + 1], border = "dodgerblue")
}

How can I overlay two dense scatter plots so that I can see the outlines of each in R or Matlab?

See this example
This was created in matlab by making two scatter plots independently, creating images of each, then using the imagesc to draw them into the same figure and then finally setting the alpha of the top image to 0.5.
I would like to do this in R or matlab without using images, since creating an image does not preserve the axis scale information, nor can I overlay a grid (e.g. using 'grid on' in matlab). Ideally I wold like to do this properly in matlab, but would also be happy with a solution in R. It seems like it should be possible but I can't for the life of me figure it out.
So generally, I would like to be able to set the alpha of an entire plotted object (i.e. of a matlab plot handle in matlab parlance...)
Thanks,
Ben.
EDIT: The data in the above example is actually 2D. The plotted points are from a computer simulation. Each point represents 'amplitude' (y-axis) (an emergent property specific to the simulation I'm running), plotted against 'performance' (x-axis).
EDIT 2: There are 1796400 points in each data set.
Using ggplot2 you can add together two geom_point's and make them transparent using the alpha parameter. ggplot2 als adds up transparency, and I think this is what you want. This should work, although I haven't run this.
dat = data.frame(x = runif(1000), y = runif(1000), cat = rep(c("A","B"), each = 500))
ggplot(aes(x = x, y = y, color = cat), data = dat) + geom_point(alpha = 0.3)
ggplot2 is awesome!
This is an example of calculating and drawing a convex hull:
library(automap)
library(ggplot2)
library(plyr)
loadMeuse()
theme_set(theme_bw())
meuse = as.data.frame(meuse)
chull_per_soil = ddply(meuse, .(soil),
function(sub) sub[chull(sub$x, sub$y),c("x","y")])
ggplot(aes(x = x, y = y), data = meuse) +
geom_point(aes(size = log(zinc), color = ffreq)) +
geom_polygon(aes(color = soil), data = chull_per_soil, fill = NA) +
coord_equal()
which leads to the following illustration:
You could first export the two data sets as bitmap images, re-import them, add transparency:
library(grid)
N <- 1e7 # Warning: slow
d <- data.frame(x1=rnorm(N),
x2=rnorm(N, 0.8, 0.9),
y=rnorm(N, 0.8, 0.2),
z=rnorm(N, 0.2, 0.4))
v <- with(d, dataViewport(c(x1,x2),c(y, z)))
png("layer1.png", bg="transparent")
with(d, grid.points(x1,y, vp=v,default="native",pch=".",gp=gpar(col="blue")))
dev.off()
png("layer2.png", bg="transparent")
with(d, grid.points(x2,z, vp=v,default="native",pch=".",gp=gpar(col="red")))
dev.off()
library(png)
i1 <- readPNG("layer1.png", native=FALSE)
i2 <- readPNG("layer2.png", native=FALSE)
ghostize <- function(r, alpha=0.5)
matrix(adjustcolor(rgb(r[,,1],r[,,2],r[,,3],r[,,4]), alpha.f=alpha), nrow=dim(r)[1])
grid.newpage()
grid.rect(gp=gpar(fill="white"))
grid.raster(ghostize(i1))
grid.raster(ghostize(i2))
you can add these as layers in, say, ggplot2.
Use the transparency capability of color descriptions. You can define a color as a sequence of four 2-byte words: muddy <- "#888888FF" . The first three pairs set the RGB colors (00 to FF); the final pair sets the transparency level.
AFAIK, your best option with Matlab is to just make your own plot function. The scatter plot points unfortunately do not yet have a transparency attribute so you cannot affect it. However, if you create, say, most crudely, a bunch of loops which draw many tiny circles, you can then easily give them an alpha value and obtain a transparent set of data points.

Annotate ggplot2 graphs using tikzAnnotate in tikzDevice

I would like to use tikzDevice to include annotated ggplot2 graphs in a Latex document.
tikzAnnotate help has an example of how to use it with base graphics, but how to use it with a grid-based plotting package like ggplot2? The challenge seems to be the positioning of the tikz node.
playwith package has a function convertToDevicePixels (http://code.google.com/p/playwith/source/browse/trunk/R/gridwork.R) that seems to be similar to grconvertX/grconvertY, but I am unable to get this to work either.
Would appreciate any pointers on how to proceed.
tikzAnnotate example using base graphics
library(tikzDevice)
library(ggplot2)
options(tikzLatexPackages = c(getOption('tikzLatexPackages'),
"\\usetikzlibrary{shapes.arrows}"))
tikz(standAlone=TRUE)
print(plot(15:20, 5:10))
#print(qplot(15:20, 5:10))
x <- grconvertX(17,,'device')
y <- grconvertY(7,,'device')
#px <- playwith::convertToDevicePixels(17, 7)
#x <- px$x
#y <- px$y
tikzAnnotate(paste('\\node[single arrow,anchor=tip,draw,fill=green] at (',
x,',',y,') {Look over here!};'))
dev.off()
Currently, tikzAnnotate only works with base graphics. When tikzAnnotate was first written, the problem with grid graphics was that we needed a way of specifying the x,y coordinates relative to the absolute lower left corner of the device canvas. grid thinks in terms of viewports and for many cases it seems the final coordinate system of the graphic is not known until it is heading to the device by means of the print function.
It would be great to have this functionality, but I could not figure out a way good way to implement it and so the feature got shelved. If anyone has details on a good implementation, feel free to start a discussion on the mailing list (which now has an alternate portal on Google Groups) and it will get on the TODO list.
Even better, implement the functionality and open a pull request to the project on GitHub. This is guaranteed to get the feature into a release over 9000 times faster than if it sits on my TODO list for months.
Update
I have had some time to work on this, and I have come up with a function for converting grid coordinates in the current viewport to absolute device coordinates:
gridToDevice <- function(x = 0, y = 0, units = 'native') {
# Converts a coordinate pair from the current viewport to an "absolute
# location" measured in device units from the lower left corner. This is done
# by first casting to inches in the current viewport and then using the
# current.transform() matrix to obtain inches in the device canvas.
x <- convertX(unit(x, units), unitTo = 'inches', valueOnly = TRUE)
y <- convertY(unit(y, units), unitTo = 'inches', valueOnly = TRUE)
transCoords <- c(x,y,1) %*% current.transform()
transCoords <- (transCoords / transCoords[3])
return(
# Finally, cast from inches to native device units
c(
grconvertX(transCoords[1], from = 'inches', to ='device'),
grconvertY(transCoords[2], from = 'inches', to ='device')
)
)
}
Using this missing piece, one can use tikzAnnotate to mark up a grid or lattice plot:
require(tikzDevice)
require(grid)
options(tikzLatexPackages = c(getOption('tikzLatexPackages'),
"\\usetikzlibrary{shapes.arrows}"))
tikz(standAlone=TRUE)
xs <- 15:20
ys <- 5:10
pushViewport(plotViewport())
pushViewport(dataViewport(xs,ys))
grobs <- gList(grid.rect(),grid.xaxis(),grid.yaxis(),grid.points(xs, ys))
coords <- gridToDevice(17, 7)
tikzAnnotate(paste('\\node[single arrow,anchor=tip,draw,fill=green,left=1em]',
'at (', coords[1],',',coords[2],') {Look over here!};'))
dev.off()
This gives the following output:
There is still some work to be done, such as:
Creation of a "annotation grob" that can be added to grid graphics.
Determine how to add such an object to a ggplot.
These features are scheduled to appear in release 0.7 of the tikzDevice.
I have made up a small example based on #Andrie's suggestion with geom_text and geom_polygon:
Initializing your data:
df <- structure(list(x = 15:20, y = 5:10), .Names = c("x", "y"), row.names = c(NA, -6L), class = "data.frame")
And the point you are to annotate is the 4th row in the dataset, the text should be: "Look over here!"
point <- df[4,]
ptext <- "Look over here!"
Make a nice arrow calculated from the coords of the point given above:
arrow <- data.frame(
x = c(point$x-0.1, point$x-0.3, point$x-0.3, point$x-2, point$x-2, point$x-0.3, point$x-0.3, point$x-0.1),
y = c(point$y, point$y+0.3, point$y+0.2, point$y+0.2, point$y-0.2, point$y-0.2, point$y-0.3, point$y)
)
And also make some calculations for the position of the text:
ptext <- data.frame(label=ptext, x=point$x-1, y=point$y)
No more to do besides plotting:
ggplot(df, aes(x,y)) + geom_point() + geom_polygon(aes(x,y), data=arrow, fill="green") + geom_text(aes(x, y, label=label), ptext) + theme_bw()
Of course, this is a rather hackish solution, but could be extended:
compute the size of arrow based on the x and y ranges,
compute the position of the text based on the length of the text (or by the real width of the string with textGrob),
define a shape which does not overlaps your points :)
Good luck!

Resources