How can I overlay two dense scatter plots so that I can see the outlines of each in R or Matlab? - r

See this example
This was created in matlab by making two scatter plots independently, creating images of each, then using the imagesc to draw them into the same figure and then finally setting the alpha of the top image to 0.5.
I would like to do this in R or matlab without using images, since creating an image does not preserve the axis scale information, nor can I overlay a grid (e.g. using 'grid on' in matlab). Ideally I wold like to do this properly in matlab, but would also be happy with a solution in R. It seems like it should be possible but I can't for the life of me figure it out.
So generally, I would like to be able to set the alpha of an entire plotted object (i.e. of a matlab plot handle in matlab parlance...)
Thanks,
Ben.
EDIT: The data in the above example is actually 2D. The plotted points are from a computer simulation. Each point represents 'amplitude' (y-axis) (an emergent property specific to the simulation I'm running), plotted against 'performance' (x-axis).
EDIT 2: There are 1796400 points in each data set.

Using ggplot2 you can add together two geom_point's and make them transparent using the alpha parameter. ggplot2 als adds up transparency, and I think this is what you want. This should work, although I haven't run this.
dat = data.frame(x = runif(1000), y = runif(1000), cat = rep(c("A","B"), each = 500))
ggplot(aes(x = x, y = y, color = cat), data = dat) + geom_point(alpha = 0.3)
ggplot2 is awesome!
This is an example of calculating and drawing a convex hull:
library(automap)
library(ggplot2)
library(plyr)
loadMeuse()
theme_set(theme_bw())
meuse = as.data.frame(meuse)
chull_per_soil = ddply(meuse, .(soil),
function(sub) sub[chull(sub$x, sub$y),c("x","y")])
ggplot(aes(x = x, y = y), data = meuse) +
geom_point(aes(size = log(zinc), color = ffreq)) +
geom_polygon(aes(color = soil), data = chull_per_soil, fill = NA) +
coord_equal()
which leads to the following illustration:

You could first export the two data sets as bitmap images, re-import them, add transparency:
library(grid)
N <- 1e7 # Warning: slow
d <- data.frame(x1=rnorm(N),
x2=rnorm(N, 0.8, 0.9),
y=rnorm(N, 0.8, 0.2),
z=rnorm(N, 0.2, 0.4))
v <- with(d, dataViewport(c(x1,x2),c(y, z)))
png("layer1.png", bg="transparent")
with(d, grid.points(x1,y, vp=v,default="native",pch=".",gp=gpar(col="blue")))
dev.off()
png("layer2.png", bg="transparent")
with(d, grid.points(x2,z, vp=v,default="native",pch=".",gp=gpar(col="red")))
dev.off()
library(png)
i1 <- readPNG("layer1.png", native=FALSE)
i2 <- readPNG("layer2.png", native=FALSE)
ghostize <- function(r, alpha=0.5)
matrix(adjustcolor(rgb(r[,,1],r[,,2],r[,,3],r[,,4]), alpha.f=alpha), nrow=dim(r)[1])
grid.newpage()
grid.rect(gp=gpar(fill="white"))
grid.raster(ghostize(i1))
grid.raster(ghostize(i2))
you can add these as layers in, say, ggplot2.

Use the transparency capability of color descriptions. You can define a color as a sequence of four 2-byte words: muddy <- "#888888FF" . The first three pairs set the RGB colors (00 to FF); the final pair sets the transparency level.

AFAIK, your best option with Matlab is to just make your own plot function. The scatter plot points unfortunately do not yet have a transparency attribute so you cannot affect it. However, if you create, say, most crudely, a bunch of loops which draw many tiny circles, you can then easily give them an alpha value and obtain a transparent set of data points.

Related

how to mimic histogram plot from flowjo in R using flowCore?

I'm new to flowCore + R. I would like to mimic a histogram plot after gating that can be manually done in FlowJo software. I got something similar but it doesn't look quite right because it is a "density" plot and is shifted. How can I get the x axis to shift over and look similar to how FlowJo outputs the plot? I tried reading this document but couldn't find a plot similar to the one in FlowJo: howtoflowcore Appreciate any guidance. Thanks.
code snippet:
library(flowCore)
parentpath <- "/parent/path"
subfolder <- "Sample 1"
fcs_files <- list.files(paste0(parentpath, subfolder), pattern = ".fcs")
fs <- read.flowSet(fcs_files)
rect.g <- rectangleGate(filterId = "main",list("FSC-A" = c(1e5, 2e5), "SSC-A" = c(3e4,1e5)))
fs_sub <- Subset(fs, rect.g)
p <- ggcyto(fs_sub[[15]], aes(x= `UV-379-A`)) +
geom_density(fill='black', alpha = 0.4) +
ggcyto_par_set(limits = list(x = c(-1e3, 5e4), y = c(0, 6e-5)))
p
FlowJo output:
R FlowCore output:
The reason that for the "shift" is that the x axis is logarithmic (base 10) in the flowJo graph. To achieve the same result in R, add
+ scale_x_log10()
after the existing code. This might interact weirdly with the axis limits you've set, so bare that in mind.
To make the y-axis "count" rather than density, you can change the first line of your ggcyto() call to:
aes(x= `UV-379-A`, y = after_stat(count))
Let me know if that works - I don't have your data to hand so that's all from memory!
For any purely aesthetic changes, they are relatively easy to look up.

3D pipe/tube plots in R - creating plots of tree roots

I'm trying to create 3D plots of simulated tree roots in R. Here is an example of a root system growing over time:
This is essentially a 3D network of cylinders, where the cylinder diameter (and, optionally, color) represents the size of the root. The available data includes:
x, y, z of the root centroid
direction of "parent" root (e.g. +x, -x, +y, -y, +z, -z), although this information could be captured in several different ways, including by calculating the x, y, z of the parent directly prior to plotting.
size of root
Example 3D data is here, but here is my first attempt at it in just 2D using ggplot2::geom_spoke:
dat <- data.frame(x = c(0,1,-1,0,1,-1),
y = c(-1,-1,-1,-2,-2,-2),
biomass = c(3,1.5,1.5,1,1,1),
parent.dir = c("+y","-x","+x","+y","+y","+y"))
dat$parent.dir <- as.numeric(as.character(factor(dat$parent.dir,
levels = c("-x", "+x", "-y", "+y"),
labels = c(pi, 0, pi*3/2, pi/2))))
ggplot(dat, aes(x = x, y = y)) +
geom_point(x = 0, y = 0, size = 20) +
geom_spoke(radius = 1,
aes(angle = parent.dir,
size = biomass)) +
coord_equal()
I prefer a solution based in the ggplot2 framework, but I realize that there are not a ton of 3D options for ggplot2. One interesting approach could be to creatively utilize the concept of network graphs via the ggraph and tidygraph packages. While those packages only operate in 2D as far as I know, their developer has also had some interesting related ideas in 3D that could also be applied.
The rgl library in seems to be the go-to for 3D plots in R, but an rgl solution just seems so much more complex and lacks the other benefits of ggplot2, such as faceting by year as in the example, easily adjusting scales, etc.
Example data is here:
I don't understand the format of your data so I'm sure this isn't the display you want, but it shows how to draw a bunch of cylinders in rgl:
root <- read.csv("~/temp/root.csv")
segments <- data.frame(row.names = unique(root$parent.direction),
x = c(-1,0,1,0,0),
y = c(0,1,0,0,-1),
z = c(0,0,0,0.2,0))
library(rgl)
open3d()
for (i in seq_len(nrow(root))) {
rbind(root[i,2:4],
root[i,2:4] - segments[root$parent.direction[i],]) %>%
cylinder3d(radius = root$size[i]^0.3, closed = -2, sides = 20) %>%
shade3d(col = "green")
}
decorate3d()
This gives the following display (rotatable in the original):
You can pass each cylinder through addNormals if you want it to look smooth, or use sides = <some big number> in the cylinder3d to make them look rounder.

Plot a table with box size changing

Does anyone have an idea how is this kind of chart plotted? It seems like heat map. However, instead of using color, size of each cell is used to indicate the magnitude. I want to plot a figure like this but I don't know how to realize it. Can this be done in R or Matlab?
Try scatter:
scatter(x,y,sz,c,'s','filled');
where x and y are the positions of each square, sz is the size (must be a vector of the same length as x and y), and c is a 3xlength(x) matrix with the color value for each entry. The labels for the plot can be input with set(gcf,properties) or xticklabels:
X=30;
Y=10;
[x,y]=meshgrid(1:X,1:Y);
x=reshape(x,[size(x,1)*size(x,2) 1]);
y=reshape(y,[size(y,1)*size(y,2) 1]);
sz=50;
sz=sz*(1+rand(size(x)));
c=[1*ones(length(x),1) repmat(rand(size(x)),[1 2])];
scatter(x,y,sz,c,'s','filled');
xlab={'ACC';'BLCA';etc}
xticks(1:X)
xticklabels(xlab)
set(get(gca,'XLabel'),'Rotation',90);
ylab={'RAPGEB6';etc}
yticks(1:Y)
yticklabels(ylab)
EDIT: yticks & co are only available for >R2016b, if you don't have a newer version you should use set instead:
set(gca,'XTick',1:X,'XTickLabel',xlab,'XTickLabelRotation',90) %rotation only available for >R2014b
set(gca,'YTick',1:Y,'YTickLabel',ylab)
in R, you should use ggplot2 that allows you to map your values (gene expression in your case?) onto the size variable. Here, I did a simulation that resembles your data structure:
my_data <- matrix(rnorm(8*26,mean=0,sd=1), nrow=8, ncol=26,
dimnames = list(paste0("gene",1:8), LETTERS))
Then, you can process the data frame to be ready for ggplot2 data visualization:
library(reshape)
dat_m <- melt(my_data, varnames = c("gene", "cancer"))
Now, use ggplot2::geom_tile() to map the values onto the size variable. You may update additional features of the plot.
library(ggplot2)
ggplot(data=dat_m, aes(cancer, gene)) +
geom_tile(aes(size=value, fill="red"), color="white") +
scale_fill_discrete(guide=FALSE) + ##hide scale
scale_size_continuous(guide=FALSE) ##hide another scale
In R, corrplotpackage can be used. Specifically, you have to use method = 'square' when creating the plot.
Try this as an example:
library(corrplot)
corrplot(cor(mtcars), method = 'square', col = 'red')

contour plot of a custom function in R

I'm working with some custom functions and I need to draw contours for them based on multiple values for the parameters.
Here is an example function:
I need to draw such a contour plot:
Any idea?
Thanks.
First you construct a function, fourvar that takes those four parameters as arguments. In this case you could have done it with 3 variables one of which was lambda_2 over lambda_1. Alpha1 is fixed at 2 so alpha_1/alpha_2 will vary over 0-10.
fourvar <- function(a1,a2,l1,l2){
a1* integrate( function(x) {(1-x)^(a1-1)*(1-x^(l2/l1) )^a2} , 0 , 1)$value }
The trick is to realize that the integrate function returns a list and you only want the 'value' part of that list so it can be Vectorize()-ed.
Second you construct a matrix using that function:
mat <- outer( seq(.01, 10, length=100),
seq(.01, 10, length=100),
Vectorize( function(x,y) fourvar(a1=2, x/2, l1=2, l2=y/2) ) )
Then the task of creating the plot with labels in those positions can only be done easily with lattice::contourplot. After doing a reasonable amount of searching it does appear that the solution to geom_contour labeling is still a work in progress in ggplot2. The only labeling strategy I found is in an external package. However, the 'directlabels' package's function directlabel does not seem to have sufficient control to spread the labels out correctly in this case. In other examples that I have seen, it does spread the labels around the plot area. I suppose I could look at the code, but since it depends on the 'proto'-package, it will probably be weirdly encapsulated so I haven't looked.
require(reshape2)
mmat <- melt(mat)
str(mmat) # to see the names in the melted matrix
g <- ggplot(mmat, aes(x=Var1, y=Var2, z=value) )
g <- g+stat_contour(aes(col = ..level..), breaks=seq(.1, .9, .1) )
g <- g + scale_colour_continuous(low = "#000000", high = "#000000") # make black
install.packages("directlabels", repos="http://r-forge.r-project.org", type="source")
require(directlabels)
direct.label(g)
Note that these are the index positions from the matrix rather than the ratios of parameters, but that should be pretty easy to fix.
This, on the other hand, is how easilyy one can construct it in lattice (and I think it looks "cleaner":
require(lattice)
contourplot(mat, at=seq(.1,.9,.1))
As I think the question is still relevant, there have been some developments in the contour plot labeling in the metR package. Adding to the previous example will give you nice contour labeling also with ggplot2
require(metR)
g + geom_text_contour(rotate = TRUE, nudge_x = 3, nudge_y = 5)

R ggplot: geom_tile lines in pdf output

I'm constructing a plot that uses geom_tile and then outputting it to .pdf (using pdf("filename",...)). However, when I do, the .pdf result has tiny lines (striations, as one person put it) running through it. I've attached an image showing the problem.
Googling let to this thread, but the only real advice in there was to try passing size=0 to geom_tile, which I did with no effect. Any suggestions on how I can fix these? I'd like to use this as a figure in a paper, but it's not going to work like this.
Minimal code:
require(ggplot2)
require(scales)
require(reshape)
volcano3d <- melt(volcano)
names(volcano3d) <- c("x", "y", "z")
v <- ggplot(volcano3d, aes(x, y, z = z))
pdf("mew.pdf")
print(v + geom_tile(aes(fill=z)) + stat_contour(size=2) + scale_fill_gradient("z"))
This happens because the default colour of the tiles in geom_tile seems to be white.
To fix this, you need to map the colour to z in the same way as fill.
print(v +
geom_tile(aes(fill=z, colour=z), size=1) +
stat_contour(size=2) +
scale_fill_gradient("z")
)
Try to use geom_raster:
pdf("mew.pdf")
print(v + geom_raster(aes(fill=z)) + stat_contour(size=2) + scale_fill_gradient("z"))
dev.off()
good quality in my environment.
I cannot reproduce the problem on my computer (Windows 7), but I remember it was a problem discussed on the list for certain configurations. Brian Ripley (if I remember) recommended
CairoPDF("mew.pdf") # Package Cairo
to get around this
In the interests of skinning this cat, and going into waaay too much detail, this code decomposes the R image into a mesh of quads (as used by rgl), and then shows the difference between a raster plot and a "tile" or "rect" plot.
library(raster)
im <- raster::raster(volcano)
## this is the image in rgl corner-vertex form
msh <- quadmesh::quadmesh(im)
## manual labour for colour scaling
dif <- diff(range(values(im)))
mn <- min(values(im))
scl <- function(x) (x - mn)/dif
This the the traditional R 'image', which draws a little tile or 'rect()' for every pixel.
list_image <- list(x = xFromCol(im), y = rev(yFromRow(im)), z = t(as.matrix(im)[nrow(im):1, ]))
image(list_image)
It's slow, and though it calls the source of 'rect()' under the hood, we can't also set the border colour. Use 'useRaster = TRUE' to use 'rasterImage' for more efficient drawing time, control over interpolation, and ultimately - file size.
Now let's plot the image again, but by explicitly calling rect for every pixel. ('quadmesh' probably not the easiest way to demonstrate, it's just fresh in my mind).
## worker function to plot rect from vertex index
rectfun <- function(x, vb, ...) rect(vb[1, x[1]], vb[2,x[1]], vb[1,x[3]], vb[2,x[3]], ...)
## draw just the borders on the original, traditional image
apply(msh$ib, 2, rectfun, msh$vb, border = "white")
Now try again with 'rect'.
## redraw the entire image, with rect calls
##(not efficient, but essentially the same as what image does with useRaster = FALSE)
cols <- heat.colors(12)
## just to clear the plot, and maintain the plot space
image(im, col = "black")
for (i in seq(ncol(msh$ib))) {
rectfun(msh$ib[,i], msh$vb, col = cols[scl(im[i]) * (length(cols)-1) + 1], border = "dodgerblue")
}

Resources