have some data that I would like to add "stippling" to show where it is "important", as they do in the IPCC plots
At the moment I am really struggling with trying to do this in R.
If I make up some test data and plot it:
data <- array(runif(12*6), dim=c(12,6) )
over <- ifelse(data > 0.5, 1, 0 )
image(1:12, 1:6, data)
What I would like to finally do is over-plot some points based on the array "over" on top of the current image.
Any suggestions!??
This should help - I had do do a similar thing before and wrote a function that I posted here.
#required function from www.menugget.blogspot.com
matrix.poly <- function(x, y, z=mat, n=NULL){
if(missing(z)) stop("Must define matrix 'z'")
if(missing(n)) stop("Must define at least 1 grid location 'n'")
if(missing(x)) x <- seq(0,1,,dim(z)[1])
if(missing(y)) y <- seq(0,1,,dim(z)[2])
poly <- vector(mode="list", length(n))
for(i in seq(length(n))){
ROW <- ((n[i]-1) %% dim(z)[1]) +1
COL <- ((n[i]-1) %/% dim(z)[1]) +1
dist.left <- (x[ROW]-x[ROW-1])/2
dist.right <- (x[ROW+1]-x[ROW])/2
if(ROW==1) dist.left <- dist.right
if(ROW==dim(z)[1]) dist.right <- dist.left
dist.down <- (y[COL]-y[COL-1])/2
dist.up <- (y[COL+1]-y[COL])/2
if(COL==1) dist.down <- dist.up
if(COL==dim(z)[2]) dist.up <- dist.down
xs <- c(x[ROW]-dist.left, x[ROW]-dist.left, x[ROW]+dist.right, x[ROW]+dist.right)
ys <- c(y[COL]-dist.down, y[COL]+dist.up, y[COL]+dist.up, y[COL]-dist.down)
poly[[i]] <- data.frame(x=xs, y=ys)
}
return(poly)
}
#make vector of grids for hatching
incl <- which(over==1)
#make polygons for each grid for hatching
polys <- matrix.poly(1:12, 1:6, z=over, n=incl)
#plot
png("hatched_image.png")
image(1:12, 1:6, data)
for(i in seq(polys)){
polygon(polys[[i]], density=10, angle=45, border=NA)
polygon(polys[[i]], density=10, angle=-45, border=NA)
}
box()
dev.off()
Or, and alternative with "stipples":
png("hatched_image2.png")
image(1:12, 1:6, data)
for(i in seq(polys)){
xran <- range(polys[[i]]$x)
yran <- range(polys[[i]]$y)
xs <- seq(xran[1], xran[2],,5)
ys <- seq(yran[1], yran[2],,5)
grd <- expand.grid(xs,ys)
points(grd, pch=19, cex=0.5)
}
box()
dev.off()
Update:
In (very late) response to Paul Hiemstra's comment, here are two more examples with a matrix of higher resolution. The hatching maintains a nice regular pattern, but it is not nice to look at when broken up. The stippled example is much nicer:
n <- 100
x <- 1:n
y <- 1:n
M <- list(x=x, y=y, z=outer(x, y, FUN = function(x,y){x^2 * y * rlnorm(n^2,0,0.2)}))
image(M)
range(M$z)
incl <- which(M$z>5e5)
polys <- matrix.poly(M$x, M$y, z=M$z, n=incl)
png("hatched_image.png", height=5, width=5, units="in", res=400)
op <- par(mar=c(3,3,1,1))
image(M)
for(i in seq(polys)){
polygon(polys[[i]], density=10, angle=45, border=NA, lwd=0.5)
polygon(polys[[i]], density=10, angle=-45, border=NA, lwd=0.5)
}
box()
par(op)
dev.off()
png("stippled_image.png", height=5, width=5, units="in", res=400)
op <- par(mar=c(3,3,1,1))
image(M)
grd <- expand.grid(x=x, y=y)
points(grd$x[incl], grd$y[incl], pch=".", cex=1.5)
box()
par(op)
dev.off()
Do it using the coordinate positioning mechanism of ?image [1].
data(volcano)
m <- volcano
dimx <- nrow(m)
dimy <- ncol(m)
d1 <- list(x = seq(0, 1, length = dimx), y = seq(0, 1, length = dimy), z = m)
With your 'image' constructed that way you keep the structure with the object, and its
coordinates intact. You can collect multiple matrices into a 3D array or as multiple
elements, but you need to augment image() in order to handle that, so I keep them
separate here.
Make a copy of the data to specify an interesting area.
d2 <- d1
d2$z <- d2$z > 155
Use the coordinates to specify which cells are interesting. This is expensive if you have a very big raster, but it's super easy to do.
pts <- expand.grid(x = d2$x, y = d2$y)
pts$over <- as.vector(d2$z)
Set up the plot.
op <- par(mfcol = c(2, 1))
image(d1)
image(d1)
points(pts$x[pts$over], pts$y[pts$over], cex = 0.7)
par(op)
Don't forget to modify the plotting of points to get different effects, in particular a very dense grid with lots of points will take ages to draw all those little circles. pch = "." is a good choice.
Now, do you have some real data to plot on that nice projection? See examples here for some of the options: http://spatial-analyst.net/wiki/index.php?title=Global_datasets
[1] R has classes for more sophisticated handling of raster data, see package sp and raster
for two different approaches.
This is a solution in the spirit of #mdsummer's comment using ggplot2. I first draw the grid, and then draw +'es at the locations where a certain value has been exceeded. Note that ggplot2 works with data.frame's, not with multi-dimensional arrays or matrices. You can use melt from the reshape package to convert from an array / marix to a data.frame flat structure.
Here is a concrete example using the example data from the geom_tile documentation:
pp <- function (n,r=4) {
x <- seq(-r*pi, r*pi, len=n)
df <- expand.grid(x=x, y=x)
df$r <- sqrt(df$x^2 + df$y^2)
df$z <- cos(df$r^2)*exp(-df$r/6)
df
}
require(ggplot2)
dat = pp(200)
over = dat[,c("x","y")]
over$value = with(dat, ifelse(z > 0.5, 1, 0))
ggplot(aes(x = x, y = y), data = dat) +
geom_raster(aes(fill = z)) +
scale_fill_gradient2() +
geom_point(data = subset(over, value == 1), shape = "+", size = 1)
This is probably coming too late, but I'd like to post my answer as a reference too.
One nice option for spatial data is to use the rasterVis package. Once you have a "base" raster object, and the "mask" object, which you will use to draw the stippling, you can do something like:
require(raster)
require(rasterVis)
# Scratch raster objects
data(volcano)
r1 <- raster(volcano)
# Here we are selecting only values from 160 to 180.
# This will be our "mask" layer.
over <- ifelse(volcano >=160 & volcano <=180, 1, NA)
r2 <- raster(over)
# And this is the key step:
# Converting the "mask" raster to spatial points
r.mask <- rasterToPoints(r2, spatial=TRUE)
# Plot
levelplot(r1, margin=F) +
layer(sp.points(r.mask, pch=20, cex=0.3, alpha=0.8))
which resembles the map that the OP was looking for. Parameters of the points such as color, size and type can be fine tuned. ?sp.points provides all the arguments that can be used to do that.
Related
Is it possible to add the key for, e.g., an sp.points layer, to the colorkey generated by levelplot?
Take the following example:
library(rasterVis)
library(latticeExtra)
library(sp)
r <- as.factor(raster(matrix(rbinom(100, 1, 0.5), 10)))
levels(r)[[1]] <- data.frame(ID=0:1, z=c('a', 'b'))
p <- SpatialPoints(matrix(runif(20), 10))
levelplot(r, margin=list(draw=FALSE), scales=list(draw=FALSE),
col.regions=c('white', 'gray90')) +
latticeExtra::layer(sp.points(p, pch=20, col=1))
I would like to add a key entry for the points, below the existing colorkey.
A kludgy solution is to add a key to the levelplot call as follows, adjusting the x and y values until it's in the desired location, but (1) finding the right x and y values is a pain, requiring interaction, (2) the right padding doesn't resize to accommodate the key, and (3) the font size is not automatically scaled to be consistent with the colorkey.
k <- list(x = 1.02, y = 0.4, corner = c(0, 0), points=list(pch=20, col=1),
text=list('foo', cex=0.9))
levelplot(r, margin=list(draw=FALSE), scales=list(draw=FALSE),
col.regions=c('white', 'gray90'), key=k) +
latticeExtra::layer(sp.points(p, pch=20, col=1))
Assuming I need to stick with lattice graphics, what's the best way to overcome the issues I listed above?
Although it does not solve all the issues you raised, maybe the latticeExtra::mergedTrellisLegendGrob function is useful for you:
p1 <- levelplot(r, scales=list(draw=FALSE),
col.regions=c('white', 'gray90'))
myPoints <- SpatialPoints(matrix(runif(20), 10))
p2 <- spplot(myPoints, pch=20)
## Merge graphics
p <- p1 + p2
## Merge legends
l1 <- p1$legend$right
l2 <- p2$legend$bottom
ll <- mergedTrellisLegendGrob(l1, l2)
p$legend$right$fun <- ll
p
I want to plot a 3D plot using R. My data set is independent, which means the values of x, y, and z are not dependent on each other. The plot I want is given in this picture:
This plot was drawn by someone using MATLAB. How can I can do the same kind of Plot using R?
Since you posted your image file, it appears you are not trying to make a 3d scatterplot, rather a 2d scatterplot with a continuous color scale to indicate the value of a third variable.
Option 1: For this approach I would use ggplot2
# make data
mydata <- data.frame(x = rnorm(100, 10, 3),
y = rnorm(100, 5, 10),
z = rpois(100, 20))
ggplot(mydata, aes(x,y)) + geom_point(aes(color = z)) + theme_bw()
Which produces:
Option 2: To make a 3d scatterplot, use the cloud function from the lattice package.
library(lattice)
# make some data
x <- runif(20)
y <- rnorm(20)
z <- rpois(20, 5) / 5
cloud(z ~ x * y)
I usually do these kinds of plots with the base plotting functions and some helper functions for the color levels and color legend from the sinkr package (you need the devtools package to install from GitHib).
Example:
#library(devtools)
#install_github("marchtaylor/sinkr")
library(sinkr)
# example data
grd <- expand.grid(
x=seq(nrow(volcano)),
y=seq(ncol(volcano))
)
grd$z <- c(volcano)
# plot
COL <- val2col(grd$z, col=jetPal(100))
op <- par(no.readonly = TRUE)
layout(matrix(1:2,1,2), widths=c(4,1), heights=4)
par(mar=c(4,4,1,1))
plot(grd$x, grd$y, col=COL, pch=20)
par(mar=c(4,1,1,4))
imageScale(grd$z, col=jetPal(100), axis.pos=4)
mtext("z", side=4, line=3)
par(op)
Result:
I would like to draw the external outer envelope (contour) of a distribution graph which includes multiple density graphs and extract its value.
X <- c(1,2,1,4,3,1,2,8,9,0,5,4,2,2,5,5,7,8,8,9,5,6,5,6,3,4,5,3,4,5,4)
Y <- c(0,3,1,1,3,2,7,1,2,1,9,2,1,3,6,1,9,5,2,9,1,1,2,1,3,4,6,9,4,5,2)
Z <- c(1,4,9,5,7,8,2,8,9,0,5,4,2,2,5,5,7,8,8,9,5,6,5,6,9,9,9,2,6,7,1)
W <- c(1,8,9,7,8,9,12,3,11,21,5,4,8,8,2,3,2,1,2,3,4,5,6,3,1,9,2,1,8,4,1)
Q <- c(1,8,9,7,8,9,12,3,16,30,2,3,4,4,4,3,7,7,2,3,2,5,9,3,2,1,1,1,1,1,0)
n <- data.frame(X,Y,Z,W,Q)
plot((density(X)))
fun <- function(x)lines((density(n[[x]])))
t <- seq(1:length(n))
lapply(t,fun)
Have searched on web and I could find the contour but it can not be applied to a distribution graph generated as above.
I suspect you are asking for what might be called the "outer envelope". My first effort had a couple of problems as you can see:
lines( x= density(X)$x,
y=apply( do.call( cbind,
lapply(n, function(x){ density(x)$y})), 1, max), col="red", lwd=3)
And I think that also highlights problems with your efforts as well, since it illustrates the problems with not establishing a common grid on which to hang your density estimates.
So establish limits and re-do:
from=min(X,Y,Z,W,Q); to = max(X,Y,Z,W,Q)
png()
plot( Xd <- density(X, from=from, to=to))
fun <- function(x) lines(density(n[[x]],from=from, to = to ) )
t <- seq(1:length(n))
lapply(t,fun)
lines( x= density(X, from=from, to=to )$x,
y=apply( do.call( cbind, lapply(n,
function(x){ density(x, from=from, to = to)$y})), 1, max), col="red", lwd=3)
dev.off()
Here's an answer very similar to BondedDust's (and based on his from/to approach) that I think is easier to read and possibly faster if the dataset is very large, because it doesn't calculate the densities twice.
from <- min(n)
to <- max(n)
t <- seq(1:length(n))
ds <- lapply(t, function(i) density(n[[i]], from=from, to=to)) #Densities
maxd <- apply(sapply(ds, "[[", "y"), 1, max) #Max y of each x
plot(density(X), type="n", ylim=c(0, max(maxd)), xlim=c(0, 15))
for (i in t) lines(ds[[i]])
lines(seq(from, to, length.out = length(maxd)), maxd, col="red", lwd=3)
I manually set the plot xlim for better visualization.
I'm trying to use ggplot or base R to produce something like the following:
I know how to do histograms with ggplot2, and can easily separate them using facet_grid or facet_wrap. But I'd like to "stagger" them vertically, such that they have some overlap, as shown below. Sorry, I'm not allowed to post my own image, and it's quite difficult to find a simpler picture of what I want. If I could, I would only post the top-left panel.
I understand that this is not a particularly good way to display data -- but that decision does not rest with me.
A sample dataset would be as follows:
my.data <- as.data.frame(rbind( cbind( rnorm(1e3), 1) , cbind( rnorm(1e3)+2, 2), cbind( rnorm(1e3)+3, 3), cbind( rnorm(1e3)+4, 4)))
And I can plot it with geom_histogram as follows:
ggplot(my.data) + geom_histogram(aes(x=V1,fill=as.factor(V2))) + facet_grid( V2~.)
But I'd like the y-axes to overlap.
require(ggplot2)
require(plyr)
my.data <- as.data.frame(rbind( cbind( rnorm(1e3), 1) , cbind( rnorm(1e3)+2, 2), cbind( rnorm(1e3)+3, 3), cbind( rnorm(1e3)+4, 4)))
my.data$V2=as.factor(my.data$V2)
calculate the density depending on V2
res <- dlply(my.data, .(V2), function(x) density(x$V1))
dd <- ldply(res, function(z){
data.frame(Values = z[["x"]],
V1_density = z[["y"]],
V1_count = z[["y"]]*z[["n"]])
})
add an offset depending on V2
dd$offest=-as.numeric(dd$V2)*0.2 # adapt the 0.2 value as you need
dd$V1_density_offest=dd$V1_density+dd$offest
and plot
ggplot(dd, aes(Values, V1_density_offest, color=V2)) +
geom_line()+
geom_ribbon(aes(Values, ymin=offest,ymax=V1_density_offest, fill=V2),alpha=0.3)+
scale_y_continuous(breaks=NULL)
densityplot() from bioconductor flowViz package is one option for stacked densities.
from: http://www.bioconductor.org/packages/release/bioc/manuals/flowViz/man/flowViz.pdf :
For flowSets the idea is to horizontally stack plots of density estimates for all frames in the
flowSet for one or several flow parameters. In the latter case, each parameter will be plotted
in a separate panel, i.e., we implicitely condition on parameters.
you can see example visuals here:
http://www.bioconductor.org/packages/release/bioc/vignettes/flowViz/inst/doc/filters.html
source("http://bioconductor.org/biocLite.R")
biocLite("flowViz")
Using the ggridges package:
ggplot(my.data, aes(x = V1, y = factor(V2), fill = factor(V2), color = factor(V2))) +
geom_density_ridges(alpha = 0.5)
I think it's going to be difficult to get ggplot to offset the histograms like that. At least with faceting it makes new panels, and really, this transformation makes the y-axis meaningless. (The value is in the comparison from row to row). Here's one attempt at using base graphics to try to accomplish a similar thing.
#plotting function
plotoffsethists <- function(vals, groups, freq=F, overlap=.25, alpha=.75, colors=apply(floor(rbind(col2rgb(scales:::hue_pal(h = c(0, 360) + 15, c = 100, l = 65)(nlevels(groups))),alpha=alpha*255)),2,function(x) {paste0("#",paste(sprintf("%02X",x),collapse=""))}), ...) {
print(colors)
if (!is.factor(groups)) {
groups<-factor(groups)
}
offsethist <- function (x, col = NULL, offset=0, freq=F, ...) {
y <- if (freq) y <- x$counts
else
x$density
nB <- length(x$breaks)
rect(x$breaks[-nB], 0+offset, x$breaks[-1L], y+offset, col = col, ...)
}
hh<-tapply(vals, groups, hist, plot=F)
ymax<-if(freq)
sapply(hh, function(x) max(x$counts))
else
sapply(hh, function(x) max(x$density))
offset<-(mean(ymax)*overlap) * (length(ymax)-1):0
ylim<-range(c(0,ymax+offset))
xlim<-range(sapply(hh, function(x) range(x$breaks)))
plot.new()
plot.window(xlim, ylim, "")
box()
axis(1)
Map(offsethist, hh, colors, offset, freq=freq, ...)
invisible(hh)
}
#sample call
par(mar=c(3,1,1,1)+.1)
plotoffsethists(my.data$V1, factor(my.data$V2), overlap=.25)
Complementing Axeman's answer, you can add the option stat="binline" to the geom_density_ridges geom. This results in a histogram like plot, instead of a density line.
library(ggplot2)
library(ggridges)
my.data <- as.data.frame(rbind( cbind( rnorm(1e3), 1) ,
cbind( rnorm(1e3)+2, 2),
cbind( rnorm(1e3)+3, 3),
cbind( rnorm(1e3)+4, 4)))
my.data$V2 <- as.factor(my.data$V2)
ggplot(my.data, aes(x=V1, y=factor(V2), fill=factor(V2))) +
geom_density_ridges(alpha=0.6, stat="binline", bins=30)
Resulting image:
I would like to create a waterfall plot in R (XYYY) from my data.
So far, I use this code:
load("myData.RData")
ls()
dim(data)
##matrix to xyz coords
library(reshape2)
newData <- melt(data, id="Group.1")
dim(newData)
head(newData)
tail(newData)
newDataO <- newData[c(2,1,3)]
head(newDataO)
##color scale for z axis
myColorRamp <- function(colors, values) {
v <- (values - min(values))/diff(range(values))
x <- colorRamp(colors)(v)
rgb(x[,1], x[,2], x[,3], maxColorValue = 255)
}
cols <- myColorRamp(c("darkblue","yellow","darkorange","red","darkred"),newDataO$value)
##3D scatter
library(rgl)
plot3d(newDataO$variable, newDataO$Group.1, newDataO$value, xlab="", ylab="", zlab="", type="p", col=cols, box=FALSE, axes=FALSE)
rgl.postscript("persptrial_060514.eps","eps")
to get this plot:
https://dl.dropboxusercontent.com/u/14906265/persptrial_060514.jpg
I have also use this option in 2d with polygon but the result does not properly show the differential effect between both plots (left vs right).
I do not know whether something like persp3d could do the job but I am not familiar enough with writing code to achieve it. Any help will be very much appreciated.
It seems to me that the simplest way of doing a waterfall plot in R is to add all the lines manually in a loop.
library(rgl)
# Function to plot
f <- function(x, y) sin(10 * x * y) * cos(4 * y^3) + x
nx <- 30
ny <- 100
x <- seq(0, 1, length = nx)
y <- seq(0, 1, length = ny)
z <- outer(x, y, FUN = f)
# Plot function and add lines manually
surface3d(x, y, z, alpha = 0.4)
axes3d()
for (i in 1:nx) lines3d(x[i], y, z[i, ], col = 'white', lwd = 2)