Plotting 4D compositional in a 3D tetrahedron in R - r

I would like to know how to connect-the-dots in the plot below.
I have four-variable compositional data, in which each row represents a sample, and each sample consists of varying proportions of four components (4 columns).
Reproducible example:
library(compositions); library(rgl)
TimeSeries <- cbind(runif(10),runif(10),runif(10),runif(10))
TimeSeries <- TimeSeries/rowSums(TimeSeries)
Acomp <- acomp(TimeSeries)
plot3D(Acomp_TS, cex=10, col="red", log=FALSE, coors=T, bbox=F, scale=F, center=F, axis.col=1, axes=TRUE)
Ideally, I'd like to connect the dots in the order that they appear in the data frame.
I guess this might be accomplished with something like lines3d or segments3d (library rgl), but I can't see how to extract the (x,y,z) coordinates from Acomp.

You don't have a variable named Acomp_TS. I guess you meant Acomp.
The best way to do this is to look at the source of plot3D.acomp, and do what it did. You might also want to suggest to the maintainer of the package that they should invisibly return the 3D coordinates that they computed to facilitate things like you want to do.
But here's a hack that may work: after plotting the points, read their locations and use those as coordinates. For example,
library(compositions); library(rgl)
TimeSeries <- cbind(runif(10),runif(10),runif(10),runif(10))
TimeSeries <- TimeSeries/rowSums(TimeSeries)
Acomp <- acomp(TimeSeries)
plot3D(Acomp, cex=10, col="red", log=FALSE, coors=T, bbox=F, scale=F, center=F, axis.col=1, axes=TRUE)
ids <- rgl.ids()
pts <- ids$id[ids$type == "points"]
lines3d(rgl.attrib(pts, "vertices"))
This produced

Related

3D interactive surface plot with spatial data

I would like to create an interactive 3D surface plot of depths in a lake, ideally using the plotly or rgl libraries. I have extracted my data from a SpatialLinesDataFrame of contour lines in Gauss-Krueger/EPSG:31468 CRS, i.e. metric units. Now each contour line produces a set of coordinates with the same depth value. The resulting data frame is rather large, but looks something like this:
set.seed(41)
xx <- rnorm(100,4448929,100)
yy <- rnorm(100,5308097,100)
zz <- c(rep(-10,10),rep(-20,10),rep(-30,10),rep(-40,10),rep(-50,10),rep(-60,10),rep(-70,10),rep(-80,10),rep(-90,10),rep(-100,10))
df <- data.frame(xx,yy,zz)
I have tried plotting the data with plotly as in this example and with rgl as in this post. In both cases I get error messages relating to my data not being in a matrix format, i.e. where x- and y-values are represented as row- and column-numbers.
What does work, is using the add_trace command in plotly:
plot_ly() %>% add_trace(df,x = ~df$xx, y = ~df$yy, z = ~df$zz,type="mesh3d")
However, the resulting graph not only lacks the fancy colour legend of the add_surface command, but more importantly, warps the x- and y-values in relation to the z-values. The z-values are shown much too large, although all have the same metric unit.
I have also tried reshaping the data frame to a matrix as in this post, but it either doesn't work at all, or gives me a matrix consisting almost entirely of NAs. I can only speculate that the number of coordinates that have depth values attached is very small in comparison to all x-y-combinations of coordinates in that range?
Any suggestions will be much appreciated - thanks!
Those are randomly located points, so rgl::persp3d can't handle them directly. However, you can follow the example in ?rgl::persp3d.deldir to triangulate them and then plot. For example,
dxyz <- deldir::deldir(df$xx, df$yy, z = df$zz, suppressMsgs=TRUE)
persp3d(dxyz, col = "lightblue")
This results in a pretty ugly picture, but with some work (e.g. fixing the axis labels, using real data) you should get something reasonable.

Stacked single row heatmaps in R

I have 10 matrices where each one represent the eigenvalues of correlation matrices of wavelet coefficients of a number of time series.
I would like to generate a heatmap of my data to see if there are any overlapping significant events across the different scales.
So far I have hacked together the following image using the graphics package as described here by #Josliber.
I am still getting to grips with R so excuse the nasty code but it works for me for now. I have yet to mess around with the labels and formatting but it's a quick and dirty representation.
#plotting the eigenvalues for each of the wavelet coefficient scales
w1mat <- matrix(w1eigen[1,])
w2mat <- matrix(w2eigen[1,])
w3mat <- matrix(w3eigen[1,])
w4mat <- matrix(w4eigen[1,])
w5mat <- matrix(w5eigen[1,])
w6mat <- matrix(w6eigen[1,])
w7mat <- matrix(w7eigen[1,])
w8mat <- matrix(w8eigen[1,])
w9mat <- matrix(w9eigen[1,])
w10mat <- matrix(w10eigen[1,])
#plots the eigenvalues for each of the scales
par(mfrow=c(10,1))
imageW1 <- image(w1mat)
imageW2 <- image(w2mat)
imageW3 <- image(w3mat)
imageW4 <- image(w4mat)
imageW5 <- image(w5mat)
imageW6 <- image(w6mat)
imageW7 <- image(w7mat)
imageW8 <- image(w8mat)
imageW9 <- image(w9mat)
imageW10 <- image(w10mat)
As you can see I used the image function in the graphics package to create this and I am sure I can achieve the same in ggplot with greater control.
Ultimately I wish to create a plot similar to the one below where there is a common Y axis and they're sitting right on top of one another with no white space.
What I would like to know is whether using the stacked image function in graphics is the best approach or is there a better way to visualise the data in an alternative package?

How can I recreate this 2d surface + contour + glyph plot in R?

I've run a 2d simulation in some modelling software from which i've got an export of x,y point locations with a set of 6 attributes. I wish to recreate a figure that combines the data, like this:
The ellipses and the background are shaded according to attribute 1 (and the borders of these are of course representing the model geometry, but I don't think I can replicate that), the isolines are contours of attribute 2, and the arrow glyphs are from attributes 3 (x magnitude) and 4 (y magnitude).
The x,y points are centres of the triangulated mesh I think, and look like this:
I want to know how I can recreate a plot like this with R. To start with I have irregularly-spaced data due to it being exported from an irregular mesh. That's immediately where I get stuck with R, having only ever used it for producing box-and-whisper plots and the like.
Here's the data:
https://dl.dropbox.com/u/22417033/Ellipses_noheader.txt
Edit: fields: x, y, heat flux (x), heat flux (y), thermal conductivity, Temperature, gradT (x), gradT (y).
names(Ellipses) <- c('x','y','dfluxx','dfluxy','kxx','Temps','gradTx','gradTy')
It's quite easy to make the lower plot (making the assumption that there is a dataframe named 'edat' read in with:
edat <- read.table(file=file.choose())
with(edat, plot(V1,V2), cex=0.2)
Things get a bit more beautiful with:
with(edat, plot(V1,V2, cex=0.2, col=V5))
So I do not think your original is being faithfully represented by the data. The contour lines are NOT straight across the "conductors". I call them "conductors" because this looks somewhat like iso-potential lines in electrostatics. I'm adding some text here to serve as a search handle for others who might be searching for plotting problems in real world physics: vector-field (the arrows) , heat equations, gradient, potential lines.
You can then overlay the vector field with:
with(edat, arrows(V1,V2, V1-20*V6*V7, V2-20*V6*V8, length=0.04, col="orange") )
You could"zoom in" with xlim and ylim:
with(edat, plot(V1,V2, cex=0.3, col=V5, xlim=c(0, 10000), ylim=c(-8000, -2000) ))
with(edat, arrows(V1,V2, V1-20*V6*V7, V2-20*V6*V8, length=0.04, col="orange") )
Guessing that the contour requested if for the Temps variable. Take your pick of contourplots.
require(akima)
intflow<- with(edat, interp(x=x, y=y, z=Temps, xo=seq(min(x), max(x), length = 410),
yo=seq(min(y), max(y), length = 410), duplicate="mean", linear=FALSE) )
require(lattice)
contourplot(intflow$z)
filled.contour(intflow)
with( intflow, contour(x=x, y=y, z=z) )
The last one will mix with the other plotting examples since those were using base plotting functions. You may need to switch to points instead of plot.
There are several parts to your plot so you will probably need several tools to make the different parts.
The background and ellipses can be created with polygon (once you figure where they should be).
The contourLines function can calculate the contour lines for you which you can add with the lines function (or contour has and add argument and could probably be used to add the lines directly).
The akima package has a function interp which can estimate values on a grid given the values ungridded.
The my.symbols function along with ms.arrows, both from the TeachingDemos package, can be used to draw the vector field.
#DWin is right to say that your graph don't represent faithfully your data, so I would advice to follow his answer. However here is how to reproduce (the closest I could) your graph:
Ellipses <- read.table(file.choose())
names(Ellipses) <- c('x','y','dfluxx','dfluxy','kxx','Temps','gradTx','gradTy')
require(splancs)
require(akima)
First preparing the data:
#First the background layer (the 'kxx' layer):
# Here the regular grid on which we're gonna do the interpolation
E.grid <- with(Ellipses,
expand.grid(seq(min(x),max(x),length=200),
seq(min(y),max(y),length=200)))
names(E.grid) <- c("x","y") # Without this step, function inout throws an error
E.grid$Value <- rep(0,nrow(E.grid))
#Split the dataset according to unique values of kxx
E.k <- split(Ellipses,Ellipses$kxx)
# Find the convex hull delimiting each of those values domain
E.k.ch <- lapply(E.k,function(X){X[chull(X$x,X$y),]})
for(i in unique(Ellipses$kxx)){ # Pick the value for each coordinate in our regular grid
E.grid$Value[inout(E.grid[,1:2],E.k.ch[names(E.k.ch)==i][[1]],bound=TRUE)]<-i
}
# Then the regular grid for the second layer (Temp)
T.grid <- with(Ellipses,
interp(x,y,Temps, xo=seq(min(x),max(x),length=200),
yo=seq(min(y),max(y),length=200),
duplicate="mean", linear=FALSE))
# The regular grids for the arrow layer (gradT)
dx <- with(Ellipses,
interp(x,y,gradTx,xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
dy <- with(Ellipses,
interp(x,y,gradTy,xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
T.grid2 <- with(Ellipses,
interp(x,y,Temps, xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
gradTgrid<-expand.grid(dx$x,dx$y)
And then the plotting:
palette(grey(seq(0.5,0.9,length=5)))
par(mar=rep(0,4))
plot(E.grid$x, E.grid$y, col=E.grid$Value,
axes=F, xaxs="i", yaxs="i", pch=19)
contour(T.grid, add=TRUE, col=colorRampPalette(c("blue","red"))(15), drawlabels=FALSE)
arrows(gradTgrid[,1], gradTgrid[,2], # Here I multiply the values so you can see them
gradTgrid[,1]-dx$z*40*T.grid2$z, gradTgrid[,2]-dy$z*40*T.grid2$z,
col="yellow", length=0.05)
To understand in details how this code works, I advise you to read the following help pages: ?inout, ?chull, ?interp, ?expand.grid and ?contour.

Plotting predefined density functions using ggplot and R

I have three data sets of different lengths and I would like to plot density functions of all three on the same plot. This is straight forward with base graphics:
n <- c(rnorm(10000), rnorm(10000))
a <- c(rnorm(10001), rnorm(10001, 0, 2))
p <- c(rnorm(10002), rnorm(10002, 2, .5))
plot(density(n))
lines(density(a))
lines(density(p))
Which gives me something like this:
alt text http://www.cerebralmastication.com/wp-content/uploads/2009/10/density.png
But I really want to do this with GGPLOT2 because I want to add other features that are only available with GGPLOT2. It seems that GGPLOT really wants to take my empirical data and calculate the density for me. And it gives me a bunch of lip because my data sets are of different lengths. So how do I get these three densities to plot in GGPLOT2?
The secret to happiness in ggplot2 is to put everything in the "long" (or what I guess matrix oriented people would call "sparse") format:
df <- rbind(data.frame(x="n",value=n),
data.frame(x="a",value=a),
data.frame(x="p",value=p))
qplot(value, colour=x, data=df, geom="density")
If you don't want colors:
qplot(value, group=x, data=df, geom="density")

maximum plot points in R?

I have come across a number of situations where I want to plot more points than I really ought to be -- the main holdup is that when I share my plots with people or embed them in papers, they occupy too much space. It's very straightforward to randomly sample rows in a dataframe.
if I want a truly random sample for a point plot, it's easy to say:
ggplot(x,y,data=myDf[sample(1:nrow(myDf),1000),])
However, I was wondering if there were more effective (ideally canned) ways to specify the number of plot points such that your actual data is accurately reflected in the plot. So here is an example.
Suppose I am plotting something like the CCDF of a heavy tailed distribution, e.g.
ccdf <- function(myList,density=FALSE)
{
# generates the CCDF of a list or vector
freqs = table(myList)
X = rev(as.numeric(names(freqs)))
Y =cumsum(rev(as.list(freqs)));
data.frame(x=X,count=Y)
}
qplot(x,count,data=ccdf(rlnorm(10000,3,2.4)),log='xy')
This will produce a plot where the x & y axis become increasingly dense. Here it would be ideal to have fewer samples plotted for large x or y values.
Does anybody have any tips or suggestions for dealing with similar issues?
Thanks,
-e
I tend to use png files rather than vector based graphics such as pdf or eps for this situation. The files are much smaller, although you lose resolution.
If it's a more conventional scatterplot, then using semi-transparent colours also helps, as well as solving the over-plotting problem. For example,
x <- rnorm(10000); y <- rnorm(10000)
qplot(x, y, colour=I(alpha("blue",1/25)))
Beyond Rob's suggestions, one plot function I like as it does the 'thinning' for you is hexbin; an example is at the R Graph Gallery.
Here is one possible solution for downsampling plot with respect to the x-axis, if it is log transformed. It log transforms the x-axis, rounds that quantity, and picks the median x value in that bin:
downsampled_qplot <- function(x,y,data,rounding=0, ...) {
# assumes we are doing log=xy or log=x
group = factor(round(log(data$x),rounding))
d <- do.call(rbind, by(data, group,
function(X) X[order(X$x)[floor(length(X)/2)],]))
qplot(x,count,data=d, ...)
}
Using the definition of ccdf() from above, we can then compare the original plot of the CCDF of the distribution with the downsampled version:
myccdf=ccdf(rlnorm(10000,3,2.4))
qplot(x,count,data=myccdf,log='xy',main='original')
downsampled_qplot(x,count,data=myccdf,log='xy',rounding=1,main='rounding = 1')
downsampled_qplot(x,count,data=myccdf,log='xy',rounding=0,main='rounding = 0')
In PDF format, the original plot takes up 640K, and the downsampled versions occupy 20K and 8K, respectively.
I'd either make image files (png or jpeg devices) as Rob already mentioned, or I'd make a 2D histogram. An alternative to the 2D histogram is a smoothed scatterplot, it makes a similar graphic but has a more smooth cutoff from dense to sparse regions of space.
If you've never seen addictedtor before, it's worth a look. It has some very nice graphics generated in R with images and sample code.
Here's the sample code from the addictedtor site:
2-d histogram:
require(gplots)
# example data, bivariate normal, no correlation
x <- rnorm(2000, sd=4)
y <- rnorm(2000, sd=1)
# separate scales for each axis, this looks circular
hist2d(x,y, nbins=50, col = c("white",heat.colors(16)))
rug(x,side=1)
rug(y,side=2)
box()
smoothscatter:
library("geneplotter") ## from BioConductor
require("RColorBrewer") ## from CRAN
x1 <- matrix(rnorm(1e4), ncol=2)
x2 <- matrix(rnorm(1e4, mean=3, sd=1.5), ncol=2)
x <- rbind(x1,x2)
layout(matrix(1:4, ncol=2, byrow=TRUE))
op <- par(mar=rep(2,4))
smoothScatter(x, nrpoints=0)
smoothScatter(x)
smoothScatter(x, nrpoints=Inf,
colramp=colorRampPalette(brewer.pal(9,"YlOrRd")),
bandwidth=40)
colors <- densCols(x)
plot(x, col=colors, pch=20)
par(op)

Resources