Related
I have a perspective plot of a locfit model and I wish to add two things to it
Predictor variables as points in the 3D space
Color the surface according to the Z axis value
For the first, I have tried to use the trans3d function. But I get the following error even though my variables are in vector format:
Error in cbind(x, y, z, 1) %*% pmat : requires numeric/complex matrix/vector arguments
Here is a snippet of my code
library(locfit)
X <- as.matrix(loc1[,1:2])
Y <- as.matrix(loc1[,3])
zz <- locfit(Y~X,kern="bisq")
pmat <- plot(zz,type="persp",zlab="Amount",xlab="",ylab="",main="Plains",
phi = 30, theta = 30, ticktype="detailed")
x1 <- as.vector(X[,1])
x2 <- as.vector(X[,2])
Y <- as.vector(Y)
points(trans3d(x1,x2,Y,pmat))
My "loc1" data can be found here - https://www.dropbox.com/s/0kdpd5hxsywnvu2/loc1_amountfreq.txt?dl=0
TL,DR: not really in plot.locfit, but you can reconstruct it.
I don't think plot.locfit has good support for this sort of customisation. Supposedly get.data=T in your plot call will plot the original data points (point 1), and it does seem to do so, except if type="persp". So no luck there. Alternatively you can points(trans3d(...)) as you have done, except you need the perspective matrix returned by persp, and plot.locfit.3d does not return it. So again, no luck.
For colouring, typically you make a colour scale (http://r.789695.n4.nabble.com/colour-by-z-value-persp-in-raster-package-td4428254.html) and assign each z facet the colour that goes with it. However, you need the z-values of the surface (not the z-values of your original data) for this, and plot.locfit does not appear to return this either.
So to do what you want, you'll essentially be recoding plot.locfit yourself (not hard, though just cludgy).
You could put this into a function so you can reuse it.
We:
make a uniform grid of x-y points
calculate the value of the fit at each point
use these to draw the surface (with a colour scale), saving the perspective matrix so that we can
plot your original data
so:
# make a grid of x and y coords, calculate the fit at those points
n.x <- 20 # number of x points in the x-y grid
n.y <- 30 # number of y points in the x-y grid
zz <- locfit(Total ~ Mex_Freq + Cal_Freq, data=loc1, kern="bisq")
xs <- with(loc1, seq(min(Mex_Freq), max(Mex_Freq), length.out=20))
ys <- with(loc1, seq(min(Cal_Freq), max(Cal_Freq), length.out=30))
xys <- expand.grid(Mex_Freq=xs, Cal_Freq=ys)
zs <- matrix(predict(zz, xys), nrow=length(xs))
# generate a colour scale
n.cols <- 100 # number of colours
palette <- colorRampPalette(c('blue', 'green'))(n.cols) # from blue to green
# palette <- colorRampPalette(c(rgb(0,0,1,.8), rgb(0,1,0,.8)), alpha=T)(n.cols) # if you want transparency for example
# work out which colour each z-value should be in by splitting it
# up into n.cols bins
facetcol <- cut(zs, n.cols)
# draw surface, with colours (col=...)
pmat <- persp(x=xs, y=ys, zs, theta=30, phi=30, ticktype='detailed', main="plains", xlab="", ylab="", zlab="Amount", col=palette[facetcol])
# draw your original data
with(loc1, points(trans3d(Mex_Freq,Cal_Freq,Total,pmat), pch=20))
Note - doesn't look that pretty! might want to adjust say your colour scale colours, or the transparency of the facets, etc. Re: adding legend, there are some other questions that deal with that.
(PS: what a shame ggplot doesn't do 3D scatter plots.)
I am trying to construct a graph consisting of 2-3 filled.contour plots next to each other. The color scale is the same for all plots, and I would like only one z value key plot. I am having difficulties to do this with par(mfrow=c(1,3))
Example code:
x <- 1:5
y <- 1:5
z <- matrix(outer(x,y,"+"),nrow=5)
filled.contour(x,y,z)
filled.contour(x,y,z,color.palette=rainbow)
z2 <- z
z2[5,5] <- Inf
filled.contour(x,y,z2,col=rainbow(200),nlevels=200)
Is it possible to stack 2-3 of these plots next to each other with only one z value color key? I can do this in GIMP but I was wondering if it is natively in R possible.
No I do not think this is possible in filled.contour.
Although extensions have been written for you already. To be found here, here and here and a legend code here.
[If you are using the filled.contour3 function referred to on those sites, and using a more recent version, then you need to use the upgrade fix referred to in this SO post].
Using those codes I produced:
It can be solved with cowplot.
The clue is the cowplot::draw_plot function which successfully represent the plot (cowplot::draw_grob(~filled.contour(x,y,z))).
Later the grob representations can be binded.
The possible problem is that each plot will have own legend.
x <- 1:5
y <- 1:5
z <- matrix(outer(x,y,"+"),nrow=5)
z2 <- z
z2[5,5] <- Inf
cowplot::plot_grid(~filled.contour(x,y,z), ~filled.contour(x,y,z,color.palette=rainbow), ~filled.contour(x,y,z2,col=rainbow(200),nlevels=200))
Created on 2023-02-14 with reprex v2.0.2
I'd like to make a true heat map in R, much like a weather map, except my data is much more simple.
Consider this 3d data:
x <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4)
y <- c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5)
z <- rnorm(20)
The z would be color.
Here is what a discrete looking heatmap would like for this data:
How can I make a heatmap such that the colors are smooth and the full 2d space is filled with smoothed out colors based on the z values.
Please include sample code, not just a link that will probably confuse me even more, and I've probably already visited that site anyhow. Thanks :)
Use the following:
interp in the akima package
image.plot in the fields package
x <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4)
y <- c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5)
z <- rnorm(20)
library(fields)
library(akima)
s <- interp(x,y,z)
image.plot(s)
smooth.2d in the fields package does a good job (and it is much faster than interp from akima package for larger number of input points.
library(fields)
x <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4)
y <- c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5)
z <- rnorm(20)
s = smooth.2d(z, x=cbind(x,y), theta=0.5)
image.plot(s)
I have this problem. I got a heatmap, (but i suppose this applies to every plot) but I need to mirror my y-axis.
I got here some example code:
library(gstat)
x <- seq(1,50,length=50)
y <- seq(1,50,length=50)
z <- rnorm(1000)
df <- data.frame(x=x,y=y,z=z)
image(df,col=heat.colors(256))
This will generate the following heatmap
But I need the y-axis mirrored. Starting with 0 on the top and 50 on the bottom. Does anybody has a clue as to what I must do to change this?
See the help page for ?plot.default, which specifies
xlim: the x limits (x1, x2) of the plot. Note that ‘x1 > x2’ is
allowed and leads to a ‘reversed axis’.
library(gstat)
x <- seq(1,50,length=50)
y <- seq(1,50,length=50)
z <- rnorm(1000)
df <- data.frame(x=x,y=y,z=z)
So
image(df,col=heat.colors(256), ylim = rev(range(y)))
Does this work for you (it's a bit of a hack, though)?
df2<-df
df2$y<-50-df2$y #reverse oredr
image(df2,col=heat.colors(256),yaxt="n") #avoid y axis
axis(2, at=c(0,10,20,30,40,50), labels=c(50,40,30,20,10,0)) #draw y axis manually
The revaxis function in the plotrix package "reverses the sense of either or both the ‘x’ and ‘y’ axes". It doesn't solve your problem (Nick's solution is the correct one) but can be useful when you need to plot a scatterplot with reversed axes.
I would use rev like so:
df <- data.frame(x=x,y=rev(y),z=z)
In case you were not aware, notice that df is actually a function. You might want to be careful when overwriting. If you rm(df), things will go back to normal.
Don't forget to relabel the y axis as Nick suggests.
For the vertical axis increasing in the downward direction, I provided two ways (two different answers) for the following question:
R - image of a pixel matrix?
I have come across a number of situations where I want to plot more points than I really ought to be -- the main holdup is that when I share my plots with people or embed them in papers, they occupy too much space. It's very straightforward to randomly sample rows in a dataframe.
if I want a truly random sample for a point plot, it's easy to say:
ggplot(x,y,data=myDf[sample(1:nrow(myDf),1000),])
However, I was wondering if there were more effective (ideally canned) ways to specify the number of plot points such that your actual data is accurately reflected in the plot. So here is an example.
Suppose I am plotting something like the CCDF of a heavy tailed distribution, e.g.
ccdf <- function(myList,density=FALSE)
{
# generates the CCDF of a list or vector
freqs = table(myList)
X = rev(as.numeric(names(freqs)))
Y =cumsum(rev(as.list(freqs)));
data.frame(x=X,count=Y)
}
qplot(x,count,data=ccdf(rlnorm(10000,3,2.4)),log='xy')
This will produce a plot where the x & y axis become increasingly dense. Here it would be ideal to have fewer samples plotted for large x or y values.
Does anybody have any tips or suggestions for dealing with similar issues?
Thanks,
-e
I tend to use png files rather than vector based graphics such as pdf or eps for this situation. The files are much smaller, although you lose resolution.
If it's a more conventional scatterplot, then using semi-transparent colours also helps, as well as solving the over-plotting problem. For example,
x <- rnorm(10000); y <- rnorm(10000)
qplot(x, y, colour=I(alpha("blue",1/25)))
Beyond Rob's suggestions, one plot function I like as it does the 'thinning' for you is hexbin; an example is at the R Graph Gallery.
Here is one possible solution for downsampling plot with respect to the x-axis, if it is log transformed. It log transforms the x-axis, rounds that quantity, and picks the median x value in that bin:
downsampled_qplot <- function(x,y,data,rounding=0, ...) {
# assumes we are doing log=xy or log=x
group = factor(round(log(data$x),rounding))
d <- do.call(rbind, by(data, group,
function(X) X[order(X$x)[floor(length(X)/2)],]))
qplot(x,count,data=d, ...)
}
Using the definition of ccdf() from above, we can then compare the original plot of the CCDF of the distribution with the downsampled version:
myccdf=ccdf(rlnorm(10000,3,2.4))
qplot(x,count,data=myccdf,log='xy',main='original')
downsampled_qplot(x,count,data=myccdf,log='xy',rounding=1,main='rounding = 1')
downsampled_qplot(x,count,data=myccdf,log='xy',rounding=0,main='rounding = 0')
In PDF format, the original plot takes up 640K, and the downsampled versions occupy 20K and 8K, respectively.
I'd either make image files (png or jpeg devices) as Rob already mentioned, or I'd make a 2D histogram. An alternative to the 2D histogram is a smoothed scatterplot, it makes a similar graphic but has a more smooth cutoff from dense to sparse regions of space.
If you've never seen addictedtor before, it's worth a look. It has some very nice graphics generated in R with images and sample code.
Here's the sample code from the addictedtor site:
2-d histogram:
require(gplots)
# example data, bivariate normal, no correlation
x <- rnorm(2000, sd=4)
y <- rnorm(2000, sd=1)
# separate scales for each axis, this looks circular
hist2d(x,y, nbins=50, col = c("white",heat.colors(16)))
rug(x,side=1)
rug(y,side=2)
box()
smoothscatter:
library("geneplotter") ## from BioConductor
require("RColorBrewer") ## from CRAN
x1 <- matrix(rnorm(1e4), ncol=2)
x2 <- matrix(rnorm(1e4, mean=3, sd=1.5), ncol=2)
x <- rbind(x1,x2)
layout(matrix(1:4, ncol=2, byrow=TRUE))
op <- par(mar=rep(2,4))
smoothScatter(x, nrpoints=0)
smoothScatter(x)
smoothScatter(x, nrpoints=Inf,
colramp=colorRampPalette(brewer.pal(9,"YlOrRd")),
bandwidth=40)
colors <- densCols(x)
plot(x, col=colors, pch=20)
par(op)