Binning data for use in matrix and image() or heatmap() - r

I have a data frame with three columns and I'd like to make a image/heatmap of the data.
The three columns are pe, vix, and ret with pe and vix being x and y and ret being z.
There are 220 lines in the data frame so i'd like to bin the data if possible, the ranges are below.
Any suggestions for how to bin the x and y data and also create a matrix for use in an image()?
> range(matr$pe)
[1] 13.32 44.20
> range(matr$vix)
[1] 10.42 59.89
> range(matr$ret)
[1] -0.09274936 0.04693118
> class(matr)
[1] "data.frame"
> head(matr)
pe vix ret
1 20.86 13.16 -0.002931561
2 20.46 12.53 -0.003546889
3 20.52 12.42 0.006339165
4 20.61 13.47 0.009683174
5 20.57 11.26 -0.002666668
6 20.81 11.73 0.002895003

Here's what I ended up doing. I used the interp() function in the akima package to create the appropriately binned matrix object. It seems to do the work of binning and 'matricizing' of the data frame. On a side note, in order to make the heatmap WITH a legend, I ended up using the image.plot() method from the fields package. Here's the
code:
par(bg = 3)
image.plot(s,xlab="P/E Ratio", ylab="VIX",
main="Contour Map of SPY Returns vs P/E Ratio and Vix")
abline(v=(seq(0,100,5)), col=6, lty="dotted")
abline(h=(seq(0,100,5)), col=6, lty="dotted")
contour(s, add=TRUE)
and resulting product for anyone interested:
Thanks to everyone for their help and suggestions.

You could use e.g. cutlike this:
matr$binnedpe<-cut(matr$pe, breaks=10)
matr$binnedvix<-cut(matr$vix, breaks=10)
Next you can use e.g. ddply (from package plyr) to get the means per bin:
binneddata<-ddply(matr, .(binnedpe, binnedvix), function(d){c(d$binnendpe, d$binnedvix, mean(d$ret))})
Finally, you use this last data.frame to draw your heat map. I haven't tested any of the above, but it should be close enough to get you going.

you should take a spin through the raster package. In particular, the function rasterfromXYZ() should do most of what you want. It's pretty easy, either with the base graphics tools or the raster package, to setup a 'heatmap' color range for the raster object.

Related

finding different y values along curve in r

So I have plotted a curve, and have had a look in both my book and on stack but can not seem to find any code to instruct R to tell me the value of y when along curve at 70 x.
curve(
20*1.05^x,
from=0, to=140,
xlab='Time passed since 1890',
ylab='Population of Salmon',
main='Growth of Salmon since 1890'
)
So in short, I would like to know how to command R to give me the number of salmon at 70 years, and at other times.
Edit:
To clarify, I was curious how to command R to show multiple Y values for X at an increase of 5.
salmon <- data.frame(curve(
20*1.05^x,
from=0, to=140,
xlab='Time passed since 1890',
ylab='Population of Salmon',
main='Growth of Salmon since 1890'
))
salmon$y[salmon$x==70]
1 608.5285
This salmon data.frame gives you all of the data.
head(salmon)
x y
1 0.0 20.00000
2 1.4 21.41386
3 2.8 22.92768
4 4.2 24.54851
5 5.6 26.28392
6 7.0 28.14201
If you can also use inequalities to check the number of salmon in given ranges using the syntax above.
It's also simple to answer the 2nd part of your question using this object:
salmon$z <- salmon$y*5 # I am using * instead of + to make the plot more clear
plot(x=salmon$x,y=salmon$z, xlab='Time passed since 1890', ylab='Population of Salmon',type="l")
lines(salmon$x,salmon$y, col="blue")
curve is plotting the function 20*1.05^x
so just plug any value you want in that function instead of x, e.g.
> 20*1.05^70
[1] 608.5285
>
20*1.05^(seq(from=0, to=70, by=10))
Was all I had to do, I had forgotten until Ed posted his reply that I could type a function directly into R.

3D Scatterplot function in R with groups

So I've been working on a scatter plot for some data that I have. I used to be able to get the scatter plot function to work, but now I can't and I don't understand what my error is. My data looks has 5 values and a column that assigns each to a cluster (I used k-means in this particular case).
closedmi uncertin certknow sourknow justknow fit3.cluster
1 3.166667 6.125 2.571429 4.500 3.375 1
2 3.666667 4.250 3.428571 4.000 4.750 2
3 1.833333 5.750 1.428571 3.375 2.125 2
4 3.500000 4.500 1.857143 4.250 3.125 3
I'm looking to try to plot my data in 3 dimensions using the first three principle components and see the clusters. Here is my code to find the principal components, and then attach the cluster column to the principle components into a new data frame.
#Find the 5 principal components of the data matrix
pcdf <- princomp(pre2, cor=T, score=T)
pre4 <- data.frame(pcdf$scores, cluster=fit3$cluster)
#Making a 3D plot of the Solution
scatter3d(pre4$Comp.1, pre4$Comp.2, pre4$Comp.3, groups=pre4$cluster,
surface=FALSE, grid=FALSE, ellipsoid=TRUE)
So then try to use scatter3d to plot the individuals using the cluster column as a grouping factor and I end up with an error. I've been using this source for the code to get the right syntax, but I still end up with the error.
Error in scatter3d.default(pre4$Comp.1, pre4$Comp.2, pre4$Comp.3, groups = pre4$cluster: groups variable must be a factor
but it is. It's in the data frame, I can call the column using pre4$cluster. Is there some formatting or syntax error I can't see? Am I just going mad?
I was able to get this to work just last week and now I'm not able to. I know I can use plot3d to get the visualization, but I like the visualization better using scatter3d and would like to be able to use it.
Try this:
scatter3d(pre4$Comp.1, pre4$Comp.2, pre4$Comp.3, groups=as.factor(pre4$cluster),
surface=FALSE, grid=FALSE, ellipsoid=TRUE)
That will solve the error message regarding factors. Beyond that, just make sure that your leading minor is positive definite.

Creating a 3D Surface Plot from a matrix in R

I have been searching for this quite a while, but cannot find an answer to my problem or a minimum example. I would like to make a 3D-plot of a matrix.
An extract of my data looks like this. There are the years, which I would like to use as X-Axis. There is Y, which I would like to use as Y and I would like to plot z.
Year y z
2000 1 467
2000 2 10678
2000 2 25
...
How can I make this a surface plot?
Best
Have you tried searching for how to plot a surface plot in R? It turns out there's at least a persp function, a package called plot3D, wireframe in lattice and plotly.
For starters, try (from the plot3D package vignette)
library(plot3D)
example(persp3D)
example(surf3D)
example(slice3D)
example(scatter3D)
example(segments3D)
example(image2D)
example(image3D)
example(contour3D)
example(colkey)
example(jet.col)
example(perspbox)
example(mesh)
example(trans3D)
example(plot.plist)
example(ImageOcean)
example(Oxsat)

R Doplot() coordinates locator()

I drew a dotplot (using dotPlot() from seqinr package) of 2 fasta sequences and I need to extract some values (x,y) from the plot.
The Dotplot() output is an image
A generic dotplot maybe be this one
I need for example the values of start & end of the local alignment which are represented by the purple lines
so here an example
l=30
seq1 <- paste(sample(c("A","G","T","C"), l, repl=TRUE))
seq2 <- paste(sample(c("A","G","T","C"), l, repl=TRUE))
dotPlot(seq1,seq2, wsize = 2, wstep = 1, nmatch = 2, col = c("white", "green"), xlab = deparse(substitute(seq1)), ylab = deparse(substitute(seq2)))
locator(n=2, type="p")
$x
[1] 27.18720 31.23263
$y
[1] 20.45222 24.65726
So I want exactly the position of the 2 circled points,and as you can see the locator() gives decimal value .
I may use ceiling() or round() but i maybe get back an approximation error
I need the integer value of the point I clicked on, basically the nearest point to the place
Would be perfect to use identify(), which works with "normal" plots and gives back a vector with the closest plotted value to your "click", but it doesn't work on the dotPlot() output (the problem seems to be that it doesn't work on image output as locator() )
Any possible solution would be welcome, including using dotter in shell or python. Thanks
As you have mentioned Identify doesn't work since it need a plot not an image. Maybe a solution is to call image after plot(type="n",..) but this need to change the dotPlot function source code. Another elegant solution is to use lattice package and panel.identify the grid equivalent of identify.
Here an example, where I select some points ( 6 -> 15):
library(lattice)
dotplot(y~x,data.frame(x=letters,y=letters))
trellis.focus("panel", 1, 1)
> panel.identify()
[1] 6 7 8 9 10 11 12 13 14 15
Have a look at evolvedmicrobe/dotplot on github
https://github.com/evolvedmicrobe/dotplot/blob/master/R/plotters.R
It provides mkDotPlotDataFrame. With this you can better get coordinates between matches, like with identify.

3d plot or contourplot of 3-tuples where x&y are NOT in a grid and NOT equally spaced in R

I am trying to visualize 3-tuple points that are NOT in a grid, and x and y are NOT equally spaced. Thus I can not make a matrix as mostly required, nor can I meet the requirements of the lattice contourplot, which accepts vectors, but they have to be in a pretty restrictive form. (x,y must form a grid and be equally spaced...)
I don't care, whether the result is a 3d surface or a 2D contourplot. But in some way I'ld like to visualize a (probably interpolated) surface of my 3-tuples.
Data will look like this:
myX myY myZ
1 458 4 0.54
2 101 5 0.46
3 390 0 0.45
4 186 2 0.84
5 241 3 0.50
6 495 2 0.67
I have tried several plotting functions from graphics, rgl and lattice packages.
I understand that the connecting of x,y pairs at arbitrary positions is everything but trivial - but is there any plotting function in any package, which can handle this? Or can I fill (interpolate) my data beforehand easily in order to have a full matrix? (I have fitted models visualized, but I want to see the raw data...)
Any help or hint is appreciated!
Cheers,
Niko
I bit hard to understand the question, but I will try to show how one interpolates to a full matrix. I usually use the interp function from the akima package:
set.seed(1)
x <- runif(20)
y <- runif(20)
z <- x^3 + sin(y)
require(akima)
F <- interp(x,y,z)
image(F)
points(x,y)
Here's an example of extrapolation:
F <- interp(x,y,z, linear=FALSE, extrap=TRUE)
image(F)
points(x,y)

Resources