Make density cloud from point cloud - r

My question consists of two sub questions.
I have a graphical illustration presenting (some virtual) worst case scenarios sampled from history organized based on two parameters.
Image:
At this moment I have a point cloud. I would like to create nicely splined density cloud of my results. I would like the 3d spline to consider density of points when aproximating (so aproximate further around when there are less samples availabe and more exactly in more dense region of space)
Because then, having that density cloud, I would be able scale the density in each vertical line specified by the two input parameters, and that would make it a likehood function of each outcome - [the worst case scenario])
Second part is, I would like to plot it, at best as semi-transparent 3d-regions that would be forming sometihng like a fog around the most dense region.
Uh,wow.. that wasn't easy to explain. Sigh. :)
Thanks for reading that far.

So here is a way to generate 3D density plots using the ks package. Since you provided no data this example is taken directly from the documentation to plot(...) in the ks package
library(MASS)
library(ks)
x <- iris[,1:3]
H.pi <- Hpi(x, pilot="samse")
fhat <- kde(x, H=H.pi, compute.cont=TRUE)
plot(fhat, drawpoints=TRUE)

Related

Overlapping data contour on a map

I have gone through few tutorials and answers here in stackoverflow such as:
Overlap image plot on a Google Map background in R or
Plotting contours on an irregular grid or Geographical heat map of a custom property in R with ggmap or How to overlay global map on filled contour in R language or https://blog.dominodatalab.com/geographic-visualization-with-rs-ggmaps/
They either don't serve my purpose or consider the density of the data to create the image.
I am looking for a way to plot contour on a map of a certain data, and would expect the image to look something like this:
or something like this taken from https://dsparks.wordpress.com/2012/07/18/mapping-public-opinion-a-tutorial/:
I have a data here that gives a contour plot like this in plot_ly but i want this over the map given by latitudes and longitudes.
Please guide me on how this can be done. Any links to potential answers or codes would be helpful.
Ok I did some digging and figured that to plot the data -which in this case are point values randomly distributed across the Latitude and Longitude, one has to make it continuous instead of the discreetly distributed one. To do this I interpolated the data to fill in the gaps, this method is given in Plotting contours on an irregular grid and then take it from there. Now the interpolation here is done using a linear regression, one can use other methods such as IDW, Kriging, Nearest Neighbourhood etc for which R-packages are easily available. These methods are widely used in climatology and topographic analysis. To read more about interpolation methods see this paper.

cluster: :clusplot axis in wrong direction

I'm trying to plot the cluster obtained from fuzzy c-means clustering.
The plot should look like this.
code for the plot
plot(data$Longitude, data$Latitude, main="Fuzzy C-Means",col=data$Revised, pch=16, cex=.6,
xlab="Longitude",ylab="Latitude")
library(maps)
map("state", add=T)
However, when I tried to use clusplot the plot is displaying in opposite direction(both top and bottom and left and right) as below.
I wanna know if there's a way to reverse the plot to show in the order as the above picture.
Also, for the very dense area, it's hard to find the ellipse label. I wanna know if there's a way to show the label inside the ellipse instead of outside.
code for 2nd pic
library(cluster)
clusplot(cbind(Geocode$Longitude, Geocode$Latitude), cluster, color=TRUE,shade=TRUE,
labels=4, lines=0,col.p=cluster,
xlab="Longitude",ylab="Latitude",cex=1)
clusplot is a function that performs a lot of magic for you. In particular it projects the data set - which happens in a way you don't like, unfortunately. (Also note the scales - it centered and scaled the data, too)
clusplot.default: Creates a bivariate plot visualizing a partition (clustering) of the data. All observation are represented by points in the plot, using principal components or multidimensional scaling.
As far as I can tell, clusplot doesn't have map support, but you will want such a map I guess...
While maybe you can use the s.x.2d parameter to specify the exact projection (and this way disable automatic scaling), it probably is still difficult to add the map. Maybe look at the source of clusplot instead, and take only the parts you want?

How to create variable sized square polygons to use for a choropleth map?

I have asked this question in the GIS part of stack exchange https://gis.stackexchange.com/questions/95265/r-how-to-create-a-pre-determined-number-of-identical-square-polygons-to-use-fo - I am asking it here as well as it has also topics of wider interest (e.g. calculation of density) - I hope not to be penalised for this! :)
I am trying to plot crime data density (again!) over a city map, say of NY. As a well known problem there are plenty of examples on this (http://www.obscureanalytics.com/2012/12/07/visualizing-baltimore-with-r-and-ggplot2-crime-data/). These methods plot the crime density through isoclines, while I need to represent it through identical density squares of a pre-determined area (and the area / side length may change from one iteration to the other). This is actually done in commercially available COTS packages like PredPol (see http://www.predpol.com). The reason for representing crime density through squares is that the square are the hotspot areas to be patrolled. The size will influence the overall amount of police people required.
This is what I am trying to achieve:
I would like to be able to create identical square polygons with pre-determined size to overimpose to the map (is it a raster? apologies but I've just started to learn to spell GIS!)
I would like to use the above squares as items to colour as in a choropleth map (i.e. different colouring in relation to frequency of crime in the area), probably using ggplot2 or similar.
This should allow me to see how the density of crimes per square kilometre varies changing the size (i.e. the area) of the square, proposing different patrolling areas.
I do not have a clue if it is possible to use R to create regularly shaped squares polygons of a pre-defined size to use for this (as the code snipped below attests). Any help or links to examples are welcome.
I would be glad to get some indication on alternative ways to calculate the density. I have used the stat_density2 (part of ggplot2) but maybe there are better / faster ways?
(
In hindsight, do I need a density function at all? I just need to count the crimes in a cell and colour-plot it accordingly...)
This is where I got to:
library(rgdal)
library(raster)
library(sp)
#NY boroughs shapefile downloaded from NY website
shp <- readOGR(dsn = "nybb_14a_av", layer = "nybb")
r <- raster(extent(shp))
res(r)=0.05
# using BoroCode as an experiment...
r <- rasterize(shp, field="BoroCode", r)
plot(r)
plot(shp,lwd=10,add=TRUE)
#don't know the result of the above: the laptop basically hangs processing
#plot(r) :)

Simple Contour map

I just discovered ggmap and I've been playing around with plotting earthquake data from the USGS. I get the data in the form of Lat and Lon, depth and magnitude. I can easily plot the earthquakes as points with different colors based on depth but what I would like to do is take that depth data (just a single number) and generate contours to overlay on the map.
This seems like it should be MUCH more simple than the "Houston Crime" example I keep coming up on since I'm not doing any statistical "density" calculation or anything like that. Basically it's just a contour map on top of the google map of an area.
How do I do this (Presumably) simple, simple thing?
Thanks!
The problem of plotting a 3D surface using only a small sample of unequally spaced lat/long points and a height z (or equivalent) variable is non-trivial -- you have to estimate the values of z for all of the lat-long grid coordinates you do not have, for example using loess() or kriging to create a smooth surface.
Take a look at Methods for doing heatmaps, level / contour plots, and hexagonal binning, case #5. For a geoR example see http://www4.stat.ncsu.edu/~reich/CUSP/Ordinary_Kriging_in_R.pdf

R: update plot [xy]lims with new points() or lines() additions?

Background:
I'm running a Monte Carlo simulation to show that a particular process (a cumulative mean) does not converge over time, and often diverges wildly in simulation (the expectation of the random variable = infinity). I want to plot about 10 of these simulations on a line chart, where the x axis has the iteration number, and the y axis has the cumulative mean up to that point.
Here's my problem:
I'll run the first simulation (each sim. having 10,000 iterations), and build the main plot based on its current range. But often one of the simulations will have a range a few orders of magnitude large than the first one, so the plot flies outside of the original range. So, is there any way to dynamically update the ylim or xlim of a plot upon adding a new set of points or lines?
I can think of two workarounds for this: 1. store each simulation, then pick the one with the largest range, and build the base graph off of that (not elegant, and I'd have to store a lot of data in memory, but would probably be laptop-friendly [[EDIT: as Marek points out, this is not a memory-intense example, but if you know of a nice solution that'd support far more iterations such that it becomes an issue (think high dimensional walks that require much, much larger MC samples for convergence) then jump right in]]) 2. find a seed that appears to build a nice looking version of it, and set the ylim manually, which would make the demonstration reproducible.
Naturally I'm holding out for something more elegant than my workarounds. Hoping this isn't too pedestrian a problem, since I imagine it's not uncommon with simulations in R. Any ideas?
I'm not sure if this is possible using base graphics, if someone has a solution I'd love to see it. However graphics systems based on grid (lattice and ggplot2) allow the graphics object to be saved and updated. It's insanely easy in ggplot2.
require(ggplot2)
make some data and get the range:
foo <- as.data.frame(cbind(data=rnorm(100), numb=seq_len(100)))
make an initial ggplot object and plot it:
p <- ggplot(as.data.frame(foo), aes(numb, data)) + layer(geom='line')
p
make some more data and add it to the plot
foo <- as.data.frame(cbind(data=rnorm(200), numb=seq_len(200)))
p <- p + geom_line(aes(numb, data, colour="red"), data=as.data.frame(foo))
plot the new object
p
I think (1) is the best option. I actually don't think this isn't elegant. I think it would be more computationally intensive to redraw every time you hit a point greater than xlim or ylim.
Also, I saw in Peter Hoff's book about Bayesian statistics a cool use of ts() instead of lines() for cumulative sums/means. It looks pretty spiffy:

Resources