Plot points on a sphere in R - r

Could you help me to make a plot similar to this in R?
I would like to have it interactive such that I could rotate the sphere. I guess I should use rgl. I found an example similar to what I need here, however I couldn't find a way to draw a grid instead of a filled sphere.
UPD: A reproducible dataset that could help answering the question (I took it from here):
u <- runif(1000,0,1)
v <- runif(1000,0,1)
theta <- 2 * pi * u
phi <- acos(2 * v - 1)
x <- sin(theta) * cos(phi)
y <- sin(theta) * sin(phi)
z <- cos(theta)
library("lattice")
cloud(z ~ x + y)

Start with
library("rgl")
spheres3d(0,0,0,lit=FALSE,color="white")
spheres3d(0,0,0,radius=1.01,lit=FALSE,color="black",front="lines")
to create a "wireframe" sphere (I'm cheating a little bit here by drawing two spheres, one a little bit larger than the other ... there may be a better way to do this, but I couldn't easily/quickly figure it out).
from the Wolfram web page on sphere point picking (the source of your picture) we get
Similarly, we can pick u=cos(phi) to be uniformly distributed (so we have du=sin phi dphi) and obtain the points x = sqrt(1-u^2)*cos(theta); y = sqrt(1-u^2)*sin(theta); z=u with theta in [0,2pi) and u in [-1,1], which are also uniformly distributed over S^2.
So:
set.seed(101)
n <- 50
theta <- runif(n,0,2*pi)
u <- runif(n,-1,1)
x <- sqrt(1-u^2)*cos(theta)
y <- sqrt(1-u^2)*sin(theta)
z <- u
spheres3d(x,y,z,col="red",radius=0.02)
The spheres take a little more effort to render but are prettier than the results of points3d() (flat squares) ...

Wandering in late, I might suggest looking at the packages sphereplot and, if you're feeling really brave, gensphere for highly configurable general placement of points in 3-space.
sphereplot includes simple functions such as (quoting from the man pages)
pointsphere Random sphere pointing
Description Randomly generates data
points within a sphere that are uniformly distributed.
Usage
pointsphere(N = 100, longlim = c(0, 360), latlim = c(-90, 90), rlim =
c(0, 1))
Arguments
N Number of random points.
longlim Limits of longitude in degrees.
latlim Limits of latitude in degrees.
rlim Limits of radius.

Related

Creating a Circle of Known Area

I want to create a circle of area 100 as an sf object. I thought st_buffer() would do it, but the area is
slightly less than 100.
pt.df <- data.frame(pt = 1, x = 20, y = 20)
pt.sf <- st_as_sf(pt.df, coords = c("x", "y"))
circle1 <- st_buffer(pt.sf, dist = sqrt(100 / pi))
st_area(circle1) # 99.95431 on my PC
I can use a fudge factor to multiply the radius and I get what I want.
fudge <- sqrt( 100 / st_area(circle1) )
circle2 <- st_buffer(pt.sf, dist = fudge * sqrt(100 / pi))
st_area(circle2) # 100
But it seems silly to use a fudge factor.
Is there a way to create a circle of known area within the sf package without
a fudge factor in st_buffer ?
It's a float precision issue calculating the radius with a truncated form of pi. This will result in a circle that is slightly less than what your desired output is. You can see that pi can only be stored to machine precision:
.Machine$double.eps
# 2.22044604925031e-16
pi
# 3.14159265358979
If you want to correct it, you can use a linear correction on the area that you want. Note that this is still an approximation, but it should get you much closer to your desired result.
radius <- function(area){
A <- area + (area * 0.000457099999999997)
return(sqrt(A / pi))
}
system.file("shape/nc.shp", package="sf") %>%
st_read() %>%
st_centroid() %>%
st_transform(st_crs(5070)) %>%
st_buffer(radius(100)) %>%
st_area()
The main issue is that st_buffer works internally with polygons, not with circles. Increasing the nQuadSegs argument (default=30) allows you to use a better approximation to a circle, at the cost of memory and computation time (don't know if this is important to you):
library(sf)
pt.df <- data.frame(pt = 1, x = 20, y = 20)
pt.sf <- st_as_sf(pt.df, coords = c("x", "y"))
get_area <- function(nq) {
circle1 <- st_buffer(pt.sf, dist = sqrt(100 / pi), nQuadSegs=nq)
st_area(circle1)
}
sapply(c(30,100,300,1000), get_area)
## [1] 99.95431 99.99589 99.99954 99.99996
If you really want an area of exactly 100, then the 'fudge' that your question (and #AdamTrevisan's answer) suggest is the way to go (as increasing the number of segments to a million still only gets you to an area of 99.99999999997200461621). To be really clever, you might be able to use the formula for the area of an inscribed polygon to come up with a correction factor ...

R, rgl, plotting points and ellipses

I am using R to visualize some data. I am found RGL to be a great library for plotting points.
points3d(x,y,z)
where x = c(x1,x2, ...), y = c(y1,y2,...), z = c(z1,z2, ...) and x,y,z have the same length, is a great function for plotting large sets of data.
Now, I would like to plot ellipses, mixed in with the data. I have a characterization of ellipses by a center point C, a vector describing the major axis U, and a vector describing the minor axis V. I obtain points P on the boundary of the ellipse by
P = U*cos(t) + V*sin(t) (t ranges between 0 and 2*pi)
obtaining vectors, xt, yt, and zt. Then I can plot the ellipse with
polygon3d(xt,yt,zt)
It works fine, but I'm guessing everyone reading is cringing, and will tell me that this is a bad way to do this. Indeed it takes a couple seconds to render each ellipse this way.
I don't think the ellipse3d function from the RGL package works here; at the very least, I am not working a matrix of covariances, nor do I understand how to get the ellipse I want from this function. Also, it returns an ellipsoid, not an ellipse.
****** EDIT ************
For a concrete example that takes awhile:
library(rgl)
open3d()
td <- c(0:359)
t <- td*pi/180
plotEllipseFromVector <- function(c,u,v){
xt <- c[1] + u[1]*cos(t) + v[1]*sin(t)
yt <- c[2] + u[2]*cos(t) + v[2]*sin(t)
zt <- c[3] + u[3]*cos(t) + v[3]*sin(t)
polygon3d(xt,yt,zt)
}
Input center point, major, and minor axis you want. It takes just over 2 seconds for me.
On the other hand, if I change t to be 0,20,40,... 340, then it works quite fast.

Greatest distance between set of longitude/latitude points

I have a set of lng/lat coordinates. What would be an efficient method of calculating the greatest distance between any two points in the set (the "maximum diameter" if you will)?
A naive way is to use Haversine formula to calculate the distance between each 2 points and get the maximum, but this doesn't scale well obviously.
Edit: the points are located on a sufficiently small area, measuring the area in which a person carrying a mobile device was active in the course of a single day.
Theorem #1: The ordering of any two great circle distances along the surface of the earth is the same as the ordering as the straight line distance between the points where you tunnel through the earth.
Hence turn your lat-long into x,y,z based either on a spherical earth of arbitrary radius or an ellipsoid of given shape parameters. That's a couple of sines/cosines per point (not per pair of points).
Now you have a standard 3-d problem that doesn't rely on computing Haversine distances. The distance between points is just Euclidean (Pythagoras in 3d). Needs a square-root and some squares, and you can leave out the square root if you only care about comparisons.
There may be fancy spatial tree data structures to help with this. Or algorithms such as http://www.tcs.fudan.edu.cn/rudolf/Courses/Algorithms/Alg_ss_07w/Webprojects/Qinbo_diameter/2d_alg.htm (click 'Next' for 3d methods). Or C++ code here: http://valis.cs.uiuc.edu/~sariel/papers/00/diameter/diam_prog.html
Once you've found your maximum distance pair, you can use the Haversine formula to get the distance along the surface for that pair.
I think that the following could be a useful approximation, which scales linearly instead of quadratically with the number of points, and is quite easy to implement:
calculate the center of mass M of the points
find the point P0 that has the maximum distance to M
find the point P1 that has the maximum distance to P0
approximate the maximum diameter with the distance between P0 and P1
This can be generalized by repeating step 3 N times,
and taking the distance between PN-1 and PN
Step 1 can be carried out efficiently approximating M as the average of longitudes and latitudes, which is OK when distances are "small" and the poles are sufficiently far away. The other steps could be carried out using the exact distance formula, but they are much faster if the points' coordinates can be approximated as lying on a plane. Once the "distant pair" (hopefully the pair with the maximum distance) has been found, its distance can be re-calculated with the exact formula.
An example of approximation could be the following: if φ(M) and λ(M) are latitude and longitude of the center of mass calculated as Σφ(P)/n and Σλ(P)/n,
x(P) = (λ(P) - λ(M) + C) cos(φ(P))
y(P) = φ(P) - φ(M) [ this is only for clarity, it can also simply be y(P) = φ(P) ]
where C is usually 0, but can be ± 360° if the set of points crosses the λ=±180° line. To find the maximum distance you simply have to find
max((x(PN) - x(PN-1))2 + (y(PN) - y(PN-1))2)
(you don't need the square root because it is monotonic)
The same coordinate transformation could be used to repeat step 1 (in the new coordinate system) in order to have a better starting point. I suspect that if some conditions are met, the above steps (without repeating step 3) always lead to the "true distant pair" (my terminology). If I only knew which conditions...
EDIT:
I hate building on others' solutions, but someone will have to.
Still keeping the above 4 steps, with the optional (but probably beneficial, depending on the typical distribution of points) repetition of step 3,
and following the solution of Spacedman,
doing calculations in 3D overcomes the limitations of closeness and distance from poles:
x(P) = sin(φ(P))
y(P) = cos(φ(P)) sin(λ(P))
z(P) = cos(φ(P)) cos(λ(P))
(the only approximation is that this holds only for a perfect sphere)
The center of mass is given by x(M) = Σx(P)/n, etc.,
and the maximum one has to look for is
max((x(PN) - x(PN-1))2 + (y(PN) - y(PN-1))2 + (z(PN) - z(PN-1))2)
So: you first transform spherical to cartesian coordinates, then start from the center of mass, to find, in at least two steps (steps 2 and 3), the farthest point from the preceding point. You could repeat step 3 as long as the distance increases, perhaps with a maximum number of repetitions, but this won't take you away from a local maximum. Starting from the center of mass is not of much help, either, if the points are spread all over the Earth.
EDIT 2:
I learned enough R to write down the core of the algorithm (nice language for data analysis!)
For the plane approximation, ignoring the problem around the λ=±180° line:
# input: lng, lat (vectors)
rad = pi / 180;
x = (lng - mean(lng)) * cos(lat * rad)
y = (lat - mean(lat))
i = which.max((x - mean(x))^2 + (y )^2)
j = which.max((x - x[i] )^2 + (y - y[i])^2)
# output: i, j (indices)
On my PC it takes less than a second to find the indices i and j for 1000000 points. The following 3D version is a bit slower, but works for any distribution of points (and does not need to be amended when the λ=±180° line is crossed):
# input: lng, lat
rad = pi / 180
x = sin(lat * rad)
f = cos(lat * rad)
y = sin(lng * rad) * f
z = cos(lng * rad) * f
i = which.max((x - mean(x))^2 + (y - mean(y))^2 + (z - mean(z))^2)
j = which.max((x - x[i] )^2 + (y - y[i] )^2 + (z - z[i] )^2)
k = which.max((x - x[j] )^2 + (y - y[j] )^2 + (z - z[j] )^2) # optional
# output: j, k (or i, j)
The calculation of k can be left out (i.e., the result could be given by i and j), depending on the data and on the requirements. On the other hand, my experiments have shown that calculating a further index is useless.
It should be remembered that, in any case, the distance between the resulting points is an estimate which is a lower bound of the "diameter" of the set, although it very often will be the diameter itself (how often depends on the data.)
EDIT 3:
Unfortunately the relative error of the plane approximation can, in extreme cases, be as much as 1-1/√3 ≅ 42.3%, which may be unacceptable, even if very rare. The algorithm can be modified in order to have an upper bound of approximately 20%, which I have derived by compass and straight-edge (the analytic solution is cumbersome). The modified algorithm finds a pair of points whith a locally maximal distance, then repeats the same steps, but this time starting from the midpoint of the first pair, possibly finding a different pair:
# input: lng, lat
rad = pi / 180
x = (lng - mean(lng)) * cos(lat * rad)
y = (lat - mean(lat))
i.n_1 = 1 # n_1: n-1
x.n_1 = mean(x)
y.n_1 = 0 # = mean(y)
s.n_1 = 0 # s: square of distance
repeat {
s = (x - x.n_1)^2 + (y - y.n_1)^2
i.n = which.max(s)
x.n = x[i.n]
y.n = y[i.n]
s.n = s[i.n]
if (s.n <= s.n_1) break
i.n_1 = i.n
x.n_1 = x.n
y.n_1 = y.n
s.n_1 = s.n
}
i.m_1 = 1
x.m_1 = (x.n + x.n_1) / 2
y.m_1 = (y.n + y.n_1) / 2
s.m_1 = 0
m_ok = TRUE
repeat {
s = (x - x.m_1)^2 + (y - y.m_1)^2
i.m = which.max(s)
if (i.m == i.n || i.m == i.n_1) { m_ok = FALSE; break }
x.m = x[i.m]
y.m = y[i.m]
s.m = s[i.m]
if (s.m <= s.m_1) break
i.m_1 = i.m
x.m_1 = x.m
y.m_1 = y.m
s.m_1 = s.m
}
if (m_ok && s.m > s.n) {
i = i.m
j = i.m_1
} else {
i = i.n
j = i.n_1
}
# output: i, j
The 3D algorithm can be modified in a similar way. It is possible (both in the 2D and in the 3D case) to start over once again from the midpoint of the second pair of points (if found). The upper bound in this case is "left as an exercise for the reader" :-).
Comparison of the modified algorithm with the (too) simple algorithm has shown, for normal and for square uniform distributions, a near doubling of processing time, and a reduction of the average error from .6% to .03% (order of magnitude). A further restart from the midpoint results in an a just slightly better average error, but almost equal maximum error.
EDIT 4:
I have to study this article yet, but it looks like the 20% I found with compass and straight-edge is in fact 1-1/√(5-2√3) ≅ 19.3%
Here's a naive example that doesn't scale well (as you say), as you say but might help with building a solution in R.
## lonlat points
n <- 100
d <- cbind(runif(n, -180, 180), runif(n, -90, 90))
library(sp)
## distances on WGS84 ellipsoid
x <- spDists(d, longlat = TRUE)
## row, then column index of furthest points
ind <- c(row(x)[which.max(x)], col(x)[which.max(x)])
## maps
library(maptools)
data(wrld_simpl)
plot(as(wrld_simpl, "SpatialLines"), col = "grey")
points(d, pch = 16, cex = 0.5)
## draw the points and a line between on the page
points(d[ind, ], pch = 16)
lines(d[ind, ], lwd = 2)
## for extra credit, draw the great circle on which the furthest points lie
library(geosphere)
lines(greatCircle(d[ind[1], ], d[ind[2], ]), col = "firebrick")
The geosphere package provides more options for distance calculation if that's needed. See ?spDists in sp for the details used here.
You don't tell us whether these points will be located in a sufficiently small part of the globe. For truly global sets of points, my first guess would be running a naive O(n^2) algorithm, possibly getting performance boost with some spatial indexing (R*-trees, octal-trees etc.). The idea is to pre-generate an n*(n-1) list of the triangle in the distance matrix and feed it in chunks to a fast distance library to minimize I/O and process churn. Haversine is fine, you could also do it with Vincenty's method (the greatest contributor to running time is quadratic complexity, not the (fixed number of) iterations in Vincenty's formula). As a side note, in fact, you don't need R for this stuff.
EDIT #2: The Barequet-Har-Peled algorithm (as pointed at by Spacedman in his reply) has O((n+1/(e^3))log(1/e)) complexity for e>0, and is worth exploring.
For the quasi-planar problem, this is known as "diameter of convex hull" and has three parts:
Computing convex hull with Graham's scan which is O(n*log(n)) - in fact, one should try transforming points into a transverse Mercator projection (using the centroid of the points in data set).
Finding antipodal points by Rotating Calipers algorithm - linear O(n).
Finding the largest distance among all antipodal pairs - linear search, O(n).
The link with pseudo-code and discussion: http://fredfsh.com/2013/05/03/convex-hull-and-its-diameter/
See also the discussion on a related question here: https://gis.stackexchange.com/questions/17358/how-can-i-find-the-farthest-point-from-a-set-of-existing-points
EDIT: Spacedman's solution pointed me to the Malandain-Boissonnat algorithm (see the paper in pdf here). However, this is worse or the same as the bruteforce naive O(n^2) algorithm.

Generating random point in a cylinder

What is best way or an algorithm for generating a random 3d point [x,y,z] inside the volume of the circular cylinder if radius r and height h of the cylinder are given?
How about -- in Python pseudocode, letting R be the radius and H be the height:
s = random.uniform(0, 1)
theta = random.uniform(0, 2*pi)
z = random.uniform(0, H)
r = sqrt(s)*R
x = r * cos(theta)
y = r * sin(theta)
z = z # .. for symmetry :-)
The problem with simply taking x = r * cos(angle) and y = r * sin(angle) is that then when r is small, i.e. at the centre of the circle, a tiny change in r doesn't change the x and y positions very much. IOW, it leads to a nonuniform distribution in Cartesian coordinates, and the points get concentrated toward the centre of the circle. Taking the square root corrects this, at least if I've done my arithmetic correctly.
[Ah, it looks like the sqrt was right.]
(Note that I assumed without thinking about it that the cylinder is aligned with the z-axis and the cylinder centre is located at (0,0,H/2). It'd be less arbitrary to set (0,0,0) at the cylinder centre, in which case z should be chosen to be between -H/2 and H/2, not 0,H.)
Generate a random point inside the rectangular solid circumscribing the cylinder; if it's inside the cylinder (probability pi/4), keep it, otherwise discard it and try again.
Generate a random angle (optionally less than 2π), a random r less than the radius, and a random z less than the height.
x = r * cos(angle)
y = r * sin(angle)
The z axis is easy: -0.5 * h <= z <= 0.5 * h
The x and y are equal to a circle will be:
x^2 + y^2 <= r^2
Buth math is long ago for me :-)

How to generate random shapes given a specified area.(R language).?

My question is this.. I am working on some clustering algorithms.. For this first i am experimenting with 2d shapes..
Given a particular area say 500sq units .. I need to generate random shapes for a particular area
say a Rect, Square, Triangle of 500 sq units.. etc .. Any suggestions on how i should go about this problem.. I am using R language..
It's fairly straightforward to do this for regular polygon.
The area of an n-sided regular polygon, with a circumscribed circle of radius R is
A = 1/2 nR^2 * sin((2pi)/n)
Therefore, knowing n and A you can easily find R
R = sqrt((2*A)/(n*sin((2pi)/n))
So, you can pick the center, go at distance R and generate n points at 2pi/n angle increments.
In R:
regular.poly <- function(nSides, area)
{
# Find the radius of the circumscribed circle
radius <- sqrt((2*area)/(nSides*sin((2*pi)/nSides)))
# I assume the center is at (0;0) and the first point lies at (0; radius)
points <- list(x=NULL, y=NULL)
angles <- (2*pi)/nSides * 1:nSides
points$x <- cos(angles) * radius
points$y <- sin(angles) * radius
return (points);
}
# Some examples
par(mfrow=c(3,3))
for (i in 3:11)
{
p <- regular.poly(i, 100)
plot(0, 0, "n", xlim=c(-10, 10), ylim=c(-10, 10), xlab="", ylab="", main=paste("n=", i))
polygon(p)
}
We can extrapolate to a generic convex polygon.
The area of a convex polygon can be found as:
A = 1/2 * [(x1*y2 + x2*y3 + ... + xn*y1) - (y1*x2 + y2*x3 + ... + yn*x1)]
We generate the polygon as above, but deviate angles and radii from those of the regular polygon.
We then scale the points to get the desired area.
convex.poly <- function(nSides, area)
{
# Find the radius of the circumscribed circle, and the angle of each point if this was a regular polygon
radius <- sqrt((2*area)/(nSides*sin((2*pi)/nSides)))
angle <- (2*pi)/nSides
# Randomize the radii/angles
radii <- rnorm(nSides, radius, radius/10)
angles <- rnorm(nSides, angle, angle/10) * 1:nSides
angles <- sort(angles)
points <- list(x=NULL, y=NULL)
points$x <- cos(angles) * radii
points$y <- sin(angles) * radii
# Find the area of the polygon
m <- matrix(unlist(points), ncol=2)
m <- rbind(m, m[1,])
current.area <- 0.5 * (sum(m[1:nSides,1]*m[2:(nSides+1),2]) - sum(m[1:nSides,2]*m[2:(nSides+1),1]))
points$x <- points$x * sqrt(area/current.area)
points$y <- points$y * sqrt(area/current.area)
return (points)
}
A random square of area 500m^2 is easy - its a square of side sqrt(500)m. Do you care about rotations? Then rotate it by runif(x,0,2*pi). Do you care about its location? Add an (x,y) offset computed from runif or whatever.
Rectangle? Given the length of any one pair of sides you only have the freedom to choose the length of the other two. How do you choose the length of the first pair of sides? Well, you might want to use runif() between some 'sensible' limits for your application. You could use rnorm() but that might give you negative lengths, so maybe rnorm-squared. Then once you've got that side, the other side length is 500/L. Rotate, translate, and add salt and pepper to taste.
For triangles, the area formula is half-base-times-height. So generate a base length - again, runif, rnorm etc etc - then choose another point giving the required height. Rotate, etc.
Summarily, a shape has a number of "degrees of freedom", and constraining the area to be fixed will limit at least one of those freedoms[1], so if you start building a shape with random numbers you'll come to a point where you have to put in a computed value.
[1] exactly one? I'm not sure - these aren't degrees of freedom in the statistical sense...
I would suggest coding a random walk of adjacent tiny squares, so that the aggregation of the tiny squares could be of arbitrary shape with known area.
http://en.wikipedia.org/wiki/File:Random_walk_in2D.png
It would be very tough to make a generic method.
But you could code up example for 3, 4, 5 sided objects.
Here is an example of a random triangle.(in C#)
class Triangle
{
double Angle1;
double Angle2;
//double angle3; 180 - angle1 - angle2;
double Base;
}
Triangle randomTriangle(double area){
//A = (base*hieght)/2.0;
double angle1 = *random number < 180*;
double angle2 = *random number < (180 - angle1)*;
*use trig to get height in terms of angles and base*
double base = (area*2.0)/height;
return new Triangle(){Angle1 = angle1, Angle2 = angle2, Base = base};
}

Resources