Calculating the distance between polygon and point in R - r

I have a, not necessarily convex, polygon without intersections and a point outside this polygon. I'm wondering how calculate the Euclidian distance most efficiently in a 2-dimensional space. Is there a standard method in R?
My first idea was to calculate the minimum distance of all the lines of the polygon (extended infinitely so they are line, not line pieces) and then calculate the distance from the point to each individual line using the start of the line piece and Pythagoras.
Do you know about a package that implements an efficient algorithm?

You could use the rgeos package and the gDistance method. This will require you to prepare your geometries, creating spgeom objects from the data you have (I assume it is a data.frame or something similar). The rgeos documentation is very detailed (see the PDF manual of the package from the CRAN page), this is one relevant example from the gDistance documentation:
pt1 = readWKT("POINT(0.5 0.5)")
pt2 = readWKT("POINT(2 2)")
p1 = readWKT("POLYGON((0 0,1 0,1 1,0 1,0 0))")
p2 = readWKT("POLYGON((2 0,3 1,4 0,2 0))")
gDistance(pt1,pt2)
gDistance(p1,pt1)
gDistance(p1,pt2)
gDistance(p1,p2)
readWKT is included in rgeos as well.
Rgeos is based on the GEOS library, one of the de facto standards in geometric computing. If you don't feel like reinventing the wheel, this is a good way to go.

I decided to return and write up a theoretical solution, just for posterity. This isn't the most concise example, but it is fully transparent for those who want to know how to go about solving a problem like this by hand.
The theoretical algorithm
First, our assumptions.
We assume the polygon's vertices specify the points of a polygon in a rotational order going clockwise or counter-clockwise and the lines of the polygon cannot intersect. This means we have a normal geometric polygon, and not some strangely-defined vector graphic shape.
We assume this is a set of Cartesian coordinates, using 'x' and 'y' values that represent location on a 2-dimensional plane.
We assume the point must be outside the internal area of the polygon.
Finally, we assume that the distance desired is the minimum distance between the point and all of the infinite number of points on the perimeter of the polygon.
Now before coding, we should write out in basic terms what we want to do. We can assume that the shortest distance between the polygon and the point outside the polygon will always be one of two things: a vertex of the polygon or a point on a line between two vertices. With this in mind, we do the following steps:
Calculate the distances between all vertices and the target point.
Find the two vertices closest to the target point.
If either:
(a) the two closest vertices are not adjacent or
(b) the inside angles of either of the two vertices is greater or equal to 90 degrees,
then the closest vertex is the closest point. Calculate the distance between the closest point and the target point.
Otherwise, calculate the height of the triangle formed between the two points.
We're basically just looking to see if a vertex is closest to the point or if a point on a line is closest to the point. We have to use a few trig functions to make this work.
The code
To make this work properly, we want to avoid any 'for' loops and want to only use vectorized functions when looking at the entire list of polygon vertices. Luckily, this is pretty easy in R. We accept a data frame with 'x' and 'y' columns for our polygon's vertices, and we accept a vector with one 'x' and 'y' value for the point's location.
get_Point_Dist_from_Polygon <- function(.polygon, .point){
# Calculate all vertex distances from the target point.
vertex_Distance <- sqrt((.point[1] - .polygon$x)^2 + (.point[2] - .polygon$y)^2)
# Select two closest vertices.
min_1_Index <- which.min(vertex_Distance)
min_2_Index <- which.min(vertex_Distance[-min_1_Index])
# Calculate lengths of triangle sides made of
# the target point and two closest points.
a <- vertex_Distance[min_1_Index]
b <- vertex_Distance[min_2_Index]
c <- sqrt(diff(.polygon$x[c(min_1_Index, min_2_Index)])^2 + diff(.polygon$y[c(min_1_Index, min_2_Index)])^2)
if(abs(min_1_Index - min_2_Index) != 1 |
acos((b^2 + c^2 - a^2)/(2*b*c)) >= pi/2 |
acos((a^2 + c^2 - b^2)/(2*a*c)) >= pi/2
){
# Step 3 of algorithm.
return(vertex_Distance[min_1_Index])
} else {
# Step 4 of algorithm.
# Here we are using the law of cosines.
return(sqrt((a+b-c) * (a-b+c) * (-a+b+c) * (a+b+c)) / (2 * c))
}
}
Demo
polygon <- read.table(text="
x, y
0, 1
1, 0.8
2, 1.3
3, 1.4
2.5,0.3
1.5,0.5
0.5,0.1", header=TRUE, sep=",")
point <- c(3.2, 4.1)
get_Point_Dist_from_Polygon(polygon, point)
# 2.707397

Otherwise:
p2poly <- function(pt, poly){
# Closing the polygon
if(!identical(poly[1,],poly[nrow(poly),])){poly<-rbind(poly,poly[1,])}
# A simple distance function
dis <- function(x0,x1,y0,y1){sqrt((x0-x1)^2 +(y0-y1)^2)}
d <- c() # Your distance vector
for(i in 1:(nrow(poly)-1)){
ba <- c((pt[1]-poly[i,1]),(pt[2]-poly[i,2])) #Vector BA
bc <- c((poly[i+1,1]-poly[i,1]),(poly[i+1,2]-poly[i,2])) #Vector BC
dbc <- dis(poly[i+1,1],poly[i,1],poly[i+1,2],poly[i,2]) #Distance BC
dp <- (ba[1]*bc[1]+ba[2]*bc[2])/dbc #Projection of A on BC
if(dp<=0){ #If projection is outside of BC on B side
d[i] <- dis(pt[1],poly[i,1],pt[2],poly[i,2])
}else if(dp>=dbc){ #If projection is outside of BC on C side
d[i] <- dis(poly[i+1,1],pt[1],poly[i+1,2],pt[2])
}else{ #If projection is inside of BC
d[i] <- sqrt(abs((ba[1]^2 +ba[2]^2)-dp^2))
}
}
min(d)
}
Example:
pt <- c(3,2)
triangle <- matrix(c(1,3,2,3,4,2),byrow=T, nrow=3)
p2poly(pt,triangle)
[1] 0.3162278

I used distm() function in geosphere package to calculate the distence when points and apexes are presented in coordinate system. Also, you can easily make some alternation by substitude dis <- function(x0,x1,y0,y1){sqrt((x0-x1)^2 +(y0-y1)^2)}
for distm() .
algo.p2poly <- function(pt, poly){
if(!identical(poly[1,],poly[nrow(poly),])){poly<-rbind(poly,poly[1,])}
library(geosphere)
n <- nrow(poly) - 1
pa <- distm(pt, poly[1:n, ])
pb <- distm(pt, poly[2:(n+1), ])
ab <- diag(distm(poly[1:n, ], poly[2:(n+1), ]))
p <- (pa + pb + ab) / 2
d <- 2 * sqrt(p * (p - pa) * (p - pb) * (p - ab)) / ab
cosa <- (pa^2 + ab^2 - pb^2) / (2 * pa * ab)
cosb <- (pb^2 + ab^2 - pa^2) / (2 * pb * ab)
d[which(cosa <= 0)] <- pa[which(cosa <= 0)]
d[which(cosb <= 0)] <- pb[which(cosb <= 0)]
return(min(d))
}
Example:
poly <- matrix(c(114.33508, 114.33616,
114.33551, 114.33824,
114.34629, 114.35053,
114.35592, 114.35951,
114.36275, 114.35340,
114.35391, 114.34715,
114.34385, 114.34349,
114.33896, 114.33917,
30.48271, 30.47791,
30.47567, 30.47356,
30.46876, 30.46851,
30.46882, 30.46770,
30.47219, 30.47356,
30.47499, 30.47673,
30.47405, 30.47723,
30.47872, 30.48320),
byrow = F, nrow = 16)
pt1 <- c(114.33508, 30.48271)
pt2 <- c(114.6351, 30.98271)
algo.p2poly(pt1, poly)
algo.p2poly(pt2, poly)
Outcome:
> algo.p2poly(pt1, poly)
[1] 0
> algo.p2poly(pt2, poly)
[1] 62399.81

Related

Arc length between points on Archimedean spiral using MATLAB or R

I'm needing to derive xy values and calculate arc length between each xy value, so a length value for every value in i as generated by the attached code below (excluding the origin). The points follow an Archimedean spiral path. I don't have MATLAB and am using R, but the closest I've found that I can interpret was a MATLAB example found here with credit to Jos. Below is a modified version of the MATLAB script to generate the xy data:
r = 938; %outer radius
a = 0; %inner radius
b = 7; %increment per rev
n = (r - a)./(b); %number of revolutions
th = 2*n*pi; %angle
i = linspace(0,n,n*1000);
x = (a+b*i).* cos(2*pi*i);
y = (a+b*i).* sin(2*pi*i);
and the R equivalent:
r <- 938 # outer radius
a <- 0 # inner radius
b <- 7 # increment per revolution
n <- (r - a)/b # number of revolutions
th <- 2*n*pi # angle
i <- seq(0, n, length.out = n*1000) # number of points per revolution
x <- (a+b*i) * cos(2*pi*i)
y <- (a+b*i) * sin(2*pi*i)
My assumption is that the easiest way to derive arc length between every point is to coerce i, x, and y into a MATLAB table (dataframe in R). The closest I've found for calculating arc length is this formula for calculating the total length. I'm unable to interpret math notation, so am not sure how to implement it or how to modify it to calculate arc length between every point. Using the example of the first spiral in the link above for calculating total length I tried:
sqrt((5 + 0.1289155 * 47.12389)^2 + (0.1289155)^2) * 47.12389
The link above says the result should be 378.8 but my attempt returns 521.9324. So in sum, how is the arc length between points derived in MATLAB or R?
The exact formula for the length, with your notations for a (start radius), r (end radius) and b (increment per revolutions) reduces to
(note that in order to preserve the OP notation, there are two different meanings of the same r symbol, that might be frown upon by some)
That formula can be implemented this way
r <- 938 # outer radius
a <- 0 # inner radius
b <- 7 # increment per revolution
A <- 2 * pi / b
fa <- sqrt(1 + A^2 * a^2)
fr <- sqrt(1 + A^2 * r^2)
int_r <- (A*r*fr - log(-(A*r)+fr))/(2*A)
int_a <- (A*a*fa - log(-(A*a)+fa))/(2*A)
spiralLen <- int_r - int_a #exact formula
394877.5
you can also use numerical (approximative) integration in R stats integrate to evaluate the integral
integrate(function(r){sqrt(4*pi^2*r^2/b^2+1)}, a, r)
394877.3 with absolute error < 5.8
Another method, that gives a rather rough approximation, but is a very good verification because it doesn't use any theoretical considerations, but just takes the data you generated - and sums up the length of the segments of all consecutive points in the data:
dx <- x[2:length(x)] - x[1:length(x)-1]
dy <- y[2:length(x)] - y[1:length(x)-1]
len_approx = sum(sqrt(dx^2 + dy^2))
394876.8
As for plotting, in R, since you already have a set of points, it seems the very basic application of plot function does the job
plot(x, y, type="l")

R terra calculate area moment of inertia OR how to get (weighted) raster-cell distance from patch-centroid

I'm trying to calculate a measure akin to the moment of inertia using a raster layer and I am struggling to figure out how to get the distance of each cell to a patch's centroid and then extracting both that distance and the cell's value.
I want to calculate the moment of inertia (get the squared distance of each cell to its patches centroid, multiply by value of cell, sum these values by patch, and then divide by the sum of all values per patch). I provide a simplified set-up below. The code creates a simple raster layer, patches clusters of cells, and gets their centroids. I know that the function in question to use next is probably terra::distance (maybe in combination with terra::zonal?!) -- how do I calculate the distance by patch?
#lonlat
library(terra)
r <- rast(ncols=36, nrows=18, crs="+proj=longlat +datum=WGS84")
r[498:500] <- 1
r[3:6] <- 1
r[111:116] <- 8
r[388:342] <- 1
r[345:349] <- 3
r_patched <- patches(r, directions = 8, allowGaps = F)
testvector <- terra::as.polygons(r_patched, trunc=T, dissolve = T)
p_centr <- geom(centroids(testvector), df=T)
##next steps
#1. get distance of each cell from patch's centroid
#r <- distance(r)
#2. multiply cell value by squared distance to centroid
I think you need to loop over the patches. Something like this:
p_centr <- centroids(testvector)
v <- rep(NA, length(p_centr))
for (i in 1:length(p_centr)) {
x <- ifel(r_patched == p_centr$patches[i], i, NA)
x <- trim(x)
d <- distance(x, p_centr[i,])
d <- mask(d, x)
# square distance and multiply with cell values
d <- d^2 * crop(r, d)
v[i] <- global(d, "sum", na.rm=TRUE)[[1]]
}
v / sum(v)
#[1] 1.213209e-05 1.324495e-02 9.864759e-01 2.669833e-04

Angles Between Continuous 2D Vectors

I'm having some trouble calculating the clockwise angles between continuous 2D vectors. My computed angles do not seem right when I compare them by eye on a plot. What follows is my process in R.
If necessary, install and enable the "circular" package:
install.packages('circular')
library(circular)
Generate a small data frame of 2D coordinates:
functest <- data.frame(x=c(2,8,4,9,10,7),y=c(6,8,2,5,1,4))
Plot the points for reference:
windows(height=8,width=8)
par(pty="s")
plot(functest, main = "Circular Functions Test")
## draw arrows from point to point :
s <- seq(length(functest$x)-1) # one shorter than data
arrows(functest$x[s], functest$y[s], functest$x[s+1], functest$y[s+1], col = 1:3)
Create a function that computes the angle between two vectors:
angle <- function(m)
{ # m is a matrix
dot.prod <- crossprod(m[, 1], m[, 2])
norm.x <- norm(m[, 1], type="2")
norm.y <- norm(m[, 2], type="2")
theta <- acos(dot.prod / (norm.x * norm.y))
as.numeric(theta) # returns the angle in radians
}
Generate a vector of compass angles in degrees (clockwise rotation):
functest_matrix <- cbind(x = functest$x,y = functest$y)
moves <- apply(functest_matrix, 2, diff)
tst <- lapply(seq(nrow(moves) - 1), function(idx) moves[c(idx, idx + 1), ])
functest_angles <- vapply(tst, angle, numeric(1))
functest_object <- circular(functest_angles, type="angles", units="radians", zero=0, rotation = "counter")
functest_convert <- conversion.circular(functest_object, type = "angles", units = "degrees", rotation = "clock", zero = pi/2)
functest_compass <- lapply(functest_convert, function(x) {if (x < 0) x+360 else x}) # converts any negative rotations to positive
I suspect something wrong may be occuring in my last three lines of code when I try to convert "normal" counterclockwise angles in radians to clockwise compass angles in degrees. Any help would be greatly appreciated!
Don't know R but see that you calculate angle between vectors using scalar product.
Note that resulted angle is not directed - it is neither clockwise, nor counterclockwise (consider that scalar product is insensitive to vector exchange).
If you really need directed angle (the angle needed to rotate the first vector to make it collinear with the second one), you have to apply ArcTan2 (atan2) approach
(result range usually is -Pi..Pi)
Theta = ArcTan2(CrossProduct(v1,v2), DotProduct(v1,v2))
This line makes no sense to me:
dot.prod <- crossprod(m[, 1], m[, 2])
You assign the cross product of two vectors to a variable named dot product.
I didn't read the rest of your code, but those are two very different things.
The dot product produces a scalar value; the cross product produces a vector orthogonal to the other two.
Are you sure your naming doesn't reflect a misunderstanding of those two operations? It might explain why you're having trouble.
You can get the angle between any two vectors using the dot product. Why do you think you need to go to all the trouble in that method?

Greatest distance between set of longitude/latitude points

I have a set of lng/lat coordinates. What would be an efficient method of calculating the greatest distance between any two points in the set (the "maximum diameter" if you will)?
A naive way is to use Haversine formula to calculate the distance between each 2 points and get the maximum, but this doesn't scale well obviously.
Edit: the points are located on a sufficiently small area, measuring the area in which a person carrying a mobile device was active in the course of a single day.
Theorem #1: The ordering of any two great circle distances along the surface of the earth is the same as the ordering as the straight line distance between the points where you tunnel through the earth.
Hence turn your lat-long into x,y,z based either on a spherical earth of arbitrary radius or an ellipsoid of given shape parameters. That's a couple of sines/cosines per point (not per pair of points).
Now you have a standard 3-d problem that doesn't rely on computing Haversine distances. The distance between points is just Euclidean (Pythagoras in 3d). Needs a square-root and some squares, and you can leave out the square root if you only care about comparisons.
There may be fancy spatial tree data structures to help with this. Or algorithms such as http://www.tcs.fudan.edu.cn/rudolf/Courses/Algorithms/Alg_ss_07w/Webprojects/Qinbo_diameter/2d_alg.htm (click 'Next' for 3d methods). Or C++ code here: http://valis.cs.uiuc.edu/~sariel/papers/00/diameter/diam_prog.html
Once you've found your maximum distance pair, you can use the Haversine formula to get the distance along the surface for that pair.
I think that the following could be a useful approximation, which scales linearly instead of quadratically with the number of points, and is quite easy to implement:
calculate the center of mass M of the points
find the point P0 that has the maximum distance to M
find the point P1 that has the maximum distance to P0
approximate the maximum diameter with the distance between P0 and P1
This can be generalized by repeating step 3 N times,
and taking the distance between PN-1 and PN
Step 1 can be carried out efficiently approximating M as the average of longitudes and latitudes, which is OK when distances are "small" and the poles are sufficiently far away. The other steps could be carried out using the exact distance formula, but they are much faster if the points' coordinates can be approximated as lying on a plane. Once the "distant pair" (hopefully the pair with the maximum distance) has been found, its distance can be re-calculated with the exact formula.
An example of approximation could be the following: if φ(M) and λ(M) are latitude and longitude of the center of mass calculated as Σφ(P)/n and Σλ(P)/n,
x(P) = (λ(P) - λ(M) + C) cos(φ(P))
y(P) = φ(P) - φ(M) [ this is only for clarity, it can also simply be y(P) = φ(P) ]
where C is usually 0, but can be ± 360° if the set of points crosses the λ=±180° line. To find the maximum distance you simply have to find
max((x(PN) - x(PN-1))2 + (y(PN) - y(PN-1))2)
(you don't need the square root because it is monotonic)
The same coordinate transformation could be used to repeat step 1 (in the new coordinate system) in order to have a better starting point. I suspect that if some conditions are met, the above steps (without repeating step 3) always lead to the "true distant pair" (my terminology). If I only knew which conditions...
EDIT:
I hate building on others' solutions, but someone will have to.
Still keeping the above 4 steps, with the optional (but probably beneficial, depending on the typical distribution of points) repetition of step 3,
and following the solution of Spacedman,
doing calculations in 3D overcomes the limitations of closeness and distance from poles:
x(P) = sin(φ(P))
y(P) = cos(φ(P)) sin(λ(P))
z(P) = cos(φ(P)) cos(λ(P))
(the only approximation is that this holds only for a perfect sphere)
The center of mass is given by x(M) = Σx(P)/n, etc.,
and the maximum one has to look for is
max((x(PN) - x(PN-1))2 + (y(PN) - y(PN-1))2 + (z(PN) - z(PN-1))2)
So: you first transform spherical to cartesian coordinates, then start from the center of mass, to find, in at least two steps (steps 2 and 3), the farthest point from the preceding point. You could repeat step 3 as long as the distance increases, perhaps with a maximum number of repetitions, but this won't take you away from a local maximum. Starting from the center of mass is not of much help, either, if the points are spread all over the Earth.
EDIT 2:
I learned enough R to write down the core of the algorithm (nice language for data analysis!)
For the plane approximation, ignoring the problem around the λ=±180° line:
# input: lng, lat (vectors)
rad = pi / 180;
x = (lng - mean(lng)) * cos(lat * rad)
y = (lat - mean(lat))
i = which.max((x - mean(x))^2 + (y )^2)
j = which.max((x - x[i] )^2 + (y - y[i])^2)
# output: i, j (indices)
On my PC it takes less than a second to find the indices i and j for 1000000 points. The following 3D version is a bit slower, but works for any distribution of points (and does not need to be amended when the λ=±180° line is crossed):
# input: lng, lat
rad = pi / 180
x = sin(lat * rad)
f = cos(lat * rad)
y = sin(lng * rad) * f
z = cos(lng * rad) * f
i = which.max((x - mean(x))^2 + (y - mean(y))^2 + (z - mean(z))^2)
j = which.max((x - x[i] )^2 + (y - y[i] )^2 + (z - z[i] )^2)
k = which.max((x - x[j] )^2 + (y - y[j] )^2 + (z - z[j] )^2) # optional
# output: j, k (or i, j)
The calculation of k can be left out (i.e., the result could be given by i and j), depending on the data and on the requirements. On the other hand, my experiments have shown that calculating a further index is useless.
It should be remembered that, in any case, the distance between the resulting points is an estimate which is a lower bound of the "diameter" of the set, although it very often will be the diameter itself (how often depends on the data.)
EDIT 3:
Unfortunately the relative error of the plane approximation can, in extreme cases, be as much as 1-1/√3 ≅ 42.3%, which may be unacceptable, even if very rare. The algorithm can be modified in order to have an upper bound of approximately 20%, which I have derived by compass and straight-edge (the analytic solution is cumbersome). The modified algorithm finds a pair of points whith a locally maximal distance, then repeats the same steps, but this time starting from the midpoint of the first pair, possibly finding a different pair:
# input: lng, lat
rad = pi / 180
x = (lng - mean(lng)) * cos(lat * rad)
y = (lat - mean(lat))
i.n_1 = 1 # n_1: n-1
x.n_1 = mean(x)
y.n_1 = 0 # = mean(y)
s.n_1 = 0 # s: square of distance
repeat {
s = (x - x.n_1)^2 + (y - y.n_1)^2
i.n = which.max(s)
x.n = x[i.n]
y.n = y[i.n]
s.n = s[i.n]
if (s.n <= s.n_1) break
i.n_1 = i.n
x.n_1 = x.n
y.n_1 = y.n
s.n_1 = s.n
}
i.m_1 = 1
x.m_1 = (x.n + x.n_1) / 2
y.m_1 = (y.n + y.n_1) / 2
s.m_1 = 0
m_ok = TRUE
repeat {
s = (x - x.m_1)^2 + (y - y.m_1)^2
i.m = which.max(s)
if (i.m == i.n || i.m == i.n_1) { m_ok = FALSE; break }
x.m = x[i.m]
y.m = y[i.m]
s.m = s[i.m]
if (s.m <= s.m_1) break
i.m_1 = i.m
x.m_1 = x.m
y.m_1 = y.m
s.m_1 = s.m
}
if (m_ok && s.m > s.n) {
i = i.m
j = i.m_1
} else {
i = i.n
j = i.n_1
}
# output: i, j
The 3D algorithm can be modified in a similar way. It is possible (both in the 2D and in the 3D case) to start over once again from the midpoint of the second pair of points (if found). The upper bound in this case is "left as an exercise for the reader" :-).
Comparison of the modified algorithm with the (too) simple algorithm has shown, for normal and for square uniform distributions, a near doubling of processing time, and a reduction of the average error from .6% to .03% (order of magnitude). A further restart from the midpoint results in an a just slightly better average error, but almost equal maximum error.
EDIT 4:
I have to study this article yet, but it looks like the 20% I found with compass and straight-edge is in fact 1-1/√(5-2√3) ≅ 19.3%
Here's a naive example that doesn't scale well (as you say), as you say but might help with building a solution in R.
## lonlat points
n <- 100
d <- cbind(runif(n, -180, 180), runif(n, -90, 90))
library(sp)
## distances on WGS84 ellipsoid
x <- spDists(d, longlat = TRUE)
## row, then column index of furthest points
ind <- c(row(x)[which.max(x)], col(x)[which.max(x)])
## maps
library(maptools)
data(wrld_simpl)
plot(as(wrld_simpl, "SpatialLines"), col = "grey")
points(d, pch = 16, cex = 0.5)
## draw the points and a line between on the page
points(d[ind, ], pch = 16)
lines(d[ind, ], lwd = 2)
## for extra credit, draw the great circle on which the furthest points lie
library(geosphere)
lines(greatCircle(d[ind[1], ], d[ind[2], ]), col = "firebrick")
The geosphere package provides more options for distance calculation if that's needed. See ?spDists in sp for the details used here.
You don't tell us whether these points will be located in a sufficiently small part of the globe. For truly global sets of points, my first guess would be running a naive O(n^2) algorithm, possibly getting performance boost with some spatial indexing (R*-trees, octal-trees etc.). The idea is to pre-generate an n*(n-1) list of the triangle in the distance matrix and feed it in chunks to a fast distance library to minimize I/O and process churn. Haversine is fine, you could also do it with Vincenty's method (the greatest contributor to running time is quadratic complexity, not the (fixed number of) iterations in Vincenty's formula). As a side note, in fact, you don't need R for this stuff.
EDIT #2: The Barequet-Har-Peled algorithm (as pointed at by Spacedman in his reply) has O((n+1/(e^3))log(1/e)) complexity for e>0, and is worth exploring.
For the quasi-planar problem, this is known as "diameter of convex hull" and has three parts:
Computing convex hull with Graham's scan which is O(n*log(n)) - in fact, one should try transforming points into a transverse Mercator projection (using the centroid of the points in data set).
Finding antipodal points by Rotating Calipers algorithm - linear O(n).
Finding the largest distance among all antipodal pairs - linear search, O(n).
The link with pseudo-code and discussion: http://fredfsh.com/2013/05/03/convex-hull-and-its-diameter/
See also the discussion on a related question here: https://gis.stackexchange.com/questions/17358/how-can-i-find-the-farthest-point-from-a-set-of-existing-points
EDIT: Spacedman's solution pointed me to the Malandain-Boissonnat algorithm (see the paper in pdf here). However, this is worse or the same as the bruteforce naive O(n^2) algorithm.

R: Converting cartesian coordinates to polar coordinates, and then calculating distance from origin

I've been looking for a solution to convert cartesian coordinates (lat, long) that I have to polar coordinates in order to facilitate a simulation that I want to run, but I haven't found any questions or answers here for doing this in R. There are a number of options, including the built in function cart2pol in Matlab, but all of my data are in R and I'd like to continue getting comfortable working in this framework.
Question:
I have lat/long coordinates from tagging data, and I want to convert these to polar coordinates (meaning jump size and angle: http://en.wikipedia.org/wiki/Polar_coordinate_system) so that I can then shuffle or bootstrap them (haven't decided which) about 1,000 times, and calculate the straight-line distance of each simulated track from the starting point. I have a true track, and I'm interested in determining if this animal is exhibiting site affinity by simulating 1,000 random tracks with the same jump sizes and turning angles, but in completely different orders and combinations. So I need 1,000 straight-line distances from the origin to create a distribution of distances and then compare this to my true data set's straight-line distance.
I'm comfortable doing the bootstrapping, but I'm stuck at the very first step, which is converting my cartesian lat/long coordinates to polar coordinates (jump size and turning angle). I know there are built in functions to do this in other programs such as Matlab, but I can't find any way to do it in R. I could do it manually by hand in a for-loop, but if there's a package out there or any easier way to do it, I'd much prefer that.
Ideally I'd like to convert the data to polar coordinates, run the simulation, and then for each random track output an end point as cartesian coordinates, lat/long, so I can then calculate the straight-line distance traveled.
I didn't post any sample data, as it would just be a two-column data frame of lat and long coordinates.
Thanks for any help you can provide! If there's an easy explanation somewhere on this site or others that I missed, please point me in that direction! I couldn't find anything.
Cheers
For x-y coordinates that are in the same units (e.g. meters rather than degrees of latitude and degrees of longitude), you can use this function to get a data.frame of jump sizes and turning angles (in degrees).
getSteps <- function(x,y) {
d <- diff(complex(real = x, imaginary = y))
data.frame(size = Mod(d),
angle = c(NA, diff(Arg(d)) %% (2*pi)) * 360/(2*pi))
}
## Try it out
set.seed(1)
x <- rnorm(10)
y <- rnorm(10)
getSteps(x, y)
# size angle
# 1 1.3838360 NA
# 2 1.4356900 278.93771
# 3 2.9066189 101.98625
# 4 3.5714584 144.00231
# 5 1.6404354 114.73369
# 6 1.3082132 135.76778
# 7 0.9922699 74.09479
# 8 0.2036045 141.67541
# 9 0.9100189 337.43632
## A plot helps check that this works
plot(x, y, type = "n", asp = 1)
text(x, y, labels = 1:10)
You can do a transformation bewteen cartesian and polar this way:
polar2cart <- function(r, theta) {
data.frame(x = r * cos(theta), y = r * sin(theta))
}
cart2polar <- function(x, y) {
data.frame(r = sqrt(x^2 + y^2), theta = atan2(y, x))
}
Since it is fairly straight forward, you can write your own function. Matlab-like cart2pol function in R:
cart2pol <- function(x, y)
{
r <- sqrt(x^2 + y^2)
t <- atan(y/x)
c(r,t)
}
I used Josh O'Brien's code and got what appear to be reasonable jumps and angles—they match up pretty well to eyeballing the rough distance and heading between points. I then used a formula from his suggestions to create a function to turn the polar coordinates back to cartesian coordinates, and a for loop to apply the function to the data frame of all of the polar coordinates. The loops appear to work, and the outputs are in the correct units, but I don't believe the values that it's outputting are corresponding to my data. So either I did a miscalculation with my formula, or there's something else going on. More details below:
Here's the head of my lat long data:
> head(Tag1SSM[,3:4])
lon lat
1 130.7940 -2.647957
2 130.7873 -2.602994
3 130.7697 -2.565903
4 130.7579 -2.520757
5 130.6911 -2.704841
6 130.7301 -2.752182
When I plot the full dataset just as values, I get this plot:
which looks exactly the same as if I were to plot this using any spatial or mapping package in R.
I then used Josh's function to convert my data to polar coordinates:
x<-Tag1SSM$lon
y<-Tag1SSM$lat
getSteps <- function(x,y) {
d <- diff(complex(real = x, imaginary = y))
data.frame(size = Mod(d),
angle = c(NA, diff(Arg(d)) %% (2*pi)) * 360/(2*pi))
}
which produced the following polar coordinates appropriately:
> polcoords<-getSteps(x,y)
> head(polcoords)
size angle
1 0.04545627 NA
2 0.04103718 16.88852
3 0.04667590 349.38153
4 0.19581350 145.35439
5 0.06130271 59.37629
6 0.01619242 31.86359
Again, these look right to me, and correspond well to the actual angles and relative distances between points. So far so good.
Now I want to convert these back to cartesian coordinates and calculate a euclidian distance from the origin. These don't have to be in true lat/long, as I'm just comparing them amongst themselves. So I'm happy for the origin to be set as (0,0) and for distances to be calculated in reference x,y values instead of kilometers or something like that.
So, I used this function with Josh's help and a bit of web searching:
polar2cart<-function(x,y,size,angle){
#convert degrees to radians (dividing by 360/2*pi, or multiplying by pi/180)
angle=angle*pi/180
if(is.na(x)) {x=0} #this is for the purpose of the for loop below
if(is.na(y)) {y=0}
newx<-x+size*sin(angle) ##X #this is how you convert back to cartesian coordinates
newy<-y+size*cos(angle) ##Y
return(c("x"=newx,"y"=newy)) #output the new x and y coordinates
}
And then plugged it into this for loop:
u<-polcoords$size
v<-polcoords$angle
n<-162 #I want 162 new coordinates, starting from 0
N<-cbind(rep(NA,163),rep(NA,163)) #need to make 163 rows, though, for i+1 command below— first row will be NA
for(i in 1:n){
jump<-polar2cart(N[i,1],N[i,2],u[i+1],v[i+1]) #use polar2cart function above, jump from previous coordinate in N vector
N[i+1,1]<-jump[1] #N[1,] will be NA's which sets the starting point to 0,0—new coords are then calculated from each previous N entry
N[i+1,2]<-jump[2]
Dist<-sqrt((N[163,1]^2)+(N[163,2]^2))
}
And then I can take a look at N, with my new coordinates based on those jumps:
> N
[,1] [,2]
[1,] NA NA
[2,] 0.011921732 0.03926732
[3,] 0.003320851 0.08514394
[4,] 0.114640605 -0.07594871
[5,] 0.167393509 -0.04472125
[6,] 0.175941466 -0.03096891
This is where the problem is... the x,y coordinates from N get progressively larger—there's a bit of variation in there, but if you scroll down the list, y goes from 0.39 to 11.133, with very few backward steps to lower values. This isn't what my lat/long data do, and if I calculated the cart->pol and pol->cart properly, these new values from N should match my lat/long data, just in a different coordinate system. This is what the N values look like plotted:
Not the same at all... The last point in N is the farthest point from the origin, while in my lat/long data, the last point is actually quite close to the first point, and definitely not the farthest point away. I think the issue must be in my conversion from polar coordinates back to cartesian coordinates, but I'm not sure how to fix it...
Any help in solving this would be much appreciated!
Cheers
I think this code I wrote converts to polar coordinates:
# example data
x<-runif(30)
y<-runif(30)
# center example around 0
x<-x-mean(x)
y<-y-mean(y)
# function to convert to polar coordinates
topolar<-function(x,y){
# calculate angles
alphas<-atan(y/x)
# correct angles per quadrant
quad2<-which(x<0&y>0)
quad3<-which(x<0&y<0)
quad4<-which(x>0&y<0)
alphas[quad2]<-alphas[quad2]+pi
alphas[quad3]<-alphas[quad3]+pi
alphas[quad4]<-alphas[quad4]+2*pi
# calculate distances to 0,0
r<-sqrt(x^2+y^2)
# create output
polar<-data.frame(alphas=alphas,r=r)
}
# call function
polar_out<-topolar(x,y)
# get out angles
the_angles<-polar_out$alphas
Another option only in degree
pol2car = function(angle, dist){
co = dist*sin(angle)
ca = dist*cos(angle)
return(list(x=ca, y=co))
}
pol2car(angle = 45, dist = sqrt(2))
cart2sph {pracma} Transforms between cartesian, spherical, polar, and cylindrical coordinate systems in two and three dimensions.

Resources