Approaches for spatial geodesic latitude longitude clustering in R -- Follow-Up - r

Mine are follow-ups to the question & answer in Approaches for spatial geodesic latitude longitude clustering in R with geodesic or great circle distances.
I would like to better understand:
Question #1: If all the lat / long values are within the same city, is it necessary to use either fossil or distHaversine(...) to first calculate great circle distances ?
or, within a single city, is it OK to run clustering on the lat/long values themselves ?
Question #2: jlhoward suggests that :
It's worth noting that these methods require that all points must go into some cluster. If you just ask which points are close together, and allow that some cities don't go into any cluster, you get very different results.
In my case I would like to ask just ask "which points are close together", without forcing every point into a cluster. How can I do this ?
Question #3: To include one or two factor variables into the clustering (in addition to lat/long), is it as easy as including those factor variables in the df upon which the clustering is run ?
Please confirm.
Thanks!

"within a single city, is it OK to run clustering on the lat/long values themselves ?"
Yes, as long as your city is on the equator, where a degree of longitude is the same distance as a degree of latitude.
I'm standing very close to the north pole. One degree of longitude is 1/360 of the circumference of the circle round the pole from me. Someone ten degrees east of me might only be ten feet away. Someone one degree south of me is miles away. A clustering algorithm based on lat-long would think that guy miles away was closer to me than the guy I can wave to ten degrees east of me.
The solution for small areas to save having to compute great-circle ellipsoid distances is to project to a coordinate system that is near-enough cartesian so that you can use pythagoras' theorem for distance without too much error. Typically you would use a UTM zone transform, which is essentially a coordinate system that puts its equator through your study area.
The spTransform function in sp and rgdal will sort this out for you.

Related

A* algorithm when the heuristic can not be calculated for some nodes

I am working on a dataset of cities and towns spread across North America with the objective of finding the shortest path between a starting point and an ending point. I decided to use the Haversine distance as my heuristic function. But, my dataset doesn't have the latitude and longitude coordinates for some of the towns that could lie in the shortest distance path. How am I supposed to calculate the heuristic in this case? Would taking the average of the heuristics of the neighboring towns make sense?
It is given that a town/city without its corresponding coordinates cant be the starting point or the ending point.
Is there a different heuristic I should be considering instead of the Haversive distance?
If I remember correctly (don’t trust me on this!) a heuristic that returns zero for some nodes is still "legal" (in the sense that when you get to the end node, you know it’s optimal), so that would be a brutal solution. Obviously, doing this for too many nodes would wreck your search performance!
I think that interpolating between neighbour locations risks creating an inadmissible heuristic.

Returning distance in miles or kilometers from manhattan distance formula output

I am running a taxicab distance function on a list of coordinates and I would like to convert the outcome integer to a mile or km quantity. For example:
0.0117420 = |40.721319 - 40.712278| + |-73.844311 - -73.841610|
Where 0.0117420 is the output I would like to convert to mi/km. How could I go about this?
This appears to be a situation where you are trying to navigate from (40.721319, -73.844311) to (40.712278, -73.841610) where these are lat / lon pairs, and you want to navigate using a "Manhattan" routing rather than a direct great circle route.
It looks like you are considering these points as opposite corners of a "rectangle" where travel is only allowed along north, south, east and west headings to move from one point to another and where travel along the path always brings the traveler closer to the destination point.
An approximation of this is to find one of the corners of the bounding rectangle for all such paths. There are two of them, one at (40.721319, -73.841610) and the other at (40.712278, -73.844311). So, you can pick one of these and chose that as a waypoint for approximating the length each possible "Manhattan routes" between the two points. If we chose the first, you will need to calculate the distance from the starting point to the waypoint then to the destination point. Such as:
l(0) = (40.721319, -73.844311)
l(1) = (40.721319, -73.841610)
l(2) = (40.712278, -73.841610)
Using the Haversine equations we see the distance from l(0) to l(1) is approximately 0.2276km and the distance from l(1) to l(2) is approximately 1.005km making the entire route about 1.2326km.
This is approximately the length of any "Manhattan route" you pick where the distance is strictly decreasing along the path taken between the two points. There are also some errors due to the curvature of the Earth, but for points this close to each other and so distant from either of the poles, this should be good enough for most applications.

How to get the radius given Min and Max Latitude and Longitude?

Given coordinates:
min_lat=25.862491496700553
max_lat=26.358213103705367
min_lng=-80.790159828186
max_lng=-79.78628409576413
Is is possible to determine a radius?
"Given [two geographic points] [i]s it possible to determine a radius?"
As asked, no.
A radius is the distance from the center of a circle or sphere to its perimeter. It is used for 2D circles and 3D spheres.
That said, I suspect you may in fact being trying to ask a different question that can be answered.
You may be asking for either the "great circle distance" between two points (each point with a latitude and longitude value);
I would suggest reading Calculate distance, bearing and more between Latitude/Longitude points by Chris Veness and the StackOverflow question here: "How do I calculate distance between two latitude-longitude points?"
If you need a unusually high degree of accuracy (far more than almost everyone using Google Earth needs) you can use the Vincentry Formula which takes into account an ellipsoidal Earth model, as the Earth is not a perfect sphere.
Or you may of meant that you wish to find the mid-point of the four line segments along the perimeter of max and min latitude and longitude. e.g. (min_lat, min_lon), (min_lat, max_lon), (max_lat, min_lon), and (max_lat, max_lon), a less frequent request.
This covered in more detail in the StackOverflow question "Mid point of two point where latitude and longitude given."
No. You could mark out that region on a billiard ball, or on Jupiter.

Getting a handle on GIS math, where do I start?

I am in charge of a program that is used to create a set of nodes and paths for consumption by an autonomous ground vehicle. The program keeps track of the locations of all items in its map by indicating the item's position as being x meters north and y meters east of an origin point of 0,0. In the real world, the vehicle knows the location of the origin's lat and long, as it is determined by a dgps system and is accurate down to a couple centimeters. My program is ignorant of any lat long coordinates.
It is one of my goals to modify the program to keep track of lat long coords of items in addition to an origin point and items' x,y position in relation to that origin. At first blush, it seems that I am going to modify the program to allow the lat long coords of the origin to be passed in, and after that I desire that the program will automatically calculate the lat long of every item currently in a map. From what I've researched so far, I believe that I will need to figure out the math behind converting to lat long coords from a UTM like projection where I specify the origin points and meridians etc as opposed to whatever is defined already for UTM.
I've come to ask of you GIS programmers, am I on the right track? It seems to me like there is so much to wrap ones head around, and I'm not sure if the answer isn't something as simple as, "oh yea theres a conversion from meters to lat long, here"
Currently, due to the nature of DGPS, the system really doesn't need to care about locations more than oh, what... 40 km? radius away from the origin. Given this, and the fact that I need to make sure that the error on my coordinates is not greater than .5 meters, do I need anything more complex than a simple lat/long to meters conversion constant?
I'm knee deep in materials here. I could use some pointers about what concepts to research.
Thanks much!
Given a start point in lat/long and a distance and bearing, finding the end point is a geodesic calculation. There's a great summary of geodesic calculations and errors on the proj.4 website. They come to the conclusion that using a spherical model can get results for distance between points with at most 0.51% error. That, combined with a formula to translate between WGS-84 and ECEF (see the "LLA to ECEF" and "ECEF to LLA" sections, seems like it gets you what you need.
If you want to really get the errors nailed down by inverse projecting your flat map to WGS-84, proj.4 is a projection software package. It has source code, and comes with three command line utilities - proj, which converts to/from cartographic projection and cartesian data; cs2cs, which converts between different cartographic projections; and geod, which calculates geodesic relationships.
The USGS publishes a very comprehensive treatment of map projections.
I'd do a full-up calculation if you can. That way you'll always be as accurate as you can be.
If you happen to be using C++ the GDAL is a very good library.
For a range of 40km, you may find that approximating the world to a 2D flat surface may work, although a UTM transform would be the ideal way to go - in any case, I'd advocate using the actual WGS84 co-ordinates & ellipsoid for calculations such as great circle distance, or calculating bearings.
If you get bored, you could go down a similar line to something I've been working on, that can be used as a base class for differing datums such as OSGB36 or WGS84...

Determine the centroid of multiple points

I'm writing a mapping application that I am writing in python and I need to get the lat/lon centroid of N points.
Say I have two locations
a.lat = 101
a.lon = 230
b.lat = 146
b.lon = 200
Getting the center of two points is fairly easy using a euclidean formula. I would like
to be able to do it for more then two points.
Fundamentally I'm looking to do something like http://a.placebetween.us/ where one can enter multiple addresses and find a the spot that is equidistant for everyone.
Have a look at the pdf document linked below. It explains how to apply the plane figure algorithm that Bill the Lizard mentions, but on the surface of a sphere.
poster thumbnail and some details http://img51.imageshack.us/img51/4093/centroidspostersummary.jpg
Source: http://www.jennessent.com/arcgis/shapes_poster.htm
There is also a 25 MB full-size PDF available for download.
Credit goes to mixdev for finding the link to the original source, and of course to Jenness Enterprises for making the information available. Note: I am in no way affiliated with the author of this material.
Adding to Andrew Rollings' answer.
You will also need to make sure that if you have points on either side of the 0/360 longitude line that you are measuring in the "right direction"
Is the center of (0,359) and (0, 1) at (0,0) or (0,180)?
If you are averaging angles and have to deal with them crossing the 0/360 then it is safer to sum the sin and cos of each value and then Average = atan2(sum of sines,sum of cosines)
(be careful of the argument order in your atan2 function)
The math is pretty simple if the points form a plane figure. There's no guarantee, however, that a set of latitudes and longitudes are that simple, so it may first be necessary to find the convex hull of the points.
EDIT: As eJames points out, you have to make corrections for the surface of a sphere. My fault for assuming (without thinking) that this was understood. +1 to him.
The below PDF has a bit more detail than the poster from Jenness Enterprises. It also handles conversion in both directions and for a spheroid (such as the Earth) rather than a perfect sphere.
Converting between 3-D Cartesian and ellipsoidal latitude, longitude and height coordinates
Separately average the latitudes and longitudes.

Resources