Related
I have a PhoneGap app and have shifted everything to a local DB, I now need to calculate the users location form the points in the DB.
Does anyone have any pointed how to do this, I was using the Google API but now focusing on this working offline so have to query the DB.
Thanks.
You want to calculate the distance from longtitude and lattitude (offline), right??
Look at this link
var lat1 = //start latitude
var lon1 = //start longitude
var lat2 = //end latitude
var lon2 = //end longitude
var R = 6371; // now in km (change for get miles)
var dLat = (lat2-lat1) * Math.PI / 180;
var dLon = (lon2-lon1) * Math.PI / 180;
var a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.cos(lat1 * Math.PI / 180 ) * Math.cos(lat2 * Math.PI / 180 ) * Math.sin(dLon/2) * Math.sin(dLon/2);
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));
var d = R * c;
if (d>1)
return Math.round(d);//+"km";
else if (d<=1)
return Math.round(d*1000)+"m";
return d;
I need to port an excel spreadsheet that uses the solver plugin to an ASP.NET c# site, I'm using the Solver Foundation 3.1 to no avail.
The equation is iterative, and two conditions need to be met in order to determine the two decision values.
The two decision values on the spreadsheet are set to 1
Ng = 1
Nv = 1
Then the goal is to find the MAX of Formula1 when the following conditions are met:
Decision 1: Formula1 = _na
Decision 2: Formula2 = Formula3
All formulas use Ng and Nv, hitting solve changes Ng and Nv to satisfy each equation, it takes and average of 22 iterations on the spreadsheet.
I've implemented it in c# as follows:
public class FluxSolver
{
public FluxSolver(double _na, double _ygo, double _k, double _alpha, double _inletVoidFraction)
{
var solver = SolverContext.GetContext();
solver.ClearModel();
var model = solver.CreateModel();
var decisionNG = new Decision(Domain.RealNonnegative, "NG");
var decisionNV = new Decision(Domain.RealNonnegative, "NV");
model.AddDecision(decisionNG);
model.AddDecision(decisionNV);
Term formula1 = (_ygo * decisionNG) + ((1 - _ygo) * decisionNV);
model.AddGoal("Goal", GoalKind.Maximize, formula1);
model.AddConstraint("Constraint1", ((_inletVoidFraction / _k) * (1 / decisionNG)) == (_alpha * ((1 / decisionNV) - 1)));
model.AddConstraint("Constraint2", ( _ygo * decisionNG) + ((1 - _ygo) * decisionNV) == _na);
var solution = solver.Solve();
NV = decisionNV.GetDouble();
NG = decisionNG.GetDouble();
Quality = solution.Quality.ToString();
}
public double NG { get; set; }
public double NV { get; set; }
public string Quality { get; set; }
}
I've been at it for 3 days now and still not making any progress, the solver just loads and and disappears, never times out, doesn't return the values etc. Is there something fundamentally wrong in my code?
Any help appreciated!
I want to latitude/longitude into X,Y coordinates in flash, So i tried these methods by googling.
Please can someone tell me some other way to calculate. because its not showing the expected places
public function getPoint(lat, lon) {
// First method
/*stationX = (180+lon) * (mapwidth / 360);
stationY = (90-lat) * (mapheight / 180);*/
// Second method
lat = (lat * -1) + 90;
lon+=180;
stationX = p.TnMap.x+Math.round(lon * (mapwidth / 360))+ 50;
stationY = p.TnMap.y+Math.round(lat * (mapheight / 180));
// Thirdmethod
/*stationX = radius_of_world * Math.cos(lon) * Math.cos(lat);
stationY = radius_of_world * Math.sin(lon) * Math.cos(lat);
stationZ = radius_of_world * Math.sin(lat);
stationX = stationX * 150 / (150 + stationZ);
stationY = stationY * 150 / (150 + stationZ);*/
}
I'm looking for a string similarity algorithm that yields better results on variable length strings than the ones that are usually suggested (levenshtein distance, soundex, etc).
For example,
Given string A: "Robert",
Then string B: "Amy Robertson"
would be a better match than
String C: "Richard"
Also, preferably, this algorithm should be language agnostic (also works in languages other than English).
Simon White of Catalysoft wrote an article about a very clever algorithm that compares adjacent character pairs that works really well for my purposes:
http://www.catalysoft.com/articles/StrikeAMatch.html
Simon has a Java version of the algorithm and below I wrote a PL/Ruby version of it (taken from the plain ruby version done in the related forum entry comment by Mark Wong-VanHaren) so that I can use it in my PostgreSQL queries:
CREATE FUNCTION string_similarity(str1 varchar, str2 varchar)
RETURNS float8 AS '
str1.downcase!
pairs1 = (0..str1.length-2).collect {|i| str1[i,2]}.reject {
|pair| pair.include? " "}
str2.downcase!
pairs2 = (0..str2.length-2).collect {|i| str2[i,2]}.reject {
|pair| pair.include? " "}
union = pairs1.size + pairs2.size
intersection = 0
pairs1.each do |p1|
0.upto(pairs2.size-1) do |i|
if p1 == pairs2[i]
intersection += 1
pairs2.slice!(i)
break
end
end
end
(2.0 * intersection) / union
' LANGUAGE 'plruby';
Works like a charm!
marzagao's answer is great. I converted it to C# so I thought I'd post it here:
Pastebin Link
/// <summary>
/// This class implements string comparison algorithm
/// based on character pair similarity
/// Source: http://www.catalysoft.com/articles/StrikeAMatch.html
/// </summary>
public class SimilarityTool
{
/// <summary>
/// Compares the two strings based on letter pair matches
/// </summary>
/// <param name="str1"></param>
/// <param name="str2"></param>
/// <returns>The percentage match from 0.0 to 1.0 where 1.0 is 100%</returns>
public double CompareStrings(string str1, string str2)
{
List<string> pairs1 = WordLetterPairs(str1.ToUpper());
List<string> pairs2 = WordLetterPairs(str2.ToUpper());
int intersection = 0;
int union = pairs1.Count + pairs2.Count;
for (int i = 0; i < pairs1.Count; i++)
{
for (int j = 0; j < pairs2.Count; j++)
{
if (pairs1[i] == pairs2[j])
{
intersection++;
pairs2.RemoveAt(j);//Must remove the match to prevent "GGGG" from appearing to match "GG" with 100% success
break;
}
}
}
return (2.0 * intersection) / union;
}
/// <summary>
/// Gets all letter pairs for each
/// individual word in the string
/// </summary>
/// <param name="str"></param>
/// <returns></returns>
private List<string> WordLetterPairs(string str)
{
List<string> AllPairs = new List<string>();
// Tokenize the string and put the tokens/words into an array
string[] Words = Regex.Split(str, #"\s");
// For each word
for (int w = 0; w < Words.Length; w++)
{
if (!string.IsNullOrEmpty(Words[w]))
{
// Find the pairs of characters
String[] PairsInWord = LetterPairs(Words[w]);
for (int p = 0; p < PairsInWord.Length; p++)
{
AllPairs.Add(PairsInWord[p]);
}
}
}
return AllPairs;
}
/// <summary>
/// Generates an array containing every
/// two consecutive letters in the input string
/// </summary>
/// <param name="str"></param>
/// <returns></returns>
private string[] LetterPairs(string str)
{
int numPairs = str.Length - 1;
string[] pairs = new string[numPairs];
for (int i = 0; i < numPairs; i++)
{
pairs[i] = str.Substring(i, 2);
}
return pairs;
}
}
Here is another version of marzagao's answer, this one written in Python:
def get_bigrams(string):
"""
Take a string and return a list of bigrams.
"""
s = string.lower()
return [s[i:i+2] for i in list(range(len(s) - 1))]
def string_similarity(str1, str2):
"""
Perform bigram comparison between two strings
and return a percentage match in decimal form.
"""
pairs1 = get_bigrams(str1)
pairs2 = get_bigrams(str2)
union = len(pairs1) + len(pairs2)
hit_count = 0
for x in pairs1:
for y in pairs2:
if x == y:
hit_count += 1
break
return (2.0 * hit_count) / union
if __name__ == "__main__":
"""
Run a test using the example taken from:
http://www.catalysoft.com/articles/StrikeAMatch.html
"""
w1 = 'Healed'
words = ['Heard', 'Healthy', 'Help', 'Herded', 'Sealed', 'Sold']
for w2 in words:
print('Healed --- ' + w2)
print(string_similarity(w1, w2))
print()
A shorter version of John Rutledge's answer:
def get_bigrams(string):
'''
Takes a string and returns a list of bigrams
'''
s = string.lower()
return {s[i:i+2] for i in xrange(len(s) - 1)}
def string_similarity(str1, str2):
'''
Perform bigram comparison between two strings
and return a percentage match in decimal form
'''
pairs1 = get_bigrams(str1)
pairs2 = get_bigrams(str2)
return (2.0 * len(pairs1 & pairs2)) / (len(pairs1) + len(pairs2))
Here's my PHP implementation of suggested StrikeAMatch algorithm, by Simon White. the advantages (like it says in the link) are:
A true reflection of lexical similarity - strings with small differences should be recognised as being similar. In particular, a significant substring overlap should point to a high level of similarity between the strings.
A robustness to changes of word order - two strings which contain the same words, but in a different order, should be recognised as being similar. On the other hand, if one string is just a random anagram of the characters contained in the other, then it should (usually) be recognised as dissimilar.
Language Independence - the algorithm should work not only in English, but in many different languages.
<?php
/**
* LetterPairSimilarity algorithm implementation in PHP
* #author Igal Alkon
* #link http://www.catalysoft.com/articles/StrikeAMatch.html
*/
class LetterPairSimilarity
{
/**
* #param $str
* #return mixed
*/
private function wordLetterPairs($str)
{
$allPairs = array();
// Tokenize the string and put the tokens/words into an array
$words = explode(' ', $str);
// For each word
for ($w = 0; $w < count($words); $w++)
{
// Find the pairs of characters
$pairsInWord = $this->letterPairs($words[$w]);
for ($p = 0; $p < count($pairsInWord); $p++)
{
$allPairs[] = $pairsInWord[$p];
}
}
return $allPairs;
}
/**
* #param $str
* #return array
*/
private function letterPairs($str)
{
$numPairs = mb_strlen($str)-1;
$pairs = array();
for ($i = 0; $i < $numPairs; $i++)
{
$pairs[$i] = mb_substr($str,$i,2);
}
return $pairs;
}
/**
* #param $str1
* #param $str2
* #return float
*/
public function compareStrings($str1, $str2)
{
$pairs1 = $this->wordLetterPairs(strtoupper($str1));
$pairs2 = $this->wordLetterPairs(strtoupper($str2));
$intersection = 0;
$union = count($pairs1) + count($pairs2);
for ($i=0; $i < count($pairs1); $i++)
{
$pair1 = $pairs1[$i];
$pairs2 = array_values($pairs2);
for($j = 0; $j < count($pairs2); $j++)
{
$pair2 = $pairs2[$j];
if ($pair1 === $pair2)
{
$intersection++;
unset($pairs2[$j]);
break;
}
}
}
return (2.0*$intersection)/$union;
}
}
This discussion has been really helpful, thanks. I converted the algorithm to VBA for use with Excel and wrote a few versions of a worksheet function, one for simple comparison of a pair of strings, the other for comparing one string to a range/array of strings. The strSimLookup version returns either the last best match as a string, array index, or similarity metric.
This implementation produces the same results listed in the Amazon example on Simon White's website with a few minor exceptions on low-scoring matches; not sure where the difference creeps in, could be VBA's Split function, but I haven't investigated as it's working fine for my purposes.
'Implements functions to rate how similar two strings are on
'a scale of 0.0 (completely dissimilar) to 1.0 (exactly similar)
'Source: http://www.catalysoft.com/articles/StrikeAMatch.html
'Author: Bob Chatham, bob.chatham at gmail.com
'9/12/2010
Option Explicit
Public Function stringSimilarity(str1 As String, str2 As String) As Variant
'Simple version of the algorithm that computes the similiarity metric
'between two strings.
'NOTE: This verision is not efficient to use if you're comparing one string
'with a range of other values as it will needlessly calculate the pairs for the
'first string over an over again; use the array-optimized version for this case.
Dim sPairs1 As Collection
Dim sPairs2 As Collection
Set sPairs1 = New Collection
Set sPairs2 = New Collection
WordLetterPairs str1, sPairs1
WordLetterPairs str2, sPairs2
stringSimilarity = SimilarityMetric(sPairs1, sPairs2)
Set sPairs1 = Nothing
Set sPairs2 = Nothing
End Function
Public Function strSimA(str1 As Variant, rRng As Range) As Variant
'Return an array of string similarity indexes for str1 vs every string in input range rRng
Dim sPairs1 As Collection
Dim sPairs2 As Collection
Dim arrOut As Variant
Dim l As Long, j As Long
Set sPairs1 = New Collection
WordLetterPairs CStr(str1), sPairs1
l = rRng.Count
ReDim arrOut(1 To l)
For j = 1 To l
Set sPairs2 = New Collection
WordLetterPairs CStr(rRng(j)), sPairs2
arrOut(j) = SimilarityMetric(sPairs1, sPairs2)
Set sPairs2 = Nothing
Next j
strSimA = Application.Transpose(arrOut)
End Function
Public Function strSimLookup(str1 As Variant, rRng As Range, Optional returnType) As Variant
'Return either the best match or the index of the best match
'depending on returnTYype parameter) between str1 and strings in rRng)
' returnType = 0 or omitted: returns the best matching string
' returnType = 1 : returns the index of the best matching string
' returnType = 2 : returns the similarity metric
Dim sPairs1 As Collection
Dim sPairs2 As Collection
Dim metric, bestMetric As Double
Dim i, iBest As Long
Const RETURN_STRING As Integer = 0
Const RETURN_INDEX As Integer = 1
Const RETURN_METRIC As Integer = 2
If IsMissing(returnType) Then returnType = RETURN_STRING
Set sPairs1 = New Collection
WordLetterPairs CStr(str1), sPairs1
bestMetric = -1
iBest = -1
For i = 1 To rRng.Count
Set sPairs2 = New Collection
WordLetterPairs CStr(rRng(i)), sPairs2
metric = SimilarityMetric(sPairs1, sPairs2)
If metric > bestMetric Then
bestMetric = metric
iBest = i
End If
Set sPairs2 = Nothing
Next i
If iBest = -1 Then
strSimLookup = CVErr(xlErrValue)
Exit Function
End If
Select Case returnType
Case RETURN_STRING
strSimLookup = CStr(rRng(iBest))
Case RETURN_INDEX
strSimLookup = iBest
Case Else
strSimLookup = bestMetric
End Select
End Function
Public Function strSim(str1 As String, str2 As String) As Variant
Dim ilen, iLen1, ilen2 As Integer
iLen1 = Len(str1)
ilen2 = Len(str2)
If iLen1 >= ilen2 Then ilen = ilen2 Else ilen = iLen1
strSim = stringSimilarity(Left(str1, ilen), Left(str2, ilen))
End Function
Sub WordLetterPairs(str As String, pairColl As Collection)
'Tokenize str into words, then add all letter pairs to pairColl
Dim Words() As String
Dim word, nPairs, pair As Integer
Words = Split(str)
If UBound(Words) < 0 Then
Set pairColl = Nothing
Exit Sub
End If
For word = 0 To UBound(Words)
nPairs = Len(Words(word)) - 1
If nPairs > 0 Then
For pair = 1 To nPairs
pairColl.Add Mid(Words(word), pair, 2)
Next pair
End If
Next word
End Sub
Private Function SimilarityMetric(sPairs1 As Collection, sPairs2 As Collection) As Variant
'Helper function to calculate similarity metric given two collections of letter pairs.
'This function is designed to allow the pair collections to be set up separately as needed.
'NOTE: sPairs2 collection will be altered as pairs are removed; copy the collection
'if this is not the desired behavior.
'Also assumes that collections will be deallocated somewhere else
Dim Intersect As Double
Dim Union As Double
Dim i, j As Long
If sPairs1.Count = 0 Or sPairs2.Count = 0 Then
SimilarityMetric = CVErr(xlErrNA)
Exit Function
End If
Union = sPairs1.Count + sPairs2.Count
Intersect = 0
For i = 1 To sPairs1.Count
For j = 1 To sPairs2.Count
If StrComp(sPairs1(i), sPairs2(j)) = 0 Then
Intersect = Intersect + 1
sPairs2.Remove j
Exit For
End If
Next j
Next i
SimilarityMetric = (2 * Intersect) / Union
End Function
I'm sorry, the answer was not invented by the author. This is a well known algorithm that was first present by Digital Equipment Corporation and is often referred to as shingling.
http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-TN-1997-015.pdf
I translated Simon White's algorithm to PL/pgSQL. This is my contribution.
<!-- language: lang-sql -->
create or replace function spt1.letterpairs(in p_str varchar)
returns varchar as
$$
declare
v_numpairs integer := length(p_str)-1;
v_pairs varchar[];
begin
for i in 1 .. v_numpairs loop
v_pairs[i] := substr(p_str, i, 2);
end loop;
return v_pairs;
end;
$$ language 'plpgsql';
--===================================================================
create or replace function spt1.wordletterpairs(in p_str varchar)
returns varchar as
$$
declare
v_allpairs varchar[];
v_words varchar[];
v_pairsinword varchar[];
begin
v_words := regexp_split_to_array(p_str, '[[:space:]]');
for i in 1 .. array_length(v_words, 1) loop
v_pairsinword := spt1.letterpairs(v_words[i]);
if v_pairsinword is not null then
for j in 1 .. array_length(v_pairsinword, 1) loop
v_allpairs := v_allpairs || v_pairsinword[j];
end loop;
end if;
end loop;
return v_allpairs;
end;
$$ language 'plpgsql';
--===================================================================
create or replace function spt1.arrayintersect(ANYARRAY, ANYARRAY)
returns anyarray as
$$
select array(select unnest($1) intersect select unnest($2))
$$ language 'sql';
--===================================================================
create or replace function spt1.comparestrings(in p_str1 varchar, in p_str2 varchar)
returns float as
$$
declare
v_pairs1 varchar[];
v_pairs2 varchar[];
v_intersection integer;
v_union integer;
begin
v_pairs1 := wordletterpairs(upper(p_str1));
v_pairs2 := wordletterpairs(upper(p_str2));
v_union := array_length(v_pairs1, 1) + array_length(v_pairs2, 1);
v_intersection := array_length(arrayintersect(v_pairs1, v_pairs2), 1);
return (2.0 * v_intersection / v_union);
end;
$$ language 'plpgsql';
A version in beautiful Scala:
def pairDistance(s1: String, s2: String): Double = {
def strToPairs(s: String, acc: List[String]): List[String] = {
if (s.size < 2) acc
else strToPairs(s.drop(1),
if (s.take(2).contains(" ")) acc else acc ::: List(s.take(2)))
}
val lst1 = strToPairs(s1.toUpperCase, List())
val lst2 = strToPairs(s2.toUpperCase, List())
(2.0 * lst2.intersect(lst1).size) / (lst1.size + lst2.size)
}
String Similarity Metrics contains an overview of many different metrics used in string comparison (Wikipedia has an overview as well). Much of these metrics is implemented in a library simmetrics.
Yet another example of metric, not included in the given overview is for example compression distance (attempting to approximate the Kolmogorov's complexity), which can be used for a bit longer texts than the one you presented.
You might also consider looking at a much broader subject of Natural Language Processing. These R packages can get you started quickly (or at least give some ideas).
And one last edit - search the other questions on this subject at SO, there are quite a few related ones.
A faster PHP version of the algorithm:
/**
*
* #param $str
* #return mixed
*/
private static function wordLetterPairs ($str)
{
$allPairs = array();
// Tokenize the string and put the tokens/words into an array
$words = explode(' ', $str);
// For each word
for ($w = 0; $w < count($words); $w ++) {
// Find the pairs of characters
$pairsInWord = self::letterPairs($words[$w]);
for ($p = 0; $p < count($pairsInWord); $p ++) {
$allPairs[$pairsInWord[$p]] = $pairsInWord[$p];
}
}
return array_values($allPairs);
}
/**
*
* #param $str
* #return array
*/
private static function letterPairs ($str)
{
$numPairs = mb_strlen($str) - 1;
$pairs = array();
for ($i = 0; $i < $numPairs; $i ++) {
$pairs[$i] = mb_substr($str, $i, 2);
}
return $pairs;
}
/**
*
* #param $str1
* #param $str2
* #return float
*/
public static function compareStrings ($str1, $str2)
{
$pairs1 = self::wordLetterPairs(mb_strtolower($str1));
$pairs2 = self::wordLetterPairs(mb_strtolower($str2));
$union = count($pairs1) + count($pairs2);
$intersection = count(array_intersect($pairs1, $pairs2));
return (2.0 * $intersection) / $union;
}
For the data I had (approx 2300 comparisons) I had a running time of 0.58sec with Igal Alkon solution versus 0.35sec with mine.
Posting marzagao's answer in C99, inspired by these algorithms
double dice_match(const char *string1, const char *string2) {
//check fast cases
if (((string1 != NULL) && (string1[0] == '\0')) ||
((string2 != NULL) && (string2[0] == '\0'))) {
return 0;
}
if (string1 == string2) {
return 1;
}
size_t strlen1 = strlen(string1);
size_t strlen2 = strlen(string2);
if (strlen1 < 2 || strlen2 < 2) {
return 0;
}
size_t length1 = strlen1 - 1;
size_t length2 = strlen2 - 1;
double matches = 0;
int i = 0, j = 0;
//get bigrams and compare
while (i < length1 && j < length2) {
char a[3] = {string1[i], string1[i + 1], '\0'};
char b[3] = {string2[j], string2[j + 1], '\0'};
int cmp = strcmpi(a, b);
if (cmp == 0) {
matches += 2;
}
i++;
j++;
}
return matches / (length1 + length2);
}
Some tests based on the original article:
#include <stdio.h>
void article_test1() {
char *string1 = "FRANCE";
char *string2 = "FRENCH";
printf("====%s====\n", __func__);
printf("%2.f%% == 40%%\n", dice_match(string1, string2) * 100);
}
void article_test2() {
printf("====%s====\n", __func__);
char *string = "Healed";
char *ss[] = {"Heard", "Healthy", "Help",
"Herded", "Sealed", "Sold"};
int correct[] = {44, 55, 25, 40, 80, 0};
for (int i = 0; i < 6; ++i) {
printf("%2.f%% == %d%%\n", dice_match(string, ss[i]) * 100, correct[i]);
}
}
void multicase_test() {
char *string1 = "FRaNcE";
char *string2 = "fREnCh";
printf("====%s====\n", __func__);
printf("%2.f%% == 40%%\n", dice_match(string1, string2) * 100);
}
void gg_test() {
char *string1 = "GG";
char *string2 = "GGGGG";
printf("====%s====\n", __func__);
printf("%2.f%% != 100%%\n", dice_match(string1, string2) * 100);
}
int main() {
article_test1();
article_test2();
multicase_test();
gg_test();
return 0;
}
Here is the R version:
get_bigrams <- function(str)
{
lstr = tolower(str)
bigramlst = list()
for(i in 1:(nchar(str)-1))
{
bigramlst[[i]] = substr(str, i, i+1)
}
return(bigramlst)
}
str_similarity <- function(str1, str2)
{
pairs1 = get_bigrams(str1)
pairs2 = get_bigrams(str2)
unionlen = length(pairs1) + length(pairs2)
hit_count = 0
for(x in 1:length(pairs1)){
for(y in 1:length(pairs2)){
if (pairs1[[x]] == pairs2[[y]])
hit_count = hit_count + 1
}
}
return ((2.0 * hit_count) / unionlen)
}
Building on Michael La Voie's awesome C# version, as per the request to make it an extension method, here is what I came up with. The primary benefit of doing it this way is that you can sort a Generic List by the percent match. For example, consider you have a string field named "City" in your object. A user searches for "Chester" and you want to return results in descending order of match. For example, you want literal matches of Chester to show up before Rochester. To do this, add two new properties to your object:
public string SearchText { get; set; }
public double PercentMatch
{
get
{
return City.ToUpper().PercentMatchTo(this.SearchText.ToUpper());
}
}
Then on each object, set the SearchText to what the user searched for. Then you can sort it easily with something like:
zipcodes = zipcodes.OrderByDescending(x => x.PercentMatch);
Here's the slight modification to make it an extension method:
/// <summary>
/// This class implements string comparison algorithm
/// based on character pair similarity
/// Source: http://www.catalysoft.com/articles/StrikeAMatch.html
/// </summary>
public static double PercentMatchTo(this string str1, string str2)
{
List<string> pairs1 = WordLetterPairs(str1.ToUpper());
List<string> pairs2 = WordLetterPairs(str2.ToUpper());
int intersection = 0;
int union = pairs1.Count + pairs2.Count;
for (int i = 0; i < pairs1.Count; i++)
{
for (int j = 0; j < pairs2.Count; j++)
{
if (pairs1[i] == pairs2[j])
{
intersection++;
pairs2.RemoveAt(j);//Must remove the match to prevent "GGGG" from appearing to match "GG" with 100% success
break;
}
}
}
return (2.0 * intersection) / union;
}
/// <summary>
/// Gets all letter pairs for each
/// individual word in the string
/// </summary>
/// <param name="str"></param>
/// <returns></returns>
private static List<string> WordLetterPairs(string str)
{
List<string> AllPairs = new List<string>();
// Tokenize the string and put the tokens/words into an array
string[] Words = Regex.Split(str, #"\s");
// For each word
for (int w = 0; w < Words.Length; w++)
{
if (!string.IsNullOrEmpty(Words[w]))
{
// Find the pairs of characters
String[] PairsInWord = LetterPairs(Words[w]);
for (int p = 0; p < PairsInWord.Length; p++)
{
AllPairs.Add(PairsInWord[p]);
}
}
}
return AllPairs;
}
/// <summary>
/// Generates an array containing every
/// two consecutive letters in the input string
/// </summary>
/// <param name="str"></param>
/// <returns></returns>
private static string[] LetterPairs(string str)
{
int numPairs = str.Length - 1;
string[] pairs = new string[numPairs];
for (int i = 0; i < numPairs; i++)
{
pairs[i] = str.Substring(i, 2);
}
return pairs;
}
My JavaScript implementation takes a string or array of strings, and an optional floor (the default floor is 0.5). If you pass it a string, it will return true or false depending on whether or not the string's similarity score is greater than or equal to the floor. If you pass it an array of strings, it will return an array of those strings whose similarity score is greater than or equal to the floor, sorted by score.
Examples:
'Healed'.fuzzy('Sealed'); // returns true
'Healed'.fuzzy('Help'); // returns false
'Healed'.fuzzy('Help', 0.25); // returns true
'Healed'.fuzzy(['Sold', 'Herded', 'Heard', 'Help', 'Sealed', 'Healthy']);
// returns ["Sealed", "Healthy"]
'Healed'.fuzzy(['Sold', 'Herded', 'Heard', 'Help', 'Sealed', 'Healthy'], 0);
// returns ["Sealed", "Healthy", "Heard", "Herded", "Help", "Sold"]
Here it is:
(function(){
var default_floor = 0.5;
function pairs(str){
var pairs = []
, length = str.length - 1
, pair;
str = str.toLowerCase();
for(var i = 0; i < length; i++){
pair = str.substr(i, 2);
if(!/\s/.test(pair)){
pairs.push(pair);
}
}
return pairs;
}
function similarity(pairs1, pairs2){
var union = pairs1.length + pairs2.length
, hits = 0;
for(var i = 0; i < pairs1.length; i++){
for(var j = 0; j < pairs2.length; j++){
if(pairs1[i] == pairs2[j]){
pairs2.splice(j--, 1);
hits++;
break;
}
}
}
return 2*hits/union || 0;
}
String.prototype.fuzzy = function(strings, floor){
var str1 = this
, pairs1 = pairs(this);
floor = typeof floor == 'number' ? floor : default_floor;
if(typeof(strings) == 'string'){
return str1.length > 1 && strings.length > 1 && similarity(pairs1, pairs(strings)) >= floor || str1.toLowerCase() == strings.toLowerCase();
}else if(strings instanceof Array){
var scores = {};
strings.map(function(str2){
scores[str2] = str1.length > 1 ? similarity(pairs1, pairs(str2)) : 1*(str1.toLowerCase() == str2.toLowerCase());
});
return strings.filter(function(str){
return scores[str] >= floor;
}).sort(function(a, b){
return scores[b] - scores[a];
});
}
};
})();
The Dice coefficient algorithm (Simon White / marzagao's answer) is implemented in Ruby in the
pair_distance_similar method in the amatch gem
https://github.com/flori/amatch
This gem also contains implementations of a number of approximate matching and string comparison algorithms: Levenshtein edit distance, Sellers edit distance, the Hamming distance, the longest common subsequence length, the longest common substring length, the pair distance metric, the Jaro-Winkler metric.
A Haskell version—feel free to suggest edits because I haven't done much Haskell.
import Data.Char
import Data.List
-- Convert a string into words, then get the pairs of words from that phrase
wordLetterPairs :: String -> [String]
wordLetterPairs s1 = concat $ map pairs $ words s1
-- Converts a String into a list of letter pairs.
pairs :: String -> [String]
pairs [] = []
pairs (x:[]) = []
pairs (x:ys) = [x, head ys]:(pairs ys)
-- Calculates the match rating for two strings
matchRating :: String -> String -> Double
matchRating s1 s2 = (numberOfMatches * 2) / totalLength
where pairsS1 = wordLetterPairs $ map toLower s1
pairsS2 = wordLetterPairs $ map toLower s2
numberOfMatches = fromIntegral $ length $ pairsS1 `intersect` pairsS2
totalLength = fromIntegral $ length pairsS1 + length pairsS2
Clojure:
(require '[clojure.set :refer [intersection]])
(defn bigrams [s]
(->> (split s #"\s+")
(mapcat #(partition 2 1 %))
(set)))
(defn string-similarity [a b]
(let [a-pairs (bigrams a)
b-pairs (bigrams b)
total-count (+ (count a-pairs) (count b-pairs))
match-count (count (intersection a-pairs b-pairs))
similarity (/ (* 2 match-count) total-count)]
similarity))
Here is another version of Similarity based in Sørensen–Dice index (marzagao's answer), this one written in C++11:
/*
* Similarity based in Sørensen–Dice index.
*
* Returns the Similarity between _str1 and _str2.
*/
double similarity_sorensen_dice(const std::string& _str1, const std::string& _str2) {
// Base case: if some string is empty.
if (_str1.empty() || _str2.empty()) {
return 1.0;
}
auto str1 = upper_string(_str1);
auto str2 = upper_string(_str2);
// Base case: if the strings are equals.
if (str1 == str2) {
return 0.0;
}
// Base case: if some string does not have bigrams.
if (str1.size() < 2 || str2.size() < 2) {
return 1.0;
}
// Extract bigrams from str1
auto num_pairs1 = str1.size() - 1;
std::unordered_set<std::string> str1_bigrams;
str1_bigrams.reserve(num_pairs1);
for (unsigned i = 0; i < num_pairs1; ++i) {
str1_bigrams.insert(str1.substr(i, 2));
}
// Extract bigrams from str2
auto num_pairs2 = str2.size() - 1;
std::unordered_set<std::string> str2_bigrams;
str2_bigrams.reserve(num_pairs2);
for (unsigned int i = 0; i < num_pairs2; ++i) {
str2_bigrams.insert(str2.substr(i, 2));
}
// Find the intersection between the two sets.
int intersection = 0;
if (str1_bigrams.size() < str2_bigrams.size()) {
const auto it_e = str2_bigrams.end();
for (const auto& bigram : str1_bigrams) {
intersection += str2_bigrams.find(bigram) != it_e;
}
} else {
const auto it_e = str1_bigrams.end();
for (const auto& bigram : str2_bigrams) {
intersection += str1_bigrams.find(bigram) != it_e;
}
}
// Returns similarity coefficient.
return (2.0 * intersection) / (num_pairs1 + num_pairs2);
}
Why not for a JavaScript implementation, I also explained the algorithm.
Algorithm
Input : France and French.
Map them both to their upper case characters (making the algorithm insensitive to case differences), then split them up into their character pairs:
FRANCE: {FR, RA, AN, NC, CE}
FRENCH: {FR, RE, EN, NC, CH}
Find there intersection:
Result:
Implementation
function similarity(s1, s2) {
const
set1 = pairs(s1.toUpperCase()), // [ FR, RA, AN, NC, CE ]
set2 = pairs(s2.toUpperCase()), // [ FR, RE, EN, NC, CH ]
intersection = set1.filter(x => set2.includes(x)); // [ FR, NC ]
// Tips: Instead of `2` multiply by `200`, To get percentage.
return (intersection.length * 2) / (set1.length + set2.length);
}
function pairs(input) {
const tokenized = [];
for (let i = 0; i < input.length - 1; i++)
tokenized.push(input.substring(i, 2 + i));
return tokenized;
}
console.log(similarity("FRANCE", "FRENCH"));
Ranking Results By ( Word - Similarity )
Sealed - 80%
Healthy - 55%
Heard - 44%
Herded - 40%
Help - 25%
Sold - 0%
From same original source.
What about Levenshtein distance, divided by the length of the first string (or alternatively divided my min/max/avg length of both strings)? That has worked for me so far.
Hey guys i gave this a try in javascript, but I'm new to it, anyone know faster ways to do it?
function get_bigrams(string) {
// Takes a string and returns a list of bigrams
var s = string.toLowerCase();
var v = new Array(s.length-1);
for (i = 0; i< v.length; i++){
v[i] =s.slice(i,i+2);
}
return v;
}
function string_similarity(str1, str2){
/*
Perform bigram comparison between two strings
and return a percentage match in decimal form
*/
var pairs1 = get_bigrams(str1);
var pairs2 = get_bigrams(str2);
var union = pairs1.length + pairs2.length;
var hit_count = 0;
for (x in pairs1){
for (y in pairs2){
if (pairs1[x] == pairs2[y]){
hit_count++;
}
}
}
return ((2.0 * hit_count) / union);
}
var w1 = 'Healed';
var word =['Heard','Healthy','Help','Herded','Sealed','Sold']
for (w2 in word){
console.log('Healed --- ' + word[w2])
console.log(string_similarity(w1,word[w2]));
}
I was looking for pure ruby implementation of the algorithm indicated by #marzagao's answer. Unfortunately, link indicated by #marzagao is broken. In #s01ipsist answer, he indicated ruby gem amatch where implementation is not in pure ruby. So I searchd a little and found gem fuzzy_match which has pure ruby implementation (though this gem use amatch) at here. I hope this will help someone like me.
**I've converted marzagao's answer to Java.**
import org.apache.commons.lang3.StringUtils; //Add a apache commons jar in pom.xml
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class SimilarityComparator {
public static void main(String[] args) {
String str0 = "Nischal";
String str1 = "Nischal";
double v = compareStrings(str0, str1);
System.out.println("Similarity betn " + str0 + " and " + str1 + " = " + v);
}
private static double compareStrings(String str1, String str2) {
List<String> pairs1 = wordLetterPairs(str1.toUpperCase());
List<String> pairs2 = wordLetterPairs(str2.toUpperCase());
int intersection = 0;
int union = pairs1.size() + pairs2.size();
for (String s : pairs1) {
for (int j = 0; j < pairs2.size(); j++) {
if (s.equals(pairs2.get(j))) {
intersection++;
pairs2.remove(j);
break;
}
}
}
return (2.0 * intersection) / union;
}
private static List<String> wordLetterPairs(String str) {
List<String> AllPairs = new ArrayList<>();
String[] Words = str.split("\\s");
for (String word : Words) {
if (StringUtils.isNotBlank(word)) {
String[] PairsInWord = letterPairs(word);
Collections.addAll(AllPairs, PairsInWord);
}
}
return AllPairs;
}
private static String[] letterPairs(String str) {
int numPairs = str.length() - 1;
String[] pairs = new String[numPairs];
for (int i = 0; i < numPairs; i++) {
try {
pairs[i] = str.substring(i, i + 2);
} catch (Exception e) {
pairs[i] = str.substring(i, numPairs);
}
}
return pairs;
}
}
Here's another c++ implementation that follows the original article, that minimizes dynamic memory allocations.
It obtains the same matching values in the examples, but I think it's better to take into account also the single character words.
//---------------------------------------------------------------------------
// Similarity based on Sørensen–Dice index
double calc_similarity( const std::string_view s1, const std::string_view s2 )
{
// Check banal cases
if( s1.empty() || s2.empty() )
{// Empty string is never similar to another
return 0.0;
}
else if( s1==s2 )
{// Perfectly equal
return 1.0;
}
else if( s1.length()==1 || s2.length()==1 )
{// Single (not equal) characters have zero similarity
return 0.0;
}
/////////////////////////////////////////////////////////////////////////
// Represents a pair of adjacent characters
class charpair_t final
{
public:
charpair_t(const char a, const char b) noexcept : c1(a), c2(b) {}
[[nodiscard]] bool operator==(const charpair_t& other) const noexcept { return c1==other.c1 && c2==other.c2; }
private:
char c1, c2;
};
/////////////////////////////////////////////////////////////////////////
// Collects and access a sequence of adjacent characters (skipping spaces)
class charpairs_t final
{
public:
charpairs_t(const std::string_view s)
{
assert( !s.empty() );
const std::size_t i_last = s.size()-1;
std::size_t i = 0;
chpairs.reserve(i_last);
while( i<i_last )
{
// Accepting also single-character words (the second is a space)
//if( !std::isspace(s[i]) ) chpairs.emplace_back( std::tolower(s[i]), std::tolower(s[i+1]) );
// Skipping single-character words (as in the original article)
if( std::isspace(s[i]) ) ; // Skip
else if( std::isspace(s[i+1]) ) ++i; // Skip also next
else chpairs.emplace_back( std::tolower(s[i]), std::tolower(s[i+1]) );
++i;
}
}
[[nodiscard]] auto size() const noexcept { return chpairs.size(); }
[[nodiscard]] auto cbegin() const noexcept { return chpairs.cbegin(); }
[[nodiscard]] auto cend() const noexcept { return chpairs.cend(); }
auto erase(std::vector<charpair_t>::const_iterator i) { return chpairs.erase(i); }
private:
std::vector<charpair_t> chpairs;
};
charpairs_t chpairs1{s1},
chpairs2{s2};
const double orig_avg_bigrams_count = 0.5 * static_cast<double>(chpairs1.size() + chpairs2.size());
std::size_t matching_bigrams_count = 0;
for( auto ib1=chpairs1.cbegin(); ib1!=chpairs1.cend(); ++ib1 )
{
for( auto ib2=chpairs2.cbegin(); ib2!=chpairs2.cend(); )
{
if( *ib1==*ib2 )
{
++matching_bigrams_count;
ib2 = chpairs2.erase(ib2); // Avoid to match the same character pair multiple times
break;
}
else ++ib2;
}
}
return static_cast<double>(matching_bigrams_count) / orig_avg_bigrams_count;
}
How do I calculate distance between two GPS coordinates (using latitude and longitude)?
Calculate the distance between two coordinates by latitude and longitude, including a Javascript implementation.
West and South locations are negative.
Remember minutes and seconds are out of 60 so S31 30' is -31.50 degrees.
Don't forget to convert degrees to radians. Many languages have this function. Or its a simple calculation: radians = degrees * PI / 180.
function degreesToRadians(degrees) {
return degrees * Math.PI / 180;
}
function distanceInKmBetweenEarthCoordinates(lat1, lon1, lat2, lon2) {
var earthRadiusKm = 6371;
var dLat = degreesToRadians(lat2-lat1);
var dLon = degreesToRadians(lon2-lon1);
lat1 = degreesToRadians(lat1);
lat2 = degreesToRadians(lat2);
var a = Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.sin(dLon/2) * Math.sin(dLon/2) * Math.cos(lat1) * Math.cos(lat2);
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));
return earthRadiusKm * c;
}
Here are some examples of usage:
distanceInKmBetweenEarthCoordinates(0,0,0,0) // Distance between same
// points should be 0
0
distanceInKmBetweenEarthCoordinates(51.5, 0, 38.8, -77.1) // From London
// to Arlington
5918.185064088764
Look for haversine with Google; here is my solution:
#include <math.h>
#include "haversine.h"
#define d2r (M_PI / 180.0)
//calculate haversine distance for linear distance
double haversine_km(double lat1, double long1, double lat2, double long2)
{
double dlong = (long2 - long1) * d2r;
double dlat = (lat2 - lat1) * d2r;
double a = pow(sin(dlat/2.0), 2) + cos(lat1*d2r) * cos(lat2*d2r) * pow(sin(dlong/2.0), 2);
double c = 2 * atan2(sqrt(a), sqrt(1-a));
double d = 6367 * c;
return d;
}
double haversine_mi(double lat1, double long1, double lat2, double long2)
{
double dlong = (long2 - long1) * d2r;
double dlat = (lat2 - lat1) * d2r;
double a = pow(sin(dlat/2.0), 2) + cos(lat1*d2r) * cos(lat2*d2r) * pow(sin(dlong/2.0), 2);
double c = 2 * atan2(sqrt(a), sqrt(1-a));
double d = 3956 * c;
return d;
}
C# Version of Haversine
double _eQuatorialEarthRadius = 6378.1370D;
double _d2r = (Math.PI / 180D);
private int HaversineInM(double lat1, double long1, double lat2, double long2)
{
return (int)(1000D * HaversineInKM(lat1, long1, lat2, long2));
}
private double HaversineInKM(double lat1, double long1, double lat2, double long2)
{
double dlong = (long2 - long1) * _d2r;
double dlat = (lat2 - lat1) * _d2r;
double a = Math.Pow(Math.Sin(dlat / 2D), 2D) + Math.Cos(lat1 * _d2r) * Math.Cos(lat2 * _d2r) * Math.Pow(Math.Sin(dlong / 2D), 2D);
double c = 2D * Math.Atan2(Math.Sqrt(a), Math.Sqrt(1D - a));
double d = _eQuatorialEarthRadius * c;
return d;
}
Here's a .NET Fiddle of this, so you can test it out with your own Lat/Longs.
Java Version of Haversine Algorithm based on Roman Makarov`s reply to this thread
public class HaversineAlgorithm {
static final double _eQuatorialEarthRadius = 6378.1370D;
static final double _d2r = (Math.PI / 180D);
public static int HaversineInM(double lat1, double long1, double lat2, double long2) {
return (int) (1000D * HaversineInKM(lat1, long1, lat2, long2));
}
public static double HaversineInKM(double lat1, double long1, double lat2, double long2) {
double dlong = (long2 - long1) * _d2r;
double dlat = (lat2 - lat1) * _d2r;
double a = Math.pow(Math.sin(dlat / 2D), 2D) + Math.cos(lat1 * _d2r) * Math.cos(lat2 * _d2r)
* Math.pow(Math.sin(dlong / 2D), 2D);
double c = 2D * Math.atan2(Math.sqrt(a), Math.sqrt(1D - a));
double d = _eQuatorialEarthRadius * c;
return d;
}
}
This is very easy to do with geography type in SQL Server 2008.
SELECT geography::Point(lat1, lon1, 4326).STDistance(geography::Point(lat2, lon2, 4326))
-- computes distance in meters using eliptical model, accurate to the mm
4326 is SRID for WGS84 elipsoidal Earth model
Here's a Haversine function in Python that I use:
from math import pi,sqrt,sin,cos,atan2
def haversine(pos1, pos2):
lat1 = float(pos1['lat'])
long1 = float(pos1['long'])
lat2 = float(pos2['lat'])
long2 = float(pos2['long'])
degree_to_rad = float(pi / 180.0)
d_lat = (lat2 - lat1) * degree_to_rad
d_long = (long2 - long1) * degree_to_rad
a = pow(sin(d_lat / 2), 2) + cos(lat1 * degree_to_rad) * cos(lat2 * degree_to_rad) * pow(sin(d_long / 2), 2)
c = 2 * atan2(sqrt(a), sqrt(1 - a))
km = 6367 * c
mi = 3956 * c
return {"km":km, "miles":mi}
I needed to calculate a lot of distances between the points for my project, so I went ahead and tried to optimize the code, I have found here. On average in different browsers my new implementation runs 2 times faster than the most upvoted answer.
function distance(lat1, lon1, lat2, lon2) {
var p = 0.017453292519943295; // Math.PI / 180
var c = Math.cos;
var a = 0.5 - c((lat2 - lat1) * p)/2 +
c(lat1 * p) * c(lat2 * p) *
(1 - c((lon2 - lon1) * p))/2;
return 12742 * Math.asin(Math.sqrt(a)); // 2 * R; R = 6371 km
}
You can play with my jsPerf and see the results here.
Recently I needed to do the same in python, so here is a python implementation:
from math import cos, asin, sqrt
def distance(lat1, lon1, lat2, lon2):
p = 0.017453292519943295
a = 0.5 - cos((lat2 - lat1) * p)/2 + cos(lat1 * p) * cos(lat2 * p) * (1 - cos((lon2 - lon1) * p)) / 2
return 12742 * asin(sqrt(a))
And for the sake of completeness: Haversine on wiki.
It depends on how accurate you need it to be. If you need pinpoint accuracy, it is best to look at an algorithm which uses an ellipsoid, rather than a sphere, such as Vincenty's algorithm, which is accurate to the mm.
Here it is in C# (lat and long in radians):
double CalculateGreatCircleDistance(double lat1, double long1, double lat2, double long2, double radius)
{
return radius * Math.Acos(
Math.Sin(lat1) * Math.Sin(lat2)
+ Math.Cos(lat1) * Math.Cos(lat2) * Math.Cos(long2 - long1));
}
If your lat and long are in degrees then divide by 180/PI to convert to radians.
PHP version:
(Remove all deg2rad() if your coordinates are already in radians.)
$R = 6371; // km
$dLat = deg2rad($lat2-$lat1);
$dLon = deg2rad($lon2-$lon1);
$lat1 = deg2rad($lat1);
$lat2 = deg2rad($lat2);
$a = sin($dLat/2) * sin($dLat/2) +
sin($dLon/2) * sin($dLon/2) * cos($lat1) * cos($lat2);
$c = 2 * atan2(sqrt($a), sqrt(1-$a));
$d = $R * $c;
A T-SQL function, that I use to select records by distance for a center
Create Function [dbo].[DistanceInMiles]
( #fromLatitude float ,
#fromLongitude float ,
#toLatitude float,
#toLongitude float
)
returns float
AS
BEGIN
declare #distance float
select #distance = cast((3963 * ACOS(round(COS(RADIANS(90-#fromLatitude))*COS(RADIANS(90-#toLatitude))+
SIN(RADIANS(90-#fromLatitude))*SIN(RADIANS(90-#toLatitude))*COS(RADIANS(#fromLongitude-#toLongitude)),15))
)as float)
return round(#distance,1)
END
I. Regarding "Breadcrumbs" method
Earth radius is different on different Lat. This must be taken into consideration in Haversine algorithm.
Consider Bearing change, which turns straight lines to arches (which are longer)
Taking Speed change into account will turn arches to spirals (which are longer or shorter than arches)
Altitude change will turn flat spirals to 3D spirals (which are longer again). This is very important for hilly areas.
Below see the function in C which takes #1 and #2 into account:
double calcDistanceByHaversine(double rLat1, double rLon1, double rHeading1,
double rLat2, double rLon2, double rHeading2){
double rDLatRad = 0.0;
double rDLonRad = 0.0;
double rLat1Rad = 0.0;
double rLat2Rad = 0.0;
double a = 0.0;
double c = 0.0;
double rResult = 0.0;
double rEarthRadius = 0.0;
double rDHeading = 0.0;
double rDHeadingRad = 0.0;
if ((rLat1 < -90.0) || (rLat1 > 90.0) || (rLat2 < -90.0) || (rLat2 > 90.0)
|| (rLon1 < -180.0) || (rLon1 > 180.0) || (rLon2 < -180.0)
|| (rLon2 > 180.0)) {
return -1;
};
rDLatRad = (rLat2 - rLat1) * DEGREE_TO_RADIANS;
rDLonRad = (rLon2 - rLon1) * DEGREE_TO_RADIANS;
rLat1Rad = rLat1 * DEGREE_TO_RADIANS;
rLat2Rad = rLat2 * DEGREE_TO_RADIANS;
a = sin(rDLatRad / 2) * sin(rDLatRad / 2) + sin(rDLonRad / 2) * sin(
rDLonRad / 2) * cos(rLat1Rad) * cos(rLat2Rad);
if (a == 0.0) {
return 0.0;
}
c = 2 * atan2(sqrt(a), sqrt(1 - a));
rEarthRadius = 6378.1370 - (21.3847 * 90.0 / ((fabs(rLat1) + fabs(rLat2))
/ 2.0));
rResult = rEarthRadius * c;
// Chord to Arc Correction based on Heading changes. Important for routes with many turns and U-turns
if ((rHeading1 >= 0.0) && (rHeading1 < 360.0) && (rHeading2 >= 0.0)
&& (rHeading2 < 360.0)) {
rDHeading = fabs(rHeading1 - rHeading2);
if (rDHeading > 180.0) {
rDHeading -= 180.0;
}
rDHeadingRad = rDHeading * DEGREE_TO_RADIANS;
if (rDHeading > 5.0) {
rResult = rResult * (rDHeadingRad / (2.0 * sin(rDHeadingRad / 2)));
} else {
rResult = rResult / cos(rDHeadingRad);
}
}
return rResult;
}
II. There is an easier way which gives pretty good results.
By Average Speed.
Trip_distance = Trip_average_speed * Trip_time
Since GPS Speed is detected by Doppler effect and is not directly related to [Lon,Lat] it can be at least considered as secondary (backup or correction) if not as main distance calculation method.
If you need something more accurate then have a look at this.
Vincenty's formulae are two related iterative methods used in geodesy
to calculate the distance between two points on the surface of a
spheroid, developed by Thaddeus Vincenty (1975a) They are based on the
assumption that the figure of the Earth is an oblate spheroid, and
hence are more accurate than methods such as great-circle distance
which assume a spherical Earth.
The first (direct) method computes the location of a point which is a
given distance and azimuth (direction) from another point. The second
(inverse) method computes the geographical distance and azimuth
between two given points. They have been widely used in geodesy
because they are accurate to within 0.5 mm (0.020″) on the Earth
ellipsoid.
If you're using .NET don't reivent the wheel. See System.Device.Location. Credit to fnx in the comments in another answer.
using System.Device.Location;
double lat1 = 45.421527862548828D;
double long1 = -75.697189331054688D;
double lat2 = 53.64135D;
double long2 = -113.59273D;
GeoCoordinate geo1 = new GeoCoordinate(lat1, long1);
GeoCoordinate geo2 = new GeoCoordinate(lat2, long2);
double distance = geo1.GetDistanceTo(geo2);
here is the Swift implementation from the answer
func degreesToRadians(degrees: Double) -> Double {
return degrees * Double.pi / 180
}
func distanceInKmBetweenEarthCoordinates(lat1: Double, lon1: Double, lat2: Double, lon2: Double) -> Double {
let earthRadiusKm: Double = 6371
let dLat = degreesToRadians(degrees: lat2 - lat1)
let dLon = degreesToRadians(degrees: lon2 - lon1)
let lat1 = degreesToRadians(degrees: lat1)
let lat2 = degreesToRadians(degrees: lat2)
let a = sin(dLat/2) * sin(dLat/2) +
sin(dLon/2) * sin(dLon/2) * cos(lat1) * cos(lat2)
let c = 2 * atan2(sqrt(a), sqrt(1 - a))
return earthRadiusKm * c
}
This is version from "Henry Vilinskiy" adapted for MySQL and Kilometers:
CREATE FUNCTION `CalculateDistanceInKm`(
fromLatitude float,
fromLongitude float,
toLatitude float,
toLongitude float
) RETURNS float
BEGIN
declare distance float;
select
6367 * ACOS(
round(
COS(RADIANS(90-fromLatitude)) *
COS(RADIANS(90-toLatitude)) +
SIN(RADIANS(90-fromLatitude)) *
SIN(RADIANS(90-toLatitude)) *
COS(RADIANS(fromLongitude-toLongitude))
,15)
)
into distance;
return round(distance,3);
END;
This Lua code is adapted from stuff found on Wikipedia and in Robert Lipe's GPSbabel tool:
local EARTH_RAD = 6378137.0
-- earth's radius in meters (official geoid datum, not 20,000km / pi)
local radmiles = EARTH_RAD*100.0/2.54/12.0/5280.0;
-- earth's radius in miles
local multipliers = {
radians = 1, miles = radmiles, mi = radmiles, feet = radmiles * 5280,
meters = EARTH_RAD, m = EARTH_RAD, km = EARTH_RAD / 1000,
degrees = 360 / (2 * math.pi), min = 60 * 360 / (2 * math.pi)
}
function gcdist(pt1, pt2, units) -- return distance in radians or given units
--- this formula works best for points close together or antipodal
--- rounding error strikes when distance is one-quarter Earth's circumference
--- (ref: wikipedia Great-circle distance)
if not pt1.radians then pt1 = rad(pt1) end
if not pt2.radians then pt2 = rad(pt2) end
local sdlat = sin((pt1.lat - pt2.lat) / 2.0);
local sdlon = sin((pt1.lon - pt2.lon) / 2.0);
local res = sqrt(sdlat * sdlat + cos(pt1.lat) * cos(pt2.lat) * sdlon * sdlon);
res = res > 1 and 1 or res < -1 and -1 or res
res = 2 * asin(res);
if units then return res * assert(multipliers[units])
else return res
end
end
private double deg2rad(double deg)
{
return (deg * Math.PI / 180.0);
}
private double rad2deg(double rad)
{
return (rad / Math.PI * 180.0);
}
private double GetDistance(double lat1, double lon1, double lat2, double lon2)
{
//code for Distance in Kilo Meter
double theta = lon1 - lon2;
double dist = Math.Sin(deg2rad(lat1)) * Math.Sin(deg2rad(lat2)) + Math.Cos(deg2rad(lat1)) * Math.Cos(deg2rad(lat2)) * Math.Cos(deg2rad(theta));
dist = Math.Abs(Math.Round(rad2deg(Math.Acos(dist)) * 60 * 1.1515 * 1.609344 * 1000, 0));
return (dist);
}
private double GetDirection(double lat1, double lon1, double lat2, double lon2)
{
//code for Direction in Degrees
double dlat = deg2rad(lat1) - deg2rad(lat2);
double dlon = deg2rad(lon1) - deg2rad(lon2);
double y = Math.Sin(dlon) * Math.Cos(lat2);
double x = Math.Cos(deg2rad(lat1)) * Math.Sin(deg2rad(lat2)) - Math.Sin(deg2rad(lat1)) * Math.Cos(deg2rad(lat2)) * Math.Cos(dlon);
double direct = Math.Round(rad2deg(Math.Atan2(y, x)), 0);
if (direct < 0)
direct = direct + 360;
return (direct);
}
private double GetSpeed(double lat1, double lon1, double lat2, double lon2, DateTime CurTime, DateTime PrevTime)
{
//code for speed in Kilo Meter/Hour
TimeSpan TimeDifference = CurTime.Subtract(PrevTime);
double TimeDifferenceInSeconds = Math.Round(TimeDifference.TotalSeconds, 0);
double theta = lon1 - lon2;
double dist = Math.Sin(deg2rad(lat1)) * Math.Sin(deg2rad(lat2)) + Math.Cos(deg2rad(lat1)) * Math.Cos(deg2rad(lat2)) * Math.Cos(deg2rad(theta));
dist = rad2deg(Math.Acos(dist)) * 60 * 1.1515 * 1.609344;
double Speed = Math.Abs(Math.Round((dist / Math.Abs(TimeDifferenceInSeconds)) * 60 * 60, 0));
return (Speed);
}
private double GetDuration(DateTime CurTime, DateTime PrevTime)
{
//code for speed in Kilo Meter/Hour
TimeSpan TimeDifference = CurTime.Subtract(PrevTime);
double TimeDifferenceInSeconds = Math.Abs(Math.Round(TimeDifference.TotalSeconds, 0));
return (TimeDifferenceInSeconds);
}
i took the top answer and used it in a Scala program
import java.lang.Math.{atan2, cos, sin, sqrt}
def latLonDistance(lat1: Double, lon1: Double)(lat2: Double, lon2: Double): Double = {
val earthRadiusKm = 6371
val dLat = (lat2 - lat1).toRadians
val dLon = (lon2 - lon1).toRadians
val latRad1 = lat1.toRadians
val latRad2 = lat2.toRadians
val a = sin(dLat / 2) * sin(dLat / 2) + sin(dLon / 2) * sin(dLon / 2) * cos(latRad1) * cos(latRad2)
val c = 2 * atan2(sqrt(a), sqrt(1 - a))
earthRadiusKm * c
}
i curried the function in order to be able to easily produce functions that have one of the two locations fixed and require only a pair of lat/lon to produce distance.
Here's a Kotlin variation:
import kotlin.math.*
class HaversineAlgorithm {
companion object {
private const val MEAN_EARTH_RADIUS = 6371.008
private const val D2R = Math.PI / 180.0
}
private fun haversineInKm(lat1: Double, lon1: Double, lat2: Double, lon2: Double): Double {
val lonDiff = (lon2 - lon1) * D2R
val latDiff = (lat2 - lat1) * D2R
val latSin = sin(latDiff / 2.0)
val lonSin = sin(lonDiff / 2.0)
val a = latSin * latSin + (cos(lat1 * D2R) * cos(lat2 * D2R) * lonSin * lonSin)
val c = 2.0 * atan2(sqrt(a), sqrt(1.0 - a))
return MEAN_EARTH_RADIUS * c
}
}
you can find a implementation of this (with some good explanation) in F# on fssnip
here are the important parts:
let GreatCircleDistance<[<Measure>] 'u> (R : float<'u>) (p1 : Location) (p2 : Location) =
let degToRad (x : float<deg>) = System.Math.PI * x / 180.0<deg/rad>
let sq x = x * x
// take the sin of the half and square the result
let sinSqHf (a : float<rad>) = (System.Math.Sin >> sq) (a / 2.0<rad>)
let cos (a : float<deg>) = System.Math.Cos (degToRad a / 1.0<rad>)
let dLat = (p2.Latitude - p1.Latitude) |> degToRad
let dLon = (p2.Longitude - p1.Longitude) |> degToRad
let a = sinSqHf dLat + cos p1.Latitude * cos p2.Latitude * sinSqHf dLon
let c = 2.0 * System.Math.Atan2(System.Math.Sqrt(a), System.Math.Sqrt(1.0-a))
R * c
I needed to implement this in PowerShell, hope it can help someone else.
Some notes about this method
Don't split any of the lines or the calculation will be wrong
To calculate in KM remove the * 1000 in the calculation of $distance
Change $earthsRadius = 3963.19059 and remove * 1000 in the calculation of $distance the to calulate the distance in miles
I'm using Haversine, as other posts have pointed out Vincenty's formulae is much more accurate
Function MetresDistanceBetweenTwoGPSCoordinates($latitude1, $longitude1, $latitude2, $longitude2)
{
$Rad = ([math]::PI / 180);
$earthsRadius = 6378.1370 # Earth's Radius in KM
$dLat = ($latitude2 - $latitude1) * $Rad
$dLon = ($longitude2 - $longitude1) * $Rad
$latitude1 = $latitude1 * $Rad
$latitude2 = $latitude2 * $Rad
$a = [math]::Sin($dLat / 2) * [math]::Sin($dLat / 2) + [math]::Sin($dLon / 2) * [math]::Sin($dLon / 2) * [math]::Cos($latitude1) * [math]::Cos($latitude2)
$c = 2 * [math]::ATan2([math]::Sqrt($a), [math]::Sqrt(1-$a))
$distance = [math]::Round($earthsRadius * $c * 1000, 0) #Multiple by 1000 to get metres
Return $distance
}
Scala version
def deg2rad(deg: Double) = deg * Math.PI / 180.0
def rad2deg(rad: Double) = rad / Math.PI * 180.0
def getDistanceMeters(lat1: Double, lon1: Double, lat2: Double, lon2: Double) = {
val theta = lon1 - lon2
val dist = Math.sin(deg2rad(lat1)) * Math.sin(deg2rad(lat2)) + Math.cos(deg2rad(lat1)) *
Math.cos(deg2rad(lat2)) * Math.cos(deg2rad(theta))
Math.abs(
Math.round(
rad2deg(Math.acos(dist)) * 60 * 1.1515 * 1.609344 * 1000)
)
}
Here's my implementation in Elixir
defmodule Geo do
#earth_radius_km 6371
#earth_radius_sm 3958.748
#earth_radius_nm 3440.065
#feet_per_sm 5280
#d2r :math.pi / 180
def deg_to_rad(deg), do: deg * #d2r
def great_circle_distance(p1, p2, :km), do: haversine(p1, p2) * #earth_radius_km
def great_circle_distance(p1, p2, :sm), do: haversine(p1, p2) * #earth_radius_sm
def great_circle_distance(p1, p2, :nm), do: haversine(p1, p2) * #earth_radius_nm
def great_circle_distance(p1, p2, :m), do: great_circle_distance(p1, p2, :km) * 1000
def great_circle_distance(p1, p2, :ft), do: great_circle_distance(p1, p2, :sm) * #feet_per_sm
#doc """
Calculate the [Haversine](https://en.wikipedia.org/wiki/Haversine_formula)
distance between two coordinates. Result is in radians. This result can be
multiplied by the sphere's radius in any unit to get the distance in that unit.
For example, multiple the result of this function by the Earth's radius in
kilometres and you get the distance between the two given points in kilometres.
"""
def haversine({lat1, lon1}, {lat2, lon2}) do
dlat = deg_to_rad(lat2 - lat1)
dlon = deg_to_rad(lon2 - lon1)
radlat1 = deg_to_rad(lat1)
radlat2 = deg_to_rad(lat2)
a = :math.pow(:math.sin(dlat / 2), 2) +
:math.pow(:math.sin(dlon / 2), 2) *
:math.cos(radlat1) * :math.cos(radlat2)
2 * :math.atan2(:math.sqrt(a), :math.sqrt(1 - a))
end
end
Dart Version
Haversine Algorithm.
import 'dart:math';
class GeoUtils {
static double _degreesToRadians(degrees) {
return degrees * pi / 180;
}
static double distanceInKmBetweenEarthCoordinates(lat1, lon1, lat2, lon2) {
var earthRadiusKm = 6371;
var dLat = _degreesToRadians(lat2-lat1);
var dLon = _degreesToRadians(lon2-lon1);
lat1 = _degreesToRadians(lat1);
lat2 = _degreesToRadians(lat2);
var a = sin(dLat/2) * sin(dLat/2) +
sin(dLon/2) * sin(dLon/2) * cos(lat1) * cos(lat2);
var c = 2 * atan2(sqrt(a), sqrt(1-a));
return earthRadiusKm * c;
}
}
In Python, you can use the geopy library to compute the geodesic distance using the WGS84 ellipsoid:
from geopy.distance import geodesic
newport_ri = (41.49008, -71.312796)
cleveland_oh = (41.499498, -81.695391)
print(geodesic(newport_ri, cleveland_oh).km)
TypeScript Version
export const degreeToRadian = (degree: number) => {
return degree * Math.PI / 180;
}
export const distanceBetweenEarthCoordinatesInKm = (lat1: number, lon1: number, lat2: number, lon2: number) => {
const earthRadiusInKm = 6371;
const dLat = degreeToRadian(lat2 - lat1);
const dLon = degreeToRadian(lon2 - lon1);
lat1 = degreeToRadian(lat1);
lat2 = degreeToRadian(lat2);
const a = Math.sin(dLat / 2) * Math.sin(dLat / 2) + Math.sin(dLon / 2) * Math.sin(dLon / 2) * Math.cos(lat1) * Math.cos(lat2);
const c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a));
return earthRadiusInKm * c;
}
I think a version of the algorithm in R is still missing:
gpsdistance<-function(lat1,lon1,lat2,lon2){
# internal function to change deg to rad
degreesToRadians<- function (degrees) {
return (degrees * pi / 180)
}
R<-6371e3 #radius of Earth in meters
phi1<-degreesToRadians(lat1) # latitude 1
phi2<-degreesToRadians(lat2) # latitude 2
lambda1<-degreesToRadians(lon1) # longitude 1
lambda2<-degreesToRadians(lon2) # longitude 2
delta_phi<-phi1-phi2 # latitude-distance
delta_lambda<-lambda1-lambda2 # longitude-distance
a<-sin(delta_phi/2)*sin(delta_phi/2)+
cos(phi1)*cos(phi2)*sin(delta_lambda/2)*
sin(delta_lambda/2)
cc<-2*atan2(sqrt(a),sqrt(1-a))
distance<- R * cc
return(distance) # in meters
}
For java
public static double degreesToRadians(double degrees) {
return degrees * Math.PI / 180;
}
public static double distanceInKmBetweenEarthCoordinates(Location location1, Location location2) {
double earthRadiusKm = 6371;
double dLat = degreesToRadians(location2.getLatitude()-location1.getLatitude());
double dLon = degreesToRadians(location2.getLongitude()-location1.getLongitude());
double lat1 = degreesToRadians(location1.getLatitude());
double lat2 = degreesToRadians(location2.getLatitude());
double a = Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.sin(dLon/2) * Math.sin(dLon/2) * Math.cos(lat1) * Math.cos(lat2);
double c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));
return earthRadiusKm * c;
}
For anyone searching for a Delphi/Pascal version:
function GreatCircleDistance(const Lat1, Long1, Lat2, Long2: Double): Double;
var
Lat1Rad, Long1Rad, Lat2Rad, Long2Rad: Double;
const
EARTH_RADIUS_KM = 6378;
begin
Lat1Rad := DegToRad(Lat1);
Long1Rad := DegToRad(Long1);
Lat2Rad := DegToRad(Lat2);
Long2Rad := DegToRad(Long2);
Result := EARTH_RADIUS_KM * ArcCos(Cos(Lat1Rad) * Cos(Lat2Rad) * Cos(Long1Rad - Long2Rad) + Sin(Lat1Rad) * Sin(Lat2Rad));
end;
I take no credit for this code, I originally found it posted by Gary William on a public forum.