Reverse Geo Coding in R - r

I would like to reverse geo code address and pin code in R
These are the columns
A B C
15.3859085 74.0314209 7J7P92PJ+9H77QGCCCC
I have taken first four rows having columns A B and C among 1000's of rows
df<-ga.data[1:4,]
df <- cbind(df,do.call(rbind,
lapply(1:nrow(df),
function(i)
revgeocode(as.numeric(
df[i,3:1]), output = "more")
[c("administrative_area_level_1","locality","postal_code","address")])))
Error in revgeocode(as.numeric(df[i, 3:1]), output = "more") :
is.numeric(location) && length(location) == 2 is not TRUE
Also is there any other package or approach to find out the address and pincode most welcome
I also tried the following
When I tried using ggmap I got this error
In revgeocode(as.numeric(df[i, c("Latitude", "Longitude")]), output = "address") :
HTTP 400 Bad Request
Also i tried this
revgeocode(c(df$B[1], df$A[1]))
Warning Warning message: In revgeocode(c(df$Longitude[1],
df$Latitude[1])) : HTTP 400 Bad Request
Also I am from India and it does not work for me if i search for lat long of India. If I use lat long of US it gives me the exact address
seems fishy
data <- read.csv(text="ID, Longitude, Latitude
311175, 41.298437, -72.929179
292058, 41.936943, -87.669838
12979, 37.580956, -77.471439")
library(ggmap)
result <- do.call(rbind,
lapply(1:nrow(data),
function(i)revgeocode(as.numeric(data[i,3:2]))))
data <- cbind(data,result)

The current CRAN version of revgeo_0.15 does not have the revgeocode function. If you upgrade to this version, you'll find a revgeo function, which takes longitude, latitude arguments. Your column C should not be passed into the function.
revgeo::revgeo(latitude=df[, 'A'], longitude=df[, 'B'], output='frame')
[1] "Getting geocode data from Photon: http://photon.komoot.de/reverse?lon=74.0314209&lat=15.3859085"
housenumber street city state zip country
1 House Number Not Found Street Not Found Borim Goa Postcode Not Found India

Related

Geocoding with R: Errors stopping program altogether

I have a working program which pulls addresses from a list in Excel and geocodes them using a Google API, but anytime it gets to an address with an apartment, unit, or unfindable address, it stops the program.
I can't get a workable tryCatch routine going inside my loop. :(
Here is the Code:
library("readxl")
library(ggplot2)
library(ggmap)
fileToLoad <- file.choose(new = TRUE)
origAddress <- read_excel(fileToLoad, sheet = "Sheet1")
geocoded <- data.frame(stringsAsFactors = FALSE)
for(i in 1:nrow(origAddress))
{
# Print("Working...")
result <- geocode(origAddress$addresses[i], output = "latlona", source = "google")
origAddress$lon[i] <- as.numeric(result[1])
origAddress$lat[i] <- as.numeric(result[2])
origAddress$geoAddress[i] <- as.character(result[3])
}
write.csv(origAddress, "geocoded1.csv", row.names=FALSE)
And here is the Error message:
Warning: Geocoding "[removed address]" failed with error:
You must use an API key to authenticate each request to Google Maps Platform APIs. For additional information, please refer to http://g.co/dev/maps-no-account
Error: Can't subset columns that don't exist.
x Location 3 doesn't exist.
i There are only 2 columns.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning messages:
1: Unknown or uninitialised column: `lon`.
2: Unknown or uninitialised column: `lat`.
3: Unknown or uninitialised column: `geoAddress`.
Now, this is not an API key error because the key works in calls after the error -- and it stops at any address that ends in a number after the street name.
I'm going to be processing batches of thousands of addresses every month and they are not all going to be perfect, so what I need is to be able to skip these bad addresses, put "NA" in the lon/lat columns, and move on.
I'm new to R and can't make a workable error handling routine to handle these types of mistakes. can anyone point me in the right direction? Thanks in advance.
When geocode fails to find an address and output = "latlona", the address field is not returned. You code can be made to work with the following modification.
#
# example data
#
origAddress <- data.frame(addresses = c("white house, Washington",
"white house, # 100, Washington",
"white hose, Washington",
"Washington Apartments, Washington, DC 20001",
"1278 7th st nw, washington, dc 20001") )
#
# simple fix for fatal error
#
for(i in 1:nrow(origAddress))
{
result <- geocode(origAddress$addresses[i], output = "latlona",
source = "google")
origAddress$lon[i] <- result$lon[1]
origAddress$lat[i] <- result$lat[1]
origAddress$geoAddress[i] <- ifelse( is.na(result$lon[1]), NA, result$address[1] )
}
However, you mention that some of your addresses may not be exact. Google's geocoding will try to interpret all address you supply. Sometimes it fails and returns NA but other times its interpretation may not be correct so you should always check geocode results.
A simple method which will catch many errors to set output = "more" in geocode and then examine the values returned in the loctype column. If loctype != "rooftop", you may have a problem. Examing the type column will give you more information. This check isn't complete. To do a more complete check, you could use output = "all" to return all data supplied by google for an address but this requires parsing a moderately complex list. You should read more about the data returned by google geocoding at https://developers.google.com/maps/documentation/geocoding/overview
Also, geocode will take at least tens of minutes at least to return results for thousands of addresses. To minimize the response time, you should supply addresses to geocode as a character vector of addresses. A data frame of results is then returned which you can use to update your origAddress data frame and check for errors as shown below.
#
# Solution should check for wrongly interpreted addresses
#
# see https://developers.google.com/maps/documentation/geocoding/overview
# for more information on fields returned by google geocoding
#
# return all addresses in single call to geocode
#
origAddress <- data.frame(addresses = c("white house, Washington", # identified by name
"white hose, Washington", # misspelling
"Washington Apartments, apt 100, Washington, DC 20001", # identified by name of apartment building
"Washington Apartments, # 100, Washington, DC 20001", # invalid apartment number specification
"1206 7th st nw, washington, dc 20001") ) # address on street but no structure with that address
result <- suppressWarnings(geocode(location = origAddress$addresses,
output = "more",
source = "google") )
origAddress <- cbind(origAddress, result[, c("address", "lon","lat","type", "loctype")])
#
# Addresses which need to be checked
#
check_addresses <- origAddress[ origAddress$loctype != "rooftop" |
is.na(origAddress$loctype), ]

Geocoding Data Locations With Google in R

I am trying to use very well written instructions from this blog: https://www.jessesadler.com/post/geocoding-with-r/ to geocode locational data in R including specific cites and cities in Hawaii. I am having issues pulling information from Google. When running mutate_geocode my data runs but no output is gathered. I bypassed this for the time being with manual entry of lat and lon for just one location of my dataset, attempting to trouble shoot. Now, when I use get_googlemap, I get the error message "Error in Download File"
I have tried using mutate_geocode as well as running a loop using geocode. I either do not get output or I get the OVER_QUERY_LIMIT error (which seems to be very classic). After checking my query limit I am nowhere near the limit.
Method 1:
BH <- rename(location, place = Location)
BH_df <- as.data.frame(BH)
location_df <- mutate_geocode(HB, Location)
Method 2:
origAddress <- read.csv("HSMBH.csv", stringsAsFactors = FALSE)
geocoded <- data.frame(stringsAsFactors = FALSE)
for(i in 1:nrow(origAddress))
{
result <- geocode(HB$Location[i], output = "latlona", source = "google")
HB$lon[i] <- as.character(result[1])
HB$lat[i] <- as.character(result[2])
HB$geoAddress[i] <- as.character(result[3])
}
Post Manual Entry of lon and lat points I run in to this error:
map <- get_googlemap(center = c(-158.114, 21.59), zoom = 4)
I am hoping to gather lat and lon points for my locations, and then be able to use get_googlemap to draft a map with which I can plot density points of occurrences (I have the code for the points already).
Alternatively, you can use a one-liner for rapid geocoding via tmaptools::geocode_OSM():
Data
library(tmaptools)
addresses <- data.frame(address = c("New York", "Berlin", "Huangpu Qu",
"Vienna", "St. Petersburg"),
stringsAsFactors = FALSE)
Code
result <- lapply(addresses[, 1], geocode_OSM)
> result
$address
query lat lon lat_min lat_max lon_min lon_max
1 New York 40.73086 -73.98716 40.47740 40.91618 -74.25909 -73.70018
2 Berlin 52.51704 13.38886 52.35704 52.67704 13.22886 13.54886
3 Huangpu Qu 31.21823 121.48030 31.19020 31.24653 121.45220 121.50596
4 Vienna 48.20835 16.37250 48.04835 48.36835 16.21250 16.53250
5 St. Petersburg 27.77038 -82.66951 27.64364 27.91390 -82.76902 -82.54062
This way, you have both
the centroids (lon, lat) that are important for Google Maps and
boundary boxes (lon_min, lat_min, lon_max, lat_max) that mapping services like OSM or Stamen need.

Reverse geocode search (country names) for many locations, output to dataframe issues when country is missing

I am using the geonames package in R to do a reverse geocode search (GNcountryCode) to find the nearest country to my inputs. My inputs are not very precise and are located in water near land. geonames allows for a search within a buffer (km) of the input location.
I was trying to use mapply to expedite retrieving country names from a long list of input locations. However, the limits on buffer size still leave some input locations without a country. To permit mapply to continue running I used tryCatch to prevent mapply from stopping.
However, this results in a non-list entry ("Error") in the overall list of lists (output below). As such, when trying to use data.table::rbindlist I get the following error: "Item n of list input is not a data.frame, data.table or list"
How can I otherwise loop or vectorize GNcountryCode to get the nearest country name to the input location and then add this name back (cbind) to the original data frame (with the understanding that some locations will not be matched to a country)?
library(geonames)# requires a username for some functionality
Latitude <- c("32.75", "33.75", "33.75", "34.25", "34.25", "36.75")
Longitude <- c("-17.25", "-52.25", "-51.75", "-52.25", "-51.75", "-25.25")
# df <- cbind.data.frame(Latitude, Longitude)
MyFun <- function(x,y) {
MyRes <- tryCatch(GNcountryCode(lat=x, lng=y, radius=250), error = function(e) paste("Error"))
#print(MyRes)
return(MyRes)
}
MyResult <- mapply(MyFun, Latitude, Longitude)
data.table::rbindlist(MyResult, fill = TRUE)
#cbind(df, data.table::rbindlist(MyResult, fill = TRUE))
#Ouput
$`32.75`
$`32.75`$`languages`
[1] "pt-PT,mwl"
$`32.75`$distance
[1] "1.96436"
$`32.75`$countryCode
[1] "PT"
$`32.75`$countryName
[1] "Portuguese Republic"
$`33.75`
[1] "Error"
$`33.75`
[1] "Error"
$`34.25`
[1] "Error"
$`34.25`
[1] "Error"
$`36.75`
$`36.75`$`languages`
[1] "pt-PT,mwl"
$`36.75`$distance
[1] "22.63538"
$`36.75`$countryCode
[1] "PT"
$`36.75`$countryName
[1] "Portuguese Republic"
set the error parameter to return NA (and you might also want to pull out just the country name from the return of results that work)...
library(geonames)# requires a username for some functionality
Latitude <- c("32.75", "33.75", "33.75", "34.25", "34.25", "36.75")
Longitude <- c("-17.25", "-52.25", "-51.75", "-52.25", "-51.75", "-25.25")
df <- cbind.data.frame(Latitude, Longitude)
MyFun <- function(x,y) {
tryCatch(GNcountryCode(lat = x, lng = y, radius = 250)$countryName, error = function(e) NA_character_)
}
df$countryname <- mapply(MyFun, Latitude, Longitude)
df
# Latitude Longitude countryname
# 1 32.75 -17.25 Portuguese Republic
# 2 33.75 -52.25 <NA>
# 3 33.75 -51.75 <NA>
# 4 34.25 -52.25 <NA>
# 5 34.25 -51.75 <NA>
# 6 36.75 -25.25 Portuguese Republic

geocode result different from google maps

I'm trying to geocode different IATA airport codes in Italy, with the following (rudimentary) code in ggmap (version 2.4)
#list of all IATA codes
geo_apt <- c("AOI", "BGY", "BLQ", "BRI", "CTA", "FCO", "LIN", "MXP", "NAP",
"PMF", "PSA", "PSR", "RMI", "TRN", "VCE", "VRN")
#preparing an empty dataframe to store the geocodes
apt_geo <- data.frame(IATA=rep(NA,16), lon=rep(NA,16), lat=rep(NA,16))
#geocoding the codes
for (i in seq_along(geo_apt)) {
apt_geo[i,1] <- geo_apt[i]
apt_geo[i,2] <- (geocode(paste(geo_apt[i],"airport")))[1]
apt_geo[i,3] <- (geocode(paste(geo_apt[i],"airport")))[2]
}
and the geocode function of ggmap works perfectly fine with all of these codes except "PSR"
IATA lon lat
1 AOI 13.363752 43.61654
2 BGY 9.703631 45.66957
3 BLQ 11.287859 44.53452
4 BRI 16.765202 41.13751
5 CTA 15.065775 37.46730
6 FCO 12.246238 41.79989
7 LIN 9.276308 45.45218
8 MXP 8.725531 45.63006
9 NAP 14.286579 40.88299
10 PMF 10.295935 44.82326
11 PSA 10.397884 43.68908
12 PSR -81.117259 33.94855 #<- doens't work
13 RMI 12.618819 44.02289
14 TRN 7.647867 45.19654
15 VCE 12.339771 45.50506
16 VRN 10.890141 45.40000
I've tried to use revgeocode and those coordinates correspond to the following address:
revgeocode(as.numeric(apt_geo[12,2:3]))
#Information from URL : http://maps.googleapis.com/maps/api/geocode/json?latlng=33.948545,-81.1172588&sensor=false
[1] "Kentucky Avenue, West Columbia, SC 29170, USA"
On the contrary, if I go to Google maps, it works perfectly fine:
Does anybody have a clue on this apparently strange phenomenon?
EDIT
Following one suggestion in the comments below, I tried to use geocode(italy PSR airport) on version 2.4 again and instead of throwing a more accurate result or even the same result, this is the warning I got:
geocode("italy PSR airport")
lon lat
1 NA NA
Warning message:
geocode failed with status ZERO_RESULTS, location = "italy PSR airport"
while with the attempt airport PSR the coordinates are even different from those of PSR airport (at least this time it's an actual airport, although its IATA code is LEX instead of PSR).
revgeocode(as.numeric(geocode("airport PSR")))
Information from URL : http://maps.googleapis.com/maps/api/geocode/json?latlng=38.0381454,-84.5970727&sensor=false
[1] "3895 Terminal Drive, Lexington, KY 40510, USA"
The whole question is a possible duplicate
Nonetheless, I don't get the reason for which the API and Google maps are using different datasets...

Applying revgeocode to a list of longitude-latitude coordinates

I'm trying to get the Zip codes of a (long) list of Longitude Latitude coordinates by using the revgeodcode function in the ggmap library.
My question & data are the same as here: Using revgeocode function in a FOR loop. Help required but the accepted answer does not work for me.
My data (.csv):
ID, Longitude, Latitude
311175, 41.298437, -72.929179
292058, 41.936943, -87.669838
12979, 37.580956, -77.471439
I follow the same steps:
data <- read.csv(file.choose())
dset <- as.data.frame(data[,2:3])
location = dset
locaddr <- lapply(seq(nrow(location)), function(i){
revgeocode(location[i,],
output = c("address"),
messaging = FALSE,
sensor = FALSE,
override_limit = FALSE)
})
... and get the error message: "Error: is.numeric(location) && length(location) == 2 is not TRUE"
Specifically, is.numeric(location) is FALSE, which seems strange because I can multiply by 2 and get the expected answer.
Any help would be appreciated.
There are lots of things wrong here.
First, you have latitude and longitude reversed. All the locations in your dataset, as specified, are in Antarctica.
Second, revgeocode(...) expects a numeric vector of length 2 containing the longitude and latitude in that order. You are passing a data.frame object (this is the reason for the error), and as per (1) it's in the wrong order.
Third, revgeocode(...) uses the google maps api, which limits you to 2500 queries a day. So if you really do have a large dataset, good luck with that.
This code works with your sample:
data <- read.csv(text="ID, Longitude, Latitude
311175, 41.298437, -72.929179
292058, 41.936943, -87.669838
12979, 37.580956, -77.471439")
library(ggmap)
result <- do.call(rbind,
lapply(1:nrow(data),
function(i)revgeocode(as.numeric(data[i,3:2]))))
data <- cbind(data,result)
data
# ID Longitude Latitude result
# 1 311175 41.29844 -72.92918 16 Church Street South, New Haven, CT 06519, USA
# 2 292058 41.93694 -87.66984 1632 West Nelson Street, Chicago, IL 60657, USA
# 3 12979 37.58096 -77.47144 2077-2199 Seddon Way, Richmond, VA 23230, USA
This extracts the zipcodes:
library(stringr)
data$zipcode <- substr(str_extract(data$result," [0-9]{5}, .+"),2,6)
data[,-4]
# ID Longitude Latitude zipcode
# 1 311175 41.29844 -72.92918 06519
# 2 292058 41.93694 -87.66984 60657
# 3 12979 37.58096 -77.47144 23230
I've written the package googleway to access google maps API with a valid API key. So if your data is greater than 2,500 items you can pay for an API key, and then use googleway::google_reverse_geocode()
For example
data <- read.csv(text="ID, Longitude, Latitude
311175, 41.298437, -72.929179
292058, 41.936943, -87.669838
12979, 37.580956, -77.471439")
library(googleway)
key <- "your_api_key"
res <- apply(data, 1, function(x){
google_reverse_geocode(location = c(x["Latitude"], x["Longitude"]),
key = key)
})
## Everything contained in 'res' is all the data returnd from Google Maps API
## for example, the geometry section of the first lat/lon coordiantes
res[[1]]$results$geometry
bounds.northeast.lat bounds.northeast.lng bounds.southwest.lat bounds.southwest.lng location.lat location.lng
1 -61.04904 180 -90 -180 -75.25097 -0.071389
location_type viewport.northeast.lat viewport.northeast.lng viewport.southwest.lat viewport.southwest.lng
1 APPROXIMATE -61.04904 180 -90 -180
To extract the zip code just write down:
>data$postal_code

Resources