Geocoding Data Locations With Google in R - r

I am trying to use very well written instructions from this blog: https://www.jessesadler.com/post/geocoding-with-r/ to geocode locational data in R including specific cites and cities in Hawaii. I am having issues pulling information from Google. When running mutate_geocode my data runs but no output is gathered. I bypassed this for the time being with manual entry of lat and lon for just one location of my dataset, attempting to trouble shoot. Now, when I use get_googlemap, I get the error message "Error in Download File"
I have tried using mutate_geocode as well as running a loop using geocode. I either do not get output or I get the OVER_QUERY_LIMIT error (which seems to be very classic). After checking my query limit I am nowhere near the limit.
Method 1:
BH <- rename(location, place = Location)
BH_df <- as.data.frame(BH)
location_df <- mutate_geocode(HB, Location)
Method 2:
origAddress <- read.csv("HSMBH.csv", stringsAsFactors = FALSE)
geocoded <- data.frame(stringsAsFactors = FALSE)
for(i in 1:nrow(origAddress))
{
result <- geocode(HB$Location[i], output = "latlona", source = "google")
HB$lon[i] <- as.character(result[1])
HB$lat[i] <- as.character(result[2])
HB$geoAddress[i] <- as.character(result[3])
}
Post Manual Entry of lon and lat points I run in to this error:
map <- get_googlemap(center = c(-158.114, 21.59), zoom = 4)
I am hoping to gather lat and lon points for my locations, and then be able to use get_googlemap to draft a map with which I can plot density points of occurrences (I have the code for the points already).

Alternatively, you can use a one-liner for rapid geocoding via tmaptools::geocode_OSM():
Data
library(tmaptools)
addresses <- data.frame(address = c("New York", "Berlin", "Huangpu Qu",
"Vienna", "St. Petersburg"),
stringsAsFactors = FALSE)
Code
result <- lapply(addresses[, 1], geocode_OSM)
> result
$address
query lat lon lat_min lat_max lon_min lon_max
1 New York 40.73086 -73.98716 40.47740 40.91618 -74.25909 -73.70018
2 Berlin 52.51704 13.38886 52.35704 52.67704 13.22886 13.54886
3 Huangpu Qu 31.21823 121.48030 31.19020 31.24653 121.45220 121.50596
4 Vienna 48.20835 16.37250 48.04835 48.36835 16.21250 16.53250
5 St. Petersburg 27.77038 -82.66951 27.64364 27.91390 -82.76902 -82.54062
This way, you have both
the centroids (lon, lat) that are important for Google Maps and
boundary boxes (lon_min, lat_min, lon_max, lat_max) that mapping services like OSM or Stamen need.

Related

Geocoding with R: Errors stopping program altogether

I have a working program which pulls addresses from a list in Excel and geocodes them using a Google API, but anytime it gets to an address with an apartment, unit, or unfindable address, it stops the program.
I can't get a workable tryCatch routine going inside my loop. :(
Here is the Code:
library("readxl")
library(ggplot2)
library(ggmap)
fileToLoad <- file.choose(new = TRUE)
origAddress <- read_excel(fileToLoad, sheet = "Sheet1")
geocoded <- data.frame(stringsAsFactors = FALSE)
for(i in 1:nrow(origAddress))
{
# Print("Working...")
result <- geocode(origAddress$addresses[i], output = "latlona", source = "google")
origAddress$lon[i] <- as.numeric(result[1])
origAddress$lat[i] <- as.numeric(result[2])
origAddress$geoAddress[i] <- as.character(result[3])
}
write.csv(origAddress, "geocoded1.csv", row.names=FALSE)
And here is the Error message:
Warning: Geocoding "[removed address]" failed with error:
You must use an API key to authenticate each request to Google Maps Platform APIs. For additional information, please refer to http://g.co/dev/maps-no-account
Error: Can't subset columns that don't exist.
x Location 3 doesn't exist.
i There are only 2 columns.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning messages:
1: Unknown or uninitialised column: `lon`.
2: Unknown or uninitialised column: `lat`.
3: Unknown or uninitialised column: `geoAddress`.
Now, this is not an API key error because the key works in calls after the error -- and it stops at any address that ends in a number after the street name.
I'm going to be processing batches of thousands of addresses every month and they are not all going to be perfect, so what I need is to be able to skip these bad addresses, put "NA" in the lon/lat columns, and move on.
I'm new to R and can't make a workable error handling routine to handle these types of mistakes. can anyone point me in the right direction? Thanks in advance.
When geocode fails to find an address and output = "latlona", the address field is not returned. You code can be made to work with the following modification.
#
# example data
#
origAddress <- data.frame(addresses = c("white house, Washington",
"white house, # 100, Washington",
"white hose, Washington",
"Washington Apartments, Washington, DC 20001",
"1278 7th st nw, washington, dc 20001") )
#
# simple fix for fatal error
#
for(i in 1:nrow(origAddress))
{
result <- geocode(origAddress$addresses[i], output = "latlona",
source = "google")
origAddress$lon[i] <- result$lon[1]
origAddress$lat[i] <- result$lat[1]
origAddress$geoAddress[i] <- ifelse( is.na(result$lon[1]), NA, result$address[1] )
}
However, you mention that some of your addresses may not be exact. Google's geocoding will try to interpret all address you supply. Sometimes it fails and returns NA but other times its interpretation may not be correct so you should always check geocode results.
A simple method which will catch many errors to set output = "more" in geocode and then examine the values returned in the loctype column. If loctype != "rooftop", you may have a problem. Examing the type column will give you more information. This check isn't complete. To do a more complete check, you could use output = "all" to return all data supplied by google for an address but this requires parsing a moderately complex list. You should read more about the data returned by google geocoding at https://developers.google.com/maps/documentation/geocoding/overview
Also, geocode will take at least tens of minutes at least to return results for thousands of addresses. To minimize the response time, you should supply addresses to geocode as a character vector of addresses. A data frame of results is then returned which you can use to update your origAddress data frame and check for errors as shown below.
#
# Solution should check for wrongly interpreted addresses
#
# see https://developers.google.com/maps/documentation/geocoding/overview
# for more information on fields returned by google geocoding
#
# return all addresses in single call to geocode
#
origAddress <- data.frame(addresses = c("white house, Washington", # identified by name
"white hose, Washington", # misspelling
"Washington Apartments, apt 100, Washington, DC 20001", # identified by name of apartment building
"Washington Apartments, # 100, Washington, DC 20001", # invalid apartment number specification
"1206 7th st nw, washington, dc 20001") ) # address on street but no structure with that address
result <- suppressWarnings(geocode(location = origAddress$addresses,
output = "more",
source = "google") )
origAddress <- cbind(origAddress, result[, c("address", "lon","lat","type", "loctype")])
#
# Addresses which need to be checked
#
check_addresses <- origAddress[ origAddress$loctype != "rooftop" |
is.na(origAddress$loctype), ]

Having problem with ggmap's mapdist() function

I have this code. I have my google API set up already, registered as well in R, Distance Matrix API has been initiated as well in the Google Cloud console.
Here is the dataframe I have, random 25 postal codes FROM and TO postal codes.
Dataset_test = data.frame(
FROM_POSTAL = c("V8A 0E5","T4G 6M4","V1N 8X3",
"C1B 5G1","R5H 2L4","H9S 8L4","L8E 4Y0","H2Y 7N6",
"K1B 7C0","G4A 5B0","E4P 3T2","E4V 5P4","H3J 1R5",
"G0B 4J7","E7A 6E7","E5B 2Y9","S4H 1T8","A2V 4G5",
"V8L 2A9","T9E 1M5","A5A 5M2","E4T 5B4","S2V 6C4",
"S9H 5P8","B1Y 0V0"),
TO_POSTAL = c("G0J 0B8","N0H 9N4","J9B 4Y4",
"L3Z 2Y7","E8K 4R4","B4P 7X9","S4H 2M0","A1Y 0B8",
"A1W 1E9","P9N 7X1","E4R 4B0","N0P 0M8","E1W 9Y7",
"T9W 8E2","G6X 4S9","A0E 0V4","J5X 7N8","N4N 8A1",
"V9K 0B9","L4G 3H7","E1W 0T2","G5R 9G3","L7C 9S2",
"E8P 2X6","E2A 2M1")
)
Here is the simple script I have to try to calculate the distance between the two postal codes by driving using Google's Distance Matrix API.
Driving_Distance = mapdist(from = Dataset_test[["FROM_POSTAL"]], to = Dataset_test[["TO_POSTAL"]], mode = c("driving")) %>% distinct()
When I run this, it throws an error in the Driving_Distance - says
Error: Argument 1 is a list, must contain atomic vectors
Your Canadian postal codes are hereby working with the mapdist() function.
The number of addresses used here were shortened for the sake of brevity.
A tibble was used instead of a dataframe so that the variables were character data types rather than factor data types. The actual Google API key that was used has been replaced with some text.
This was a good mapping question. The working code and output below:
library(ggmap)
library(plyr)
library(googleway)
library(tidyverse)
df = tibble(
FROM_POSTAL = c("V8A 0E5","T4G 6M4","V1N 8X3",
"C1B 5G1","R5H 2L4","H9S 8L4"),
TO_POSTAL = c("G0J 0B8","N0H 9N4","J9B 4Y4",
"L3Z 2Y7","E8K 4R4","B4P 7X9"))
dd <- apply(df, 1, function(x){
google_distance(origins = list(x["from"]),
destinations = list(x["to"]),
key="My_secret_key")
})
dd

Reverse Geo Coding in R

I would like to reverse geo code address and pin code in R
These are the columns
A B C
15.3859085 74.0314209 7J7P92PJ+9H77QGCCCC
I have taken first four rows having columns A B and C among 1000's of rows
df<-ga.data[1:4,]
df <- cbind(df,do.call(rbind,
lapply(1:nrow(df),
function(i)
revgeocode(as.numeric(
df[i,3:1]), output = "more")
[c("administrative_area_level_1","locality","postal_code","address")])))
Error in revgeocode(as.numeric(df[i, 3:1]), output = "more") :
is.numeric(location) && length(location) == 2 is not TRUE
Also is there any other package or approach to find out the address and pincode most welcome
I also tried the following
When I tried using ggmap I got this error
In revgeocode(as.numeric(df[i, c("Latitude", "Longitude")]), output = "address") :
HTTP 400 Bad Request
Also i tried this
revgeocode(c(df$B[1], df$A[1]))
Warning Warning message: In revgeocode(c(df$Longitude[1],
df$Latitude[1])) : HTTP 400 Bad Request
Also I am from India and it does not work for me if i search for lat long of India. If I use lat long of US it gives me the exact address
seems fishy
data <- read.csv(text="ID, Longitude, Latitude
311175, 41.298437, -72.929179
292058, 41.936943, -87.669838
12979, 37.580956, -77.471439")
library(ggmap)
result <- do.call(rbind,
lapply(1:nrow(data),
function(i)revgeocode(as.numeric(data[i,3:2]))))
data <- cbind(data,result)
The current CRAN version of revgeo_0.15 does not have the revgeocode function. If you upgrade to this version, you'll find a revgeo function, which takes longitude, latitude arguments. Your column C should not be passed into the function.
revgeo::revgeo(latitude=df[, 'A'], longitude=df[, 'B'], output='frame')
[1] "Getting geocode data from Photon: http://photon.komoot.de/reverse?lon=74.0314209&lat=15.3859085"
housenumber street city state zip country
1 House Number Not Found Street Not Found Borim Goa Postcode Not Found India

Determining the distance between multiple ZIP codes from one point

I've been using the mapdist funtion() for determining the distance between two zipcodes. I am not very good with loops yet and was wondering how to loop in multiple zipcodes so that I don't have to rerun the code every time.
Code posted below.
library(ggmap)
mapdist('95077','06473', mode = 'driving')
library(ggmap)
Building an example data.frame
geoData <- data.frame(FROM = c('95077', 'Manchester Deaf Institute'),
TO = c('06473', 'Birmingham O2 Academy 1'),
stringsAsFactors = FALSE)
passing columns as args
mapdist(from = geoData[['FROM']],
to = geoData[['TO']],
mode = 'driving')
result
from to m km miles seconds minutes hours
1 95077 06473 4932333 4932.333 3064.95173 161558 2692.6333 44.877222
2 Manchester Deaf Institute Birmingham O2 Academy 1 141330 141.330 87.82246 6569 109.4833 1.824722

Applying revgeocode to a list of longitude-latitude coordinates

I'm trying to get the Zip codes of a (long) list of Longitude Latitude coordinates by using the revgeodcode function in the ggmap library.
My question & data are the same as here: Using revgeocode function in a FOR loop. Help required but the accepted answer does not work for me.
My data (.csv):
ID, Longitude, Latitude
311175, 41.298437, -72.929179
292058, 41.936943, -87.669838
12979, 37.580956, -77.471439
I follow the same steps:
data <- read.csv(file.choose())
dset <- as.data.frame(data[,2:3])
location = dset
locaddr <- lapply(seq(nrow(location)), function(i){
revgeocode(location[i,],
output = c("address"),
messaging = FALSE,
sensor = FALSE,
override_limit = FALSE)
})
... and get the error message: "Error: is.numeric(location) && length(location) == 2 is not TRUE"
Specifically, is.numeric(location) is FALSE, which seems strange because I can multiply by 2 and get the expected answer.
Any help would be appreciated.
There are lots of things wrong here.
First, you have latitude and longitude reversed. All the locations in your dataset, as specified, are in Antarctica.
Second, revgeocode(...) expects a numeric vector of length 2 containing the longitude and latitude in that order. You are passing a data.frame object (this is the reason for the error), and as per (1) it's in the wrong order.
Third, revgeocode(...) uses the google maps api, which limits you to 2500 queries a day. So if you really do have a large dataset, good luck with that.
This code works with your sample:
data <- read.csv(text="ID, Longitude, Latitude
311175, 41.298437, -72.929179
292058, 41.936943, -87.669838
12979, 37.580956, -77.471439")
library(ggmap)
result <- do.call(rbind,
lapply(1:nrow(data),
function(i)revgeocode(as.numeric(data[i,3:2]))))
data <- cbind(data,result)
data
# ID Longitude Latitude result
# 1 311175 41.29844 -72.92918 16 Church Street South, New Haven, CT 06519, USA
# 2 292058 41.93694 -87.66984 1632 West Nelson Street, Chicago, IL 60657, USA
# 3 12979 37.58096 -77.47144 2077-2199 Seddon Way, Richmond, VA 23230, USA
This extracts the zipcodes:
library(stringr)
data$zipcode <- substr(str_extract(data$result," [0-9]{5}, .+"),2,6)
data[,-4]
# ID Longitude Latitude zipcode
# 1 311175 41.29844 -72.92918 06519
# 2 292058 41.93694 -87.66984 60657
# 3 12979 37.58096 -77.47144 23230
I've written the package googleway to access google maps API with a valid API key. So if your data is greater than 2,500 items you can pay for an API key, and then use googleway::google_reverse_geocode()
For example
data <- read.csv(text="ID, Longitude, Latitude
311175, 41.298437, -72.929179
292058, 41.936943, -87.669838
12979, 37.580956, -77.471439")
library(googleway)
key <- "your_api_key"
res <- apply(data, 1, function(x){
google_reverse_geocode(location = c(x["Latitude"], x["Longitude"]),
key = key)
})
## Everything contained in 'res' is all the data returnd from Google Maps API
## for example, the geometry section of the first lat/lon coordiantes
res[[1]]$results$geometry
bounds.northeast.lat bounds.northeast.lng bounds.southwest.lat bounds.southwest.lng location.lat location.lng
1 -61.04904 180 -90 -180 -75.25097 -0.071389
location_type viewport.northeast.lat viewport.northeast.lng viewport.southwest.lat viewport.southwest.lng
1 APPROXIMATE -61.04904 180 -90 -180
To extract the zip code just write down:
>data$postal_code

Resources