How to geocode address given two streets and number, specific case Colombian addresses - here-api

How to geocode with here API address given two streets and number, Colombian addresses are given by main street, intersection with another street and house number.
Example
Calle 27 Carrera 25 24 , Bogota, Colombia is lat:4.62331 lng:-74.07728
https://geocode.search.hereapi.com/v1/geocode?q=Calle+27+Calle+25-34+Bogota&apiKey=*****
Gives the wrong lat and long, the wright location is (4.62331,-74.07728) but it returns (4.62638,-74.08522)

Related

Incorrect NYC Subway data from HERE

Trying to get the specific subway lines that service a subway station in NYC given a lat/long. HERE is returning some data, but it's incomplete.
I'm using the following endpoint--
https://transit.hereapi.com/v8/stations?apiKey=xxxxxxxxxxxxxx&in=40.734376,-73.990714&return=transport
It's returning the bus stations as well, but the only one I care about is
{"place":{"name":"Union Sq - 14 St","type":"station","location":{"lat":40.734789,"lng":-73.99073},"id":"717081137"},"transports":[{"mode":"subway","name":"L","color":"#A7A9AC","textColor":"#000000","headsign":"8 Av"},{"mode":"subway","name":"L","color":"#A7A9AC","textColor":"#000000","headsign":"Canarsie - Rockaway Pkwy"},{"mode":"subway","name":"L","color":"#A7A9AC","textColor":"#000000","headsign":"Myrtle - Wyckoff Avs"}]}
The Union Square 14th st subway station has the L/N/Q/R/W/4/5/6 subway lines. Is this an error with HERE data or am I missing something in my query?
Your co-ordinates for the Union Square 14th st subway station seems misplaced. You can get all the subway lines L/N/Q/R/W/4/5/6 by below query:
https://transit.hereapi.com/v8/stations?apiKey=YOUR_API_KEY&return=transport&in=40.735088,-73.989952

How to split address column in R

I have an address column in a dataframe like below:
Address
101 Marietta Street NorthWest Atlanta GA 30303
Now I want to split it into 4 diff columns like
Address City State Zip
101 Marietta Street NorthWest Atlanta GA 30303
It is guaranteed that the last value in address column will be zip code, second last will be state, third last will be city and remaining will be address. So I am thinking, I can split address column values with space and extract values from rear.
How can I do this?
We can use tidyr::extract to get last 3 words in separate columns and remaining text as Address
tidyr::extract(df, Address, c("Address", "City", "State", "Zip"),
regex = "(.+) (\\w+) (\\w+) (\\w+)")
# Address City State Zip
#1 101 Marietta Street NorthWest Atlanta GA 30303

Having trouble with R join/merge/lookup

I am new to R and am trying to put together a data set then create a map.
I have two data frames:
airports_clean
**Airport Name City Country IATA ICAO Lat Long**
Goroka Airport Goroka Papua New Guinea GKA AYGA -6.081690 145.39200
Madang Airport Madang Papua New Guinea MAG AYMD -5.207080 145.78900
routes_clean
Source_IATA source_country source_lat source_long Destination_IATA destination_country destination_lat destination_long
The only columns in routes_clean I have are source_IATA and Destination_IATA (which would be something like AER KZN respectively) which are origin and destination. I want to fill the rest of the columns with data from airports_clean.
I have tried join/merge/lookup to no avail. I think join is the correct method using IATA as the common variable between between the two datasets(they are named differently).
I used to following code to try input the source_country but I received an error message:
clean_routes_01<-join(subset(routes_clean$source_country), subset(airports_clean$Country), by = ("Source_IATA"=="IATA"))
Any advice or tips would be greatly appreciated.
Regards

How do I convert city names to time zones?

Sorry if this is repetitive, but I've looked everywhere and can't seem to find anything that addresses my specific problem in R. I have a column with city names:
cities <-data.frame(c("Sydney", "Dusseldorf", "LidCombe", "Portland"))
colnames(cities)[1]<-"CityName"
Ideally I'd like to attach a column with either the lat/long for each city or the time zone. I have tried using the "ggmap" package in R, but my request exceeds the maximum number of requests they allow per day. I found the "geonames" package that converts lat/long to timezones, so if I get the lat/long for the city I should be able to take it from there.
Edit to address potential duplicate question: I would like to do this without using the ggmap package, as I have too many rows and they have a maximum # of requests per day.
You can get at least many major cities from the world.cities data in the maps package.
## Changing your data to a vector
cities <- c("Sydney", "Dusseldorf", "LidCombe", "Portland")
## Load up data
library(maps)
data(world.cities)
world.cities[match(cities, world.cities$name), ]
name country.etc pop lat long capital
36817 Sydney Australia 4444513 -33.87 151.21 0
10026 Dusseldorf Germany 573521 51.24 6.79 0
NA <NA> <NA> NA NA NA NA
29625 Portland Australia 8757 -38.34 141.59 0
Note: LidCombe was not included.
Warning: For many names, there is more than one world city. For example,
world.cities[grep("Portland", world.cities$name), ]
name country.etc pop lat long capital
29625 Portland Australia 8757 -38.34 141.59 0
29626 Portland USA 542751 45.54 -122.66 0
29627 Portland USA 62882 43.66 -70.28 0
Of course the two in the USA are Portland, Maine and Portland, Oregon.
match is just giving the first one on the list. You may need to use more information than just the name to get a good result.

How do I preserve prexisting identifiers when geocoding a list of addresses in R?

I'm currently working with an R script set up to use RDSTK, a wrapper for the Data Science Toolkit API based on this, to geocode a list of addresses from a CSV.
The script appears to work, but the list of addresses has a preexisting unique identifier which isn't preserved in the process - the input file has two columns: id, and address. The id column, for the purposes of the geocoding process, is meaningless, but I'd like the output to retain it - that is, I'd like the output, which has three columns (address, long, and lat) to have four - id being the first.
The issue is that
The output is not in the same order as the input addresses, or doesn't appear to be, so I cannot simply tack on the column of addresses at the end, and
The output does not include nulls, so the two would not be the same number of rows in any case, even if it was the same order, and
I am not sure how to effectively tie the id column in such that it becomes a part of the geocoding process, which obviously would be the ideal solution.
Here is the script:
require("RDSTK")
library(httr)
library(rjson)
dff = read.csv("C:/Users/name/Documents/batchtestv2.csv")
data <- paste0("[",paste(paste0("\"",dff$address,"\""),collapse=","),"]")
url <- "http://www.datasciencetoolkit.org/street2coordinates"
response <- POST(url,body=data)
json <- fromJSON(content(response,type="text"))
geocode <- do.call(rbind,lapply(json, function(x) c(long=x$longitude,lat=x$latitude)))
geocode
write.csv(geocode, file = "C:/Users/name/Documents/geocodetest.csv")
And here is a sample of the output:
2633 Camino Ramon Suite 500 San Ramon California 94583 United States -121.96208 37.77027
555 Lordship Boulevard Stratford Connecticut 6615 United States -73.14098 41.16542
500 West 13th Street Fort Worth Texas 76102 United States -97.33288 32.74782
50 North Laura Street Suite 2500 Jacksonville Florida 32202 United States -81.65923 30.32733
7781 South Little Egypt Road Stanley North Carolina 28164 United States -81.00597 35.44482
Maybe the solution is extraordinarily simple and I'm just being dense - it's entirely possible (I don't have extensive experience with any particular language, so I sometimes miss obvious things) but I haven't been able to solve it.
Thanks in advance!

Resources