Convert lat/lon to zipcode / neighborhood name - r

I have a large collection of pictures with GPS locations, encoded as lat/lon coordinates, mostly in Los Angeles. I would like to convert these to (1) zipcodes, and (2) neighborhood names. Are there any free web services or databases to do so?
The best I can come up with so far is scrape the neighborhood polygons from the LA times page and try to find out in which polygon every coordinate is. However this might be quite a lot of work, and not all of my coordinates are in LA. As for the zipcodes, this 2004 database is the best I can find, however zipcodes are encoded as a single coordinates instead of a polygon. So the best I can do is find the minimum distance from a given coordinate to the given zipcode-coordinates, which is not optimal.
I was under the impression that google-maps or open-street-maps should be able to do this (as they seem to 'know' exactly where every neighboorhood and zipcode is), however I cannot find any API's to do the lookups / queries.

You can now do this directly within R itself thanks to the rather awesome ggmap package.
Like others mention, you'll be reverse geocoding using the google maps API (and therefore limited to 2,500 queries daily), but it's as simple as:
library("ggmap")
# generate a single example address
lonlat_sample <- as.numeric(geocode("the hollyood bowl"))
lonlat_sample # note the order is longitude, latitiude
res <- revgeocode(lonlat_sample, output="more")
# can then access zip and neighborhood where populated
res$postal_code
res$neighborhood

Use Reverse Geocoding to convert your lat/lon to addresses. It has some limit on the number of queries per day though.

Here is a nice blog post with examples how to geocode and reverse geocode using google-maps.

Try this one:
http://www.usnaviguide.com/zip.htm
There is some limit as to how many queries per day you can do on the site, but they also sell the complete database, which changes every few months.
Sorry that I don't know of any free resources.

As others suggested, geocode them into street address should work fine for zip code. i am not too sure about neighborhood, because you may have to look if street number is odd/even to see if it is located which side of a road that determines neighborhood.
An alternative way is to prepare GIS polygon feature (ESRI shape file for example), test each point against this set of polygons see which one it intersects.
zip code is very straighforward, you can download shape file from the census.
http://www.census.gov/cgi-bin/geo/shapefiles2010/main
neighborhood is harder, i'd guess. In another part of US i had to create my shape file on my own by combining definitions from municipal government, real-estate website, newspaper etc so that it looks like what people thinks neighborhood in the city are without having any overlap or gap. It can take some time to compose such set of polygons. you may crab census "block group", or even census "block" from the above page and merge them
Once you prepared polygon features, there are couple of GIS tools on different environment (stand-alone executable, GUI program, c/python/sql etc API, probably R as well, to do intersection of polygons and points.

Related

Mapping how many points are within a radius of every location in R

In R, I am trying to create a choropleth map. I have built a database of businesses, some are part of chains (e.g. McDonalds) and others are independent. I want to calculate how many businesses are within 30km of each point on the map, but treat the different locations of chains as a single business.
For example, if you a point is:
5km from a McDonalds,
10km from Taco Bell
15km from Chick-Fil-A
20km from KFC
25km from McDonalds
35km from Five Guys
The colour will show that there are 4 fast food outlets within 30km.
I am happy to use any R package but I am mostly familiar with tmaps and ggplot2 maps.
At this stage the best approach I can think of is to create polygons for each chain and stack them as transparent layers of the same colour. I don't think this would be very efficient and wouldn't create a very nice looking choropleth.
The other answers I could find were either counting points (e.g https://gis.stackexchange.com/questions/229066/counting-how-many-times-a-point-is-inside-a-set-of-intersecting-polygons-in-r) or for GIS software.
EDIT:
I have managed to create a 30km radius from every location of every chain (using rgeos gIntersection). I now have a series of polygons.
To solve my question the additional thing I need to do is create polygons for where:
Only one polygon covers the area,
Two polygons covers the area,
etc.
To try to visual is this I used the answer from https://gis.stackexchange.com/questions/229066/counting-how-many-times-a-point-is-inside-a-set-of-intersecting-polygons-in-r
In the linked question they are trying to count how many polygons cover the numbered points (the image on the right). What I am trying to do is to create the image on the left, where there are polygons of no overlap (1), two overlapping polygons (2) and so on.
I think what you are trying to accomplish would be best approached using a raster approach rather than a chloropleth. To make a chorlopleth, you define a set of (generally irregular) polygons, summarize something within each polygon, then color the polygons based on the attributes. This would be a good approach if you wanted to say how many fast food resteraunts are within each state or county, or how many fast food joints per capita by state.
From your description, however, you are looking for how many fast food joints within a set radius for all points. This is more of a raster question, since you can represent your data on a regular grid.
The raster package is a good start for working with raster data and works well with the sf package.
You need to determine what density you need to accomplish your goal, then use this to determine the resolution of your raster. Once you've got that you can use raster::rasterize() to summarize your (I'm assuming) point data.
I'm assuming you have an object that has the locations of each restaurant, I'll call this object "points".
library(raster)
library(sf)
# create raster template with 30km resolution (I'm assuming your projection is in meters)
raster_template = raster((extent(points),
resolution = 30000,
crs = st_crs(points)
)
# rasterize your point data
r = rasterize(points, raster_template, fun = "count")
This should create a grid where each cell has the number of points within each 30km cell. You should then be able to plot the raster, but may want to either clip or mask it to just show parts that are within New Zealand

Feature engineering of X,Y coordinates in neighborhoods of San Francisco

I am participating in a starter Kaggle competition(Crimes in San Francisco) in which I want to predict the category of a crime using a bunch of predictor variables including X and Y coordinates of a crime. As I doubt of the predictive power of the coordinates, I want to transform these variables to something more relevant to the crime category.
So I am thinking that if I had the neighbourhood of San Francisco in which the crime took place, it would be more informative than the actual coordinates of the crime. I can find the neighbourhoods online but of course I cant use the borders of each neighbour to classify the corresponding crime because their shapes are not rectangular or anything like that.
Does anyone have any idea about how I could solve this one?
Thanks guys
Well that's interesting AntoniosK and it's getting close to what I want to accomplish. The problem is that the information " south-east and 2km from city center" can lead to more than one neighborhoods.
I am still thinking that the partition of the city in neighborhoods is valuable because the socio-economic and structural differences between them ( there is a reason why the neighborhoods of each city are separated as such, right?) can lead to a higher probability for a certain category crime and a lower one for another.
That said, your idea made me thinking of using the south-east etc mapping and then use the angle of the segment(point to city center) with x axis to map the point to appropriate neighborhood. I am on it right now. Thanks
After some time on the problem I found that the procedure I want to perform is titled " reverse geocoding". It also turns out that there are some api's to solve this. The best according to my opinion is revgeocode() function contained in ggmap package(google's edition). This one though has a query limit per day(2500 queries) unless you pay for extra.
The one that I turned to though is geonames package and GNneighbourhood function that turns coordinates to neighbours. It is free, though I have experienced some errors(keep in mind that this one is only for US and Canada cities)
revgeocode function-ggmap package
Gnneighbourhood-geonames package

Divide a city into regions in Google Maps

I am trying to divide a certain city into several blocks, each representing North, North-West, North-East, South...and so on. I just need the coordinates of the region boundaries (e.g.: North is between X and Y latitude and between Z and T longitude), so that I can check in my app whether a point belongs to a region or another. The regions should not depend on a certain zoom level's boundaries and they don't need to be the same size (maybe the North part of a city is a little bit larger then the South one).
Any idea how can I "draw" these region boundaries? Thank you!
For boundary data, you would have to do a search. Depends on the city and country. In the US, many municipalities provide this data directly through a city or country web site. Generally it will be in a GIS data format such as a shapefile. You have a number of different options for working programmatically with GIS data formats. I recommend using the GDAL libraries,
particularly ogr2ogr. Once you've got the boundary data, you can draw it on the map using polyline overlays or create a raster images of the data, say using gdal_rasterize. Or you can convert the data to KML using ogr2ogr, and upload it to Google Fusion Tables using Google Docs and overlay it using a FusionTablesLayer.

How to find all the roads / streets / highways within a polynomial

This is a maps related question.
Summary: given a polynomial, I want to find all the roads / streets / highways within a polynomial.
The 'bounds' methodology does not seem to have way where I can ask it to give me all the roads within that bound
Similarly, the co-ordinate+radius methodology gives all the places like schools, but there is no way to get all the streets within that range.
We have tried to do something by creating our own polynomial using 4 coordinates, and then trying to estimate the roads, but we are far from the result. So we feel we are in wrong direction all-together.
The URL of my experiment is here: http://prototype.nextgeni.us/polygon/
I don't think this data is exposed by the API. What you're looking at is a collection of images with place names embedded into them. The closest thing I can think of would be to use the DirectionsService, which does give you street names (but not in a useful format). No idea how you could extend that to cover a whole polygon though, as it would just give you 1 route between markers, not all possible routes (and therefore not necessarily all streets in the polygon).

Getting a handle on GIS math, where do I start?

I am in charge of a program that is used to create a set of nodes and paths for consumption by an autonomous ground vehicle. The program keeps track of the locations of all items in its map by indicating the item's position as being x meters north and y meters east of an origin point of 0,0. In the real world, the vehicle knows the location of the origin's lat and long, as it is determined by a dgps system and is accurate down to a couple centimeters. My program is ignorant of any lat long coordinates.
It is one of my goals to modify the program to keep track of lat long coords of items in addition to an origin point and items' x,y position in relation to that origin. At first blush, it seems that I am going to modify the program to allow the lat long coords of the origin to be passed in, and after that I desire that the program will automatically calculate the lat long of every item currently in a map. From what I've researched so far, I believe that I will need to figure out the math behind converting to lat long coords from a UTM like projection where I specify the origin points and meridians etc as opposed to whatever is defined already for UTM.
I've come to ask of you GIS programmers, am I on the right track? It seems to me like there is so much to wrap ones head around, and I'm not sure if the answer isn't something as simple as, "oh yea theres a conversion from meters to lat long, here"
Currently, due to the nature of DGPS, the system really doesn't need to care about locations more than oh, what... 40 km? radius away from the origin. Given this, and the fact that I need to make sure that the error on my coordinates is not greater than .5 meters, do I need anything more complex than a simple lat/long to meters conversion constant?
I'm knee deep in materials here. I could use some pointers about what concepts to research.
Thanks much!
Given a start point in lat/long and a distance and bearing, finding the end point is a geodesic calculation. There's a great summary of geodesic calculations and errors on the proj.4 website. They come to the conclusion that using a spherical model can get results for distance between points with at most 0.51% error. That, combined with a formula to translate between WGS-84 and ECEF (see the "LLA to ECEF" and "ECEF to LLA" sections, seems like it gets you what you need.
If you want to really get the errors nailed down by inverse projecting your flat map to WGS-84, proj.4 is a projection software package. It has source code, and comes with three command line utilities - proj, which converts to/from cartographic projection and cartesian data; cs2cs, which converts between different cartographic projections; and geod, which calculates geodesic relationships.
The USGS publishes a very comprehensive treatment of map projections.
I'd do a full-up calculation if you can. That way you'll always be as accurate as you can be.
If you happen to be using C++ the GDAL is a very good library.
For a range of 40km, you may find that approximating the world to a 2D flat surface may work, although a UTM transform would be the ideal way to go - in any case, I'd advocate using the actual WGS84 co-ordinates & ellipsoid for calculations such as great circle distance, or calculating bearings.
If you get bored, you could go down a similar line to something I've been working on, that can be used as a base class for differing datums such as OSGB36 or WGS84...

Resources