Flow mapping in R - r

I'm trying to plot trips between zipcodes in R. Specifically, I'm trying to create an interactive where you can click on each zipcode, and see the other zipcodes colored according to how many people traveled from the zip you clicked on to the other zipcodes. Sort of like this: https://www.forbes.com/special-report/2011/migration.html
But less fancy; just showing "out-migration" would be super.
I've been messing with this in R using the leaflet package, but I haven't managed to figure it out. Could someone with better R skills help me out? Any insight would be much appreciated.
I've downloaded a shapefile of zipcodes in LA county from here:
https://data.lacounty.gov/Geospatial/ZIP-Codes/65v5-jw9f
Then I used the code below to create some toy data.
You can find the zipcode shapefiles here:
https://drive.google.com/file/d/0B2a3BZ6nzEGJNk55dmdrdVI2MTQ/view?usp=sharing
And you can find the toy data here:
https://drive.google.com/open?id=0B2a3BZ6nzEGJR29EOFdjR1NPR3c
Here's the code I've got so far:
require(rgdal)
setwd("~/Downloads/ZIP Codes")
# Read SHAPEFILE.shp from the current working directory (".")
shape <- readOGR(dsn = ".", layer = "geo_export_89ff0f09-a580-4844-988a-c4808d510398")
plot(shape) #Should look like zip codes in LA county
#get a unique list of zipcodes
zips <- as.numeric(as.character(unique(shape#data$zipcode)))
#create a dataframe with all the possible combination of origin and destination zipcodes
zips.df <- data.frame(expand.grid(as.character(zips),as.character(zips)), rpois(96721,10))
#give the dataframe some helpful variable names
names(zips.df) <- c("origin_zip", "destination_zip","number_of_trips")
Like I said, any help would be much appreciated. Thanks!

Related

Tracking animal movements, and exporting tracks as a shapefile

I have a series of lat/long coordinates for capture sites and roost trees of bats. I'd like to connect the dots between the captures and roosts in order of date, and individual (each bat has a unique ID). I've found numerous ways of plot the tracks via either the "move" or "moveHMM" packages. But I haven't found a way to export the tracks as a shapefile. Here's an example of what I'd like to do using data and code from the "moveHMM" package:
install.packages("moveHMM")
install.packages("rgdal")
library(moveHMM)
library(rgdal)
elk_data$Easting <- elk_data$Easting/1000
elk_data$Northing <- elk_data$Northing/1000
data <- prepData(elk_data,type = "UTM",coordNames = c("Easting","Northing"))
utmcoord <- SpatialPoints(cbind(data$x*1000,data$y*1000),proj4string=CRS("+proj=utm +zone=17"))
llcoord <- spTransform(utmcoord,CRS("+proj=longlat"))
lldata <- data.frame(ID=data$ID,x=attr(llcoord,"coords")
[,1],y=attr(llcoord,"coords")[,2])
plotSat(lldata,zoom=8)
I'd like to have the tracks for the 4 elk displayed in this plot to all be within one shapefile. Thanks for any help you can provide.
Keith

twitteR search geocode argument in R

I want to run a simple search using twitteR but only return tweets located in the U.S. I know twitteR has a geocode argument for lat/long and miles within that lat/long, but this way of locating tweets for an entire country seems hard.
What would I input into the argument to only get US tweets?
Thanks,
I did a brief search around and it looks like twitteR does not have a built-in country argument. But since you have lat/long, it's very straightforward to do a spatial join to a US country shapefile (i.e. point in polygon).
In this example, I'm using the shapefile from Census.gov and the package spatialEco for its point.in.polygon() function. It's a very fast spatial-join function compared to what other packages offer, even if you have hundreds of thousands of coordinates and dozens of polygons. If you have millions of tweets -- or if you decide later on to join to multiple polygons, e.g. all world countries -- then it could be a lot slower. But for most purposes, it's very fast.
(Also, I don't have a Twitter API set up, so I'm going to use an example data frame with tweet_ids and lat/long.)
library(maptools) # to
library(spatialEco)
# First, use setwd() to set working directory to the folder called cb_2015_us_nation_20m
us <- readShapePoly(fn = "cb_2015_us_nation_20m")
# Alternatively, you can use file.choose() and choose the .shp file like so:
us <- readShapePoly(file.choose())
# Create data frame with sample tweets
# Btw, tweet_id 1 is St. Louis, 2 is Toronto, 3 is ouston
tweets <- data.frame(tweet_id = c(1, 2, 3),
latitude = c(38.610543, 43.653226, 29.760427),
longitude = c(-90.337189, -79.383184, -95.369803))
# Use point.in.poly to keep only tweets that are in the US
coordinates(tweets) <- ~longitude+latitude
tweets_in_us <- point.in.poly(tweets, us)
tweets_in_us <- as.data.frame(tweets_in_us)
Now, if you look at tweets_in_us you should see only the tweets whose lat/long fall within the area of the US.

R Program map of World Leader's Age

I am working on R program to plot color-coded map of world leader's age.
This is my source, https://en.wikipedia.org/wiki/List_of_state_leaders_in_2015
Question: How do I know which countries are included in R program? I am searching documentation of rworldmap but they do not give list.
Please advise
I looked at the structure of the value returned by the getMap function. It appears that column "NAME" contains the name of an area in the map while "SOVEREIGNT" is the name of the corresponding sovereign state.
The code to extract the unique values in both columns is the following:
library(rworldmap)
world_map <- getMap()
areas <- levels(world_map[["NAME"]])
sovereign_states <- levels(world_map[["SOVEREIGNT"]])

Raster Merging in R

I need a little help with some R syntax to complete what (I think) is a fairly straightforward task- hopefully someone can assist!
I have a raster map of the UK which is split into postcode areas (e.g. DE, NG, NR etc. 127 postcodes in total).
I have installed the package 'raster' and have successfully plotted the .img in R. All working and looks correct with the raster.
I also have a comma delimited CSV file containing the same postcodes as the raster with another column next to it containing revenue for each postcode.
I was wondering if someone could help me merge/bind the revenue figures into the correct postcode in the raster so that I can plot revenue per postcode.
I feel I should be using cbind and reclassify to do this but I can't work it out on my own
Any help would be appreciated. Thanks in advance!
This is the code I have so far...not rocket science just yet.
setwd("C:\\Users\\[username]\\Documents\\GIS\\Test Data")
require(raster)
revenue<-read.table("revenue.csv",header=T,row.names=1,sep=",")
postcodes<-raster("C:\\Users\\[username]\\Documents\\GIS\\Test Data\\rasters\\postcodes\\postcodes.img")
trim(postcodes)
plot(postcodes)
You should be able to do this with the 'subs' method. You do not show us much about your data (e.g. head(revenue)), but it should work like this:
library(raster)
setwd("C:\\Users\\[username]\\Documents\\GIS\\Test Data")
postcodes <- raster("rasters\\postcodes\\postcodes.img")
revenue <- read.csv("revenue.csv")
subs(postcodes, revenue, by='code', which='rev')
where 'code' and 'rev' would be the column names in data.frame revenue that identify the postcode and revenue fields.

Choropleth Maps in R - TIGER Shapefile issue

Have a Question on Mapping with R, specifically around the choropleth maps in R.
I have a dataset of ZIP codes assigned to an are and some associated data (dataset is here).
My final data format is: Area ID, ZIP, Probability Value, Customer Count, Area Probability and Area Customer Total. I am attempting to present this data by plotting area probability and Area Customer Total on a Map. I have tried to do this by using the census TIGER Shapefiles but I guess R cannot handle the complete country.
I am comfortable with the Statistical capabilities and now I am moving all my Mapping from third party GIS focused applications to doing all my Mapping in R. Does anyone have any pointers to how to achieve this from within R?
To be a little more detailed, here's the point where R stops working -
shapes <- readShapeSpatial("tl_2013_us_zcta510.shp")
(where the shp file is the census/TIGER) shape file.
Edit - Providing further details. I am trying to first read the TIGER shapefiles, hoping to combine this spatial dataset with my data and eventually plot. I am having an issue at the very beginning when attempting to read the shape file. Below is the code with the output
require(maptools)
shapes<-readShapeSpatial("tl_2013_us_zcta510.shp")
Error: cannot allocate vector of size 317 Kb
There are several examples and tutorials on making maps using R, but most are very general and, unfortunately, most map projects have nuances that create inscrutable problems. Yours is a case in point.
The biggest issue I came across was that the US Census Bureau zip code tabulation area shapefile for the whole US is huge: ~800MB. When loaded using readOGR(...) the R SpatialPolygonDataFrame object is about 913MB. Trying to process a file this size, (e.g., converting to a data frame using fortify(...)), at least on my system, resulted in errors like the one you identified above. So the solution is to subset the file based in the zip codes that are actually in your data.
This map:
was made from your data using the following code.
library(rgdal)
library(ggplot2)
library(stringr)
library(RColorBrewer)
setwd("<directory containing shapfiles and sample data>")
data <- read.csv("Sample.csv",header=T) # your sample data, downloaded as csv
data$ZIP <- str_pad(data$ZIP,5,"left","0") # convert ZIP to char(5) w/leading zeros
zips <- readOGR(dsn=".","tl_2013_us_zcta510") # import zip code polygon shapefile
map <- zips[zips$ZCTA5CE10 %in% data$ZIP,] # extract only zips in your Sample.csv
map.df <- fortify(map) # convert to data frame suitable for plotting
# merge data from Samples.csv into map data frame
map.data <- data.frame(id=rownames(map#data),ZIP=map#data$ZCTA5CE10)
map.data <- merge(map.data,data,by="ZIP")
map.df <- merge(map.df,map.data,by="id")
# load state boundaries
states <- readOGR(dsn=".","gz_2010_us_040_00_5m")
states <- states[states$NAME %in% c("New York","New Jersey"),] # extract NY and NJ
states.df <- fortify(states) # convert to data frame suitable for plotting
ggMap <- ggplot(data = map.df, aes(long, lat, group = group))
ggMap <- ggMap + geom_polygon(aes(fill = Probability_1))
ggMap <- ggMap + geom_path(data=states.df, aes(x=long,y=lat,group=group))
ggMap <- ggMap + scale_fill_gradientn(name="Probability",colours=brewer.pal(9,"Reds"))
ggMap <- ggMap + coord_equal()
ggMap
Explanation:
The rgdal package facilitates the creation of R Spatial objects from ESRI shapefiles. In your case we are importing a polygon shapefile into a SpatialPolygonDataFrame object in R. The latter has two main parts: a polygon section, which contains the latitude and longitude points that will be joined to create the polygons on the map, and a data section which contains information about the polygons (so, one row for each polygon). If, e.g., we call the Spatial object map, then the two sections can be referenced as map#polygons and map#data. The basic challenge in making choropleth maps is to associate data from your Sample.csv file, with the relevant polygons (zip codes).
So the basic workflow is as follows:
1. Load polygon shapefiles into Spatial object ( => zips)
2. Subset if appropriate ( => map).
3. Convert to data frame suitable for plotting ( => map.df).
4. Merge data from Sample.csv into map.df.
5. Draw the map.
Step 4 is the one that causes all the problems. First we have to associate zip codes with each polygon. Then we have to associate Probability_1 with each zip code. This is a three step process.
Each polygon in the Spatial data file has a unique ID, but these ID's are not the zip codes. The polygon ID's are stored as row names in map#data. The zip codes are stored in map#data, in column ZCTA5CE10. So first we must create a data frame that associates the map#data row names (id) with map#data$ZCTA5CE10 (ZIP). Then we merge your Sample.csv with the result using the ZIP field in both data frames. Then we merge the result of that into map.df. This can be done in 3 lines of code.
Drawing the map involves telling ggplot what dataset to use (map.df), which columns to use for x and y (long and lat) and how to group the data by polygon (group=group). The columns long, lat, and group in map.df are all created by the call to fortify(...). The call to geom_polygon(...) tells ggplot to draw polygons and fill using the information in map.df$Probability_1. The call to geom_path(...) tells ggplot to create a layer with state boundaries. The call to scale_fill_gradientn(...) tells ggplot to use a color scheme based on the color brewer "Reds" palette. Finally, the call to coord_equal(...) tells ggplot to use the same scale for x and y so the map is not distorted.
NB: The state boundary layer, uses the US States TIGER file.
I would advise the following.
Use readOGR from the rgdal package rather than readShapeSpatial.
Consider using ggplot2 for good-looking maps - many of the examples use this.
Refer to one of the existing examples of creating a choropleth such as this one to get an overview.
Start with a simple choropleth and gradually add your own data; don't try and get it all right at once.
If you need more help, create a reproducible example with a SMALL fake dataset and with links to the shapefiles in question. The idea is that you make it easy to help us help you rather than discourage us by not supplying code and data in your question.

Resources