Merge neighboring regions in R (aggregate spatial data)? - r

I guess I needed to rephrase my awfully worded previous question (deleted it). Here's another try. I want to join to adjacent regions, in a way that their common border disappears and only their outer line can be seen.
Here's a reproducible example:
require(shapefiles)
require(sp)
xx <- readShapeSpatial(system.file("shapes/sids.shp", package="maptools")[1],
IDvar="FIPSNO", proj4string=CRS("+proj=longlat +ellps=clrk66"))
# show all the subregions
plot(xx)
Now let's consider only regions regions 3 and 5
plot(xx[c(3,5),])
How can I just aggregate these regions. In practice what I want to do is like having a map of the whole continent showing all countries and producing a map that shows North America and South America.
To me this looks like a pretty common task but I can't find the right function to do it so far. Do I just miss a function or can I simply to it manually?

The rgeos package provides a number of excellent tools for handling Spatial* data, that can be used in this case.
For example:
library(rgeos)
regionOfInterest <- gUnion(xx[3,], xx[5,])
This also has the same result, and may be more useful for multiple polygons:
regionOfInterest <- gUnionCascaded(xx[c(3,5), ])
The result from plot(regionOfInterest):

Related

How to get started on creating choropleth map

The task at hand is mapping the empprevyearpct value to a county map. The sample data is below.
library(tidyverse)
library(tigris)
countyname <- c("Carson City","Churchill County","Clark County","Douglas County","Elko County","Esmeralda County","Eureka County","Humboldt County","Lander County","Lincoln County","Lyon County","Mineral County","Nye County","Pershing County","Storey County","Washoe County","White Pine County")
prevyearpct <- c(.545,.541,.539,.401,.301,.201,.101,.001,.664,.604,.704,.123,.129,.130,.085,.015,.099)
data2 <- data.frame(countyname, prevyearpct)
Here is the code that I use from Tigris to get the shape file.
NV_counties <- counties(32)
I do not need the work done for me. If I was to map the prevyearpct values on to the counties, where would I start? Do I need to append the data so that the NV_Counties and data2 are one consolidated item? I have read quite a few articles/tutorials but nothing that uses tigris.
You can use geo_join() to join the two datasets together. After that, you can use geom_sf() to map it out (this guide may help).

How can I split / clip a polygon by lines in R?

I want to separate CO (a polygon) into sections (also polygons) that are not split by roads (linestrings). Which is to say I want the sections of smaller polygons to be bounded by roads or state borders, and not to contain any roads that enter and exit the polygon.
I was able to use lwgeom::st_split to generate a geometry collection, but I am not sure if that helps me; I am stuck with this solution because I am not sure how to extract the geometries in the collection and, for instance, assign them unique IDs.
My end goal is to make sure that my points (separate data) are not separated by roads. So if you have a solution to this that may be more direct I am all ears as well.
library(tidyverse)
library(tigris)
library(sf)
library(lwgeom)
co <- states(cb = T) %>%
filter(NAME == "Colorado")
roads <- primary_secondary_roads(state = 'Colorado')
cosplit <- st_split(co,roads)
Has anyone found or seen a solution to this?
I think I figured it out...but I definitely would love to hear anyone elses ideas!!!
cosplitpoly <- cosplit %>%
st_collection_extract(c("POLYGON"))

Extracting values from inside polygons raster r

I'm trying to find the mean daily temperature for counties in South Dakota from raster grids ('bil' files) found at http://prism.oregonstate.edu/. I am getting county boundaries from the 'maps' package.
library(maps)
library(raster)
sd_counties <- map('county','south dakota')
sd_raster <- raster('file_path')
How do I extract the grid cells within each county? I think I need to turn each county into it's own polygon to do this, but how? Then, I should be able to do something like the following. Any help would be greatly appreciated.
values <- extract(raster, list of polygons)
polygon_means <- unlist(lapply(values, FUN=mean))
I'm not familiar with the maps package or the map function, but it looks like it's solely for visualization, rather than geospatial operations.
While there might be a way to convert the map object to actual polygons, here's an easy way sing raster's getData function that works:
library(raster)
usa_adm2 <- getData(country='USA',level=2)
sd_counties <- usa_adm2[grepl('South Dakota',usa_adm2$NAME_1),]
plot(sd_counties)
Now you can extract pixels for each county using extract(r,sd_counties), where r is your desired raster.
Note, that depending on the number of pixels (and layers) you need to extract, that can take some time.

twitteR search geocode argument in R

I want to run a simple search using twitteR but only return tweets located in the U.S. I know twitteR has a geocode argument for lat/long and miles within that lat/long, but this way of locating tweets for an entire country seems hard.
What would I input into the argument to only get US tweets?
Thanks,
I did a brief search around and it looks like twitteR does not have a built-in country argument. But since you have lat/long, it's very straightforward to do a spatial join to a US country shapefile (i.e. point in polygon).
In this example, I'm using the shapefile from Census.gov and the package spatialEco for its point.in.polygon() function. It's a very fast spatial-join function compared to what other packages offer, even if you have hundreds of thousands of coordinates and dozens of polygons. If you have millions of tweets -- or if you decide later on to join to multiple polygons, e.g. all world countries -- then it could be a lot slower. But for most purposes, it's very fast.
(Also, I don't have a Twitter API set up, so I'm going to use an example data frame with tweet_ids and lat/long.)
library(maptools) # to
library(spatialEco)
# First, use setwd() to set working directory to the folder called cb_2015_us_nation_20m
us <- readShapePoly(fn = "cb_2015_us_nation_20m")
# Alternatively, you can use file.choose() and choose the .shp file like so:
us <- readShapePoly(file.choose())
# Create data frame with sample tweets
# Btw, tweet_id 1 is St. Louis, 2 is Toronto, 3 is ouston
tweets <- data.frame(tweet_id = c(1, 2, 3),
latitude = c(38.610543, 43.653226, 29.760427),
longitude = c(-90.337189, -79.383184, -95.369803))
# Use point.in.poly to keep only tweets that are in the US
coordinates(tweets) <- ~longitude+latitude
tweets_in_us <- point.in.poly(tweets, us)
tweets_in_us <- as.data.frame(tweets_in_us)
Now, if you look at tweets_in_us you should see only the tweets whose lat/long fall within the area of the US.

Choropleth world map

I have read so many threads and articles and I keep getting errors. I am trying to make a choropleth? map of the world using data I have from the global terrorism database. I want to color countries on a factor of nkills or just the number of attacks in that country.. I don't care at this point. Because there are so many countries with data, it is unreasonable to make any plots to show this data.
Help is strongly appreciated and if I did not ask this correctly I sincerely apologize, I am learning the rules of this website as I go.
my code (so far..)
library(maps)
library(ggplot2)
map("world")
world<- map_data("world")
gtd<- data.frame(gtd)
names(gtd)<- tolower(names(gtd))
gtd$country_txt<- tolower(rownames(gtd))
demo<- merge(world, gts, sort=FALSE, by="country_txt")
In the gtd data frame, the name for the countries column is "country_txt" so I thought I would use that but I get error in fix.by(by.x, x) : 'by' must specify a uniquely valid column
If that were to work, I would plot as I have seen on a few websites..
I have honestly been working on this for so long and I have read so many codes/other similar questions/websites/r handbooks etc.. I will accept that I am incompetent when it comes to R gladly for some help.
Something like this? This is a solution using rgdal and ggplot. I long ago gave up on using base R for this type of thing.
library(rgdal) # for readOGR(...)
library(RColorBrewer) # for brewer.pal(...)
library(ggplot2)
setwd(" < directory with all files >")
gtd <- read.csv("globalterrorismdb_1213dist.csv")
gtd.recent <- gtd[gtd$iyear>2009,]
gtd.recent <- aggregate(nkill~country_txt,gtd.recent,sum)
world <- readOGR(dsn=".",
layer="world_country_admin_boundary_shapefile_with_fips_codes")
countries <- world#data
countries <- cbind(id=rownames(countries),countries)
countries <- merge(countries,gtd.recent,
by.x="CNTRY_NAME", by.y="country_txt", all.x=T)
map.df <- fortify(world)
map.df <- merge(map.df,countries, by="id")
ggplot(map.df, aes(x=long,y=lat,group=group)) +
geom_polygon(aes(fill=nkill))+
geom_path(colour="grey50")+
scale_fill_gradientn(name="Deaths",
colours=rev(brewer.pal(9,"Spectral")),
na.value="white")+
coord_fixed()+labs(x="",y="")
There are several versions of the Global Terrorism Database. I used the full dataset available here, and then subsetted for year > 2009. So this map shows total deaths due to terrorism, by country, from 2010-01-01 to 2013-01-01 (the last data available from this source). The files are available as MS Excel download, which I converted to csv for import into R.
The world map is available as a shapefile from the GeoCommons website.
The tricky part of making choropleth maps is associating your data with the correct polygons (countries). This is generally a four step process:
Find a field in the shapefile attributes table that maps (no pun intended) to a corresponding field in your data. In this case, it appears that the field "CNTRY_NAME" in the shapefile maps to the field "country_txt" in gtd database.
Create an association between ploygon IDs (stored in the row names of the attribute table), and the CNTRY_NAME field.
Merge the result with your data using CNTRY_NAME and country_txt.
Merge the result of that with the data frame created using the fortify(map) - this associates ploygons with deaths (nkill).
Building on the nice work by #jlhoward. You could instead use rworldmap that already has a world map in R and has functions to aid joining data to the map. The default map is deliberately low resolution to create a 'cleaner' look. The map can be customised (see rworldmap documentation) but here is a start :
library(rworldmap)
#3 lines from #jlhoward
gtd <- read.csv("globalterrorismdb_1213dist.csv")
gtd.recent <- gtd[gtd$iyear>2009,]
gtd.recent <- aggregate(nkill~country_txt,gtd.recent,sum)
#join data to a map
gtdMap <- joinCountryData2Map( gtd.recent,
nameJoinColumn="country_txt",
joinCode="NAME" )
mapDevice('x11') #create a world shaped window
#plot the map
mapCountryData( gtdMap,
nameColumnToPlot='nkill',
catMethod='fixedWidth',
numCats=100 )
Following a comment from #hk47, you can also add the points to the map sized by the number of casualties.
deaths <- subset(x=gtd, nkill >0)
mapBubbles(deaths,
nameX='longitude',
nameY='latitude',
nameZSize='nkill',
nameZColour='black',
fill=FALSE,
addLegend=FALSE,
add=TRUE)

Resources