Extracting values from inside polygons raster r - r

I'm trying to find the mean daily temperature for counties in South Dakota from raster grids ('bil' files) found at http://prism.oregonstate.edu/. I am getting county boundaries from the 'maps' package.
library(maps)
library(raster)
sd_counties <- map('county','south dakota')
sd_raster <- raster('file_path')
How do I extract the grid cells within each county? I think I need to turn each county into it's own polygon to do this, but how? Then, I should be able to do something like the following. Any help would be greatly appreciated.
values <- extract(raster, list of polygons)
polygon_means <- unlist(lapply(values, FUN=mean))

I'm not familiar with the maps package or the map function, but it looks like it's solely for visualization, rather than geospatial operations.
While there might be a way to convert the map object to actual polygons, here's an easy way sing raster's getData function that works:
library(raster)
usa_adm2 <- getData(country='USA',level=2)
sd_counties <- usa_adm2[grepl('South Dakota',usa_adm2$NAME_1),]
plot(sd_counties)
Now you can extract pixels for each county using extract(r,sd_counties), where r is your desired raster.
Note, that depending on the number of pixels (and layers) you need to extract, that can take some time.

Related

Creating polygons from sparse data in R

I have a dataframe with coordinates for which I want to create polygons, the most normal is a polygon like the one I put in the first image:
But I'm looking for something different, more like this:
As you can see, if the points are enough far, another polygon is created, is this possible in R? Thanks!
Here are the data
csv
I think you would find the concept of concave/alpha hulls relevant. There's an R package alphahull that may cover your need.
install.packages("alphahull")
library(alphahull)
fff <- readr::read_csv("data.csv")
dddd <- ahull(fff[,2:3],alpha = 0.01)
plot(dddd)
And in case you need to convert this output into a spatial data format, please see the following:
https://babichmorrowc.github.io/post/2019-03-18-alpha-hull/

Create a vector footprint of a multi-part raster [R]

I've read a raster into my R session, using this code:
raster <- stack("raster.tif")
and now I'd like to make a simple feature (sf) object representing the outline of that raster. I can't use a bounding box because the raster is multi-part so the bounding box would be much larger than the raster. So the footprint also needs to be a multi-part feature (sf multipolygon).
I'd appreciate any help with this. Thanks!
Mark
If you want each raster in the stack you need to loop over each one with lapply. This will return a list of polygon layers. You then need to convert each component of the list to an sf multipolygon. Lastly, you need to concatenate the features (note that c is the c() function). shp should be your multipolygon. You may not want to dissolve the polygons, you didn't really make it clear what you wanted.
a <- lapply(as.list(raster), rasterToPolygons, dissolve=TRUE)
b <- lapply(a, st_as_sf) # convert to sf multipolygon
shp <- Reduce(c, b) # combine all polygons to one
As a side note, it's probably not great to use raster as a variable name because the raster package has a function called raster.

How to subset a large SpatialPolygonsDataFrame

I want to calculate the area of a wildfire. I tried this by substracting the NDVI calculated on a Landsat image before and another image after the fire and see where the NDVI was reduced. However, not only in the burning areas the NDVI has changed, but there are also many random differences. I used rasterToPolygons to create a large SpatialPolygonsDataFrame containing all areas where NDVI after - NDVI before < 0.
Now I want to remove all the polygons with an area below a certain threshold value. However, I cannot find a way to subset the large SpatialPolygonsDataFrame.
I found an example on how to get a list of the polygons with an area above the threshold (where burned_poly is the large SpatialPolygonsDataFrame):
pols <- lapply(burned_poly#polygons , slot , "Polygons")
pols_areas <- lapply(pols[[2]], function(x) slot(x, "area"))
However, accessing the large SpatialPolygonsDataFrame like this
bp <- burned_poly#polygons[[1]]#Polygons[pols_areas >= 9000]
gives me a list which I am currently unable to coerce into a SpatialPolygonsDataFrame.
Can someone tell me how to do this last step (I have trouble with the Sf argument of which I don't know what it is in the SpatialPolygonsDataFrame function), or maybe there is a different and better approach to extract the fire extent as a polygon?
Alright, I think I have found a way thanks to Orlandos suggestion to use sf.
I transformed my large SpatialPolygonsDataFrame object to a sf object via st_as_sf() which gave me a multipolygon. This stf_MULTIPOLYGON object can be subdivided into single polygons using st_cast() and the resulting object is subsettable like a data.frame.
bp_sf <- st_as_sf(burned_poly)
bps_sf <- st_cast(bp_sf, "POLYGON")
BpSf <- bps_sf[as.numeric(st_area(bps_sf))>=10000,]
If you are using the simple features sf library you can use functions from the tidyverse. Filtering data is a matter of using the filter() function. Notice that you can convert your objects to sf using st_as_sf(). See: https://r-spatial.github.io/sf/reference/st_as_sf.html and How to filter an R simple features collection using sf methods like st_intersects()?

Choropleth Maps in R - TIGER Shapefile issue

Have a Question on Mapping with R, specifically around the choropleth maps in R.
I have a dataset of ZIP codes assigned to an are and some associated data (dataset is here).
My final data format is: Area ID, ZIP, Probability Value, Customer Count, Area Probability and Area Customer Total. I am attempting to present this data by plotting area probability and Area Customer Total on a Map. I have tried to do this by using the census TIGER Shapefiles but I guess R cannot handle the complete country.
I am comfortable with the Statistical capabilities and now I am moving all my Mapping from third party GIS focused applications to doing all my Mapping in R. Does anyone have any pointers to how to achieve this from within R?
To be a little more detailed, here's the point where R stops working -
shapes <- readShapeSpatial("tl_2013_us_zcta510.shp")
(where the shp file is the census/TIGER) shape file.
Edit - Providing further details. I am trying to first read the TIGER shapefiles, hoping to combine this spatial dataset with my data and eventually plot. I am having an issue at the very beginning when attempting to read the shape file. Below is the code with the output
require(maptools)
shapes<-readShapeSpatial("tl_2013_us_zcta510.shp")
Error: cannot allocate vector of size 317 Kb
There are several examples and tutorials on making maps using R, but most are very general and, unfortunately, most map projects have nuances that create inscrutable problems. Yours is a case in point.
The biggest issue I came across was that the US Census Bureau zip code tabulation area shapefile for the whole US is huge: ~800MB. When loaded using readOGR(...) the R SpatialPolygonDataFrame object is about 913MB. Trying to process a file this size, (e.g., converting to a data frame using fortify(...)), at least on my system, resulted in errors like the one you identified above. So the solution is to subset the file based in the zip codes that are actually in your data.
This map:
was made from your data using the following code.
library(rgdal)
library(ggplot2)
library(stringr)
library(RColorBrewer)
setwd("<directory containing shapfiles and sample data>")
data <- read.csv("Sample.csv",header=T) # your sample data, downloaded as csv
data$ZIP <- str_pad(data$ZIP,5,"left","0") # convert ZIP to char(5) w/leading zeros
zips <- readOGR(dsn=".","tl_2013_us_zcta510") # import zip code polygon shapefile
map <- zips[zips$ZCTA5CE10 %in% data$ZIP,] # extract only zips in your Sample.csv
map.df <- fortify(map) # convert to data frame suitable for plotting
# merge data from Samples.csv into map data frame
map.data <- data.frame(id=rownames(map#data),ZIP=map#data$ZCTA5CE10)
map.data <- merge(map.data,data,by="ZIP")
map.df <- merge(map.df,map.data,by="id")
# load state boundaries
states <- readOGR(dsn=".","gz_2010_us_040_00_5m")
states <- states[states$NAME %in% c("New York","New Jersey"),] # extract NY and NJ
states.df <- fortify(states) # convert to data frame suitable for plotting
ggMap <- ggplot(data = map.df, aes(long, lat, group = group))
ggMap <- ggMap + geom_polygon(aes(fill = Probability_1))
ggMap <- ggMap + geom_path(data=states.df, aes(x=long,y=lat,group=group))
ggMap <- ggMap + scale_fill_gradientn(name="Probability",colours=brewer.pal(9,"Reds"))
ggMap <- ggMap + coord_equal()
ggMap
Explanation:
The rgdal package facilitates the creation of R Spatial objects from ESRI shapefiles. In your case we are importing a polygon shapefile into a SpatialPolygonDataFrame object in R. The latter has two main parts: a polygon section, which contains the latitude and longitude points that will be joined to create the polygons on the map, and a data section which contains information about the polygons (so, one row for each polygon). If, e.g., we call the Spatial object map, then the two sections can be referenced as map#polygons and map#data. The basic challenge in making choropleth maps is to associate data from your Sample.csv file, with the relevant polygons (zip codes).
So the basic workflow is as follows:
1. Load polygon shapefiles into Spatial object ( => zips)
2. Subset if appropriate ( => map).
3. Convert to data frame suitable for plotting ( => map.df).
4. Merge data from Sample.csv into map.df.
5. Draw the map.
Step 4 is the one that causes all the problems. First we have to associate zip codes with each polygon. Then we have to associate Probability_1 with each zip code. This is a three step process.
Each polygon in the Spatial data file has a unique ID, but these ID's are not the zip codes. The polygon ID's are stored as row names in map#data. The zip codes are stored in map#data, in column ZCTA5CE10. So first we must create a data frame that associates the map#data row names (id) with map#data$ZCTA5CE10 (ZIP). Then we merge your Sample.csv with the result using the ZIP field in both data frames. Then we merge the result of that into map.df. This can be done in 3 lines of code.
Drawing the map involves telling ggplot what dataset to use (map.df), which columns to use for x and y (long and lat) and how to group the data by polygon (group=group). The columns long, lat, and group in map.df are all created by the call to fortify(...). The call to geom_polygon(...) tells ggplot to draw polygons and fill using the information in map.df$Probability_1. The call to geom_path(...) tells ggplot to create a layer with state boundaries. The call to scale_fill_gradientn(...) tells ggplot to use a color scheme based on the color brewer "Reds" palette. Finally, the call to coord_equal(...) tells ggplot to use the same scale for x and y so the map is not distorted.
NB: The state boundary layer, uses the US States TIGER file.
I would advise the following.
Use readOGR from the rgdal package rather than readShapeSpatial.
Consider using ggplot2 for good-looking maps - many of the examples use this.
Refer to one of the existing examples of creating a choropleth such as this one to get an overview.
Start with a simple choropleth and gradually add your own data; don't try and get it all right at once.
If you need more help, create a reproducible example with a SMALL fake dataset and with links to the shapefiles in question. The idea is that you make it easy to help us help you rather than discourage us by not supplying code and data in your question.

Merge neighboring regions in R (aggregate spatial data)?

I guess I needed to rephrase my awfully worded previous question (deleted it). Here's another try. I want to join to adjacent regions, in a way that their common border disappears and only their outer line can be seen.
Here's a reproducible example:
require(shapefiles)
require(sp)
xx <- readShapeSpatial(system.file("shapes/sids.shp", package="maptools")[1],
IDvar="FIPSNO", proj4string=CRS("+proj=longlat +ellps=clrk66"))
# show all the subregions
plot(xx)
Now let's consider only regions regions 3 and 5
plot(xx[c(3,5),])
How can I just aggregate these regions. In practice what I want to do is like having a map of the whole continent showing all countries and producing a map that shows North America and South America.
To me this looks like a pretty common task but I can't find the right function to do it so far. Do I just miss a function or can I simply to it manually?
The rgeos package provides a number of excellent tools for handling Spatial* data, that can be used in this case.
For example:
library(rgeos)
regionOfInterest <- gUnion(xx[3,], xx[5,])
This also has the same result, and may be more useful for multiple polygons:
regionOfInterest <- gUnionCascaded(xx[c(3,5), ])
The result from plot(regionOfInterest):

Resources