Extracting points that belong to a certain area of a RasterLayer (raster) - r

As per title.
I have a "classified" RasterLayer object which has (apart from NAs) two fixed values, 0 and 1. It is a kind of logical image.
I also have a data frame of points with their coordinates, in form of a SpatialPointsDataFrame.
How can I extract points belonging to a certain area (0 or 1)? Been searching into raster-package help but I couldn't find a solution.

You can use extract from the raster package:
"Extract values from a Raster* object at the locations of other
spatial data (that is, perform a spatial query). You can use
coordinates (points), lines, polygons or an Extent (rectangle) object.
You can also use cell numbers to extract values."
values <- extract(x="YourRasterLayer", y="YourSpatialPointsDataFrame")
For more information type:
?raster::extract
or visit this page.

Related

In R, how can I join data by similar, but not identical, centroid (or numeric argument)?

I'm building a transition matrix of land use change (state) over the years.
I'm therefore comparing shapefile years after years and build a dataframe with:
Landuse year1 - Landuse year2 - ....- ID- centroid
with the following function :
full_join(landuse1, landuse2, by="centroid")
where centroid is the actual centroid of the polygons. A centroid, is basically a vector of two numeric value.
However, the centroid, year after year, can slitghly shift (because the polygon actually change a little bit) leading in incomplete data gathering through the full_join function because centroid must exactly match.
I'd like to include a "more or less" argument, so that that any centroid close enough to the one from the year before can be joined to the datagrame for that particular polygon.
But I'm not sure how ?
Thank you in advance.
So the general term for what you are trying to do is called fuzzy matching. Im not sure how exactly it would work for the coordinates of a centroid. My Idea would be to calculate the distance between the Coordinates, and then set a margin of error, say 0.5%, and if they deviate from each other by less than that you could declare it a match. Basically loop through your list of locations and give the matches some unique ID, which you can then use for the join

st_intersection returning a dataframe with 0 observations [duplicate]

This question already has an answer here:
Why use st_intersection rather than st_intersects?
(1 answer)
Closed 1 year ago.
I have a dataframe that shows bus stop locations in Glasgow and another dataframe that shows Datazone polygons for Glasgow. I am using the sf package and have made both dataframes spatial. I want to do a spatial join to create a new dataframe (joined_ds) to match each bus stop location to a Datazone polygon and its associated characteristics (deprivation score). I'm using st_intersection which gives me a new dataframe with all the correct columns but 0 observations.
joined_ds <- st_intersection(st_buffer(bus_stop_data,0), st_buffer(datazones,0))
Both datasets are using the appropriate CRS (EPSG: 27700 for the British National Grid) and I know that the points and polygons overlap because I have successfully plotted them on a map using ggplot, so no idea why my dataframe is showing 0 obs. I've also tried loading in the datasets from scratch and no luck.
Any suggestions welcome, thanks!
Look at the differences between st_intersection and st_intersectshere:
Why use st_intersection rather than st_intersects?
Since you are only interested if a point intersects with a polygon, you need st_intersects. If I understand you correctly you don't need to use st_buffer but simply use st_join in combination with st_intersects. Something like:
st_join(bus_stop_data, datazones, join = st_intersects)
Keep in mind issues that might arise when using spatial joins, for example when a point intersects with two polygons.

extracting pixel values above a value per polygon in R

I have a shapefile containing 38 polygons i.e. 38 states of a country. This shapefile is overlaid on a raster. I need to extract/reclassify pixel above a certain value, specific to each polygon.
For example, I need to extract the raster pixels> 120 for state/polygon 1, pixels> 189 for polygon 2 etc with the resulting raster being the extracted pixels with value 1 and everything else as NoData. Hence, it seems like I need to extract first and then reclassify.
I have the valuees, for extraction, saved as a data frame with a column containing names, matching the names of the states,which is stored as an attribute "Name" in the shapefile.
Any suggestion on how I could go about this?
Should I extract the raster for each state into 38 separate rasters, then do reclassify () and then mosaic to make one raster i.e. the country?

Given a vector of coordinates, identify the polygon from a shapefile it falls into

I have my polygons stored in a SpatialPolygonsDataFrame and my coordinates in a data frame.
The output I want is to just have an additional column on my data frame that tags the OBJECTID (id of the polygon from the shapefile) that the coordinates fall into.
My problem is kind of the same with this
But its output is a little bit different. Also, it's kinda slow. I tried to tag just 4 coordinates and it took more than 5 minutes. I'm gonna be tagging 16k coordinates so would it be possible to do it faster?
The current methods I know about wouldn't do that exactly (i.e., produce one polygon id per coordinate) because they're generalized in case one point is contained in multiple (overlapping polygons).
See sp::over(), which used to be called overlay().
Example:
over(sr, geometry(meuse), returnList = TRUE)
over(sr, meuse, returnList = TRUE)
Possible duplicates (it's hard to tell without seeing your example data):
Extracting points with polygon in R
Intersecting Points and Polygons in R

Counting species occurrence in a grid

I have about 500,000 points in R of occurrence data of a migratory bird species throughout the US.
I am attempting to overlay a grid on these points, and then count the number of occurrences in each grid. Once the counts have been tallied, I then want to reference them to a grid cell ID.
In R, I've used the over() function to just get the points within the range map, which is a shapefile.
#Read in occurrence data
data=read.csv("data.csv", header=TRUE)
coordinates(data)=c("LONGITUDE","LATITUDE")
#Get shapefile of the species' range map
range=readOGR(".",layer="data")
proj4string(data)=proj4string(range)
#Get points within the range map
inside.range=!is.na(over(data,as(range,"SpatialPolygons")))
The above worked exactly as I hoped, but does not address my current problem: how to deal with points that are the type SpatialPointsDataFrame, and a grid that is a raster. Would you recommend polygonizing the raster grid, and using the same method I indicated above? Or would another process be more efficient?
First of all, your R code doesn't work as written. I would suggest copy-pasting it into a clean session, and if it errors out for you as well, correcting syntax errors or including add-on libraries until it runs.
That said, I assume that you are supposed to end up with a data.frame of two-dimensional numeric coordinates. So, for the purposes of binning and counting them, any such data will do, so I took the liberty of simulating such a dataset. Please correct me if this doesn't capture a relevant aspect of your data.
## Skip this line if you are the OP, and substitute the real data instead.
data<-data.frame(LATITUDE=runif(100,1,100),LONGITUDE=runif(100,1,100));
## Add the latitudes and longitudes between which each observation is located
## You can substitute any number of breaks you want. Or, a vector of fixed cutpoints
## LATgrid and LONgrid are going to be factors. With ugly level names.
data$LATgrid<-cut(data$LATITUDE,breaks=10,include.lowest=T);
data$LONgrid<-cut(data$LONGITUDE,breaks=10,include.lowest=T);
## Create a single factor that gives the lat,long of each observation.
data$IDgrid<-with(data,interaction(LATgrid,LONgrid));
## Now, create another factor based on the above one, with shorter IDs and no empty levels
data$IDNgrid<-factor(data$IDgrid);
levels(data$IDNgrid)<-seq_along(levels(data$IDNgrid));
## If you want total grid-cell count repeated for each observation falling into that grid cell, do this:
data$count<- ave(data$LATITUDE,data$IDNgrid,FUN=length);
## You could have also used data$LONGITUDE, doesn't matter in this case
## If you want just a table of counts at each grid-cell, do this:
aggregate(data$LATITUDE,data[,c('LATgrid','LONgrid','IDNgrid')],FUN=length);
## I included the LATgrid and LONgrid vectors so there would be some
## sort of descriptive reference accompanying the anonymous numbers in IDNgrid,
## but only IDNgrid is actually necessary
## If you want a really minimalist table, you could do this:
table(data$IDNgrid);

Resources