R - find point farthest from set of points on rasterized USA map - r

New to spatial analysis on R here. I have a shapefile for the USA that I downloaded from HERE. I also have a set of lat/long points (half a million) that lie within the contiguous USA.
I'd like to find the "most remote spot" -- the spot within the contiguous USA that's farthest from the set of points.
I'm using the rgdal, raster and sp packages. Here's a reproducible example with a random sample of 10 points:
# Set wd to the folder tl_2010_us_state_10
usa <- readOGR(dsn = ".", layer = "tl_2010_us_state10")
# Sample 10 points in USA
sample <- spsample(usa, 10, type = "random")
# Set extent for contiguous united states
ext <- extent(-124.848974, -66.885444, 24.396308, 49.384358)
# Rasterize USA
r <- raster(ext, nrow = 500, ncol = 500)
rr <- rasterize(usa, r)
# Find distance from sample points to cells of USA raster
D <- distanceFromPoints(object = rr, xy = sample)
# Plot distances and points
plot(D)
points(sample)
After the last two lines of code, I get this plot.
However, I'd like it to be over the rasterized map of the USA. And, I'd like it to only consider distances from cells that are in the contiguous USA, not all cells in the bounding box. How do I go about doing this?
I'd also appreciate any other tips regarding the shape file I'm using -- is it the best one? Should I be worried about using the right projection, since my actual dataset is lat/long? Will distanceFromPoints be able to efficiently process such a large dataset, or is there a better function?

To limit raster D to the contiguous USA you could find the elements of rr assigned values of NA (i.e. raster cells within the bounding box but outside of the usa polygons), and assign these same elements of D a value of NA.
D[which(is.na(rr[]))] <- NA
plot(D)
lines(usa)
You can use 'proj4string(usa)' to find the projection info for the usa shapefile. If your coordinates of interest are based on a different projection, you can transform them to match the usa shapefile projection as follows:
my_coords_xform <- spTransform(my_coords, CRS(proj4string(usa)))
Not sure about the relative efficiency of distanceFromPoints, but it only took ~ 1 sec to run on my computer using your example with 10 points.

I think you were looking for the mask function.
library(raster)
usa <- getData('GADM', country='USA', level=1)
# exclude Alaska and Hawaii
usa <- usa[!usa$NAME_1 %in% c( "Alaska" , "Hawaii"), ]
# get the extent and create raster with preferred resolution
r <- raster(floor(extent(usa)), res=1)
# rasterize polygons
rr <- rasterize(usa, r)
set.seed(89)
sample <- spsample(usa, 10, type = "random")
# Find distance from sample points to cells of USA raster
D <- distanceFromPoints(object = rr, xy = sample)
# remove areas outside of polygons
Dm <- mask(D, rr)
# an alternative would be mask(D, usa)
# cell with highest value
mxd <- which.max(Dm)
# coordinates of that cell
pt <- xyFromCell(r, mxd)
plot(Dm)
points(pt)
The distances should be fine, also when using long/lat data. But rasterFromPoints could indeed be a bit slow with a large data set as it uses a brute force algorithm.

Related

Find distance by sea from coastal point A to coastal point B

I am trying to adapt the solution to this question for a slightly different purpose.
During my search for a solution, I found J. Win's nice solution here for finding a path via sea from A to B.
So I tried to adapt the code to find distances as well but did not get the expected output.
Problem 1: Scaling issue or incorrect use of function? E.g. coordinates go from north to south of Spain. The expected distance should be about 1600km~, the function outputs 66349.24
Problem 2 The required applications requires working around small islands, is there another data model that will simply slot into this when required?
library(raster)
library(gdistance)
library(maptools)
library(rgdal)
library(maps)
# make a raster of the world, low resolution for simplicity
# with all land having "NA" value
# use your own shapefile or raster if you have it
# the wrld_simpl data set is from maptools package
data(wrld_simpl)
# a typical world projection
world_crs <- crs(wrld_simpl)
world <- wrld_simpl
worldshp <- spTransform(world, world_crs)
ras <- raster(nrow=300, ncol=300)
# rasterize will set ocean to NA so we just inverse it
# and set water to "1"
# land is equal to zero because it is "NOT" NA
worldmask <- rasterize(worldshp, ras)
worldras <- is.na(worldmask)
# originally I set land to "NA"
# but that would make some ports impossible to visit
# so at 999 land (ie everything that was zero)
# becomes very expensive to cross but not "impossible"
worldras[worldras==0] <- 999
# each cell antras now has value of zero or 999, nothing else
# create a Transition object from the raster
# this calculation took a bit of time
tr <- transition(worldras, function(x) 1/mean(x), 16)
tr = geoCorrection(tr, scl=FALSE)
# distance matrix excluding the land, must be calculated
# from a point of origin, specified in the CRS of your raster
# let's start with latlong in Black Sea as a challenging route
port_origin <- structure(c(44.206963342648436, -4.350866822197724), .Dim = 1:2)
port_origin <- project(port_origin, crs(world_crs, asText = TRUE))
points(port_origin)
# function accCost uses the transition object and point of origin
A <- accCost(tr, port_origin)
# now A still shows the expensive travel over land
# so we mask it out to display sea travel only
A <- mask(A, worldmask, inverse=TRUE)
# and calculation of a travel path, let's say to South Africa
port_destination <- structure(c(43.83115853071615, -5.134767965894603), .Dim = 1:2)
port_destination <- project(port_destination, crs(world_crs, asText = TRUE))
path <- shortestPath(tr, port_origin, port_destination, output = "SpatialLines")
t_path <-shortestPath(tr, port_origin, port_destination)
distance <-costDistance(tr, port_origin, port_destination)
# make a demonstration plot
plot(A)
points(rbind(port_origin, port_destination))
lines(path)
# we can wrap all this in a customized function
# if you two points in the projected coordinate system,
# and a raster whose cells are weighted
# according to ease of shipping
RouteShip <- function(from_port, to_port, cost_raster, land_mask) {
tr <- transition(cost_raster, function(x) 1/mean(x), 16)
tr = geoCorrection(tr, scl=FALSE)
A <- accCost(tr, from_port)
A <- mask(A, land_mask, inverse=TRUE)
path <- shortestPath(tr, from_port, to_port, output = "SpatialLines")
plot(A)
points(rbind(from_port, to_port))
lines(path)
}

calculating road density raster from road shapefile

I'm looking to turn a shapefile with roads (which includes a column of length per road) in the Eastern half of the USA into a raster of 1x1km of road density, using R.
I can't find a straightforward way in Arcmap (Line density works with a radius from the cell center instead of just the cell).
Here is a solution that creates polygons from the raster cells (adapted from my answer here). You may need to to this for subsets of your dataset and then combine.
Example data
library(terra)
v <- vect(system.file("ex/lux.shp", package="terra"))
roads <- as.lines(v)
rs <- rast(v)
Solution
values(rs) <- 1:ncell(rs)
names(rs) <- "rast"
rsp <- as.polygons(rs)
rp <- intersect(roads, rsp)
rp$length <- perim(rp) / 1000 #km
x <- tapply(rp$length, rp$rast, sum)
r <- rast(rs)
r[as.integer(names(x))] <- as.vector(x)
plot(r)
lines(roads)

Convert a column value(s) in SpatialpolygonDataframe into raster image

I need help with converting a variable or column values in a spatial polygon into a raster image. I have spatial data of administrative units with income(mean) information for each unit. I want to convert this information into raster for further analysis.
I tried the code below but it didn't work.
r <- raster(ncol=5,nrow=15)
r.inc <- rasterize(DK,r,field=DK#data[,2],fun=mean)
Where SP is the spatial polygon and the mean income for each spatial unit stored in column 2 of the SpatialPolygonDataframe. Can anyone help with a function or code of how to rasterise the values in the column of interest? An example of the spatialpolygondataframe (created) and my attempt to rasterize the data are below
suppressPackageStartupMessages(library(tidyverse))
url = "https://api.dataforsyningen.dk/landsdele?format=geojson"
geofile = tempfile()
download.file(url, geofile)
DK <- rgdal::readOGR(geofile)
DK#data = subset(DK#data, select = c(navn))
DK#data$inc = runif(11, min=5000, max=80000)
require(raster)
r <- raster(ncol=5,nrow=15)
r.inc <- rasterize(DK,r,field=DK#data[,2],fun=mean)
plot(r.inc)
Thank you.
Acknowledgement: The code for creating the sample SPDF was sourced from Mikkel Freltoft Krogsholm (link below).
https://www.linkedin.com/pulse/easy-maps-denmark-r-mikkel-freltoft-krogsholm/?trk=read_related_article-card_title
Here's something that makes a raster.
library(tidyverse)
library(rgdal)
library(raster)
url <- "https://api.dataforsyningen.dk/landsdele?format=geojson"
geofile <- tempfile()
download.file(url, geofile)
DK <- rgdal::readOGR(geofile)
r_dk <- raster(DK, nrows = 100, ncols = 100) # Make a raster of the same size as the spatial polygon with many cells
DK$inc <- runif(nrow(DK), min=5000, max=80000) # Add some fake income data
rr <- rasterize(DK, r_dk, field='inc') # Rasterize the polygon into the raster - fun = 'mean' won't make any difference
plot(rr)
The original raster was the size of the whole Earth so I think Denmark was being averaged to nothing. I resolved this by making an empty raster based on the extent of the DK spatial polygons with 100x100 cells. I also simplified the code. Generally, if you find yourself using # with spatial data manipulation, it's a sign that there might be a simpler way. Because the resolution of the raster is much larger than the size of each DK region, taking the average doesn't make much difference.

I would like to work out the distance of data points (lat/long) from the edges of a shape file in R and then apply a criterion to the data points?

I have data points of a species observed using camera traps and would like to measure the distance of each camera trap site (CameraStation) to the edge of a national park using R. I have a shapefile of the park (shp) and want to apply a criterion to CameraStation(s) which are <5km from the edge. My data frame (df) consists of multiple events/observations (EventID) per CameraStation. The aim is to analyse when events near the park edge are most frequent given other environmental factors such as Season, Moon Phase and DayNight (also columns in DF).
I found a package called distance in R but this is for distance sampling and not what I want to do. Which package is relevant in this situation?
I expect the following outcome:
EventID CameraStation Distance(km) Within 5km
0001 Station 1 4.3 Yes
0002 Station 1 4.3 Yes
0003 Station 2 16.2 No
0004 Station 3 0.5 Yes
...
Here's a general solution, adapted from Spacedmans answer to this question at gis.stackexchange. Note: This solution requires working in a projected coordinate system. You can transform to a projected CRS if needed using spTransform.
The gDistance function of the rgeos package calculates the distance between geometries, but for the case of points inside a polygon the distance is zero. The trick is to create a new "mask" polygon where the original polygon is a hole cut out from the mask. Then we can measure the distance between points in the hole and the mask, which is the distance to the edge of the original polygon that we really care about.
We'll use the shape file of the Yellowstone National Park Boundary found on this page.
library(sp) # for SpatialPoints and proj4string
library(rgdal) # to read shapefile with readOGR
library(rgeos) # for gDistance, gDifference, and gBuffer
# ab67 was the name of the shape file I downloaded.
yellowstone.shp <- readOGR("ab67")
# gBuffer enlarges the boundary of the polygon by the amount specified by `width`.
# The units of `width` (meters in this case) can be found in the proj4string
# for the polygon.
yellowstone_buffer <- gBuffer(yellowstone.shp, width = 5000)
# gDifference calculates the difference between the polygons, i.e. what's
# in one and not in the other. That's our mask.
mask <- gDifference(yellowstone_buffer, yellowstone.shp)
# Some points inside the park
pts <- list(x = c(536587.281264245, 507432.037861251, 542517.161278414,
477782.637790409, 517315.171218198),
y = c(85158.0056377799, 77251.498952222, 15976.0721391485,
40683.9055315169, -3790.19457474617))
# Sanity checking the mask and our points.
plot(mask)
points(pts)
# Put the points in a SpatialPointsDataFrame with camera id in a data field.
spts.df <- SpatialPointsDataFrame(pts, data = data.frame(Camera = ordered(1:length(pts$x))))
# Give our SpatialPointsDataFrame the same spatial reference as the polygon.
proj4string(spts.df) <- proj4string(yellowstone.shp)
# Calculate distances (km) from points to edge and put in a new column.
spts.df$km_to_edge <- apply(gDistance(spts.df, difference, byid=TRUE),2,min)/1000
# Determine which records are within 5 km of an edge and note in new column.
spts.df$edge <- ifelse(spts.df$km_to_edge < 5, TRUE, FALSE)
# Results
spts.df
# coordinates Camera km_to_edge edge
# 1 (536587.3, 85158.01) 1 1.855010 TRUE
# 2 (507432, 77251.5) 2 9.762755 FALSE
# 3 (542517.2, 15976.07) 3 11.668700 FALSE
# 4 (477782.6, 40683.91) 4 4.579638 TRUE
# 5 (517315.2, -3790.195) 5 8.211961 FALSE
Here's a quick solution.
Simplify the outline of your shapefile into N points. Then calculate the minimum distance for each camera trap to every point in the outline of the national park.
library(geosphrere)
n <- 500 ##The number of points summarizing the shapefile
NPs <- ##Your shapefile goes here
NP.pts <- spsample(NPs, n = n, type = "regular")
CP.pts <- ## Coordinates for a single trap
distances<-distm(coordinates(CP.pts), coordinates(NP.pts), fun = distHaversine)/1000
##Distance in Km between the trap to each point in the perimeter of the shapefile:
distances
Use distances to find the minimum distance between the shapefile and that given trap. This approach can easily be generalizable using for loops or apply functions.
I had a problem with the points data frame and shape file being projected so instead I used the example in this link to answer my question
https://gis.stackexchange.com/questions/225102/calculate-distance-between-points-and-nearest-polygon-in-r
Basically, I used this code;
df # my data frame with points
shp # my shapefile (non-projected)
dist.mat <- geosphere::dist2Line(p = df2, line = shp)
coordinates(df2)<-~Longitude+Latitude # Longitude and Latitude are columns in my df
dmat<-data.frame(dist.mat) # turned it into a data frame
dmat$km5 <- ifelse(dmat$distance < 5000, TRUE, FALSE) # in meters (5000)
coordinates(dmat)<-~lon+lat
df2$distance <- dmat$distance # added new Distance column to my df

How to subset a raster based on grid cell values

My following question builds on the solution proposed by #jbaums on this post: Global Raster of geographic distances
For the purpose of reproducing the example, I have a raster dataset of distances to the nearest coastline:
library(rasterVis); library(raster); library(maptools)
data(wrld_simpl)
# Create a raster template for rasterizing the polys.
r <- raster(xmn=-180, xmx=180, ymn=-90, ymx=90, res=1)
# Rasterize and set land pixels to NA
r2 <- rasterize(wrld_simpl, r, 1)
r3 <- mask(is.na(r2), r2, maskvalue=1, updatevalue=NA)
# Calculate distance to nearest non-NA pixel
d <- distance(r3) # if claculating distances on land instead of ocean: d <- distance(r3)
# Optionally set non-land pixels to NA (otherwise values are "distance to non-land")
d <- d*r2
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
The data looks like this:
From here, I need to subset the distance raster (d), or create a new raster, that only contains cells for which the distance to coastline is less than 200 km. I have tried using getValues() to identify the cells for which the value <= 200 (as show below), but so far without success. Can anyone help? Am I on the right track?
#vector of desired cell numbers
my.pts <- which(getValues(d) <= 200)
# create raster the same size as d filled with NAs
bar <- raster(ncols=ncol(d), nrows=nrow(d), res=res(d))
bar[] <- NA
# replace the values with those in d
bar[my.pts] <- d[my.pts]
I think this is what you are looking for, you can treat a raster like a matrix here right after you d <- d*r2 line:
d[d>=200000]<-NA
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
(in case you forgot: the unit is in meters so the threshold should be 200000, not 200)

Resources