NA handling in raster terrain function - raster

In calculating the slope from an elevation raster using the terrain function of the raster package, there is a border effect where NAs are returned for cells having one or more NA neighbors.
library(raster)
elevation <- getData('alt', country='ITA')
x <- terrain(elevation, 'slope', neighbors = 8)
e <- elevation
e[!is.na(e)] <- 1
e[is.na(e)] <- 2
x[!is.na(x)] <- 1
x[is.na(x)] <- 2
y <- e-x
plot(y)
I'm looking for possible ways (or alternative functions/packages) for overriding this border effect and calculating the slope for all non-NA cells based on the number of available neighbors?
Whether I see this effect appropriate where the border is artificially created due to the extent of the raster (e.g. northern Italy disconnected from Austria, Switzerland...), in other cases the border is legitimate (e.g. coastal cells).
Passing na.rm = TRUE to terrain does not change the result.
Many thanks!

This would be an easy workaround:
First, download the elevation data unmask (you can mask it later if needed):
elevation <- getData('alt', country='ITA',mask=F)
Now you can assume, that all your NA elevation is sea/ocean surface, and therefore has a value of 0.
So you can set your NAs to 0:
elevation[is.na(elevation)] <- 0
This should remove all border issues due to NA values.

Related

Find distance by sea from coastal point A to coastal point B

I am trying to adapt the solution to this question for a slightly different purpose.
During my search for a solution, I found J. Win's nice solution here for finding a path via sea from A to B.
So I tried to adapt the code to find distances as well but did not get the expected output.
Problem 1: Scaling issue or incorrect use of function? E.g. coordinates go from north to south of Spain. The expected distance should be about 1600km~, the function outputs 66349.24
Problem 2 The required applications requires working around small islands, is there another data model that will simply slot into this when required?
library(raster)
library(gdistance)
library(maptools)
library(rgdal)
library(maps)
# make a raster of the world, low resolution for simplicity
# with all land having "NA" value
# use your own shapefile or raster if you have it
# the wrld_simpl data set is from maptools package
data(wrld_simpl)
# a typical world projection
world_crs <- crs(wrld_simpl)
world <- wrld_simpl
worldshp <- spTransform(world, world_crs)
ras <- raster(nrow=300, ncol=300)
# rasterize will set ocean to NA so we just inverse it
# and set water to "1"
# land is equal to zero because it is "NOT" NA
worldmask <- rasterize(worldshp, ras)
worldras <- is.na(worldmask)
# originally I set land to "NA"
# but that would make some ports impossible to visit
# so at 999 land (ie everything that was zero)
# becomes very expensive to cross but not "impossible"
worldras[worldras==0] <- 999
# each cell antras now has value of zero or 999, nothing else
# create a Transition object from the raster
# this calculation took a bit of time
tr <- transition(worldras, function(x) 1/mean(x), 16)
tr = geoCorrection(tr, scl=FALSE)
# distance matrix excluding the land, must be calculated
# from a point of origin, specified in the CRS of your raster
# let's start with latlong in Black Sea as a challenging route
port_origin <- structure(c(44.206963342648436, -4.350866822197724), .Dim = 1:2)
port_origin <- project(port_origin, crs(world_crs, asText = TRUE))
points(port_origin)
# function accCost uses the transition object and point of origin
A <- accCost(tr, port_origin)
# now A still shows the expensive travel over land
# so we mask it out to display sea travel only
A <- mask(A, worldmask, inverse=TRUE)
# and calculation of a travel path, let's say to South Africa
port_destination <- structure(c(43.83115853071615, -5.134767965894603), .Dim = 1:2)
port_destination <- project(port_destination, crs(world_crs, asText = TRUE))
path <- shortestPath(tr, port_origin, port_destination, output = "SpatialLines")
t_path <-shortestPath(tr, port_origin, port_destination)
distance <-costDistance(tr, port_origin, port_destination)
# make a demonstration plot
plot(A)
points(rbind(port_origin, port_destination))
lines(path)
# we can wrap all this in a customized function
# if you two points in the projected coordinate system,
# and a raster whose cells are weighted
# according to ease of shipping
RouteShip <- function(from_port, to_port, cost_raster, land_mask) {
tr <- transition(cost_raster, function(x) 1/mean(x), 16)
tr = geoCorrection(tr, scl=FALSE)
A <- accCost(tr, from_port)
A <- mask(A, land_mask, inverse=TRUE)
path <- shortestPath(tr, from_port, to_port, output = "SpatialLines")
plot(A)
points(rbind(from_port, to_port))
lines(path)
}

How to subset a raster based on grid cell values

My following question builds on the solution proposed by #jbaums on this post: Global Raster of geographic distances
For the purpose of reproducing the example, I have a raster dataset of distances to the nearest coastline:
library(rasterVis); library(raster); library(maptools)
data(wrld_simpl)
# Create a raster template for rasterizing the polys.
r <- raster(xmn=-180, xmx=180, ymn=-90, ymx=90, res=1)
# Rasterize and set land pixels to NA
r2 <- rasterize(wrld_simpl, r, 1)
r3 <- mask(is.na(r2), r2, maskvalue=1, updatevalue=NA)
# Calculate distance to nearest non-NA pixel
d <- distance(r3) # if claculating distances on land instead of ocean: d <- distance(r3)
# Optionally set non-land pixels to NA (otherwise values are "distance to non-land")
d <- d*r2
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
The data looks like this:
From here, I need to subset the distance raster (d), or create a new raster, that only contains cells for which the distance to coastline is less than 200 km. I have tried using getValues() to identify the cells for which the value <= 200 (as show below), but so far without success. Can anyone help? Am I on the right track?
#vector of desired cell numbers
my.pts <- which(getValues(d) <= 200)
# create raster the same size as d filled with NAs
bar <- raster(ncols=ncol(d), nrows=nrow(d), res=res(d))
bar[] <- NA
# replace the values with those in d
bar[my.pts] <- d[my.pts]
I think this is what you are looking for, you can treat a raster like a matrix here right after you d <- d*r2 line:
d[d>=200000]<-NA
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
(in case you forgot: the unit is in meters so the threshold should be 200000, not 200)

Crop, change values, and merge rasters with overlapping extent

I am trying to take a raster of soils data for one state, crop it by county, change the cell values in each county (to the county fips code), and then re-merge the county rasters back into the state raster.
Here I read in the state soils raster (which by default as the map unit key associated with each soil type as the cell value) and a polygon of US counties. I then select out the polygon just of one state, transform it to the same coordiante system as the soils raster, and then select out the soils rasters and polygons two example counties.
state_soils_raster <- raster("MapunitRaster_IL_10m.tif")
us_county_polygons <- readOGR("cb_2016_us_county_500k/cb_2016_us_county_500k.shp")
IL_county_polygons <- us_county_polygons[us_county_polygons$STATEFP == 17,]
IL_county_polygons <- spTransform(IL_county_polygons, CRS = crs(state_soils_raster))
county1 <- "Douglas"
county2 <- "Coles"
county1_polygon <- IL_county_polygons[IL_county_polygons$NAME %in% county1,]
county2_polygon <- IL_county_polygons[IL_county_polygons$NAME %in% county2,]
county1_raster <- crop(state_soils_raster, county1_polygon)
county2_raster <- crop(state_soils_raster, county2_polygon)
If I plot each county by itself, you can see that the extent of the cropped region is rectangular and extends beyond the area of the county itself. The coloring is crazy because the mukey values are all over the place (although typically grouped by county). County1 lies just to the north of County2.
plot(county1_raster)
plot(county1_polygon, add = T)
plot(county2_raster)
plot(county2_polygon, add = T)
If I leave the values as is and merge the two county rasters back together, everything is fine. Even though the extents of the two rasters do overlap, the cell values are identical regardless of which raster merge is pulling from. I'm not actually sure which raster merge pulls from in this case, but it doesn't really matter. Everything fits back together nicely and the cell values are correct.
both_counties_raster <- merge(county1_raster, county2_raster)
plot(both_counties_raster)
plot(county1_polygon, add = T)
plot(county2_polygon, add = T)
However, what I want to do is to change the cell values by county prior to recombining the county rasters.
values(county1_raster) <- 1
values(county2_raster) <- 2
both_counties_raster_new <- merge(county1_raster, county2_raster)
Everything merges just fine, but when I now plot the new combined raster it is clear that for cells that were contained in both county rasters merge just took the cell values from one of the rasters. Clearly merge prioritizes the the first input raster by default.
plot(both_counties_raster_new)
plot(county1_polygon, add = T)
plot(county2_polygon, add = T)
What I'm looking for is to just change the cell values within the boundaries of each county and then merge all the counties back together again.
I am aware of the raster::mask function that can turn anything outside of the county boundary to NA, with a 10m cell resolution (described here), this takes an insane amount of time!
I have also tried an alternative approach using the raster::rasterize function to turn the county boundary polygons into a raster with the same cell size and extent of the state soils raster. Again, with a 10m cell resolution this takes forever. I was able to process one county on each of my 8 cores in 1.5 hours. And I've got a whole country to do!
I am not aware of any 10m raster US county dataset, although that would be amazing if someone pointed me to that.
The soils data is gSSURGO data - I'm also not aware if gSSURGO has within its many tables a county attribute. If it's there, I can't find it. that would also be an easy solution.
It may not be quicker but have you tried with raster::cellFromPolygon ?
Here is a simple example:
# Create a raster with zero values
r <- raster(ncols=30, nrows=30, res = 1/3)
values(r) <- 0
# Create polygons
cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
pols <- SpatialPolygons(list(Polygons(list(Polygon(cds1)), 1), Polygons(list(Polygon(cds2)), 2)))
plot(r)
plot(pols, add = TRUE)
r2 <- r
# Find which cells are in which polygons
cellpol <- cellFromPolygon(r, pols)
# Not a really clean way to attribute values in the global environment...
lapply(1:length(cellpol), function(x) values(r2)[cellpol[[x]]] <<- x)
plot(r2)
plot(pols, add = TRUE)

R - find point farthest from set of points on rasterized USA map

New to spatial analysis on R here. I have a shapefile for the USA that I downloaded from HERE. I also have a set of lat/long points (half a million) that lie within the contiguous USA.
I'd like to find the "most remote spot" -- the spot within the contiguous USA that's farthest from the set of points.
I'm using the rgdal, raster and sp packages. Here's a reproducible example with a random sample of 10 points:
# Set wd to the folder tl_2010_us_state_10
usa <- readOGR(dsn = ".", layer = "tl_2010_us_state10")
# Sample 10 points in USA
sample <- spsample(usa, 10, type = "random")
# Set extent for contiguous united states
ext <- extent(-124.848974, -66.885444, 24.396308, 49.384358)
# Rasterize USA
r <- raster(ext, nrow = 500, ncol = 500)
rr <- rasterize(usa, r)
# Find distance from sample points to cells of USA raster
D <- distanceFromPoints(object = rr, xy = sample)
# Plot distances and points
plot(D)
points(sample)
After the last two lines of code, I get this plot.
However, I'd like it to be over the rasterized map of the USA. And, I'd like it to only consider distances from cells that are in the contiguous USA, not all cells in the bounding box. How do I go about doing this?
I'd also appreciate any other tips regarding the shape file I'm using -- is it the best one? Should I be worried about using the right projection, since my actual dataset is lat/long? Will distanceFromPoints be able to efficiently process such a large dataset, or is there a better function?
To limit raster D to the contiguous USA you could find the elements of rr assigned values of NA (i.e. raster cells within the bounding box but outside of the usa polygons), and assign these same elements of D a value of NA.
D[which(is.na(rr[]))] <- NA
plot(D)
lines(usa)
You can use 'proj4string(usa)' to find the projection info for the usa shapefile. If your coordinates of interest are based on a different projection, you can transform them to match the usa shapefile projection as follows:
my_coords_xform <- spTransform(my_coords, CRS(proj4string(usa)))
Not sure about the relative efficiency of distanceFromPoints, but it only took ~ 1 sec to run on my computer using your example with 10 points.
I think you were looking for the mask function.
library(raster)
usa <- getData('GADM', country='USA', level=1)
# exclude Alaska and Hawaii
usa <- usa[!usa$NAME_1 %in% c( "Alaska" , "Hawaii"), ]
# get the extent and create raster with preferred resolution
r <- raster(floor(extent(usa)), res=1)
# rasterize polygons
rr <- rasterize(usa, r)
set.seed(89)
sample <- spsample(usa, 10, type = "random")
# Find distance from sample points to cells of USA raster
D <- distanceFromPoints(object = rr, xy = sample)
# remove areas outside of polygons
Dm <- mask(D, rr)
# an alternative would be mask(D, usa)
# cell with highest value
mxd <- which.max(Dm)
# coordinates of that cell
pt <- xyFromCell(r, mxd)
plot(Dm)
points(pt)
The distances should be fine, also when using long/lat data. But rasterFromPoints could indeed be a bit slow with a large data set as it uses a brute force algorithm.

Gap fill for a raster using another in R

I have two rasters for the same day but with different swaths. I want to combine them but I am conscious that retrieval algorithms may be different. Both rasters are of the same dimension. What's the easiest way to do this in R please? I will be running this on a list.
library(raster)
A <- raster(nrows=108, ncols=21, xmn=0, xmx=10)
A[] <- 1:ncell(A)
xy <- matrix(rnorm(ncell(A)),108,21)
B<- raster(xy)
## Induce NAs in raster B:
B[sample(1:ncell(B), 1000)] <- NA
## Confirm we have 1000 NAs:
sum(is.na(B[]))
If there were NA pixels in raster B that had values in the other raster A, how do I fill the raster B based on the correlation between points with values in both rasters A and B, please?
As long as your rasters are of equal dimension and spatially coincident:
## Create indices for pixels that are NA in B and not NA in A:
indices <- is.na(B)[] & !is.na(A)[]
B[indices] <- A[indices]
If they are not already of the same dimension and spatially coincident then use the resample() function first to match the rasters to the desired dimensions and extent (projecting first if necessary using projectRaster()).

Resources