Finding minimum distance between two raster layer pixels in R - r

I have two thematic raster layers r1 and r2 for same area each following same classification scheme and has 16 classes. I need to find minimum distance between cell of r1 and cell of r2 but with same value. E.g. nth cell in r1 has value 10 and coordinates x1,y1. And in r2, there are 2 cells with value 10 and coordinates x1+2,y1+2 and x1-0.5,y1-0.5. Thus the value that I need for this cell would be 0.5,0.5.
I tried distance from raster package but it gives distance, for all cells that are NA, to the nearest cell that is not NA. I am confused as to how can I include second raster layer into this.

You can use knn from class package so that for each cell of r1 find index of nearest cell of r2 with the same category:
library(class)
library(raster)
#example of two rasters
r1 <- raster(ncol = 600, nrow = 300)
r2 <- raster(ncol = 600, nrow = 300)
#fill each with categories that rabge from 1 to 16
r1[] <- sample(1:16, ncell(r1), T)
r2[] <- sample(1:16, ncell(r2), T)
# coordinates of cells extracted
xy = xyFromCell(r1, 1:ncell(r1))
#multiply values of raster with a relatively large number so cells thet belong
#to each category have smaller distance with reagrd to other categories.
v1 = values(r1) * 1000000
v2 = values(r2) * 1000000
# the function returns indices of nearest cells
out = knn(cbind(v2, xy) ,cbind(v1, xy) ,1:ncell(r1), k=1)

So, use rasterToPoints to extract SpatialPoints object for unique thematic class. Then use the sp::spDists function to find the distance between your points.
library(raster)
r1 <- raster( nrow=10,ncol=10)
r2 <- raster( nrow=10,ncol=10)
set.seed(1)
r1[] <- ceiling(runif(100,0,10))
r2[] <- ceiling(runif(100,0,10))
dist.class <- NULL
for(i in unique(values(r1))){
p1 <- rasterToPoints(r1, fun=function(xx) xx==i, spatial=T)
p2 <- rasterToPoints(r2, fun=function(xx) xx==i, spatial=T)
dist.class[i] <- min(spDists(p1,p2))
}
cbind(class = unique(values(r1)),dist.class)
The loop may not be efficient for you. If it's a problem, wrap it into a function and lapply it. Also, be carefull with your class, if they aren't 1:10, my loop won't work. If your projection is in degree, you will probably need the geosphere package to get accurate results. But the best in that case I think is to use a projection in meters.

A memory safe approach using the raster-package would be to use the layerize() function to split up your raster value into a stack of binary rasters (16 in your case) and then use the distance() function to compute distances in the layers of r2, masking them with the respective layers of r1. Something like this:
layers1 <- layerize(r1, falseNA=TRUE)
layers2 <- layerize(r2, falseNA=TRUE)
# now you can loop over the layers (use foreach loop if you want
# to speed things up using parallel processing)
dist.stack <- layers1
for (i in 1:nlayers(r1)) {
dist.i <- distance(layers2[[i]])
dist.mask.i <- mask(dist, layers1[[i]])
dist.stack[[i]] <- dist.mask.i
}
# if you want pairwise distances for all classes in one layer, simply
# combine them using sum()
dist.combine <- sum(dist.stack, na.rm=TRUE)

Related

Line density function in R equivalent to Line density tool in ArcMap (arcpy)

I need to calculate the magnitude-per-unit area of polylines that fall within a radius around each cell. Essentially I need to calculate a km/km2 road density within a 500m pixel search radius. ArcMap has a quick and easy tool that handles this, but I need a pure R solution.
Here is a link on how line density works: http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/how-line-density-works.htm
And this is how to use it in a python (arcpy) script: http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/line-density.htm
I currently execute a backwards approach using raster::focal function, calculating a density of burned in road features. I then convert the km2/km2 output to km/km2.
#Import libraries
library(raster)
library(rgdal)
library(gdalUtils)
#Read-in an already created raster mask (cells are all set to 0)
mask <- raster("x://path to raster mask...")
#Make a copy of the mask to burn features in, keeping the original untouched
roads_mask <- file.copy(mask, "x://output path ...//roads.tif")
#Read-in road features (shapefile format)
roads_sldf <- readOGR("x://path to shapefile" , "roads")
#Rasterize spatial lines data frame ie. burn road features into mask
#Where road features get a value of 1, mask extent gets a value of 0
roads_raster <- gdalUtils::gdal_rasterize(src_datasource = roads_sldf,
dst_filename = "x://output path ...//roads.tif", b = 1,
burn = 1, l = "roads", output_Raster = TRUE)
#Run a 1km circular radius density function (be mindful of edge effects)
weight <- raster::focalWeight(roads_raster,1000,type = "circle")
1km_rdDensity <- raster::focal(roads_raster, weight, fun=sum, filename = '',
na.rm=TRUE, pad=TRUE, NAonly=FALSE, overwrite=TRUE)
#Convert km2/km2 road density to km/km2
#Set up the moving window
weight <- raster::focalWeight(roads_raster,1000,type = "circle")
#Count how many records in each column of the moving window are > 0
columnCount <- apply(weight,2,function(x) sum(x > 0))
#Get the sum of the column count
number_of_cells <- sum(columnCount)
#multiply km2/km2 density by number of cells in the moving window
step1 <- roads_raster * number_of_cells
#Rescale step1 output with respect to cell size(30m) and radius of a circle
final_rdDensity <- (step1*0.03)/3.14159265
#Write out final km/km2 road density raster
writeRaster(final_rdDensity,"X://path to output...", datatype = 'FLT4S', overwrite = TRUE)
After some more research I think I may be able to use a kernel function, however I don't want to apply the smoothing algorithm... As well the output is an 'im' object which I would need to write to as a 'tif'
#Import libraries
library(spatstat)
library(rgdal)
#Read-in road features (shapefile format)
roads_sldf <- readOGR("x://path to shapefile" , "roads")
#Convert roads spatial lines data frame to psp object
psp_roads <- as.psp(roads_sldf)
#Apply kernel density, however this is where I am unsure of the arguments
road_density <- spatstat::density.psp(psp_roads, sigma = 0.01, eps = 500)
Cheers.
See this question https://gis.stackexchange.com/questions/138861/calculating-road-density-in-r-using-kernel-density
Tried to mark as a duplicate but doesn't work because the other Q is on gis stack exchange
Short answer is use spatstat.geom::pixellate()
I also needed spatstat.geom::as.psp(sf::st_geometry(x)) to convert an sf lines object to the correct format and maptools::as.im.RasterLayer(r) to convert a raster. I was able to convert the result to RasterLayer with raster::raster(pix_res)
Perhaps you can use terra::rasterizeGeom which is available in the development version that you can install with install.packages('terra', repos='https://rspatial.r-universe.dev')
Example data
library(terra)
f <- system.file("ex/lux.shp", package="terra")
v <- vect(f) |> as.lines()
r <- rast(v, res=.1)
Solution
x <- rasterizeGeom(v, r, fun="length", "km")
And then use focal sum, but you would not have a perfect circle.
What you could do instead, if your dataset is not too large, is create a circle for each grid cell and use intersect. Something like this:
p <- xyFromCell(r, 1:ncell(r)) |> vect(crs="+proj=longlat")
p$id <- 1:ncell(r)
b <- buffer(p, 10000)
values(v) <- NULL
i <- intersect(v, b)
x <- aggregate(perim(i), list(id=i$id), sum)
r[x$id] <- x[,2]

Calculating the distance between points in R

I looked through the questions that been asked but dealing with coordinates but couldn't find something can help me out with my problem.
I have dataset that contain ID, Speed, Time , List of Latitude & Longitude. ( dataset can be found in the link)
https://drive.google.com/file/d/1MJUvM5WEhua7Rt0lufCyugBdGSKaHMGZ/view?usp=sharing
I want to measure the distance between each point of Latitude & Longitude.
For example;
Latitude has: x1 ,x2 ,x3 ,...x1000
Longitude has: y1 ,y2 ,y3 ,..., y100
I want to measure the distance between (x1,y1) to all the points , and (x2,y2) to all the points, and so on.
The reason I'm doing this to know which point close to which and assign index to each location based on the distance.
if (x1, y1) is close to (x4,y4) so (x1, y1) will get the index A for example and (x4,y4) will get labeled as B. sort the points in order based on distance.
I tried gDistance function but showed error message: "package ‘gDistance’ is not available (for R version 3.4.3)"
and if I change the version to 3.3 library(rgeos) won't work !!
Any suggestions?
here's what I tried,
#requiring necessary packages:
library(sp) # vector data
library(rgeos) # geometry ops
#Read the data and transform them to spatial objects
d <- read.csv("ReadyData.csv")
sp.ReadData <- d
coordinates(sp.ReadyData) <- ~Longitude + Latitude
d <- gDistance(sp.ReadyData, byid= TRUE)
here's update my solution, I created spatial object and made spatial data frame as follow:
#Create spatial object:
lonlat <- cbind(spatial$Longitude, spatial$Latitude)
#Create a SpatialPoints object:
library(sp)
pts <- SpatialPoints(lonlat)
crdref <- CRS('+proj=longlat +datum=WGS84')
pts <- SpatialPoints(lonlat, proj4string=crdref)
# make spatial data frame
ptsdf <- SpatialPointsDataFrame(pts, data=spatial)
Now I'm trying to measure the Distance for longitude/latitude coordinates. I tried dist method but seems not working for me and tried pointDistance method:
gdis <- pointDistance(pts, lonlat=TRUE)
still not clear for me how this function can measure the distance, I need to figure out the distance so I can locate the point in the middle and assign numbers for each point based on its location from the middle point..
You can use raster::pointDistance or geosphere::distm among others functions.
Part of your example data (please avoid files in your questions):
d <- read.table(sep=",", text='
"OBU ID","Time Received","Speed","Latitude","Longitude"
"1",20,1479171686325,0,38.929596,-77.2478813
"2",20,1479171686341,0,38.929596,-77.2478813
"3",20,1479171698485,1.5,38.9295887,-77.2478945
"4",20,1479171704373,1,38.9295048,-77.247922
"5",20,1479171710373,0,38.9294865,-77.2479055
"6",20,1479171710373,0,38.9294865,-77.2479055
"7",20,1479171710373,0,38.9294865,-77.2479055
"8",20,1479171716373,2,38.9294773,-77.2478712
"9",20,1479171716374,2,38.9294773,-77.2478712
"10",20,1479171722373,1.32,38.9294773,-77.2477417')
Solution:
library(raster)
m <- pointDistance(d[, c("Longitude", "Latitude")], lonlat=TRUE)
To get the nearest point to each point, you can do
mm <- as.matrix(as.dist(m))
diag(mm) <- NA
i <- apply(mm, 1, which.min)
The point pairs
p <- cbind(1:nrow(mm), i)
To get the distances, you can do:
mm[p]
Or do this:
apply(mm, 1, min, na.rm=TRUE)
Note that rgeos::gDistance is for planar data, not for longitude/latitude data.
Here is a similar question/answer with some illustration.
our data set is too large to make a single distance matrix. You can process your data in chunks to with that. Here I am showing that with a rather small chunk size of 4 rows. Make this number much bigger to speed up processing time.
library(geosphere)
chunk <- 4 # rows
start <- seq(1, nrow(d), chunk)
end <- c(start[-1], nrow(d))
x <- d[, c("Longitude", "Latitude")]
r <- list()
for (i in 1:length(start)) {
y <- x[start[i]:end[i], , drop=FALSE]
m <- distm(y, x)
m[cbind(1:nrow(m), start[i]:end)] <- NA
r[[i]] <- apply(m, 1, which.min)
}
r <- unlist(r)
r
# [1] 2 1 1 5 6 6 5 5 9 8 8 8
So for your data:
d <- read.csv("ReadyData.csv")
chunk <- 100 # rows
# etc
This will take a long time.
An alternative approach:
library(spdep)
x <- as.matrix(d[, c("Longitude", "Latitude")])
k <- as.vector(knearneigh(x, k=1, longlat=TRUE)$nn)
Assuming you have p1 as spatialpoints of x and p2 as spatialpoints of y, to get the index of the nearest other point:
ReadyData$cloDist <- apply(gDistance(p1, p2, byid=TRUE), 1, which.min)
If you have the same coordinate in the list you will get an index of the point itself since the closest place to itself is itself. An easy trick to avoid that is to use the second farthest distance as reference with a quick function:
f_which.min <- function(vec, idx) sort(vec, index.return = TRUE)$ix[idx]
ReadyData$cloDist2 <- apply(gDistance(p1, p2, byid=TRUE), 1, f_which.min,
idx = 2)

Raster in R: Create Zonal Count of specific cell values without reclassification

I would like to know if there is way to create zonal statistics for RasterLayerObjects, specifically the count of a given cell value (e.g. a land-use class) in R without having to reclassify the whole raster. The solution should be memory efficient in order to work on large raster files i.e. no extraction of the values into a matrix in R is desired.
Below an example of how I handle it until now. In this case I reclassify the original raster to hold only 1 for the value of interest and missings for all other values.
My proposed solution creates both, redundant data and additional processing steps to get me to my initial goal. I thought something like zonal(r1[r1==6],r2,"count") would work but obviously it does not (see below).
# generate reproducible Raster
library("raster")
## RASTER 1 (e.g. land-use classes)
r1 <- raster( crs="+proj=utm +zone=31")
extent(r1) <- extent(0, 100, 0, 100)
res(r1) <- c(5, 5)
values(r1) <- sample(10, ncell(r1), replace=TRUE)
plot(r1)
## RASTER 2 (containing zones of interest)
r2 <- raster( crs="+proj=utm +zone=31")
extent(r2) <- extent(0, 100, 0, 100)
res(r2) <- c(5, 5)
values(r2) <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100))
plot(r2)
# (1) ZONAL STATISTICS
# a. how many cells per zone (independent of specific cell value)
zonal(r1,r2,"count")
# b. how many cells per zone of specific value 6
zonal(r1[r1==6],r2,"count")
# -> fails
# with reclassification
r1.reclass<-
reclassify(r1,
matrix(c(1,5,NA,
5.5,6.5,1, #class of interest
6.5,10,NA),
ncol=3,
byrow = T),
include.lowest=T # include the lowest value from the table.
)
zonal(r1.reclass,r2,"count")
you can use raster::match.
zonal(match(r1, 6),r2, "count")
As you can see from plot(match(r1, 6)), it only returns raster cells which hold the desired value(s). All other cells are NA.
r1==6 as used in your try unfortunately returns a vector and therefore cannot be used in focal anymore.

Retreiving output raster comparing two rater layer in R

I have two raster layer of dimension (7801, 7651). I want to compare each pixel of one raster layer with the other and create a new raster which has the minimum pixel value among the initial two raster. That is, if any i,j pixel of raster 1 has value 25 and same i,j pixel of raster 2 has value 20, thus in the output raster the i,j pixel should be 20.
You can just use min with two raster layers.
Let's start with a reproducible example:
library(raster)
r1 <- raster(ncol = 5, nrow = 5)
r1[] <- 1:ncell(r1)
plot(r1)
r2 <- raster(ncol = 5, nrow = 5)
r2[] <- ncell(r2):1
par(mfrow = c(1,3))
plot(r1)
plot(r2)
Now we calculate the min of each overlapping cell within the two raster layers very easily with the implemented cell statistics:
r3 <- min(r2, r1)
plot(r3)
Furthermore, you can also apply statistics like mean, max, etc.
If the implemented statistics somehow fail, or you want to use your own statistics, you can also directly access the data per pixel. That is, you first copy one of the raster layers.
r3 <- r1
Afterwards, you can apply a function over the values.
r3[] <- apply(cbind(r1[], r2[]), 1, min)
Using #loki's example, you have three more options to calculate minimum value for both layers:
library(raster)
calc(stack(r1,r2),fun=min,na.rm=T)
stackApply(stack(r1,r2),indices = c(1,1),fun='min',na.rm=T)
overlay(r1,r2,fun=min,na.rm=T)

Set single raster to NA where values of raster stack are NA

I have two 30m x 30m raster files which I would like to sample points from. Prior to sampling, I would like to remove the clouded areas from the images. I turned to R and Hijman's Raster package for the task.
Using the drawPoly(sp=TRUE) command, I drew in 18 different polygons. The function did not seem to allow 18 polygons as one sp object, so I drew them all separately. I then gave the polygons a proj4string matching the rasters', and set them into a list. I ran the list through a lapply function to convert them to rasters (rasterize function in Hijman's package) with the polygon areas set to NA, and the rest of the image set to 1.
My end goal is one raster layer with the 18 areas set to NA. I have tried stacking the list of rasterized polygons, and subsetting it to put set a new raster to NA in the same areas. My reproducible code is below.
library(raster)
r1 <- raster(nrow=50, ncol = 50)
r1[] <- 1
r1[4:10,] <- NA
r2 <- raster(nrow=50, ncol = 50)
r2[] <- 1
r2[9:15,] <- NA
r3 <- raster(nrow=50, ncol = 50)
r3[] <- 1
r3[24:39,] <- NA
r4 <- raster(nrow=50, ncol = 50)
r4[] <- 1
s <- stack(r1, r2, r3)
test.a.cool <- calc(s, function(x){r4[is.na(x)==1] <- NA})
For whatever reason, the darn testacool is a blank plot, where I'm aiming to have it as a raster with all values except for the NAs in the stack, s, equal to 1.
Any tips?
Thanks.
Doing sum(s) will work, as sum() returns NA for any grid cell with even one NA value in the stack.
To see that it works, compare the figures produced by the following:
plot(s)
plot(sum(s))
I posted this question on the R-Sig-Geo forum, as well, and received a response from the package author. The two simplest solutions:
Use the sp package to rbind my polygons into one, then rasterize the polygon.
p <- rbind(p1, p2, p3...etc., makeUniqueIDs = TRUE)
r4 <- raster(nrow=50, ncol = 50)
r4[] <- 1
mask <- rasterize(p, r4)
mask[mask %in% 1:18] <- 1
#The above code produces a single raster file with
#my polygons as unique values, ready for masking.
And the second simple solution, as just pointed out by Josh O'Brien:
m <- sum(s)
test <- mask(r4, m)
The R community rocks. Problem solved (twice) within an hour. Thanks.
I'm not familiar with the package you are using, however looking at the final line in your code, I think the issue might be here:
function(x){r4[is.na(x)==1] <- NA})
It doesn't look like calc will do much with that. It is setting the values of r4 indexed by the NAs of x and setting those to NA.
What then? If anything, maybe:
function(x){r4[is.na(x)==1] <- NA; return(r4) })
Although, it's not clear if that is even what you are after.
You were on the right track. The [ operator is defined for rasters and raster stacks, so you could just use the single line:
r4[ any(is.na(s) ) ] <- NA
plot(r4)
If you wanted to use calc you could have used it like this:
r4 <- calc( s, function(x){ ( ! any( is.na(x) ) ) } )
r4[is.na(r4)] <- NA
plot(r4)

Resources