How to use distHaversine function? - r

I am trying to use the distHavrsine function in R, inside a loop to calculate the distance between some latitude and longitude coordinates for a several hundred rows. In my loop I have this code:
if ((distHaversine(c(file[i,"long"], file[i,"lat"]),
c(file[j,"long"], file[j,"lat"]))) < 50 )
after which if the distance is less than 50 meters i want it to record those rows, and where the latitude and longitude coordinates it is referencing look like:
0.492399367 30.42530045
and
0.496899361 30.42497045
but i get this error
Error in .pointsToMatrix(p1) : latitude > 90

i get this error "Error in .pointsToMatrix(p1) : latitude > 90". Can
anyone explain why and how to solve?
The error tells you that you got latitude values greater than 90, which is out of scope:
library(geosphere)
distHaversine(c(4,52), c(13,52))
# [1] 616422
distHaversine(c(4,52), c(1,91))
# Error in .pointsToMatrix(p2) : latitude > 90
You can solve this issue by only feeding distHaversine with coordinates inside the accepted ranges.
I am trying to use the distHavrsine function in R, inside a loop to
calculate the distance between some latitude and longitude coordinates
for a several hundred rows. (...) if the distance is less than 50
meters i want it to record those rows
Have a look at the distm function, which calculates a distance matrix for your few hundred rows easily (i.e. without loops). It uses distHaversine by default. For example, to get the data frame rows that are closer then 650000 meters:
df <- read.table(sep=",", col.names=c("lon", "lat"), text="
4,52
13,52
116,39")
(d <- distm(df))
# [,1] [,2] [,3]
# [1,] 0 616422 7963562
# [2,] 616422 0 7475370
# [3,] 7963562 7475370 0
d[upper.tri(d, T)] <- NA
( idx <- which(d < 650000, arr.ind = T) )
# row col
# [1,] 2 1
cbind(df[idx[, 1], ], df[idx[, 2], ])
# lon lat lon lat
# 2 13 52 4 52

Related

extract mean value of raster with buffer condition on second layer/attribute

I have this stars object (could be also formatted to raster):
stars object with 2 dimensions and 2 attributes
attribute(s):
LST_mean elevation
Min. :14.98 Min. :296.0
1st Qu.:16.89 1st Qu.:346.9
Median :17.64 Median :389.3
Mean :17.52 Mean :389.2
3rd Qu.:18.18 3rd Qu.:428.3
Max. :20.11 Max. :521.6
dimension(s):
from to offset delta refsys point values
x 71 83 4387654 860.241 DHDN / 3-degree Gauss-Kru... FALSE NULL [x]
y 33 41 5598885 -860.241 DHDN / 3-degree Gauss-Kru... FALSE NULL [y]
Which has 2 attributes (layers in the case of raster): temperature and elevation.
Using temperature, I would like to select the pixels that fall within a buffer and return the mean, only for the pixels whose difference in elevation with the considered one everytime is less than 90 meters.
Any ideas how to do this?
Calculating the averages of the pixels that fall within the buffer is very easy, but I couldn't find a way to set any condition on them.
I will be immensly grateful for your help and suggestions. Approaches using other packages than satrs are also very welcome :)
Please see below a solution using terra. The code uses terra::extract to create two corresponding lists:
The pixel values
The surrounding buffer values
Subsequently the values are processed pairwise, using mapply, with a function similar to the one you suggested.
It's the first time I'm using terra but seems like terra::extract is much faster than raster::extract, therefore this solution may be feasible even for a large raster.
Creating sample data:
library(sf)
library(terra)
r = rast(ncol = ncol(volcano), nrow = nrow(volcano), xmin = 0, xmax = ncol(volcano), ymin = 0, ymax = nrow(volcano))
values(r) = volcano
s = r
s[] = rnorm(ncell(s))
r = c(r, s)
crs(r) = ""
plot(r)
Calculating buffers:
pnt = as.points(r, values = FALSE)
pol = buffer(pnt, 10)
Extracting raster values from points:
x = extract(r, pnt)
head(x)
## ID lyr.1 lyr.1
## [1,] 1 100 -0.03525223
## [2,] 2 100 0.31525467
## [3,] 3 101 0.94054608
## [4,] 4 101 0.37209238
## [5,] 5 101 -0.38388234
## [6,] 6 101 -0.03120593
Extracting raster values from buffers:
y = extract(r, pol)
head(y)
## ID lyr.1 lyr.1
## [1,] 1 100 -0.03525223
## [2,] 1 100 0.31525467
## [3,] 1 101 0.94054608
## [4,] 1 101 0.37209238
## [5,] 1 101 -0.38388234
## [6,] 1 101 -0.03120593
Now, the extracted values can be processed sequentially using mapply.
First, we convert the objects to lists:
x = as.data.frame(x)
x = split(x, x$ID)
y = as.data.frame(y)
y = split(y, y$ID)
Next, we use mapply to make the necessary calculation, each time
considering the current focal point value x and surrounding buffer
values y:
f = function(x, y) {
d = abs(x[, 2] - y[, 2]) ## differences
values = y[, 3] ## values
mean(values[d < 5], na.rm = TRUE) ## Mean of subset
}
result = mapply(f, x, y)
Finally, putting the results back into the raster template:
u = r[[1]]
values(u) = result
plot(u)

Find nearest distance from spatial point with direction specified

I would like to calculate the nearest distance from a spatial point to spatial lines (or polygons) for predetermined bearings (0,45,90,135,180,225,270,315).
The idea is to calculate an exposure index for a number of bays along a coastline. A simple example is provided below:
Create lines
library(sp)
coords<-structure(list(lon = c(-6.1468506, -3.7628174, -3.24646,
-3.9605713, -4.4549561, -4.7955322, -4.553833, -5.9710693, -6.1468506),
lat = c(53.884916, 54.807017, 53.46189, 53.363665, 53.507651, 53.363665, 53.126998, 53.298056,53.884916)), class = "data.frame", row.names = c(NA,-9L))
l<-Line(coords)
sl<-SpatialLines(list(Lines(list(l),ID="a")),proj4string=CRS("+init=epsg:4326"))
Create point
pt<-SpatialPoints(coords[5,]+0.02,proj4string=CRS("+init=epsg:4326"))
Plot
plot(sl)
plot(pt,add=T)
I'm having trouble finding examples of what the next step might be and need help.
Example of what distance I would like to calculate
You can use geosphere library to accomplish it. You'll need to add a CRS to your points though:
library(geosphere)
pt <- SpatialPoints(c[5,],
proj4string=CRS("+init=epsg:4326"))
And then use dist2Line function:
st_distance(st_cast(sl, "POINT"), pt)
# distance lon lat ID
#[1,] 2580.843 -4.451901 53.50677 1
Alternatively you can convert your polylines to points using sf package and then get a matrix of distances (you'll need to convert you objects to sfclass):
library(sf)
sl <- SpatialLines(list(Lines(list(l),ID="a")),
proj4string=CRS("+init=epsg:4326")) %>%
st_as_sf()
pt <- SpatialPoints(coords[5,]+0.02,
proj4string=CRS("+init=epsg:4326")) %>%
st_as_sf()
st_distance(st_cast(sl, "POINT"), pt)
#Units: [m]
# [,1]
# [1,] 119833.165
# [2,] 149014.814
# [3,] 79215.071
# [4,] 36422.390
# [5,] 2591.267
# [6,] 30117.701
# [7,] 45287.637
# [8,] 105289.230
# [9,] 119833.165
As a heads-up: I'm no hero when it comes to geo-data in R.
Also: I have not automated the calculation for all bearings, but manually performed operations to get the distance to intersect on de 45-bearing.
You will have to figure out the looping by yourself, as I do not have the time. Feel free to provide/post your final findings/code here when you are done.
Here is my crack at this problem, step-by-step.
#load libraries used
library(geosphere)
library(tidyverse)
library(sf)
#get bearings of lines of the polygon
df.poly <- coords %>%
mutate( lon_next = lead(lon), lat_next = lead(lat) ) %>%
mutate( bearing_to_next = ifelse( !is.na( lon_next ),
unlist( pmap( list( a = lon, b = lat, x = lon_next, y = lat_next ),
~ round( geosphere::bearing( c(..1, ..2), c(..3, ..4) ) )
)
),
NA )
) %>%
filter( !is.na( lon_next ) )
# lon lat bearing_to_next
# 1 -6.146851 53.88492 56
# 2 -3.762817 54.80702 167
# 3 -3.246460 53.46189 -103
# 4 -3.960571 53.36366 -64
# 5 -4.454956 53.50765 -125
# 6 -4.795532 53.36366 148
# 7 -4.553833 53.12700 -78
# 8 -5.971069 53.29806 -10
#find intersection point based on the intersection of two 'great circles'
#from two points with a bearing
gcIntersectBearing(
#coordinates 2nd point of polyline, with bearing to third point
c( -3.7628174, 54.807017 ), 167,
#coordinates point, with bearing of 45
c( -4.454956, 53.50765 ), 45 )
# lon lat lon lat
# [1,] -3.476074 54.07798 176.5239 -54.07798
let's see what we have got so far
p_intersect <- data.frame( lon = -3.476074, lat = 54.07798 ) %>%
st_as_sf( coords = c( "lon", "lat" ), crs = 4326 )
startpoint <- coords %>% slice(5) %>% mutate( lon = lon + 0.02, lat = lat + 0.02 ) %>%
st_as_sf( coords = c("lon","lat"), crs = 4326 )
poly <- coords %>%
as.matrix() %>%
list() %>%
st_polygon() %>%
st_sfc() %>%
st_set_crs( 4326 )
mapview::mapview( list(poly, startpoint, p_intersect) )
The location of the intersection point p_intersect on the polygon poly from the startpoint with a 45-degrees bearing looks correct.
Now you can calculate the distance as follows:
#calculate distance
st_distance( startpoint, p_intersect )
# Units: [m]
# [,1]
# [1,] 87993.3
Google Maps seems to agree on the distance (bit of a margin due to mouseclicking aroung the points, but looks ok to me)
Now you will have to figure out some clever looping/vectorisation and you are done :)
I have to get back to my real job.
Thankyou to #patL and #Wimpel, I've used your suggestions to come up with a solution to this problem.
First I create spatial lines of set distance and bearings from an origin point using destPoint::geosphere. I then use gIntersection::rgeos to obtain the spatial points where each transect intersects the coastline. Finally I calculate the distance from the origin point to all intersect points for each transect line respectively using gDistance::rgeos and subset the minimum value i.e. the nearest intersect.
load packages
pkgs=c("sp","rgeos","geosphere","rgdal") # list packages
lapply(pkgs,require,character.only=T) # load packages
create data
coastline
coords<-structure(list(lon =c(-6.1468506,-3.7628174,-3.24646,
-3.9605713,-4.4549561,-4.7955322,-4.553833,-5.9710693,-6.1468506),
lat=c(53.884916,54.807017,53.46189,53.363665,53.507651,53.363665,53.126998,53.298056,53.884916)), class = "data.frame", row.names = c(NA,-9L))
l=Line(coords)
sl=SpatialLines(list(Lines(list(l),ID="a")),proj4string=CRS("+init=epsg:4326"))
point
sp=SpatialPoints(coords[5,]+0.02,proj4string=CRS("+init=epsg:4326"))
p=coordinates(sp) # needed for destPoint::geosphere
create transect lines
b=seq(0,315,45) # list bearings
tr=list() # container for transect lines
for(i in 1:length(b)){
tr[[i]]<-SpatialLines(list(Lines(list(Line(list(rbind(p,destPoint(p=p,b=b[i],d=200000))))),ID="a")),proj4string=CRS("+init=epsg:4326")) # create spatial lines 200km to bearing i from origin
}
calculate distances
minDistance=list() # container for distances
for(j in 1:length(tr)){ # for transect i
intersects=gIntersection(sl,tr[[j]]) # intersect with coastline
minDistance[[j]]=min(distGeo(sp,intersects)) # calculate distances and use minimum
}
do.call(rbind,minDistance)
In reality the origin point is a spatial point data frame and this process is looped multiple times for a number of sites. There are also a number of NULL results when carry out the intersect so the loop includes an if statement.

Find the nearest neighbor of coordinates in 2 separate matrices and the distance between the 2 in r

I have 2 dataframes that are simply matrices of 2 dimensions (lat/long). Both dataframes would look like the input below:
latitude longitude
27.78833 -82.28197
27.79667 -82.29294
Let's call them "dfref" and "dfnew". I would like to find the nearest point in dfnew for each point in dfref and the distance between the 2 points in meters.
The output would look like this:
dr.latitude dr.longitude dn.latitude dn.longitude dist
27.78833 -82.28197 27.54345 -82.33233 162.34
27.79667 -82.29294 27.56543 -82.12323 232.23
I have tried using the knn function in the class package and the Searchtrees package but my script only found the nearest points in the dfref matrix and I am not sure how to add the measurement.
knn1(train=cbind(dfref), test=cbind(dfnew), cl=seq_len(nrow(dfnew)))
Is there a function that does both efficiently and how can I get this into one script?
I am not expert on Geo math, but it seems that you can start with something like this:
dfref <- read.table(text =
"latitude longitude
27.78833 -82.28197
27.79667 -82.29294", header = T)
dtref <- data.table(dfref)
dfnew <- read.table(text =
"latitude longitude
27.54345 -82.33233", header = T)
dtnew <- data.table(dfnew)
# Make cartesian product of to tables.
dtref$fake <- 1
dtnew$fake <- 1
dtall <- merge(dtref, dtnew, by = "fake", allow.cartesian = T)
# Calculate distance.
library(geosphere)
dtall[, distance := distVincentyEllipsoid(c(longitude.x, latitude.x), c(longitude.y, latitude.y)), by = 1:nrow(dtall)]
# Print results.
dtall[, .(latitude.x, longitude.x, latitude.y, longitude.y, distance)]
# latitude.x longitude.x latitude.y longitude.y distance
# 1: 27.78833 -82.28197 27.54345 -82.33233 27587.29
# 2: 27.79667 -82.29294 27.54345 -82.33233 28328.19

Get closest point based on coordinates

I'd like to snap single points to other points based on their distance in R.
In detail, I have a bunch of points defined by X and Y coordinate pairs.
In addition I have single different points, that I want to snap to
the closest neighbors (euclidean distance).
# target points
df <- data.frame(X=c(1,2,2,2,3,4),Y=c(1,2,2,3,3,4))
# points that need snapping
point1 <- data.frame(X=2.1, Y=2.3)
point2 <- data.frame(X=2.5, Y=2.5)
plot(df)
points(point1$X,point1$Y,pch=20,col="red")
points(point2$X,point2$Y,pch=20,col="blue")
But how to proceed with the snapping of the points?
How can I snap the points and assign new coordinate pairs to the single points?
Is there a simple function available in R? Or do I need to apply
the dist() function to obtain a distance matrix and search for
the closest distance? Maybe there is a more straight forward way.
Thats how it should look like:
1) snap to the closest (euclidean distance) point (clear solution for point 1)
point1$X_snap <- 2
point1$Y_snap <- 2
2) if two or more points similarily close than
snap to that which is more "north-east"
a) snap first to the one which is more north (Y direction)
b) if there are more than one that a similarly distant in Y direction snap
to the one that is more east
point2$X_snap <- 3
point2$Y_snap <- 3
For a graphical illustration how the results should look like
#plot snapped points:
points(point1$X_snap,point1$Y_snap,pch=8,col="red")
points(point2$X_snap,point2$Y_snap,pch=8,col="blue")
It's possible using dist:
dist(rbind(point1,df))
1 2 3 4 5
2 1.7029386
3 0.3162278 1.4142136
4 0.3162278 1.4142136 0.0000000
5 0.7071068 2.2360680 1.0000000 1.0000000
6 1.1401754 2.8284271 1.4142136 1.4142136 1.0000000
7 2.5495098 4.2426407 2.8284271 2.8284271 2.2360680
6
2
3
4
5
6
7 1.4142136
So the row with the minimum value (distance) in the first column identifies the point in df which is closest to point1 . In your example, you have a repeated location. Repeat for each of your point_x .
I found another solution using the matchpt() function
from Biobase (Bioconductor):
# target points
df <- data.frame(X=c(1,2,2,2,3,4),Y=c(1,2,2,3,3,4))
# points that need snapping
point1 <- data.frame(X=2.1, Y=2.3)
point2 <- data.frame(X=2.5, Y=2.5)
snap <- function(df,point){
require(Biobase)
d <- matchpt(as.matrix(df),
as.matrix(data.frame(X=point$X+0.0001,Y=point$Y+0.0001))) # to the "northwest" criteria correct
min_row <- as.numeric(rownames(d[d$distance==min(d$distance),]))
point$X_snap <- unique(df[min_row,"X"])
point$Y_snap <- unique(df[min_row,"Y"])
point
}
snap(df,point2)
You might also want to try the RANN package for Fast Nearest Neighbour Search:
# your data
df <- data.frame(X=c(1,2,2,2,3,4),Y=c(1,2,2,3,3,4))
pts <- data.frame(X=c(2.1, 2.5), Y=c(2.3, 2.5))
library(RANN)
# for each point in pts, find the nearest neighbor from df
closest <- RANN::nn2(data = df, query = pts, k = 1)
# argument k sets the number of nearest neighbours, here 1 (the closest)
closest
# $nn.idx
# [,1]
# [1,] 3
# [2,] 5
#
# $nn.dists
# [,1]
# [1,] 0.3162278
# [2,] 0.7071068
# Get coordinates of nearest neighbor
pts$X_snap <- df[closest$nn.idx, "X"]
pts$Y_snap <- df[closest$nn.idx, "Y"]
pts
# X Y X_snap Y_snap
# 1 2.1 2.3 2 2
# 2 2.5 2.5 3 3
I would put the criteria (distance, "southness", "westness") in a dataframe and then sort this dataframe along these criteria:
# input data
df <- data.frame(X=c(1,2,2,2,3,4),Y=c(1,2,2,3,3,4))
point1 <- data.frame(X=2.1, Y=2.3)
point2 <- data.frame(X=2.5, Y=2.5)
df.res[with(df.res, order(dst, dy, dx)), ]
# function that sorts all potential snapping points according to distance, "westness", "southness"
snap.xy <- function(point, other.points) {
df.res <- data.frame(X = other.points$X, # to later access the coordinates to snap to
Y = other.points$Y, # dto
dx <- point$X - other.points$X, # "westness" (the higher, the more "west")
dy <- point$Y - other.points$Y, # "southness"
dst = sqrt(dx^2 + dy^2)) # distance
# print(df.res[with(df.res, order(dst, dy, dx)), ]) # just for checking the results
return(df.res[with(df.res, order(dst, dy, dx)), ][1,c("X", "Y")]) # return only the X/Y coordinates
}
# examples
snap.xy(point1, df) # 2/2
snap.xy(point2, df) # 3/3
snap.xy(point2, df)$X # 3
snap.xy(point2, df)$Y # 3

Calculating weighted polygon centroids in R

I need to calculate the centroids of a set of spatial zones based on a separate population grid dataset. Grateful for a steer on how to achieve this for the example below.
Thanks in advance.
require(raster)
require(spdep)
require(maptools)
dat <- raster(volcano) # simulated population data
polys <- readShapePoly(system.file("etc/shapes/columbus.shp",package="spdep")[1])
# set consistent coordinate ref. systems and bounding boxes
proj4string(dat) <- proj4string(polys) <- CRS("+proj=longlat +datum=NAD27")
extent(dat) <- extent(polys)
# illustration plot
plot(dat, asp = TRUE)
plot(polys, add = TRUE)
Three steps:
First, find all the cells in each polygon, return a list of 2-column matrices with the cell number and the value:
require(plyr) # for llply, laply in a bit...
cell_value = extract(dat, polys,cellnumbers=TRUE)
head(cell_value[[1]])
cell value
[1,] 31 108
[2,] 32 108
[3,] 33 110
[4,] 92 110
[5,] 93 110
[6,] 94 111
Second, turn into a list of similar matrices but add the x and y coords:
cell_value_xy = llply(cell_value, function(x)cbind(x,xyFromCell(dat,x[,"cell"])))
head(cell_value_xy[[1]])
cell value x y
[1,] 31 108 8.581164 14.71973
[2,] 32 108 8.669893 14.71973
[3,] 33 110 8.758623 14.71973
[4,] 92 110 8.581164 14.67428
[5,] 93 110 8.669893 14.67428
[6,] 94 111 8.758623 14.67428
Third, compute the weighted mean coordinate. This neglects any edge effects and assumes all grid cells are the same size:
centr = laply(cell_value_xy, function(m){c(weighted.mean(m[,3],m[,2]), weighted.mean(m[,4],m[,2]))})
head(centr)
1 2
[1,] 8.816277 14.35309
[2,] 8.327463 14.02354
[3,] 8.993655 13.82518
[4,] 8.467312 13.71929
[5,] 9.011808 13.28719
[6,] 9.745000 13.47444
Now centr is a 2-column matrix. In your example its very close to coordinates(polys) so I'd make a contrived example with some extreme weights to make sure its working as expected.
Another alternative.
I like it for its compactness, but it will likely only make sense if you're fairly familiar with the full family of raster functions:
## Convert polygons to a raster layer
z <- rasterize(polys, dat)
## Compute weighted x and y coordinates within each rasterized region
xx <- zonal(init(dat, v="x")*dat, z) / zonal(dat,z)
yy <- zonal(init(dat, v="y")*dat, z) / zonal(dat,z)
## Combine results in a matrix
res <- cbind(xx[,2],yy[,2])
head(res)
# [,1] [,2]
# [1,] 8.816277 14.35309
# [2,] 8.327463 14.02354
# [3,] 8.993655 13.82518
# [4,] 8.467312 13.71929
# [5,] 9.011808 13.28719
# [6,] 9.745000 13.47444
The answers by Spacedman and Josh are really great, but I'd like to share two other alternatives which are relatively fast and simple.
library(data.table)
library(spatialEco)
library(raster)
library(rgdal)
using a data.table approach:
# get centroids of raster data
data_points <- rasterToPoints(dat, spatial=TRUE)
# intersect with polygons
grid_centroids <- point.in.poly(data_points, polys)
# calculate weighted centroids
grid_centroids <- as.data.frame(grid_centroids)
w.centroids <- setDT(grid_centroids)[, lapply(.SD, weighted.mean, w=layer), by=POLYID, .SDcols=c('x','y')]
using wt.centroid{spatialEco} :
# get a list of the ids from each polygon
poly_ids <- unique(grid_centroids#data$POLYID)
# use lapply to calculate the weighted centroids of each individual polygon
w.centroids.list <- lapply(poly_ids, function(i){wt.centroid( subset(grid_centroids, grid_centroids#data$POLYID ==i)
, 'layer', sp = TRUE)} )
My own less elegant solution below. Gives exactly the same results as Spacedman and Josh.
# raster to pixels
p = rasterToPoints(dat) %>% as.data.frame()
coordinates(p) = ~ x + y
crs(p) = crs(polys)
# overlay pixels on polygons
ol = over(p, polys) %>% mutate(pop = p$layer) %>% cbind(coordinates(p)) %>%
filter(COLUMBUS_ %in% polys$COLUMBUS_) %>%  # i.e. a unique identifier
dplyr::select(x, y, pop, COLUMBUS_) %>% as_data_frame()
# weighted means of x/y values, by pop
pwcs = split(ol, ol$COLUMBUS_) %>% lapply(function(g){
data.frame(x = weighted.mean(g$x, g$pop), y = weighted.mean(g$y, g$pop))
}) %>% bind_rows() %>% as_data_frame()

Resources