Calculating the distance between points in R - r

I looked through the questions that been asked but dealing with coordinates but couldn't find something can help me out with my problem.
I have dataset that contain ID, Speed, Time , List of Latitude & Longitude. ( dataset can be found in the link)
https://drive.google.com/file/d/1MJUvM5WEhua7Rt0lufCyugBdGSKaHMGZ/view?usp=sharing
I want to measure the distance between each point of Latitude & Longitude.
For example;
Latitude has: x1 ,x2 ,x3 ,...x1000
Longitude has: y1 ,y2 ,y3 ,..., y100
I want to measure the distance between (x1,y1) to all the points , and (x2,y2) to all the points, and so on.
The reason I'm doing this to know which point close to which and assign index to each location based on the distance.
if (x1, y1) is close to (x4,y4) so (x1, y1) will get the index A for example and (x4,y4) will get labeled as B. sort the points in order based on distance.
I tried gDistance function but showed error message: "package ‘gDistance’ is not available (for R version 3.4.3)"
and if I change the version to 3.3 library(rgeos) won't work !!
Any suggestions?
here's what I tried,
#requiring necessary packages:
library(sp) # vector data
library(rgeos) # geometry ops
#Read the data and transform them to spatial objects
d <- read.csv("ReadyData.csv")
sp.ReadData <- d
coordinates(sp.ReadyData) <- ~Longitude + Latitude
d <- gDistance(sp.ReadyData, byid= TRUE)
here's update my solution, I created spatial object and made spatial data frame as follow:
#Create spatial object:
lonlat <- cbind(spatial$Longitude, spatial$Latitude)
#Create a SpatialPoints object:
library(sp)
pts <- SpatialPoints(lonlat)
crdref <- CRS('+proj=longlat +datum=WGS84')
pts <- SpatialPoints(lonlat, proj4string=crdref)
# make spatial data frame
ptsdf <- SpatialPointsDataFrame(pts, data=spatial)
Now I'm trying to measure the Distance for longitude/latitude coordinates. I tried dist method but seems not working for me and tried pointDistance method:
gdis <- pointDistance(pts, lonlat=TRUE)
still not clear for me how this function can measure the distance, I need to figure out the distance so I can locate the point in the middle and assign numbers for each point based on its location from the middle point..

You can use raster::pointDistance or geosphere::distm among others functions.
Part of your example data (please avoid files in your questions):
d <- read.table(sep=",", text='
"OBU ID","Time Received","Speed","Latitude","Longitude"
"1",20,1479171686325,0,38.929596,-77.2478813
"2",20,1479171686341,0,38.929596,-77.2478813
"3",20,1479171698485,1.5,38.9295887,-77.2478945
"4",20,1479171704373,1,38.9295048,-77.247922
"5",20,1479171710373,0,38.9294865,-77.2479055
"6",20,1479171710373,0,38.9294865,-77.2479055
"7",20,1479171710373,0,38.9294865,-77.2479055
"8",20,1479171716373,2,38.9294773,-77.2478712
"9",20,1479171716374,2,38.9294773,-77.2478712
"10",20,1479171722373,1.32,38.9294773,-77.2477417')
Solution:
library(raster)
m <- pointDistance(d[, c("Longitude", "Latitude")], lonlat=TRUE)
To get the nearest point to each point, you can do
mm <- as.matrix(as.dist(m))
diag(mm) <- NA
i <- apply(mm, 1, which.min)
The point pairs
p <- cbind(1:nrow(mm), i)
To get the distances, you can do:
mm[p]
Or do this:
apply(mm, 1, min, na.rm=TRUE)
Note that rgeos::gDistance is for planar data, not for longitude/latitude data.
Here is a similar question/answer with some illustration.
our data set is too large to make a single distance matrix. You can process your data in chunks to with that. Here I am showing that with a rather small chunk size of 4 rows. Make this number much bigger to speed up processing time.
library(geosphere)
chunk <- 4 # rows
start <- seq(1, nrow(d), chunk)
end <- c(start[-1], nrow(d))
x <- d[, c("Longitude", "Latitude")]
r <- list()
for (i in 1:length(start)) {
y <- x[start[i]:end[i], , drop=FALSE]
m <- distm(y, x)
m[cbind(1:nrow(m), start[i]:end)] <- NA
r[[i]] <- apply(m, 1, which.min)
}
r <- unlist(r)
r
# [1] 2 1 1 5 6 6 5 5 9 8 8 8
So for your data:
d <- read.csv("ReadyData.csv")
chunk <- 100 # rows
# etc
This will take a long time.
An alternative approach:
library(spdep)
x <- as.matrix(d[, c("Longitude", "Latitude")])
k <- as.vector(knearneigh(x, k=1, longlat=TRUE)$nn)

Assuming you have p1 as spatialpoints of x and p2 as spatialpoints of y, to get the index of the nearest other point:
ReadyData$cloDist <- apply(gDistance(p1, p2, byid=TRUE), 1, which.min)
If you have the same coordinate in the list you will get an index of the point itself since the closest place to itself is itself. An easy trick to avoid that is to use the second farthest distance as reference with a quick function:
f_which.min <- function(vec, idx) sort(vec, index.return = TRUE)$ix[idx]
ReadyData$cloDist2 <- apply(gDistance(p1, p2, byid=TRUE), 1, f_which.min,
idx = 2)

Related

Calculate Topographic Position Index for coordinates stored in a dataframe in R

I'm trying to calculate Topographic Position Index (TPI) for 177 points of interest. I have their coordinates stored in a data.frame and elevation in a raster of 7.5 arc sec spatial resolution. And the TPIs I'm calculating is basically: the elevation of point of interest minus the average elevation of its surrounding cells, then the intermediate result is divided by the spatial resolution of the raster. (resolution(dem)) to account for differences in the spatial scale of the DEM and the TPI values.
And since studies usually calculate two TPIs (a small scale + a large scale), I'm also using two windows, where in the small one the surrounding 55 cells are used, and in the large one the surrounding 1010 cells are used.
I am getting this error message
Error in .focal_fun(v, w, as.integer(c(tr$nrows[1] + addr, nc)), runfun, :
Evaluation error: could not find function "resolution"
Code:
library(raster)
library(sp)
library(terra)
library(haven)
# Make a dataframe with longitudes and latitudes
df <- data.frame(lon = coords$longitude, lat = coords$latitude)
# Convert the dataframe to a SpatialPointsDataFrame
coordinates(df) <- c("lon", "lat")
proj4string(df) <- CRS("+proj=longlat +datum=WGS84")
# Extract the elevation values at the points
elev <- extract(dem, df)
# Define the scales for TPI calculation
scales <- c(5, 10)
# Loop over the scales and calculate TPI
tpi_list <- list()
for (scale in scales) {
# Define the size of the moving window
win_size <- scale * 5
# Calculate TPI
tpi <- focal(dem, w = matrix(1, win_size, win_size), fun = function(x) {
(elev - mean(x)) / resolution(dem) * 5
})
# Extract TPI values at the points
tpi_vals <- extract(tpi, df)
# Store the TPI values in a list
tpi_list[[as.character(scale)]] <- tpi_vals
}
#Error in .focal_fun(v, w, as.integer(c(tr$nrows[1] + addr, nc)), runfun, : Evaluation error: could not find function "resolution"
# Combine the TPI values for different scales into a dataframe
tpi_df <- data.frame(tpi_list, row.names = rownames(df))
The error you get is clear:
Evaluation error: could not find function "resolution"
You are using a function resolution, but R does not know about that function. It does not exist in the current workspace. I suppose you were looking for res.
Here is a working example.
library(terra)
dem <- rast(system.file("ex/elev.tif", package="terra"))
df <- data.frame(lon=c(5.9, 6.0, 6.2), lat=c(49.9, 49.6, 49.7))
tpifun <- \(x, f) x[f] - mean(x[-f], na.rm=TRUE)
scales <- c(5, 11)
tpilst <- vector("list", length(scales))
for (i in seq_along(scales)) {
win_size <- scales[i] * 5
mid <- ceiling(win_size^2 / 2)
tpi <- focal(dem, w=win_size, fun=tpifun, f=mid, wopt=list(names="tpi"))
tpilst[[i]] <- data.frame(scale=scales[i], extract(tpi, df))
}
tpi <- do.call(rbind, tpilst)
tpi$tpi <- tpi$tpi / (mean(res(dem)) * 5)
tpi
# scale ID tpi
#1 5 1 -1330.6184
#2 5 2 340.1538
#3 5 3 -135.3077
#4 11 1 -585.7952
#5 11 2 255.3344
#6 11 3 292.0155
A couple of things:
res returns two numbers, the x and y resolution. In the example above I take the mean.
What you were doing in the function supplied to focal is not possible. You supplied a data.frame with elevation data for a few points. How can focal understand what that is all about? Instead, you can compute the TPI for each cell and extract these values.
You cannot use a value of 10 for scale because the weights matrix must have odd size. Otherwise it is not clear how it should be centered on the focal cell.
You say that if scale is 5, the "surrounding 55 cells are used". But that is not the case. The number of surrounding cells used is 624.
scale <- 5
(scale * 5)^2 - 1
#[1] 624

Generate random points from a raster with a pre-established distance between the points in R

I want to generate random points over a raster, but I need these points to have a distance between them, like 10000 meters. I've seen that is possible to establish distance between points using the package 'spatstat', but I didn't understand how to use this package to generate points based on a raster.
This is what I did to generate the points without the distance criteria:
#Number of points
n.points = 63
#Generate random points from a raster
sampling = raster::sampleRandom(myraster,size=63,na.rm=TRUE,
cells=FALSE,xy=TRUE,sp=FALSE,asRaster=FALSE)
#Select just the coordinates and transform in a data frame
xy = as.data.frame(sampling[,c(1,2)])
#Spatialize these points
spdf = sp::SpatialPointsDataFrame(coords = xy,
data = as.data.frame(xy),
proj4string =
CRS("+proj=longlat +datum=WGS84 +no_defs"))
Here's some code that tries to select points according to your rule that the closest point to any point is exactly 10000m from it. This code ignores any issues of map projection and curvature of the Earth; that should be fine in a relatively small area, but not over a very large one.
r <- 10000 # distance between points
n.points <- 63
x <- matrix(NA, nrow = n.points, ncol = 2)
# Sample one point randomly in the region. I'll assume the region is
# +/- 100000 in each coordinate; if your region is some other shape,
# change this code:
x[1,] <- runif(2, -100000, 100000)
for (i in 2:n.points) {
# Count how many tries to find the next point
tries <- 0
repeat {
# Pick an existing point
j <- sample(1:(i-1), 1)
# Pick a direction from it
theta <- runif(1, 0, 2*pi)
# Find the point at distance r in that direction
y <- x[j,] + r*c(cos(theta), sin(theta))
# Is the point in the region?
if (any(y < -100000) || any(y > 100000))
next
# Calculate the distances to all other points
dists <- apply(x[1:(i-1), , drop=FALSE],
1,
function(row) sqrt(sum((row - y)^2)))
# If this point is far enough from existing points, keep it
if (all(dists >= r))
break
# If not, try again, but not forever...
tries <- tries + 1
if (tries > 100000)
stop("failed")
}
x[i,] <- y
}
plot(x)
Created on 2022-10-10 with reprex v2.0.2

How do I find the overlap between two shapefiles?

I have two shapefiles (sf), one with polygons and one with points. As output I want a df showing which points fall within which polygons, something like this:
polygon overlap geometry
polygon1 point34 c(3478,234872)
polygon1 point56 c(23423,234982)
polygon2 point23 c(23498,2334)
polygon3 point45 c(872348,23847)
polygon3 point87 c(234982,1237)
polygon3 point88 c(234873,2873)
I assume I'll have to do something with st_intersection() but up to now I did not manage to get the desired output.
After fiddling around I came up with this solution, but I'm pretty sure it is not the most elegant. x and y are shapefiles, x with points and y with polygons.
count_overlap <- function(x, y){
f1 <- function(z){
r <- st_intersects(x,y[z,])
return(r)
}
l1 <- c(1:nrow(y))
l2 <- lapply(l1, f1)
l3 <- lapply(l2, unlist)
r <- sapply(l3, sum)
y$overlap <- r
return(y)
}
The result is the original y sf/dataframe with an added column called 'overlap' that shows the counts of points from x that fall within the polygon. Not exactly what I asked for in the question but a good outcome for me personally.
Try using over in sp:
library(sp)
out = over(pnt,plgn)
from ?over:
x = "SpatialPoints", y = "SpatialPolygons"
returns a numeric vector of length equal to the number of points; the number is the index (number) of the polygon of y in which a point falls; NA denotes the point does not fall in a polygon; if a point falls in multiple polygons, the last polygon is recorded.

Repeat for loop for all rows of a spatial points data frame

I want to calculate the shortestPath distance (using gDistance package) between a set of geographic coordinates, using a transition layer of the ocean to prevent 'movement' across land.
Here is how I created the transition layer:
library(raster); library(gdistance); library(maptools); library(rgdal); library(sp)
mapcrs <- "+proj=longlat +datum=WGS84 +no_defs"
data(wrld_simpl)
world <- wrld_simpl
worldshp <- spTransform(world, mapcrs)
ras <- raster(nrow=300,ncol=300)
crs(ras) <- crs(oceans.shp)
extent(ras) <- extent(worldshp)
landmask <- rasterize(worldshp, ras)
landras <- is.na(landmask)
tr <- transition(landras, transitionFunction = mean, directions = 8, symm = FALSE)
tr = geoCorrection(tr, scl=FALSE)
I then want to calculate the shortestPath distance between every coordinate in my dataset i.e. location 1 to location n, location 2 to location n etc.
Let's produce some hypothetical geographic coordinates and convert to spatial points
x <- rnorm(10, mean = -40, sd=5)
y <- rnorm(10, mean = 20, sd=5)
xy <- cbind(x,y); colnames(xy) <- c("lon","lat")
xy <- SpatialPoints(xy); projection(xy) <- projection(mapcrs)
Using the shortestPath function in gDistance, I can calculate the distance from the first coordinate (i.e. xy[1]) to all other xy coordinates, like so.
dist <- shortestPath(tr, origin = xy, goal = xy, output="SpatialLines")
I then tried to apply a for loop to sequentially calculate distance from location 1 to all other locations, and then calculating distance from location 2 to all other locations etc., which I wrote as follows:
for(i in seq_along(xy)){
AtoB <- shortestPath(tr, origin = xy[i,], goal=xy, output="SpatialLines")
i <- i+1
}
This, however, still only calculates the distances relative to the first xy spatial point and does not 'loop' for all subsequent rows. I don't know what I'm doing wrong. It's probably super-easy, but I'm struggling. Any help would be appreciated.
Thanks in advance,
Tony
---- UPDATE ----
We have come up with a bit of a work around (thanks Charley Clubley) but it still won't produce outputs for every spatial line. This will generate a matrix of distances.
The work around is as follows:
Using xy as a matrix, not spatial points
distances <- matrix(ncol=nrow(xy), nrow=nrow(xy))
xy_b <- xy ## Coords needs to be as a matrix (not spatial points)
## This generates an error indicating there are no more rows to delete once complete, but the computation works
for (i in 1:nrow(xy_b)) {
AtoB <-shortestPath(tr, xy_b, xy, output="SpatialLines")
length <- SpatialLinesLengths(AtoB)
distances[i, ] <- length
xy_b <- xy_b[-1,]
}

Row ordering for polygons

My question is simple. Is there an automatic way to order you data so that it makes "clean" polygons? I have functions that are generating rings (specifically the ahull function), and I would like a way to cleanly produce polygons using such functions. Here is an example.
x <- c(1:3, 3:1, 1)
y <- c(1,1,1,3,3,2, 1)
xy <- cbind(x,y)
Sr1 <- Polygon(xy)
Srs1 = Polygons(list(Sr1), "s1")
SpP = SpatialPolygons(list(Srs1))
plot(SpP)
z <- runif(7)
xyz <- cbind(x,y,z)
xyz <- xyz[order(z),]
xy <- xyz[,-3]
xy <- rbind(xy, xy[1,])
Sr1 <- Polygon(xy)
Srs1 = Polygons(list(Sr1), "s1")
SpP = SpatialPolygons(list(Srs1))
SpP = SpatialPolygons(list(Srs1))
plot(SpP)
Here is my real data: https://drive.google.com/file/d/0B8QG4cbDqH0UOUlobnlWaDgwOWs/edit?usp=sharing
In a sense, you have answered your own question.
Assuming you have a set of points, and you use ahull(...) in the alphahull package to generate the convex hull, you can extract the points on the boundary, in the correct order, directly from the ahull object. Here is an example:
library(sp)
library(alphahull)
set.seed(1) # for reproducible example
X <- rnorm(100)
Y <- rnorm(100)
plot(X,Y)
XY <- cbind(X,Y)
hull <- ahull(XY,alpha=1)
plot(hull)
# extract the row numbers of the boundary points, in convex order.
indx=hull$arcs[,"end1"]
points <- XY[indx,] # extract the boundary points from XY
points <- rbind(points,points[1,]) # add the closing point
# create the SpatialPolygonsDataFrame
SpP = SpatialPolygons(list(Polygons(list(Polygon(points)),ID="s1")))
plot(SpP)
points(XY)
EDIT Response to OP's providing their dataset.
ahull(...) seems to fail, without warning, with your dataset - it does not produce any convex hulls. After a bit if experimentation, it looks like the problem has to do with the magnitude of the x,y values. If I divide everything by 1000, it works. No idea what's going one with that (perhaps someone else will provide an insight??). Anyway, here's the code and the result:
library(sp)
library(alphahull)
df <- read.csv("ahull problem.csv")
hull <- ahull(df[2:3]/1000,alpha=2)
plot(hull)
# extract the row numbers of the boundary points, in convex order.
indx=hull$arcs[,"end1"]
points <- df[indx,2:3] # extract the boundary points from df
points <- rbind(points,points[1,]) # add the closing point
# create the SpatialPolygonsDataFrame
SpP = SpatialPolygons(list(Polygons(list(Polygon(points)),ID="s1")))
plot(SpP)
points(df[2:3])
Note also that alpha=2. Setting alpha=1 with this dataset actually generates 2 hulls, one with 1 point and one with all the other points. Setting alpha=2 creates 1 hull.

Resources