I'm working with two dataframes in R: a "red" dataframe and a "black" dataframe. In both there are two columns representing the coordinates.
I used a plot to explain what I want to do.
I would like to select all the points from the "red" dataframe that are beyond the "black" line. E.g. all the points excluded from the area of the polygon delimited by the black points.
My previous answer only showed how not to draw the points outside the polygon. To actually identify the points outside the polygon, you can use the function pip2d from the package ptinpoly. It returns negative values for points outside the polygon.
Example:
library(ptinpoly)
poly.vertices <- data.frame(x=c(20,40,80,50,40,30), y=c(30,20,70,80,50,60))
p <- data.frame(x=runif(100, min=0, max=100), y=runif(100, min=0, max=100))
outside <- (pip2d(as.matrix(poly.vertices), as.matrix(p)) < 0)
plot(p$x, p$y, col=ifelse(outside, "red", "black"))
polygon(poly.vertices$x, poly.vertices$y, border="blue", col=NA)
The same should be achieved with the function PtInPoly from the package DescTools, which returns zero for points outside the polygon. The implementation of ptinpoly, however, has the advantage of implmenting the particularly efficient algorithm described in
J. Liu, Y.Q. Chen, J.M. Maisog, G. Luta: "A new point containment test algorithm based on preprocessing and determining triangles." Computer-Aided Design, Volume 42, Issue 12, December 2010, Pages 1143-1150
Edit: Out of curiosity, I have compared the runtime of ptinpoly::pip2d and DescTools::PtInPoly with microbenchmark and N=50000 points, and pip2d turned out to be considerably faster:
> microbenchmark(outside.pip2d(), outside.PtInPoly())
Unit: milliseconds
expr min lq mean median uq max
outside.pip2d() 3.375084 3.421631 4.459051 3.48939 4.251395 65.97793
outside.PtInPoly() 27.537927 27.666688 28.739288 27.97984 28.514595 90.11313
neval
100
100
You could use the sf package to define a convex hull and intersect your target points with that polygon.
Define a convex hull based on black:
library(sf)
set.seed(99)
red <- data.frame(x = runif(100,-10,10), y = runif(100,-4,4))
black <- data.frame(x = runif(100,-8,8), y = runif(100,-4,3))
# Convert df to point feature
blk <- st_as_sf(black, coords = c("x", "y"))
# Convert to multipoint
blk_mp <- st_combine(blk)
# Define convex hull
blk_poly <- st_convex_hull(blk_mp)
plot(black)
points(red, col = "red")
plot(blk_poly, add = TRUE)
Intersecting red with the convex hull returns red within that polygon:
rd <- st_as_sf(red, coords = c("x", "y"))
rd_inside <- st_intersection(rd, blk_poly)
plot(black)
points(red)
plot(blk_poly, add = TRUE)
plot(rd_inside, pch = 24, col = "red", bg = "red", add = TRUE)
One possible solution is to draw the polygon after the points and fill its outer area white. This cannot be done directly with polygon or polypath, because these functions can only fill the interiour of a polygon. You can however fill the area between two polygons with polypath. Thus you can add a second polygon that encompasses (or goes beyond) the borders of your plot.
Here is an example that works in base R:
p.outer <- list(x=c(0,100,100,0), y=c(0,0,100,100))
p.inner <- list(x=c(20,40,80,50,40,30), y=c(30,20,70,80,50,60))
plot(p.outer, type="n")
points(runif(100, min=0, max=100), runif(100, min=0, max=100))
polypath(x=c(p.outer$x, NA, p.inner$x), y = c(p.outer$y, NA, p.inner$y), col ="white", rule="evenodd")
Using the function PtInPoly of package DescTools, as suggested by #cdalitz, I resolved the problem. This function returned a data frame of the coordinates of the points (in my case the red coordinates) and a third column "pip" of 1 (if the point is within the polygon) and 0s (if outside the polygon).
I will use another dataset to show you the result:
try <- DescTools::PtInPoly(pnts = red[,c("x","y")], poly.pnts = black[,c("x","y")])
ggplot()+
geom_point(try, mapping = aes(x = x, y = y, color = as.character(pip))) +
geom_polygon(data = black, mapping = aes(x,y))
Seems that you can reconstruct the edges of the black polygon by simply joining every point to its nearest neighbor and its nearest neighbor in the opposite direction. Then perform point-in-polygon tests.
Related
I have 3D meshes representing closed surfaces not necessarily convex for which I would like to get orthographic projections onto arbitrary directions (to put in context, the 3D meshes represent satellites and the end goal is to use the projections to calculate atmospheric drag).
As a first step, I am just aiming to compute the surface area of the resulting projection. Is there any way to perform such operation with rgl? Since the meshes represent closed surfaces, the projections will not contain multiple disconnected polygons.
I believe I can get the set of triangles/quads visible from a given direction by using the facing3d() function, specifying the direction in the up argument. But I am unsure on how to proceed from there.
You can do the projections using the rgl::shadow3d() function, and calculate area using geometry::polyarea(). For example,
library(rgl)
library(geometry)
satellite <- translate3d(icosahedron3d(), x = 0, y = 0, z = 5)
vertices <- asEuclidean2(satellite$vb)
xrange <- range(vertices[1,])
yrange <- range(vertices[2,])
floor <- mesh3d(x = c(2*xrange, 2*rev(xrange)),
y = rep(2*yrange, each = 2),
z = 0, quads = 1:4)
open3d()
#> glX
#> 1
shadow <- shadow3d(floor, satellite, plot = FALSE,
minVertices=1000 # Need this to get a good shadow
)
shade3d(satellite, col= "red")
shade3d(floor, col = "white", polygon_offset = 1, alpha = 0.1)
shade3d(shadow, col = "gray")
vertices <- unique(t(asEuclidean2(shadow$vb)))[,1:2]
hull <- chull(vertices)
hullx <- vertices[hull,1]
hully <- vertices[hull,2]
plot(c(hullx, hullx[1]), c(hully, hully[1]), type = "l")
polyarea(hullx, hully)
#> [1] 3.266855
Created on 2022-12-13 with reprex v2.0.2
I am new to R and (unsupervised) machine learning. I'm trying to find out the best cluster solution for my data in R.
What is my data about?
I have a dataset with +/- 800 long / lat WGS84 coordinates in one city.
Long is in the range 6.90 - 6.95
lat is in the range 52.29 - 52.33
What do I want?
I want to find "hotspots" based on their density. As example: minimum 5 long/lat points in a range of 50 meter. This is a point plot example:
Why do I want this?
As example: let's assume that every single point is a car accident. By clustering the points I hope to see which areas need attention. (min x points in a range of x meter needs attention)
What have I found?
The following clustering algorithms seems possible for my solution:
DBscan (https://cran.r-project.org/web/packages/dbscan/dbscan.pdf)
HDBscan(https://cran.r-project.org/web/packages/dbscan/vignettes/hdbscan.html)
OPTICS (https://www.rdocumentation.org/packages/dbscan/versions/0.9-8/topics/optics)
City Clustering Algorithm (https://cran.r-project.org/web/packages/osc/vignettes/paper.pdf)
My questions
What is the best solution or algorithm for my case in R?
Is it true that I have to convert my long/lat to a distance / Haversine matrix first?
Find something interested on: https://gis.stackexchange.com/questions/64392/finding-clusters-of-points-based-distance-rule-using-r
I changed this code a bit, using the outliers as places where a lot happens
# 1. Make spatialpointsdataframe #
xy <- SpatialPointsDataFrame(
matrix(c(x,y), ncol=2), data.frame(ID=seq(1:length(x))),
proj4string=CRS("+proj=longlat +ellps=WGS84 +datum=WGS84"))
# 2. Use DISTM function to generate distance matrix.#
mdist <- distm(xy)
# 3. Use hierarchical clustering with complete methode#
hc <- hclust(as.dist(mdist), method="complete")
# 4. Show dendogram#
plot(hc, labels = input$street, xlab="", sub="",cex=0.7)
# 5. Set distance: in my case 300 meter#
d=300
# 6. define clusters based on a tree "height" cutoff "d" and add them to the SpDataFrame
xy$clust <- cutree(hc, h=d)
# 7. Add clusters to dataset#
input$cluster <- xy#data[["clust"]]
# 8. Plot clusters #
plot(input$long, input$lat, col=input$cluster, pch=20)
text(input$long, input$lat, labels =input$cluster)
# 9. Count n in cluster#
selection2 <- input %>% count(cluster)
# 10. Make a boxplot #
boxplot(selection2$n)
#11. Get first outlier#
outlier <- boxplot.stats(selection2$n)$out
outlier <- sort(outlier)
outlier <- as.numeric(outlier[1])
#12. Filter clusters greater than outlier#
selectie3 <- as.vector(selection2 %>% filter(selection2$n >= outlier[1]) %>% select(cluster))
#13. Make a new DF with all outlier clusters#
heatclusters <- input %>% filter(cluster%in% c(selectie3$cluster))
#14. Plot outlier clusters#
plot(heatclusters$long, heatclusters$lat, col=heatclusters$cluster)
#15. Plot on density map ##
googlemap + geom_point(aes(x=long , y=lat), data=heatclusters, color="red", size=0.1, shape=".") +
stat_density2d(data=heatclusters,
aes(x =long, y =lat, fill= ..level..), alpha = .2, size = 0.1,
bins = 10, geom = "polygon") + scale_fill_gradient(low = "green", high = "red")
Don't know if this a good solution. But it seems to work. Maybe someone has any other suggestion?
I have a series of polygons and points with each polygon containing a point. I want to determine the maximum distance of each point to the edge of the polygon containing it is contained within in R.
I looked at using the rgeos gDistance function but this returns 0 for points within polygons.
Using an example polygon and a point that falls within the polygon this is what i've coded so far but i'm getting a distance of 0 rather than the distance from a point to polygon edges.
pt1 = readWKT("POINT(0.5 0.25)")
p1 = readWKT("POLYGON((0 0,1 0,1 1,0 1,0 0))")
gDistance(pt1, p1)
# 0
Does a function exist in R or an R package that can determine distances for points within polygons to the polygon edge?
Much appreciated within advance.
Solution using spatstat and the built-in dataset chorley:
library(spatstat)
W <- Window(chorley) # Polygonal window of the choley dataset
p <- list(x = c(350, 355), y = c(415, 425)) # Two points in polygon
plot(W, main = "")
points(p, col = c("red", "blue"), cex = 1.5)
v <- vertices(W) # Polygon vertices
d <- crossdist(v$x, v$y, p$x, p$y) # 2-column matrix of cross distances
i1 <- which.max(d[,1]) # Index of max dist for first (red) point
i2 <- which.max(d[,2]) # Index of max dist for second (blue) point
plot(W, main = "")
points(p, col = c("red", "blue"), cex = 1.5)
points(v$x[c(i1,i2)], v$y[c(i1,i2)], col = c("red", "blue"), cex = 1.5)
d[i1,1] # Max dist for first (red) point
#> [1] 21.35535
d[i2,2] # Max dist for second (blue) point
#> [1] 15.88226
I have a grid of rectangles, whose coordinates are stored in the variable say, 'gridPoints' as shown below:
gridData.Grid=GridTopology(c(min(data$LATITUDE),min(data$LONGITUDE)),c(0.005,0.005),c(32,32));
gridPoints = as.data.frame(coordinates(gridData.Grid))[1:1000,];
names(gridPoints) = c("LATITUDE","LONGITUDE");
plot(gridPoints,col=4);
points(data,col=2);
When plotted, these are the black points in the image,
Now, I have another data set of points called say , 'data', which when plotted are the blue points above.
I would want a count of how many blue points fall within each rectangle in the grid. Each rectangle can be represented by the center of the rectangle, along with the corresponding count of blue points within it in the output. Also, if the blue point lies on any of the sides of the rectangle, it can be considered as lying within the rectangle while making the count. The plot has the blue and black points looking like circles, but they are just standard points/coordinates and hence, much smaller than the circles. In a special case, the rectangle can also be a square.
Try this,
x <- seq(0,10,by=2)
y <- seq(0, 30, by=10)
grid <- expand.grid(x, y)
N <- 100
points <- cbind(runif(N, 0, 10), runif(N, 0, 30))
plot(grid, t="n", xaxs="i", yaxs="i")
points(points, col="blue", pch="+")
abline(v=x, h=y)
binxy <- data.frame(x=findInterval(points[,1], x),
y=findInterval(points[,2], y))
(results <- table(binxy))
d <- as.data.frame.table(results)
xx <- x[-length(x)] + 0.5*diff(x)
d$x <- xx[d$x]
yy <- y[-length(y)] + 0.5*diff(y)
d$y <- yy[d$y]
with(d, text(x, y, label=Freq))
A more general approach (may be overkill for this case, but if you generalize to arbitrary polygons it will still work) is to use the over function in the sp package. This will find which polygon each point is contained in (then you can count them up).
You will need to do some conversions up front (to spatial objects) but this method will work with more complicated polygons than rectangles.
If all the rectangles are exactly the same size, then you could use k nearest neighbor techniques using the centers of the rectangles, see the knn and knn1 functions in the class package.
I am trying to find the orthogonal distance between a set of location coordinates and a set of lines (roads or rivers). The set of points are in the form of latitude/longitude pairs, and the lines are in a shapefile (.shp). Plotting them on a map is not a problem, using either maptools or PBSmapping. But my basic problem is to find the minimum distance one has to travel from a location to reach a road or a river. Is there any way to do this in R?
If I understand correctly, you can do this simply enough with gDistance in the rgeos package.
Read in the lines as SpatialLines/DataFrame and points as SpatialPoints/DataFrame and then loop over each point calculating the distance each time:
require(rgeos)
## untested code
shortest.dists <- numeric(nrow(sp.pts))
for (i in seq_len(nrow(sp.pts)) {
shortest.dists[i] <- gDistance(sp.pts[i,], sp.lns)
}
Here sp.pts is the Spatial points object, and sp.lns is the Spatial lines object.
You must loop so that you only compare a single coordinate in sp.pts with the entirety of all lines geometries in sp.lns, otherwise you get the distance from an aggregate value across all points.
Since your data are in latitude/longitude you should transform both the lines and points to a suitable projection since the gDistance function assumes Cartesian distance.
MORE DISCUSSION AND EXAMPLE (edit)
It would be neat to get the nearest point on the line/s rather than just the distance, but this opens another option which is whether you need the nearest coordinate along a line, or an actual intersection with a line segment that is closer than any existing vertex. If your vertices are dense enough that the difference doesn't matter, then use spDistsN1 in the sp package. You'd have to extract all the coordinates from every line in the set (not hard, but a bit ugly) and then loop over each point of interest calculating the distance to the line vertices - then you can find which is the shortest and select that coordinate from the set of vertices, so you can have the distance and the coordinate easily. There's no need to project either since the function can use ellipsoidal distances with longlat = TRUE argument.
library(maptools)
## simple global data set, which we coerce to Lines
data(wrld_simpl)
wrld_lines <- as(wrld_simpl, "SpatialLinesDataFrame")
## get every coordinate as a simple matrix (scary but quick)
wrld_coords <- do.call("rbind", lapply(wrld_lines#lines, function(x1) do.call("rbind", lapply(x1#Lines, function(x2) x2#coords[-nrow(x2#coords), ]))))
Check it out interactively, you'll have to modify this to save the coords or minimum distances. This will plot up the lines and wait for you to click anywhere in the plot, then it will draw a line from your click to the nearest vertex on a line.
## no out of bounds clicking . . .
par(mar = c(0, 0, 0, 0), xaxs = "i", yaxs = "i")
plot(wrld_lines, asp = "")
n <- 5
for (i in seq_len(n)) {
xy <- matrix(unlist(locator(1)), ncol = 2)
all.dists <- spDistsN1(wrld_coords, xy, longlat = TRUE)
min.index <- which.min(all.dists)
points(xy, pch = "X")
lines(rbind(xy, wrld_coords[min.index, , drop = FALSE]), col = "green", lwd = 2)
}
The geosphere package has the dist2line function that does this for lon/lat data. It can use Spatial* objects or matrices.
line <- rbind(c(-180,-20), c(-150,-10), c(-140,55), c(10, 0), c(-140,-60))
pnts <- rbind(c(-170,0), c(-75,0), c(-70,-10), c(-80,20), c(-100,-50),
c(-100,-60), c(-100,-40), c(-100,-20), c(-100,-10), c(-100,0))
d <- dist2Line(pnts, line)
d
Illustration of the results
plot( makeLine(line), type='l')
points(line)
points(pnts, col='blue', pch=20)
points(d[,2], d[,3], col='red', pch='x')
for (i in 1:nrow(d)) lines(gcIntermediate(pnts[i,], d[i,2:3], 10), lwd=2)
Looks like this can be done in the sf package using the st_distance function.
You pass your two sf objects to the function. Same issue as with the other solutions in that you need to iterate over your points so that the function calculates the distance between every point to every point on the roadways. Then take the minimum of the resulting vector for the shortest distance.
# Solution for one point
min(st_distance(roads_sf, points_sf[1, ]))
# Iterate over all points using sapply
sapply(1:nrow(points_sf), function(x) min(st_distance(roads_sf, points_sf[x, ])))