st_join() inconsistent behaviour - should I trust the result? old version sf? - r

I am trying to match some points with long/lat coordinates with the associated regions using a map from GADM and st_join(). So the code is very simple:
x <- st_read(my_map)
y <- st_as_sf(my_points, coords = c('long', 'lat')
st_crs(y) = 4326
y <- st_join(y,x)
where both x and y are sf objects (x with polygons, y with points).
The issue is between my home and office computer, one works and the other one throws an error.
My home computer which runs the code without any issues is running on R4.0.4 and sf_0.9-7 (which I realise is older) on a Windows10x64
My office compute runs on R4.0.5 with sf_1.0-2 on a Windows10x64.
Here's the error code:
Error in s2_geography_from_wkb(x, oriented = oriented, check = check) :
Evaluation error: Found 4 features with invalid spherical geometry.
[201] Loop 2 is not valid: Edge 2 crosses edge 4
[935] Loop 16 edge 390 crosses loop 64 edge 6
[2667] Loop 20 edge 835 crosses loop 37 edge 24843
[3401] Loop 1 is not valid: Edge 818 crosses edge 820
Do you know if I can trust the output of the join by using the older version of the package? I briefly looked at the issues in Github but couldn't find something to alarm me, it might just be that they changed one default to FALSE. But I'm wondering if that is something that I should be wary of.

Related

Trying to find neighbouring buffers in R with poly2nb()

I want to get a list of neighbours for each buffer. However the returning nb list is empty.
require(sf)
require(spdep)
us <- tidycensus::county_laea
us <- st_sf(county_laea) %>% st_transform(, crs = 3857)
us$cent <- st_as_sf(st_centroid(us$geometry))
us$buff <- st_as_sf(st_buffer(us$cent, dist = 100000)) # buffer setting is 100km
plot(st_geometry(us$geometry))
plot(st_geometry(us$cent), pch =4, add=T)
plot(st_geometry(us$buff), add=T, border = "red")
nb <- spdep::poly2nb(us$buff, queen = FALSE)
>nb
Neighbour list object:
Number of regions: 3143
Number of nonzero links: 0
Percentage nonzero weights: 0
Average number of links: 0
When I run poly2nb() on us$geometry everything is fine:
us <- tidycensus::county_laea
us <- st_sf(county_laea) %>% st_transform(, crs = 3857)
nb <- spdep::poly2nb(us$geometry, queen = FALSE)
> nb
Neighbour list object:
Number of regions: 3143
Number of nonzero links: 17572
Percentage nonzero weights: 0.1778822
Average number of links: 5.590837
7 regions with no links:
2788 2836 2995 3135 3140 3141 3143
Since you are running a rook (not queen) style neighborhood list on a bunch of circular buffers getting an empty list result is expected behaviour.
This is what your us$buff object looks like when zoomed in:
Now think again about the definition of rook style neighborhood: two polygons are neighbors if & when they share a boundary consisting of more than one point (one point would be sufficient for queen). When in doubt I always think of Colorado and Arizona - they are queen type neighbors, but not rook type ones.
Given that all your buffer objects are circles they can touch their neighbor at most in a single point. Overlap does not make a boundary, and touching in a single point is ruled out by the rook settings.
On the other hand when you look at the original counties / the us$geometry object you will see a plenty of touching lines, and few rare occasions of touching points.
Which is why the queen vs. rook settings rarely makes a noticeable difference for organically grown admin areas, but a big one for grid based ones.

Drawing GRTS points within polygons with spsurvey

In previous versions of the spsurvey package, it was possible to draw random points within polygons in a shapefile using a somewhat complicated design specification. (See here for an example).
The newly updated version of spsurvey (5.0.1) appears very user-friendly, except I cannot figure out how to perform a GRTS draw of more than one point within polygons. Below is an example:
Suppose I want to draw 10 random points using GRTS within the states of Montana and Wyoming. The grts() call requires an sf object, so we can get an sf object first.
library(raster)
library(sf)
library(spsurvey)
## Get state outlines
US <- raster::getData(country = "United States", level = 1)
States.outline <- US[US$NAME_1 %in% c("Montana","Wyoming"),]
# convert to sf object
states.out <- st_as_sf(States.outline)
Then, if we want to stratify by state, and we want ten points from each, we need:
# Define the number of points to draw from each state
strata_n <- c(Montana = 10, Wyoming = 10)
The strata_n object then gets fed into the grts() call, with the NAME_1 variable being the state name.
# Attempt to make grts draw
grts(sframe = states.out,
stratum_var = "NAME_1",
n_base = strata_n
)
This returns an error message:
During the check of the input to grtspts, one or more errors were
identified. Enter the following command to view all input error
messages: stopprnt() To view a subset of the errors (e.g., errors 1
and 5) enter stopprnt(m=c(1,5))
Running stopprnt() gives the following message:
Input Error Message n_base : Each stratum must have a sample
size no larger than the number of rows in 'sframe' representing that
stratum
This is a wonderfully clear message -- we can't draw more than one point from each polygon because the sf object only has a single row per state.
So: with the new and improved spsurvey package, how does one draw multiple points from within a polygon? Any tips or direction would be appreciated.
This is a bug. I have updated the development version, which can be installed (after installing the remotes package) by running
remotes::install_github("USEPA/spsurvey", ref = "develop")
Likely a few weeks before the changes in spsurvey are reflected on CRAN. Thanks for finding this.

Error building network from vertices in spatstat

I have a very large (shapefile) road network to read as a linear network in spatstat. So I am trying to build a basic network from reading vertices and edges as discussed in chapter 17 of book - spatial point patterns by Baddeley et al
I attach my data here
Using this code below I get an error Error: length(x0) == length(x1) is not TRUE. It is not clear to me what is x0 and x1 in order to be able to find the problem.
library(maptools)
library(spatstat)
setwd("~/documents/rwork/traced/a")
pt <- readShapePoints("collected.shp") #read vertices from a shapefile.
edgeRecords<-read.delim("edgelist.txt") #read edge connectivity list
ed<-data.frame(from=edgeRecords$from,to=edgeRecords$to)
xx<-pt#bbox[1,]#read x bounds of owin
yy<-pt#bbox[2,]#read y bounds of owin
v<-ppp(x=pt#coords[,1], y=pt#coords[,2], xx,yy) #read list of vertices
edg<-as.matrix(ed) # read node pairs as matrix
built_network<-linnet(v,edges = edg)
This results in error
Error: length(x0) == length(x1) is not TRUE
As in one of the comments above. I noticed that GIS indexing starts from 0 while R indexing starts from 1.
So to solve this problem, I just added +1 to the edge matrix. Because if you have collected your edge matrix from a GIS software it will have references to node zero in either from_node or to_node. If your edge matrix in R is em then add +1 , like so: em+1 . A sample code could be like this
edgelist <- read.delim("edgelist.txt")
em <- matrix(c(edgelist$from, edgelist$to), ncol=2) +1
net <- linnet(n,edges = em)
plot(net)
This solved the problem for me. Hope it helps someone. Or if someone has another solution, please feel free to share.

Error using getNOAA.bathy, are there restrictions on coordinates values?

I am trying to execute getNOAA.bathy from package marmap.
I can successfully execute the following (from here):
library(marmap)
getNOAA.bathy(lon1=-20,lon2=-90,lat1=50,lat2=20, resolution=10) -> a
plot(a, image=TRUE, deep=-6000, shallow=0, step=1000)
However, when I execute the following:
getNOAA.bathy(lon1=-80,lon2=-79.833333,lat1=32.7,lat2=32.833333, resolution=10) -> a
plot(a, image=TRUE, deep=-6000, shallow=0, step=1000)
I get the error:
Error in getNOAA.bathy(lon1 = -80, lon2 = -79.833333, lat1 = 32.7, lat2 = 32.833333, : The NOAA server cannot be reached
Questions:
Are there special restrictions to LAT/LON values? Am I
miscalculating something here?
Are there "better" packages that can support my LAT/LON values?
As stated in the help file for getNOAA.bathy(), the resolution argument is expressed in minutes. So resolution=10 means that the cells of your grid will have a dimension of 10 minutes in longitude by 10 minutes in latitude. The bigger the number, the worse the resolution. So considering your region, you need to use the highest resolution possible for the ETOPO1 dataset (i.e. the database that's fetched by getNOAA.bathy()):
getNOAA.bathy(lon1=-80, lon2= 79.833333, lat1=32.7, lat2=32.833333, res=1)
That's definitely not hi-res (you get a grid of 80 values: 10 latitude values by 8 longitude values), but that's the maximum you can get with getNOAA.bathy().
If you need a higher resolution, you should try other sources such as the GEBCO database and the readGEBCO() function of marmap. You should also have a look at sections 2 and 3 of the marmap-ImportExport vignette where other sources are listed.
vignette("marmap-ImportExport")

Location calculation- is UTM appropriate?

I'm wanting to calculate the location of point D, based on the location of point A B and C where I know the angle of point A relative to D and D relative to B and c relative to D.
In real terms, points A B and C are 3 locations i have marked with my GPS and point D is the location of a radiocollared animal I'm attempting to get a GPS location on. The angles I gain by knowing in which direction the radio collared animal is relative to north.
I've written the algorithm, but I know I can't put GPS co-ordinates straight into it and will have to convert them in and then out again. I've been googling, and I'm a bit confused, is the usage of cartesian or UTM more appropriate for this?
How do I go about converting GPS to UTM? I've searched and I'm a bit confused. Some conversions talk of degrees minutes adn seconds, my GPS appears to give me an additional number to this, so its N 68.21.446 and `w 12.14.284
Incase its relevant, I've assumed that the area is 2d in my calculations to make things a bit simpler.
Here is the code though I'm not sure it's needed:
#10/09/2013
#Enter your points for locations A B and C
#AN and AW is your first GPS points AA is the angle
AN<-10
AW<-0
AA<-45
#BN and BW are your second
BN<-10
BW<-0
BA<-0
#CN and CW are your third
CN<-0
CW<-10
CA<-90
#Convert these to ?
#work out distance
#For each co ordinate and angle, you need to calculate y=mx+c to make a line
#From these 3 lines, you can work out where they intersect
#If the angle is 0 it wont work, so make it very close to 0.
if(AA==0) {AA<-0.00001}
if(BA==0) {BA<-0.00001}
if(CA==0) {CA<-0.00001}
#Convert all angles to radians
AAr<-(AA*pi)/180
BAr<-(BA*pi)/180
CAr<-(CA*pi)/180
#Calculate M which is 1/tan(b)
AM<-1/tan(AAr)
BM<-1/tan(BAr)
CM<-1/tan(CAr)
#Calculate C the equation constant
#c=y-m*x
AC<-AW-AM*AN
BC<-BW-BM*BN
CC<-CW-CM*CN
#Caclulate intersections
#A and B
XAB<-(AC-BC)/(BM-AM)
YAB<-(AM*XAB+AC)
#B and C
XBC<-(BC-CC)/(CM-BM)
YBC<-(BM*XBC+BC)
#C and A
XAC<-(CC-AC)/(AM-CM)
YAC<-(CM*XAC+CC)
#Work out average of these 3 points
(XofABC<-(XAB+XBC+XAC)/(3))
(YofABC<-(YAB+YBC+YAC)/(3))
#Convert this back into GPS coordinate
`
UTMs are handy for this sort of operation as they're based on a square mapping datum and are flat 2D x-y cartesian system.
But beware of their limitations especially towards higher latitudes. And be careful that the system you choose is relevant to your location – some datum systems will be very warped if you use the wrong one.
Not sure why this is tagged in R?
Code looks like it should be fine.
Coordinate system transformations are done using the spTransform function in the rgdal package. You'll need to convert your coordinates to decimal degrees before you can convert them to UTM coords.
So, what is your "N 68.21.446" in decimal degrees? Well I'm not sure. Its 68 + (21/60) but you need to find out what the last number is. It might be a) thousandths of a minute (and if the first digit of it is ever 6 or more then that would seem likely) or b) two digits for seconds and then tenths of seconds.
For a) N 68.21.446 is then 68 + (21/60) + (446/1000)/60 decimal degrees.
For b) N 68.21.446 is then 68 + (21/60) + (44/3600) + (6/36000) decimal degrees.
You'll have to use some string matching functions to split it up.
Once you've got decimal degrees, create a spatial points dataframe with those numbers, set its CRS to your GPS coordinate system (probably EPSG code 4326) and then use spTransform to convert to your UTM code - use the one appropriate for your longitude.
Unless its polar bears or emperor penguins and the distances are not tens of km then the UTM coordinates should be a good approximation to a regular square grid. The bigger source of error is going to be your angular measurements!
On that subject, I did start writing an R package for doing location finding from radio direction finding equipment, implementing some of the statistical methods in the literature. You'll find that here: https://github.com/barryrowlingson/telemetr
If you have any comments on that package, address them to me via that github site, and not here on StackOverflow.

Resources