In previous versions of the spsurvey package, it was possible to draw random points within polygons in a shapefile using a somewhat complicated design specification. (See here for an example).
The newly updated version of spsurvey (5.0.1) appears very user-friendly, except I cannot figure out how to perform a GRTS draw of more than one point within polygons. Below is an example:
Suppose I want to draw 10 random points using GRTS within the states of Montana and Wyoming. The grts() call requires an sf object, so we can get an sf object first.
library(raster)
library(sf)
library(spsurvey)
## Get state outlines
US <- raster::getData(country = "United States", level = 1)
States.outline <- US[US$NAME_1 %in% c("Montana","Wyoming"),]
# convert to sf object
states.out <- st_as_sf(States.outline)
Then, if we want to stratify by state, and we want ten points from each, we need:
# Define the number of points to draw from each state
strata_n <- c(Montana = 10, Wyoming = 10)
The strata_n object then gets fed into the grts() call, with the NAME_1 variable being the state name.
# Attempt to make grts draw
grts(sframe = states.out,
stratum_var = "NAME_1",
n_base = strata_n
)
This returns an error message:
During the check of the input to grtspts, one or more errors were
identified. Enter the following command to view all input error
messages: stopprnt() To view a subset of the errors (e.g., errors 1
and 5) enter stopprnt(m=c(1,5))
Running stopprnt() gives the following message:
Input Error Message n_base : Each stratum must have a sample
size no larger than the number of rows in 'sframe' representing that
stratum
This is a wonderfully clear message -- we can't draw more than one point from each polygon because the sf object only has a single row per state.
So: with the new and improved spsurvey package, how does one draw multiple points from within a polygon? Any tips or direction would be appreciated.
This is a bug. I have updated the development version, which can be installed (after installing the remotes package) by running
remotes::install_github("USEPA/spsurvey", ref = "develop")
Likely a few weeks before the changes in spsurvey are reflected on CRAN. Thanks for finding this.
Related
first post :)
I've been transitioning my R code from sp() to sf()/stars(), and one thing I'm still trying to grasp is accounting for the area in my grids.
Here's an example code to explain what I mean.
library(stars)
library(tidyverse)
# Reading in an example tif file, from stars() vignette
tif = system.file("tif/L7_ETMs.tif", package = "stars")
x = read_stars(tif)
x
# Get areas for each grid of the x object. Returns stars object with "area" in units of [m^2]
x_area <- st_area(x)
x_area
I tried loosely adopting code from this vignette (https://github.com/r-spatial/stars/blob/master/vignettes/stars5.Rmd) to divide each value in x by it's grid area, and it's not working as expected (perhaps because my objects are stars and not sf?)
x$test1 = x$L7_ETMs.tif / x_area # Some computationally intensive calculation seems to happen, but doesn't produce the results I expect?
x$test1 = x$L7_ETMs.tif / x_area$area # Throws error, "non-conformable arrays"
What does seem to work is the following.
x %>%
mutate(test1 = L7_ETMs.tif / units::set_units(as.numeric(x_area$area), m^2))
Here are the concerns I have with this code.
I worry that as I turn the x_area$area (a matrix, areas in lat/lon) into a numeric vector, I may mess up the lat/lon matching between the grid and it's area. I did some rough testing to see if the areas match up the way I expect them to, but can't escape the worry that this could lead to errors that are difficult to catch.
It just doesn't seem clean that I start with "x_area" in the correct units, only to remove then set the units again during the computation.
Can someone suggest a "cleaner" implementation for what I'm trying to do, i.e. multiplying or dividing grids by its area while maintaining units throughout? Or convince me that the code I have is fine?
Thanks!
I do not know how to improve the stars code, but you can compare the results you get with this
tif <- system.file("tif/L7_ETMs.tif", package = "stars")
library(terra)
r <- rast(tif)
a <- cellSize(r, sum=FALSE)
x <- r / a
With planar data you could do this when it is safe to assume there is no distortion (generally not the case, but it can be the case)
y <- r / prod(res(r))
I want to assess the degree of spatial proximity of each point to other equivalent points by looking at the number of others within 400m (5 minute walk).
I have some points on a map.
I can draw a simple 400 m buffer around them.
I want to determine which buffers overlap and then count the number of overlaps.
This number of overlaps should relate back to the original point so I can see which point has the highest number of overlaps and therefore if I were to walk 400 m from that point I could determine how many other points I could get to.
I've asked this question in GIS overflow, but I'm not sure it's going to get answered for ArcGIS and I think I'd prefer to do the work in R.
This is what I'm aiming for
https://www.newham.gov.uk/Documents/Environment%20and%20planning/EB01.%20Evidence%20Base%20-%20Cumulative%20Impact%20V2.pdf
To simplify here's some code
# load packages
library(easypackages)
needed<-c("sf","raster","dplyr","spData","rgdal",
"tmap","leaflet","mapview","tmaptools","wesanderson","DataExplorer","readxl",
"sp" ,"rgisws","viridis","ggthemes","scales","tidyverse","lubridate","phecharts","stringr")
easypackages::libraries(needed)
## read in csv data; first column is assumed to be Easting and second Northing
polls<-st_as_sf(read.csv(url("https://www.caerphilly.gov.uk/CaerphillyDocs/FOI/Datasets_polling_stations_csv.aspx")),
coords = c("Easting","Northing"),crs = 27700)
polls_buffer_400<-st_buffer(plls,400)
polls_intersection<-st_intersection(x=polls_buffer_400,y=polls_buffer_400)
plot(polls_intersection$geometry)
That should show the overlapping buffers around the polling stations.
What I'd like to do is count the number of overlaps which is done here:
polls_intersection_grouped<-polls_intersection%>%group_by(Ballot.Box.Polling.Station)%>%count()
And this is the bit I'm not sure about, to get to the output I want (which will show "Hotspots" of polling stations in this case) how do I colour things? How can I :
asess the degree of spatial proximity of each point to other equivalent points by looking at the number of others within 400m (5 minute walk).
It's probably terribly bad form but here's my original GIS question
https://gis.stackexchange.com/questions/328577/buffer-analysis-of-points-counting-intersects-of-resulting-polygons
Edit:
this gives the intersections different colours which is great.
plot(polls_intersection$geometry,col = sf.colors(categorical = TRUE, alpha = .5))
summary(lengths(st_intersects(polls_intersection)))
What am I colouring here? I mean it looks nice but I really don't know what I'm doing.
How can I : asess the degree of spatial proximity of each point to other equivalent points by looking at the number of others within 400m (5 minute walk).
Here is how to add a column to your initial sfc of pollings stations that tells you how many polling stations are within 400m of each feature in that sfc.
Note that the minimum value is 1 because a polling station is always within 400m of itself.
# n_neighbors shows how many polling stations are within 400m
polls %>%
mutate(n_neighbors = lengths(st_is_within_distance(polls, dist = 400)))
Similarly, for your sfc collection of intersecting polygons, you could add a column that counts the number of buffer polygons that contain each intersection polygon:
polls_intersection %>%
mutate(n_overlaps = lengths(st_within(geometry, polls_buffer_400)))
And this is the bit I'm not sure about, to get to the output I want (which will show "Hotspots" of polling stations in this case) how do I colour things?
If you want to plot these things I highly recommend using ggplot2. It makes it very clear how you associate an attribute like colour with a specific variable.
For example, here is an example mapping the alpha (transparency) of each polygon to a scaled version of the n_overlaps column:
library(ggplot2)
polls_intersection %>%
mutate(n_overlaps = lengths(st_covered_by(geometry, polls_buffer_400))) %>%
ggplot() +
geom_sf(aes(alpha = 0.2*n_overlaps), fill = "red")
Lastly, there should be a better way to generate your intersecting polygons that already counts overlaps. This is built in to the st_intersection function for finding intersections of sfc objects with themselves.
However, your data in particular generates an error when you try to do this:
st_intersection(polls_buffer_400)
# > Error in CPL_nary_intersection(x) :
#> Evaluation error: TopologyException: side location conflict at 315321.69159061194 199694.6971799387.
I don't know what a "side location conflict" is. Maybe #edzer could help with that. However, most subsets of your data do not contain that conflict. For example:
# this version adds an n.overlaps column automatically:
st_intersection(polls_buffer_400[1:10,]) %>%
ggplot() + geom_sf(aes(alpha = 0.2*n.overlaps), fill = "red")
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I hope you can help me with this problem i can't find how to overcome. Sorry if I made some mistakes while writing this post, my english is a bit rusty right now.
Here is the question. I have .shp data that I want to analyze in R. The .shp can be either lines that represent lines of traps we set to catch octopuses or points located directly over those lines, representing where we had catured one.
The question i'm trying to answer is: Are octopuses statistically grouped or not?
After a bit of investigation it seems to me that i need to use R and its linearK function to answer that question, using the libraries Maptools, SpatStat and Sp.
Here is the code i'm using in RStudio:
Loading the libraries
library(spatstat)
library(maptools)
library(sp)
Creating a linnet object with the track
t1<- as.linnet(readShapeSpatial("./20170518/t1.shp"))
I get the following warning but it seems to work
Warning messages:
1: use rgdal::readOGR or sf::st_read
2: use rgdal::readOGR or sf::st_read
Plotting it to be sure everything is ok
plot(t1)
Creating a ppp object with the points
p1<- as.ppp(readShapeSpatial("./20170518/p1.shp"))
I get the same warning here, but the real problems start when I try to plot it:
> plot(p1)
Error in if (!is.vector(xrange) || length(xrange) != 2 || xrange[2L] < :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: Interpretation of arguments maxsize and markscale has changed (in spatstat version 1.37-0 and later). Size of a circle is now measured by its diameter.
2: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
3: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
4: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
5: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
6: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
7: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
Now what is left is to join the objects in a lpp object and to analyze it with the linearK function
> pt1 <- lpp(p1,t1)
> linearK(pt1)
Function value object (class ‘fv’)
for the function r -> K[L](r)
......................................
Math.label Description
r r distance argument r
est {hat(K)[L]}(r) estimated K[L](r)
......................................
Default plot formula: .~r
where “.” stands for ‘est’
Recommended range of argument r: [0, 815.64]
Available range of argument r: [0, 815.64]
This is my situation right now. What i dont know is why the plot function is not working with my ppp object and how to understant the return of the linearK function. Help(linearK) didn't provide any clue. Since i have a lot of tracks, each with its set of points, my desired outcome would be some kind of summary like x tracks analized, a grouped, b dispersed and c unkown.
Thank you for your time, i'll greatly appreciate if you can help me solve this problem.
Edit: Here is a link to a zip file containing al the shp files of one day, both tracks and points, and a txt file with my code. https://drive.google.com/open?id=0B0uvwT-2l4A5ODJpOTdCekIxWUU
First two pieces of general advice: (1) each time you create a complicated object, print it at the terminal, to see if it is what you expected. (2) When you get an error, immediately type traceback() and copy the output. This will reveal exactly where the error is detected.
A ppp object must include a specification of the study region (window). In your code, the object p1 is created by converting data of class SpatialPointsDataFrame, which do not include a specification of the study region, converted via the function as.ppp.SpatialPointsDataFrame, into an object of class ppp in which the window is guessed by taking the bounding box of the coordinates. Unfortunately, in your example, there is only one data point in p1, so the default bounding box is a rectangle of width 0 and height 0. [This would have been revealed by printing p1.] Such objects can usually be handled by spatstat, but this particular object triggers a bug in the function plot.solist which expects windows to have non-zero size. I will fix the bug, but...
In your case, I suggest you do
Window(p1) <- Window(t1)
immediately after creating p1. This will ensure that p1 has the window that you probably intended.
If all else fails, read the spatstat vignette on shapefiles...
I have managed to find a solution. As Adrian Baddeley noticed there was a problem with the owin object. That problem seems to be bypassed (not really solved) if I create the ppp object in a manual way instead of converting my set of points.
I have also changed the readShapeFile function for the rgdal::readOGR, since the first once was deprecated, and that was the reason of the warnings I was getting.
This is the R script i'm using right now, commented to clarify:
#first install spatstat, maptools y sp
#load them
library(spatstat)
library(maptools)
library(sp)
#create an array of folders, will add more when everything works fine
folders=c("20170518")
for(f in folders){
#read all shp from that folder, both points and tracks
pointfiles <- list.files(paste("./",f,"/points", sep=""), pattern="*.shp$")
trackfiles <- list.files(paste("./",f,"/tracks", sep=""), pattern="*.shp$")
#for each point and track couple
for(i in 1:length(pointfiles)){
#create a linnet object with the track
t<- as.linnet(rgdal::readOGR(paste("./",f,"/tracks/",trackfiles[i], sep="")))
#plot(t)
#create a ppp object for each set of points
pre_p<-rgdal::readOGR(paste("./",f,"/points/",pointfiles[i], sep=""))
#plot(p)
#obtain the coordinates the current set of points
c<-coordinates(pre_p)
#create vector of x coords
xc=c()
#create vector of y coords
yc=c()
#not a very good way to fill my vectors but it works for my study area
for(v in c){
if(v>4000000){yc<-c(yc,v)}
else {if(v<4000000 && v>700000){xc<-c(xc,v)}}
}
print(xc)
print(yc)
#create a ppp object using the vectors of x and y coords, and a window object
#extracted from my set of points
p=ppp(xc,yc,Window(as.ppp(pre_p)))
#join them into an lpp object
pt <- lpp(p,t)
#plot(pt)
#analize it with the linearK function, nsim=9 for testing purposes
#envelope.lpp is the method for analyzing linear point patterns
assign(paste("results",f,i,sep="_"),envelope.lpp(pt, nsim=9, fun=linearK))
}#end for each points & track set
}#end for each day of study
So as you can see this script is testing for CSR each couple of points and track for each day, working fine right now. Unfortunately I have not managed to create a report or reportlike with the results yet (or even to fully understand them), I'll keep working on that. Of course I can use any advice you have, since this is my first try with R and many newie mistakes will happen.
The script and the shp files with the updated folder structure can be found here(113 KB size)
I am trying to solve the following problem in R :
I have a polygon object defined by a list l with two components x and y. The order defines the edges of the polygon.
For instance :
l=list(
x=c(-1.93400738955091,0.511747161547164,1.85047596846401,-1.4963460488281,-1.31613255558929,-0.0803828876660542,1.721752044722,-0.724002506376074,-2.08847609804132,2.13366860069641),
y=c(-1.02967154136169,1.53216851658359,-1.39564869249673,-1.21266011692921,1.6419616619241,-1.87141898897228,0.946605074767527,1.49557080147009,0.324443917837958,-0.517303529772633)
)
plot(l,type="b",pch=16)
points(l$x[c(10,1)],l$y[c(10,1)],type="b",pch=16)
Now what I am interested in is to keep only the outer boundary (but not the convex hull) of this polygon. The following picture highlights the point I'd like to keep
points(
x=c(-1.13927707377209,-1.31613255249992,-1.3598262571216,0.511747159281619,0.264900107013767,0.671727215417383,-0.724002505140328,-1.93400738893304,-1.4811931364624,-1.45298543105533,-2.08847609804132,-1.40787406113029,-1.3598262571216,0.278826441754518,1.85047596733123,1.48615105742673,1.48615105742673,2.13366860069641,1.38016944537233,1.38016944537233,1.17232981688283,1.17232981688283,1.72175204307433,0.671727215417383,-1.496346, -0.08038289, -0.2824999),
y=c(1.13914087952916,1.64196166071069,0.949843643913108,1.53216851597378,1.27360509238768,1.18229006681548,1.49557080106148,-1.02967154055378,-0.972634663817139,-0.525818314106921,0.324443915423533,0.188755761926866,0.949843643913108,-1.30971824545964,-1.3956486896768,-0.59886540309968,-0.59886540309968,-0.517303527559411,-0.367082245352325,-0.367082245352325,0.0874657083966551,0.0874657083966551,0.94660507315481,1.18229006681548,-1.21266,-1.871419,-1.281255),
pch=16,
col="red",
cex=0.75
)
I am really clueless about whether there are tools to easily do that. The closest I have found is the polysimplify function in the polyclip package, which identifies all the points I need, but also outputs some points I do not need (inner points where segments intersect).
I actually found a solution (below). The following function does what I want but I am unsure why it works (and whether it may fail).
Actually the function below correctly identifies the point I want but outputs them in the wrong order, so it is still useless to me...
polygon.clean<-function(poly){
require(polyclip)
poly.cleaned=polysimplify(poly)
x=unlist(sapply(poly.cleaned,function(x)x$x))
y=unlist(sapply(poly.cleaned,function(x)x$y))
x.src=x[!x%in%x[duplicated(x)]]
y.src=y[!y%in%y[duplicated(y)]]
poly.cleaned=poly.cleaned[sapply(poly.cleaned,function(poly.sub,x,y){
any(poly.sub$x%in%x&poly.sub$y%in%y)
},x=x.src,y=y.src)]
x=unlist(sapply(poly.cleaned,function(x){
res=x$x
if(length(res)==4){
res=vector()
}
res
}))
y=unlist(sapply(poly.cleaned,function(x){
res=x$y
if(length(res)==4){
res=vector()
}
res
}))
x=c(x,x.src)
y=c(y,y.src)
tester=duplicated(x)&duplicated(y)
x=x[!tester]
y=y[!tester]
list(x=x,y=y)
}
plot(l,type="b",pch=16)
points(l$x[c(10,1)],l$y[c(10,1)],type="b",pch=16)
points(polygon.clean(l),pch=16,cex=0.75,col="red")
Using rgeos routines, you first "node" your linestring to create all the intersections, then "polygonize" it, then "union" it to dissolve its insides.
First make a SpatialLines version of your data with duplicated first/last point:
library(sp)
library(rgeos)
coords = cbind(l$x, l$y); coords=rbind(coords,coords[1,])
s = SpatialLines(list(Lines(list(Line(coords)),ID=1)))
Then:
s_outer = gUnaryUnion(gPolygonize(gNode(s)))
Plot it thus:
plot(s,lwd=5)
plot(s_outer, lwd=2,border="red",add=TRUE)
If you want the coordinates of the surrounding polygon they are in the returned object and can be extracted with:
s_outer#polygons[[1]]#Polygons[[1]]#coords
# x y
# [1,] 0.27882644 -1.30971825
# [2,] -0.08038289 -1.87141899
# [3,] -0.28886517 -1.27867953
Assuming there's only one polygon, which might not be the case - suppose your line traces a figure-of-eight - then you'll get two polygons touching at a point. We don't know how free your jaggly line is to do things like that...
I'm trying to do the same thing asked in this question, Cartogram + choropleth map in R, but starting from a SpatialPolygonsDataFrame and hoping to end up with the same type of object.
I could save the object as a shapefile, use scapetoad, reopen it and convert back, but I'd rather have it all within R so that the procedure is fully reproducible, and so that I can code dozens of variations automatically.
I've forked the Rcartogram code on github and added my efforts so far here.
Essentially what this demo does is create a SpatialGrid over the map, look up the population density at each point of the grid and convert this to a density matrix in the format required for cartogram() to work on. So far so good.
But, how to interpolate the original map points based on the output of cartogram()?
There are two problems here. The first is to get the map and grid into the same units to allow interpolation. The second is to access every point of every polygon, interpolate it, and keep them all in right order.
The grid is in grid units and the map is in projected units (in the case of the example longlat). Either the grid must be projected into longlat, or the map into grid units. My thought is to make a fake CRS and use this along with the spTransform() function in package(rgdal), since this handles every point in the object with minimal fuss.
Accessing every point is difficult because they are several layers down into the SpPDF object: object>polygons>Polygons>lines>coords I think. Any ideas how to access these while keeping the structure of the overall map intact?
This problem can be solved with the getcartr package, available on Chris Brunsdon's GitHub, as beautifully explicated in this blog post.
The quick.carto function does exactly what you want -- takes a SpatialPolygonsDataFrame as input and has a SpatialPolygonsDataFrame as output.
Reproducing the essence of the example in the blog post here in case the link goes dead, with my own style mixed in & typos fixed:
(Shapefile; World Bank population data)
library(getcartr)
library(maptools)
library(data.table)
world <- readShapePoly("TM_WORLD_BORDERS-0.3.shp")
#I use data.table, see blog post if you want a base approach;
# data.table wonks may be struck by the following step as seeming odd;
# see here: http://stackoverflow.com/questions/32380338
# and here: https://github.com/Rdatatable/data.table/issues/1310
# for some background on what's going on.
world#data <- setDT(world#data)
world.pop <- fread("sp.pop.totl_Indicator_en_csv_v2.csv",
select = c("Country Code", "2013"),
col.names = c("ISO3", "pop"))
world#data[world.pop, Population := as.numeric(i.pop), on = "ISO3"]
#calling quick.carto has internal calls to the
# necessary functions from Rcartogram
world.carto <- quick.carto(world, world$Population, blur = 0)
#plotting with a color scale
x <- world#data[!is.na(Population), log10(Population)]
ramp <- colorRampPalette(c("navy", "deepskyblue"))(21L)
xseq <- seq(from = min(x), to = max(x), length.out = 21L)
#annoying to deal with NAs...
cols <- ramp[sapply(x, function(y)
if (length(z <- which.min(abs(xseq - y)))) z else NA)]
plot(world.carto, col = cols,
main = paste0("Cartogram of the World's",
" Population by Country (2013)"))