Why does dput()/dput2() not work with Polygons / SpatialPolygons - r

I would like to ask another question, which includes SpatialPolygons. In order to make it reproducible I wanted to use dput() for the SpatialPolygons object, but its not outputting a reproducible structure.
Why can I use dput() with SpatialPoints, but not with Lines/SpatialLines, Polygons/SpatialPolygons?
Is the only workaround, to export the coordinates and recreate the SpatialPolygons in the example?
Test Data:
library(sp)
df = data.frame(lon=runif(10, 15,19), lat=runif(10,40,45))
dput(SpatialPoints(coordinates(df)))
dput(Lines(list(Line(coordinates(df))), 1))
dput(SpatialLines(list(Lines(list(Line(coordinates(df))), 1))))
dput(Polygons(list(Polygon(df)), 1))
dput(SpatialPolygons(list(Polygons(list(Polygon(df)), 1))))
dput(SpatialPolygons(list(Polygons(list(Polygon(df)), 1))), control="all")
The dupt2() method from this answer works for Lines/SpatialLines but not for Polygons/SpatialPolygons, where this error occurs:
Error in validityMethod(object) : object 'Polygons_validate_c' not
found
So how to make a SpatialPolygons-object reproducible?
A workaround would be to convert the objects to simple features and then use dput(). They can obviously be deparsed.
Example using LINESTRING and POLYGON:
library(sp)
library(sf)
df = data.frame(lon=runif(10, 15,19), lat=runif(10,40,45))
SLi = SpatialLines(list(Lines(list(Line(coordinates(df))), 1)))
SPo = SpatialPolygons(list(Polygons(list(Polygon(df)), 1)))
dput(st_as_sf(SLi))
dput(st_as_sf(SPo))

After running the code I mentioned in the comments, I decided I would offer a tentative solution and see if you a) have the same results on your system, and b) whether it addressed the issues you were having.
newSpPa <- dput(SpatialPolygons(list(Polygons(list(Polygon(df)), 1))), control="all")
oldSpPa <- SpatialPolygons(list(Polygons(list(Polygon(df)), 1)))
identical(oldSpPa, newSpPa)
#[1] TRUE
It wasn't clear from my reading your question whether the return of a call to new("SpatialPolygons", ...) was deemed to be unsatisfactory. I think the assignment step that I did was different than your code and it's possible that my assignment would only succeed in the setting of previously defined objects being in the workspace at the time of creation. If that's the case then I think the typical suggestion would be to do this in the setting of package-creation.

Related

Check if lat/lon in polygon (R)

I have a lat/lon combination and want to check whether the point is inside a polygon (sp::Polygon class)
Consider this example:
UKJ32 <- sp::Polygon(cbind(c(-1.477037449999955, -1.366895449999959, -1.365159449999965, -1.477037449999955),
c(50.923958250000027, 50.94686525000003, 50.880069750000018, 50.923958250000027))) %>%
list() %>%
sp::Polygons(ID="UKJ32 - Southampton")
I would now like to test whether the points in df are in this polygon (and if so, return the Polygon ID).
tibble(lon = c(-1.4, 10), lat = c(50.9, 10))
Can someone tell me how I get to the result
tibble(lon = c(-1.4, 10), lat = c(50.9, 10), polyg_ID = 'UKJ32')
If you wish to stick to sp, there is a point.in.polygon() function in sp package:
UKJ32 <- sp::Polygon(cbind(c(-1.477037449999955, -1.366895449999959, -1.365159449999965, -1.477037449999955),
c(50.923958250000027, 50.94686525000003, 50.880069750000018, 50.923958250000027))) |>
list() |>
sp::Polygons(ID="UKJ32 - Southampton")
a <- tibble::tibble(lon = c(-1.4, 10), lat = c(50.9, 10))
sp::point.in.polygon(a$lon, a$lat, UKJ32#Polygons[[1]]#coords[,1], UKJ32#Polygons[[1]]#coords[,2])
#> [1] 1 0
Created on 2022-10-16 with reprex v2.0.2
The {sp} package is by now somewhat dated - after having lived a long & fruitful life - and most of current action happens in context of its successor, the {sf} package.
Assigning some kind of a polygon feature - either an id or a metric - to a points dataset is a frequent use case. It at present often done via a sf::st_join() call. For an example in action consider this earlier answer https://stackoverflow.com/a/64704624/7756889
I suggest that you try to move your workflow to the more current {sf} package; you will find it easier to keep up with recent development.
And even if this were not possible for whatever reason - use sp::Polygons() with utmost caution. I carries no information about coordinate reference system - which is a fancy way of saying it has no way of interpreting the coordinate numbers. Are they decimal degrees, or meters? Could be feet or fathoms for all that I know.
Strictly speaking you should not be allowed to proceed with a point-in-polygon operation calculation without this information.

post-processing in mice, replace one variable with another

I'm trying to perform multiple imputation on a dataset in R where I have two variables, one of which needs to be the same or greater than the other one. I have set up the method and the predictive matrix, but I am having trouble understanding how to configure the post-processing. The manual (or main paper - van Buuren and Groothuis-Oudshoorn, 2011) states (section 3.5): "The mice() function has an argument post that takes a vector of strings of R commands. These commands are parsed and evaluated just after the univariate imputation function returns, and thus provide a way to post-process the imputed values." There are a couple of examples, of which the second one seems most useful:
R> post["gen"] <- "imp[[j]][p$data$age[!r[,j]]<5,i] <- levels(boys$gen)[1]"
this suggests to me that I could do:
R> ini <- mice(cbind(boys), max = 0, print = FALSE)
R> post["A"] <- "imp[[j]][p$data$B[!r[,j]]>p$data$A[!r[,j]],i] <- levels(boys$A)[boys$B]"
However, this doesn't work (when I plot A v B, I get random scatter rather than the points being confined to one half of the graph where A >= B).
I have also tried using the ifdo() function, as suggested in another sx post:
post["A"] <- "ifdo(A < B), B"
However, it seems the ifdo() function is not yet implemented. I tried running the code suggested for inspiration but afraid my R programming skills are not that brilliant.
So, in summary, has anyone any advice about how to implement post-processing in mice such that value A >= value B in the final imputed datasets?
Ok, so I've found an answer to my own question - but maybe this isn't the best way to do it.
In FIMD, there is a suggestion to do this kind of thing outside the imputation process, which thus gives:
R> long <- mice::complete(imp, "long", include = TRUE)
R> long$A <- with(long, ifelse(B < A, B, A))
This seems to work, so I'm happy.

Markov Model diagram directly from data (makovchain or deemod package?)

I want to read a bunch of factor data and create a transition matrix from it that I can visualise nicely. I found a very sweet package, called 'heemod' which, together with 'diagram' does a decent job.
For my first quick-and-dirty approach, a ran a piece of Python code to get to the matrix, then used this R sniplet to draw the graph. Note that the transition probabilities come from that undisclosed and less important Python code, but you can also just assume that I calculated it on paper.
library('heemod')
library('diagram')
mat_dim <- define_transition(
state_names = c('State_A', 'State_B', 'State_C'),
.18, .73, .09,
.22, .0, .78,
.58, .08, .33);
plot(mat_dim)
However, I would like to integrate all in R and generate the transition matrix and the graph within R and from the sequence data directly.
This is what I have so far:
library(markovchain)
library('heemod')
library('diagram')
# the data --- this is normally read from a file
data = c(1,2,1,1,1,2,3,1,3,1,2,3,1,2,1,2,3,3,3,1,2,3,2,3,1,2,3,3,1,2,3,3,1)
fdata = factor(data)
rdata = factor(data,labels=c("State_A","State_B","State_C"))
# create transition matrix
dimMatrix = createSequenceMatrix(rdata, toRowProbs = TRUE)
dimMatrix
QUESTION: how can I transfer dimMatrix so that define_transition can process it?
mat_dim <- define_transition( ??? );
plot(mat_dim)
Any ideas? Are there better/easier solutions?
The input to define_transition seems to be quite awkward. Perhaps this is due to my inexperience with the heemod package but it seems the only way to input transitions is element by element.
Here is a workaround
library(heemod)
library(diagram)
first convert the transition matrix to a list. I used rounding on the digits which is optional. This corresponds to the ... variables in define_transition
lis <- as.list(round(dimMatrix, 3))
now add to the list all other named arguments you wish:
lis$state_names = colnames(dimMatrix)
and now pass these arguments to define_transition using do.call:
plot(do.call(define_transition, lis))
Update: to the question in the comments:
lis <- as.list(t(round(dimMatrix, 3)))
lis$state_names = colnames(dimMatrix)
plot(do.call(define_transition, lis))
The reasoning behind do.call
The most obvious way (which does not work here) is to do:
define_transition(dimMatrix, state_names = colnames(dimMatrix))
however this throws an error since the define_transition expects each transition to be supplied as an argument and not as a matrix or a list. In order to avoid typing:
define_transition(0.182, 0.222,..., state_names = colnames(dimMatrix))
one can put all the arguments in a list and then call do.call on that list as I have done.

Find nearest features using sf in R

I'm wanting to find the nearest polygons in a simple features data frame in R to a set of points in another simple features data frame using the sf package in R. I've been using 'st_is_within_distance' in 'st_join' statements, but this returns everything within a given distance, not simply the closest features.
Previously I used 'gDistance' from the 'rgeos' package with 'sp' features like this:
m = gDistance(a, b, byid = TRUE)
row = apply(m, 2, function(x) which(x == min(x)))
labels = unlist(b#data[row, ]$NAME)
a$NAME <- labels
I'm wanting to translate this approach of finding nearest features for a set of points using rgeos and sp to using sf. Any advice or suggestions greatly appreciated.
It looks like the solution to my question was already posted -- https://gis.stackexchange.com/questions/243994/how-to-calculate-distance-from-point-to-linestring-in-r-using-sf-library-and-g -- this approach gets just what I need given an sf point feature 'a' and sf polygon feature 'b':
closest <- list()
for(i in seq_len(nrow(a))){
closest[[i]] <- b[which.min(
st_distance(b, a[i,])),]
}

get an empty SpatialPolygonsDataFrame via subset?

I'm looking to subset a SpatialPolygonsDataFrame by an attribute, but I want to allow it to return an empty SpatialPolygonsDataFrame.
If we are to treat objects of type SpatialPolygonsDataFrame like data.frames, as discussed here, we should be able to get and work with empty objects.
I'm interested because I want to incorporate this into a function that may try to subset by an attribute that will essentially pick no features.
owd <- getwd()
setwd(system.file("shapes", package = "maptools"))
library(maptools)
nc90 <- readShapeSpatial("co37_d90")
setwd(owd)
nc90#data[nc90#data$AREA>0.15,] # returns data.frame
bigctys <- nc90[nc90#data$AREA>0.15,] # SpatialPolygonsDataFrame
nc90#data[nc90#data$AREA>0.25,] # returns empty data.frame
bigestctys <- nc90[nc90#data$AREA>0.25,] # ERROR
Is there a way to make this work? If not, is there a way to initalize an empty SpatialPolygonsDataFrame object? The future actions I want to perform on such an object involve over plotting on an existing map, so I'd like the image to be produced anyways, even if blank.
Right now you can't. This is somewhat inconsistent, as for SpatialPointsDataFrame objects you can:
library(sp)
demo(meuse, ask = FALSE)
x = meuse[F,]
although with warnings; also, validObject(x) returns FALSE, so they are intended to be not allowed!
It's a bit abstract what such objects should represent, but I can see the analogy with data.frame objects with zero rows: it is useful that they can exist.

Resources