Normalizing an R stars object by grid area? - r

first post :)
I've been transitioning my R code from sp() to sf()/stars(), and one thing I'm still trying to grasp is accounting for the area in my grids.
Here's an example code to explain what I mean.
library(stars)
library(tidyverse)
# Reading in an example tif file, from stars() vignette
tif = system.file("tif/L7_ETMs.tif", package = "stars")
x = read_stars(tif)
x
# Get areas for each grid of the x object. Returns stars object with "area" in units of [m^2]
x_area <- st_area(x)
x_area
I tried loosely adopting code from this vignette (https://github.com/r-spatial/stars/blob/master/vignettes/stars5.Rmd) to divide each value in x by it's grid area, and it's not working as expected (perhaps because my objects are stars and not sf?)
x$test1 = x$L7_ETMs.tif / x_area # Some computationally intensive calculation seems to happen, but doesn't produce the results I expect?
x$test1 = x$L7_ETMs.tif / x_area$area # Throws error, "non-conformable arrays"
What does seem to work is the following.
x %>%
mutate(test1 = L7_ETMs.tif / units::set_units(as.numeric(x_area$area), m^2))
Here are the concerns I have with this code.
I worry that as I turn the x_area$area (a matrix, areas in lat/lon) into a numeric vector, I may mess up the lat/lon matching between the grid and it's area. I did some rough testing to see if the areas match up the way I expect them to, but can't escape the worry that this could lead to errors that are difficult to catch.
It just doesn't seem clean that I start with "x_area" in the correct units, only to remove then set the units again during the computation.
Can someone suggest a "cleaner" implementation for what I'm trying to do, i.e. multiplying or dividing grids by its area while maintaining units throughout? Or convince me that the code I have is fine?
Thanks!

I do not know how to improve the stars code, but you can compare the results you get with this
tif <- system.file("tif/L7_ETMs.tif", package = "stars")
library(terra)
r <- rast(tif)
a <- cellSize(r, sum=FALSE)
x <- r / a
With planar data you could do this when it is safe to assume there is no distortion (generally not the case, but it can be the case)
y <- r / prod(res(r))

Related

R: Is it possible to plot a grid from x, y spatial coordinates?

I've been working with a spatial model which contains 21,000 grid cells of unequal size (i by j, where i is [1:175] and j is[1:120]). I have the latitude and longitude values in two seperate arrays (lat_array,lon_array) of i and j dimensions.
Plotting the coordinates:
> plot(lon_array, lat_array, main='Grid Coordinates')
Result:
My question: Is it possible to plot these spatial coordinates as a grid rather than as points? Does anyone know of a package or function that might be able to do this? I haven't been able to find anything online to this nature.
Thanks.
First of all it is always a bit dangerous to plot inherently spherical coordinates (lat,long) directly in the plane. Usually you should project them in some way, but I will leave it for you to explore the sp package and the function spTransform or something like that.
I guess in principle you could simply use the deldir package to calculate the Dirichlet tessellation of you points which would give you a nice grid. However, you need a bounding region for this to avoid large cells radiating out from the border of your region. I personally use spatstat to call deldir so I can't give you the direct commands in deldir, but in spatstat I would do something like:
library(spatstat)
plot(lon_array, lat_array, main='Grid Coordinates')
W <- clickpoly(add = TRUE) # Now click the region that contains your grid
i_na <- is.na(lon_array) | is.na(lat_array) # Index of NAs
X <- ppp(lon_array[!i_na], lat_array[!i_na], window = W)
grid <- dirichlet(X)
plot(grid)
I have not tested this yet and I will update this answer once I get the chance to test it with some artificial data. A major problem is the size of your dataset which may take a long time to calculate the Dirichlet tessellation of. I have only tried to call dirichlet on dataset of size up to 3000 points...

R: How do I loop through spatial points with a specific buffer?

So my problem is quite difficult to describe so I hope I can make my question as clear as possible.
I use the rLiDAR package to load a .las file into R and afterwards convert it into a SpatialPointsDataFrame using the sp package.
So my SpatialPointsDataFrame is quite dense.
Now I want to define a buffer of 0.5 meters and loop (iterate) with him (the buffer) through the points, choosing always the point with the highest Z value within the buffer, as the next point to jump to.This should be repeated until there isn't any point within the buffer with an higher Z value as the current. All values (or perhaps the X and Y values) of this "found" point should then be written into a list/dataframe and the process should be repeated until all such highest points are found.
Thats the code I got so far:
>library(rLiDAR)
>library(sp)
>rLAS<-readLAS("Test.las",short=FALSE)
>PointCloud<- data.frame(rLAS)
>coordinates(PointCloud) <- c("X", "Y")
Well I googled extensively but I could not find any clues how to proceed further...
I dont even know which packages could be of help, I guess perhaps spatstat as my question would probably go into the spatial point pattern analysis.
Does anyone have some ideas how to archive something like that in R? Or is something like that not possible? (Do I perhaps have to skip to python to make something like this work?)
Help would gladly be appreciated.
If you want to get the set of points which are the local maxima within a 0.5m radius circle around each point, this should work. The gist of it is:
Convert the LAS points to a SpatialPointsDataFrame
Create a buffered polygon set with overlapping polygons
Loop through all buffered polygons and find the desired element within the buffer -- in your case, it's the one with the maximum height.
Code below:
library(rLiDAR)
library(sp)
library(rgeos)
rLAS <- readLAS("Test.las",short=FALSE)
PointCloud <- data.frame(rLAS)
coordinates(PointCloud) <- c("X", "Y")
Finish creating the SpatialPointsDataFrame from the LAS source. I'm assuming the field with the point height is PointCloud$value
pointCloudSpdf <- SpatialPointsDataFrame(data=PointCloud,xy)
Use rgeos library for intersection. It's important to have byid=TRUE or the polygons will get merged where they intersect
bufferedPoints <- gBuffer(pointCloudSpdf,width=0.5,byid=TRUE)
# Save our local maxima state (this will be updated)
localMaxes <- rep(FALSE,nrow(PointCloud))
i=0
for (buff in 1:nrow(bufferedPoint#data)){
i <- i+1
bufPolygons <- bufferedPoints#polygons[[i]]
bufSpPolygons <- SpatialPolygons(list(bufPolygons))
bufSpPolygonDf <-patialPolygonsDataFrame(bufSpPolygons,bufferedPoints#data[i,])
ptsInBuffer <- which(!is.na(over(pointCloudSpdf,spPolygonDf)))
# I'm assuming `value` is the field name containing the point height
localMax <- order(pointCloudSpdf#data$value[ptsInBuffer],decreasing=TRUE)[1]
localMaxes[localMax] <- TRUE
}
localMaxPointCloudDf <- pointCloudSpdf#data[localMaxes,]
Now localMaxPointCloudDf should contain the data from the original points if they are a local maximum. Just a warning -- this isn't going to be super fast if you have a lot of points. If that ends up being a concern you may be smarter about pre-filtering your points using a smaller grid and extract from the raster package.
That would look something like this:
Make the cell size small enough so that each 0.5m buffer will intersect at least 4 raster cells -- err on smaller since we are comparing circles to squares.
library(raster)
numRows <- extent(pointCloudSpdf)#ymax-extent(pointCloudSpdf)#ymin/0.2
numCols <- extent(pointCloudSpdf)#xmax-extent(pointCloudSpdf)#xmin/0.2
emptyRaster <- raster(nrow=numRows,ncol=numCols)
rasterize will create a grid with the maximum value of the given field within a cell. Because of the square/circle mismatch this is only a starting point to filter out obvious non-maxima. After this we will have a raster in which all the local maxima are represented by cells. However, we won't know which cells are maxima in the 0.5m radius and we don't know which point in the original feature layer they came from.
r <- rasterize(pointCloudSpdf,emptyRaster,"value",fun="max")
extract will give us raster values (i.e., the highest value for each cell) that each point intersects. Recall from above that all the local maxima will be in this set, although some values will not be 0.5m radius local maxima.
rasterMaxes <- extract(r,pointCloudSpdf)
To match up the original points with the raster maxes, just subtract the raster value at each point from that point's value. If the value is 0, then the values are the same and we have a point with a potential maximum. Note that at this point we are only merging the points back to the raster -- we will have to throw some of these out because they are "under" a 0.5m radius with a higher local max even though they are the max in their 0.2m x 0.2m cell.
potentialMaxima <- which(pointCloudSpdf#data$value-rasterMaxes==0)
Next, just subset the original SpatialPointsDataFrame and we'll do the more exhaustive and accurate iteration over this subset of points since we should have thrown out a bunch of points which could not have been maxima.
potentialMaximaCoords <- coordinates(pointCloudSpdf#coords[potentialMaxima,])
# using the data.frame() constructor because my example has only one column
potentialMaximaDf <- data.frame(pointCloudSpdf#data[potentialMaxima,])
potentialMaximaSpdf <-SpatialPointsDataFrame(potentialMaximaCoords,potentialMaximaDf)
The rest of the algorithm is the same but we are buffering the smaller dataset and iterating over it:
bufferedPoints <- gBuffer(potentialMaximaSpdf, width=0.5, byid=TRUE)
# Save our local maxima state (this will be updated)
localMaxes <- rep(FALSE, nrow(PointCloud))
i=0
for (buff in 1:nrow(bufferedPoint#data)){
i <- i+1
bufPolygons <- bufferedPoints#polygons[[i]]
bufSpPolygons <- SpatialPolygons(list(bufPolygons))
bufSpPolygonDf <-patialPolygonsDataFrame(bufSpPolygons,bufferedPoints#data[i,])
ptsInBuffer <- which(!is.na(over(pointCloudSpdf, spPolygonDf)))
localMax <- order(pointCloudSpdf#data$value[ptsInBuffer], decreasing=TRUE)[1]
localMaxes[localMax] <- TRUE
}
localMaxPointCloudDf <- pointCloudSpdf#data[localMaxes,]

Get summary vectors of raster cell centers in R

I want to extract summary vectors that contain the coordinates for the centers of the different cells in a raster. The following code works but I believe involves an n-squared comparison operation. Is there a more efficient method? Not seeing anything obvious in {raster}'s guidance.
require(raster)
r = raster(volcano)
pts = rasterToPoints(r)
x_centroids = unique(pts[,1])
y_centroids = unique(pts[,2])
To get the centers of the raster cells, you should use the functions xFromCol, yFromRow and friends (see also the help pages)
In this case, you get exactly the same result as follows:
require(raster)
r <- raster(volcano)
x_centers <- xFromCol(r)
y_centers <- yFromRow(r)
Note that these functions actually don't do much else but check the minimum value of the coordinates and the resolution of the raster. From these two values, they calculate the sequence of centers as follows:
xmin(r) + (seq_len(ncol(r)) - 0.5) * xres(r)
ymin(r) + (seq_len(nrow(r)) - 0.5) * xres(r)
But you better use the functions mentioned above, as these do a bit more safety checks.

Use Rcartogram on a SpatialPolygonsDataFrame object

I'm trying to do the same thing asked in this question, Cartogram + choropleth map in R, but starting from a SpatialPolygonsDataFrame and hoping to end up with the same type of object.
I could save the object as a shapefile, use scapetoad, reopen it and convert back, but I'd rather have it all within R so that the procedure is fully reproducible, and so that I can code dozens of variations automatically.
I've forked the Rcartogram code on github and added my efforts so far here.
Essentially what this demo does is create a SpatialGrid over the map, look up the population density at each point of the grid and convert this to a density matrix in the format required for cartogram() to work on. So far so good.
But, how to interpolate the original map points based on the output of cartogram()?
There are two problems here. The first is to get the map and grid into the same units to allow interpolation. The second is to access every point of every polygon, interpolate it, and keep them all in right order.
The grid is in grid units and the map is in projected units (in the case of the example longlat). Either the grid must be projected into longlat, or the map into grid units. My thought is to make a fake CRS and use this along with the spTransform() function in package(rgdal), since this handles every point in the object with minimal fuss.
Accessing every point is difficult because they are several layers down into the SpPDF object: object>polygons>Polygons>lines>coords I think. Any ideas how to access these while keeping the structure of the overall map intact?
This problem can be solved with the getcartr package, available on Chris Brunsdon's GitHub, as beautifully explicated in this blog post.
The quick.carto function does exactly what you want -- takes a SpatialPolygonsDataFrame as input and has a SpatialPolygonsDataFrame as output.
Reproducing the essence of the example in the blog post here in case the link goes dead, with my own style mixed in & typos fixed:
(Shapefile; World Bank population data)
library(getcartr)
library(maptools)
library(data.table)
world <- readShapePoly("TM_WORLD_BORDERS-0.3.shp")
#I use data.table, see blog post if you want a base approach;
# data.table wonks may be struck by the following step as seeming odd;
# see here: http://stackoverflow.com/questions/32380338
# and here: https://github.com/Rdatatable/data.table/issues/1310
# for some background on what's going on.
world#data <- setDT(world#data)
world.pop <- fread("sp.pop.totl_Indicator_en_csv_v2.csv",
select = c("Country Code", "2013"),
col.names = c("ISO3", "pop"))
world#data[world.pop, Population := as.numeric(i.pop), on = "ISO3"]
#calling quick.carto has internal calls to the
# necessary functions from Rcartogram
world.carto <- quick.carto(world, world$Population, blur = 0)
#plotting with a color scale
x <- world#data[!is.na(Population), log10(Population)]
ramp <- colorRampPalette(c("navy", "deepskyblue"))(21L)
xseq <- seq(from = min(x), to = max(x), length.out = 21L)
#annoying to deal with NAs...
cols <- ramp[sapply(x, function(y)
if (length(z <- which.min(abs(xseq - y)))) z else NA)]
plot(world.carto, col = cols,
main = paste0("Cartogram of the World's",
" Population by Country (2013)"))

R/ImageJ: Measuring shortest distance between points and curves

I have some experience with R as a statistics platform, but am inexperienced in image based maths. I have a series of photographs (tiff format, px/µm is known) with holes and irregular curves. I'd like to measure the shortest distance between a hole and the closest curve for that particular hole. I'd like to do this for each hole in a photograph. The holes are not regular either, so maybe I'd need to tell the program what are holes and what are curves (ImageJ has a point and segmented line functions).
Any ideas how to do this? Which package should I use in R? Would you recommend another program for this kind of task?
EDIT: Doing this is now possible using sclero package. The package is currently available on GitHub and the procedure is described in detail in the tutorial. Just to illustrate, I use an example from the tutorial:
library(devtools)
install_github("MikkoVihtakari/sclero", dependencies = TRUE)
library(sclero)
path <- file.path(system.file("extdata", package = "sclero"), "shellspots.zip")
dat <- read.ijdata(path, scale = 0.7812, unit = "um")
shell <- convert.ijdata(dat)
aligned <- spot.dist(shell)
plot(aligned)
It is also possible to add sample spot sizes using the functions provided by the sclero package. Please see Section 2.5 in the tutorial.
There's a tool for edge detection written for Image J that might help you first find the holes and the lines, and clarify them. You find it at
http://imagejdocu.tudor.lu/doku.php?id=plugin:filter:edge_detection:start
Playing around with the settings for the tresholding and the hysteresis can help in order to get the lines and holes found. It's difficult to tell whether this has much chance of working without seeing your actual photographs, but a colleague of mine had good results using this tool on FRAP images. I programmed a ImageJ tool that can calculate recoveries in FRAP analysis based on those images. You might get some ideas for yourself when looking at the code (see: http://imagejdocu.tudor.lu/doku.php?id=plugin:analysis:frap_normalization:start )
The only way I know you can work with images, is by using EBImage that's contained in the bioconductor system. The package Rimage is orphaned, so is no longer maintained.
To find the shortest distance: once you have the coordinates of the lines and holes, you can go for the shotgun approach : calculate the distances between all points and the line, and then take the minimum. An illustration about that in R :
x <- -100:100
x2 <- seq(-70,-50,length.out=length(x)/4)
a.line <- list(x = x,
y = 4*x + 5)
a.hole <- list(
x = c(x2,rev(x2)),
y = c(200 + sqrt(100-(x2+60)^2),
rev(200 - sqrt(100-(x2+60)^2)))
)
plot(a.line,type='l')
lines(a.hole,col='red')
calc.distance <- function(line,hole){
mline <- matrix(unlist(line),ncol=2)
mhole <- matrix(unlist(hole),ncol=2)
id1 <- rep(1:nrow(mline),nrow(mhole))
id2 <- rep(1:nrow(mhole), each=nrow(mline))
min(
sqrt(
(mline[id1,1]-mhole[id2,1])^2 +
(mline[id1,2]-mhole[id2,2])^2
)
)
}
Then :
> calc.distance(a.line,a.hole)
[1] 95.51649
Which you can check mathematically by deriving the equations from the circle and the line. This goes fast enough if you don't have millions of points describing thousands of lines and holes.

Resources