R raster: extent conditional on cell value - r

I would like to obtain the extent of raster layer conditional on certain cell values. Consider the following example:
raster1 is a large raster object, filled with values between 1 and 1000. However, I only want to obtain the extent for pixels with value 100. Since this subset of cells should crowd in a small region, the extent should be rather narrow. Once I know the coordinates of that box, I can crop this minor area.
My approach so far is to replace all values != 100 with NA - as suggested in related questions. Considering the raster object's overall size, this step takes an enormous amount of time and invests a lot of computational capacity in regions that I would like to crop anyways.
Does anyone know how to obtain the extent conditional on a certain pixel value which does not require to reclassify the entire object beforehand?

Here is an alternative way to do that
Example data:
library(raster)
r <- raster(ncol=18,nrow=18)
values(r) <- 1
r[39:45] <- 100
r[113:115] <- 100
r[200] <- 100
"Standard" way:
x <- r == 100
s <- trim(x, values=FALSE)
Alternate route by creating an extent:
xy <- rasterToPoints(r, function(x){ x ==100 })
e <- extent(xy[,1:2])
e <- alignExtent(e, r, snap='out')
v <- crop(r, e)
Either way, all cells need to be looked at, but at least you do not need to create another large raster.

Related

How to create inner buffer for a specific raster class in R?

I have a large raster with 3 values (1,2,3).
I want to create a zone of 20 meters for areas with value 3, but I want the buffer to be not outside (around) the areas of value 3 but inside these areas.
I have tried to use
my_zones<- buffer(my_raster, width=20)
but this creates a buffer of 20 m around and outside of all classes.
How can I transform this? my raster includes the entire Europe, so I would also like a relatively fast way to do the zones.
Can anyone help me?
EDIT1: I have also tried to creat a negative buffer like
buffer(my_raster, width=-20) but width cannot be negative.
EDIT2: I am not sure how to create a sample raster, so I tried the following with the terra package
my_raster <- rast(xmin=1, xmax=3, ymin=1, ymax=3, res=1, val=sample(1:4, 100^2, replace=T))
There is a negative buffer for polygons, but not for rasters. However you can inverse the process yourself.
Example data (you can always start with ?buffer for inspiration)
library(terra)
r <- rast(ncols=20, nrows=20, xmin=0, xmax=20, ymin=0, ymax=20, crs="local")
r[, 1:10] <- 1
A standard buffer
b <- buffer(r, width=5)
plot(b)
To get the negative buffer, first flip the cells that are NA, and then use buffer. The ! is to make the buffered area TRUE instead of FALSE
x <- ifel(is.na(r), 1, NA)
bb <- !buffer(x, width=5)

Plotting a big raster file results in a white frame

I am currently working with an ASCII matrix of 256x256 pixels. I correctly imported it into R, rasterized it and the values are what I would expect (i.e., correct x and y boundaries and min and max "z" values). However, while plotting it I get a blank raster, like every value in the matrix is zero.
I tried by creating another file as a 5x5 matrix and I get no problem with that. Am I missing something?
Files and screenshots below:
my 256x256 raster
https://gofile.io/d/JGApXI ascii matrix link
Your raster is simply almost empty, in the sense that it has just the 2% of values !=0. However if you export the raster and visualize it in a GIS software (like Qgis, or ArcMap), by setting a 100% transparency for the 0 values you can see the remaining values:
Here an example:
library(raster)
x <- read.table("D:/muon sideways0000.txt")
x <- as.matrix(x)
r <- raster(x)
writeRaster(r,"D:/r.tif")
z <- apply(x, 1, function(x)sum(x!=0))
sum(z)/ncell(r)*100
To aid visualization, you can do
library(terra)
x <- read.table("muon sideways0000.txt")
x <- as.matrix(x)
r <- rast(x)
plot(r > 1)
Or some other transformation like
rr <- clamp(r, 0, 100)
plot(rr)

Reduce memory usage for mosaic on large list of rasters

I am using the mosaic function in the raster package to combine a long (11,000 files) list of rasters using the approach suggested by #RobertH here.
rlist <- sapply(list_names)
rlist$fun <- mean
rlist$na.rm <- TRUE
x <- do.call(mosaic, rlist)
As you might imagine, this eventually overruns my available memory (on several different machines and computing clusters). My question is: Is there a way to reduce the memory usage of either mosaic or do.call? I've tried altering maxmemory in rasterOptions(), but that does not seem to help. Processing the rasters in smaller batches seems problematic because the rasters may be spatially disjunct (i.e., sequential raster files may be located very far from each other). Thanks in advance for any help you can give.
Rather than loading all rasters into memory at once (in the mosaic() call), can you process them one at a time? That way, you have your mosaic that updates each time you bring one more raster into memory, but then you can get rid of the new raster and just keep the continuously updating mosaic raster.
Assuming that your rlist object is a list of rasters, I'm thinking of something like:
Pseudocode
Initialize an updating_raster object as the first raster in the list
Loop through each raster in the list in turn, starting from the 2nd raster
Read the ith raster into memory called next_raster
Update the updating_raster object by overwriting it with the mosaic of itself and the next raster using a weighted mean
R code
Testing with the code in the mosaic() help file example...
First generate some rasters and use the standard mosaic method.
library(raster)
r <- raster(ncol=100, nrow=100)
r1 <- crop(r, extent(-10, 11, -10, 11))
r2 <- crop(r, extent(0, 20, 0, 20))
r3 <- crop(r, extent(9, 30, 9, 30))
r1[] <- 1:ncell(r1)
r2[] <- 1:ncell(r2)
r3[] <- 1:ncell(r3)
m1 <- mosaic(r1, r2, r3, fun=mean)
Put the rasters in a list so they are in a similar format as I think you have.
rlist <- list(r1, r2, r3)
Because of the NA handling of the weighted.mean() function, I opted to create the same effect by breaking down the summation and the division into distinct steps...
First initialize the summation raster:
updating_sum_raster <- rlist[[1]]
Then initialize the "counter" raster. This will represent the number of rasters that went into mosaicking at each pixel. It starts as a 1 in all cells that aren't NA. It should properly handle NAs such that it only will increment for a given pixel if a non-NA value was added to the updating sum.
updating_counter_raster <- updating_sum_raster
updating_counter_raster[!is.na(updating_counter_raster)] <- 1
Here's the loop that doesn't require all rasters to be in memory at once. The counter raster for the raster being added to the mosaic has a value of 1 only in the cells that aren't NA. The counter is updated by summing the current counter raster and the updating counter raster. The total sum is updated by summing the current raster values and the updating raster values.
for (i in 2:length(rlist)) {
next_sum_raster <- rlist[[i]]
next_counter_raster <- next_sum_raster
next_counter_raster[!is.na(next_counter_raster)] <- 1
updating_sum_raster <- mosaic(x = updating_sum_raster, y = next_sum_raster, fun = sum)
updating_counter_raster <- mosaic(updating_counter_raster, next_counter_raster, fun = sum)
}
m2 <- updating_sum_raster / updating_counter_raster
The values here seem to match the use of the mosaic() function
identical(values(m1), values(m2))
> TRUE
But the rasters themselves aren't identical:
identical(m1, m2)
> FALSE
Not totally sure why, but maybe this gets you closer?
Perhaps compareRaster() is a better way to check:
compareRaster(m1, m2)
> TRUE
Hooray!
Here's a plot!
plot(m1)
text(m1, digits = 2)
plot(m2)
text(m2, digits = 2)
A bit more digging in the weeds...
From the mosaic.R file:
It looks like the mosaic() function initializes a matrix called v to populate with the values from all the cells in all the rasters in the list. The number of rows in matrix v is the number of cells in the output raster (based on the full mosaicked extent and resolution), and the number of columns is the number of rasters to be mosaicked (11,000) in your case. Maybe you're running into the limits of matrix creation in R?
With a 1000 x 1000 raster (1e6 pixels), the v matrix of NAs takes up 41 GB. How big do you expect your final mosaicked raster to be?
r <- raster(ncol=1e3, nrow=1e3)
x <- 11000
v <- matrix(NA, nrow=ncell(r), ncol=x)
format(object.size(v), units = "GB")
[1] "41 Gb"

R: How to extract values from contiguous raster cells that are not touched by SpatialLines?

I've been trying to extract values from a single attribute raster (area, in m2) that overlaps with lines (that is, a .shp SpatialLines).
The problem is that, along these lines, my raster sometimes goes from one to several contiguous cells in all directions. Using the extract function only values from cells that are touched by the lines are extracted. Thus, when I add up the extracted values from all lines a significant amount of area (m2) is lost due to cells that were not touched by the line and therefore values were not extracted.
I tried to work it around by:
Step 1 - first aggregating my raster to a lower resolution (i.e. increasing the fact argument) and then
Step 2 - rasterizing the lines using this aggregated raster (created in step 1) as a mold to make sure the rasterized lines would get thick enough to cover the horizontal spread of cells in my original resolution raster.
Step 3 - Then I resample the rasterized lines (created in step 2) back to the original resolution I started with.
Step 4 - Finally, extracted the values from the resampled rasterized lines (created in step 3).
However, it didn't quite work as now the total area (m2) varies according to the fact="" value I use when first aggregating the raster (in step 1).
I really appreciate if anyone has already dealt with a similar problem and can help me out here. Here are the codes I've been running to try to get it to work:
# input raster file
g.025 <- raster("ras.asc")
g.1 <- aggregate(g.025, fact=2, fun=sum)
# input SpatialLines
Spline1 <- readOGR("/Users/xxxxx.shp")
Spline2 <- readOGR("/Users/xxxxx.shp")
Spline3 <- readOGR("/Users/xxxxx.shp")
# rasterizing using low resolution raster (aggregated)
c1 <- rasterize(Spline1, g.1, field=Spline1$type, fun=sum)
c2 <- rasterize(Spline2, g.1, field=Spline2$type, fun=sum)
c3 <- rasterize(Spline3, g.1, field=Spline3$type, fun=sum)
# resampling back to higher resolution
c1 <- resample(c1, g.025)
c2 <- resample(c2, g.025)
c3 <- resample(c3, g.025)
# preparing to extract area (m2) values from raster “g.025”
c1tab <- as.data.frame(c1, xy=T)
c2tab <- as.data.frame(c2, xy=T)
c3tab <- as.data.frame(c3, xy=T)
c1tab <- c1tab[which(is.na(c1tab$layer)!=T),]
c2tab <- c2tab[which(is.na(c2tab$layer)!=T),]
c3tab <- c3tab[which(is.na(c3tab$layer)!=T),]
# extracting area (m2) values from raster “g.025”
c1tab[,4] <- extract(g.025, c1tab[,1:2])
c2tab[,4] <- extract(g.025, c2tab[,1:2])
c3tab[,4] <- extract(g.025, c3tab[,1:2])
names(c1tab)[4] <- "area_m2"
names(c2tab)[4] <- "area_m2"
names(c3tab)[4] <- "area_m2"
# sum total area (m2)
c1_area <- sum(c1tab$area_m2)
c2_area <- sum(c2tab$area_m2)
c3_area <- sum(c3tab$area_m2)
tot_area <- sum(c1_area, c2_area, c3_area)
Thanks!
Andre

R: How do I loop through spatial points with a specific buffer?

So my problem is quite difficult to describe so I hope I can make my question as clear as possible.
I use the rLiDAR package to load a .las file into R and afterwards convert it into a SpatialPointsDataFrame using the sp package.
So my SpatialPointsDataFrame is quite dense.
Now I want to define a buffer of 0.5 meters and loop (iterate) with him (the buffer) through the points, choosing always the point with the highest Z value within the buffer, as the next point to jump to.This should be repeated until there isn't any point within the buffer with an higher Z value as the current. All values (or perhaps the X and Y values) of this "found" point should then be written into a list/dataframe and the process should be repeated until all such highest points are found.
Thats the code I got so far:
>library(rLiDAR)
>library(sp)
>rLAS<-readLAS("Test.las",short=FALSE)
>PointCloud<- data.frame(rLAS)
>coordinates(PointCloud) <- c("X", "Y")
Well I googled extensively but I could not find any clues how to proceed further...
I dont even know which packages could be of help, I guess perhaps spatstat as my question would probably go into the spatial point pattern analysis.
Does anyone have some ideas how to archive something like that in R? Or is something like that not possible? (Do I perhaps have to skip to python to make something like this work?)
Help would gladly be appreciated.
If you want to get the set of points which are the local maxima within a 0.5m radius circle around each point, this should work. The gist of it is:
Convert the LAS points to a SpatialPointsDataFrame
Create a buffered polygon set with overlapping polygons
Loop through all buffered polygons and find the desired element within the buffer -- in your case, it's the one with the maximum height.
Code below:
library(rLiDAR)
library(sp)
library(rgeos)
rLAS <- readLAS("Test.las",short=FALSE)
PointCloud <- data.frame(rLAS)
coordinates(PointCloud) <- c("X", "Y")
Finish creating the SpatialPointsDataFrame from the LAS source. I'm assuming the field with the point height is PointCloud$value
pointCloudSpdf <- SpatialPointsDataFrame(data=PointCloud,xy)
Use rgeos library for intersection. It's important to have byid=TRUE or the polygons will get merged where they intersect
bufferedPoints <- gBuffer(pointCloudSpdf,width=0.5,byid=TRUE)
# Save our local maxima state (this will be updated)
localMaxes <- rep(FALSE,nrow(PointCloud))
i=0
for (buff in 1:nrow(bufferedPoint#data)){
i <- i+1
bufPolygons <- bufferedPoints#polygons[[i]]
bufSpPolygons <- SpatialPolygons(list(bufPolygons))
bufSpPolygonDf <-patialPolygonsDataFrame(bufSpPolygons,bufferedPoints#data[i,])
ptsInBuffer <- which(!is.na(over(pointCloudSpdf,spPolygonDf)))
# I'm assuming `value` is the field name containing the point height
localMax <- order(pointCloudSpdf#data$value[ptsInBuffer],decreasing=TRUE)[1]
localMaxes[localMax] <- TRUE
}
localMaxPointCloudDf <- pointCloudSpdf#data[localMaxes,]
Now localMaxPointCloudDf should contain the data from the original points if they are a local maximum. Just a warning -- this isn't going to be super fast if you have a lot of points. If that ends up being a concern you may be smarter about pre-filtering your points using a smaller grid and extract from the raster package.
That would look something like this:
Make the cell size small enough so that each 0.5m buffer will intersect at least 4 raster cells -- err on smaller since we are comparing circles to squares.
library(raster)
numRows <- extent(pointCloudSpdf)#ymax-extent(pointCloudSpdf)#ymin/0.2
numCols <- extent(pointCloudSpdf)#xmax-extent(pointCloudSpdf)#xmin/0.2
emptyRaster <- raster(nrow=numRows,ncol=numCols)
rasterize will create a grid with the maximum value of the given field within a cell. Because of the square/circle mismatch this is only a starting point to filter out obvious non-maxima. After this we will have a raster in which all the local maxima are represented by cells. However, we won't know which cells are maxima in the 0.5m radius and we don't know which point in the original feature layer they came from.
r <- rasterize(pointCloudSpdf,emptyRaster,"value",fun="max")
extract will give us raster values (i.e., the highest value for each cell) that each point intersects. Recall from above that all the local maxima will be in this set, although some values will not be 0.5m radius local maxima.
rasterMaxes <- extract(r,pointCloudSpdf)
To match up the original points with the raster maxes, just subtract the raster value at each point from that point's value. If the value is 0, then the values are the same and we have a point with a potential maximum. Note that at this point we are only merging the points back to the raster -- we will have to throw some of these out because they are "under" a 0.5m radius with a higher local max even though they are the max in their 0.2m x 0.2m cell.
potentialMaxima <- which(pointCloudSpdf#data$value-rasterMaxes==0)
Next, just subset the original SpatialPointsDataFrame and we'll do the more exhaustive and accurate iteration over this subset of points since we should have thrown out a bunch of points which could not have been maxima.
potentialMaximaCoords <- coordinates(pointCloudSpdf#coords[potentialMaxima,])
# using the data.frame() constructor because my example has only one column
potentialMaximaDf <- data.frame(pointCloudSpdf#data[potentialMaxima,])
potentialMaximaSpdf <-SpatialPointsDataFrame(potentialMaximaCoords,potentialMaximaDf)
The rest of the algorithm is the same but we are buffering the smaller dataset and iterating over it:
bufferedPoints <- gBuffer(potentialMaximaSpdf, width=0.5, byid=TRUE)
# Save our local maxima state (this will be updated)
localMaxes <- rep(FALSE, nrow(PointCloud))
i=0
for (buff in 1:nrow(bufferedPoint#data)){
i <- i+1
bufPolygons <- bufferedPoints#polygons[[i]]
bufSpPolygons <- SpatialPolygons(list(bufPolygons))
bufSpPolygonDf <-patialPolygonsDataFrame(bufSpPolygons,bufferedPoints#data[i,])
ptsInBuffer <- which(!is.na(over(pointCloudSpdf, spPolygonDf)))
localMax <- order(pointCloudSpdf#data$value[ptsInBuffer], decreasing=TRUE)[1]
localMaxes[localMax] <- TRUE
}
localMaxPointCloudDf <- pointCloudSpdf#data[localMaxes,]

Resources