I would like to change the resolution of a raster. For example, let’s take
this Landsat 7 images at ~ 30m resolution.
library(terra)
#> terra 1.5.21
f <- system.file("tif/L7_ETMs.tif", package = "stars")
r <- rast(f)
# 30m x 30m resolution
res(r)
#> [1] 28.5 28.5
plot(r, 1)
I can use aggregate() with an integer factor such as:
# 10 * 28.5
r2 <- aggregate(r, fact = 10)
res(r2)
#> [1] 285 285
plot(r2, 1)
My question is, how can I specify an exact resolution. For example, I would
like to have a pixel resolution of 1.234 km (1234 m).
fact <- 1234 / 28.5
fact
#> [1] 43.29825
r3 <- aggregate(r, fact = fact)
res(r3)
#> [1] 1225.5 1225.5
plot(r3, 1)
The documentation says that fact should be an integer, so here it is
flooring fact to 43.
res(aggregate(r, 43))
#> [1] 1225.5 1225.5
Any ways to have an exact resolution of 1234 m?
Created on 2022-04-28 by the reprex package (v2.0.1)
I came up with this solution which seems to give me what I need.
library(terra)
#> terra 1.5.21
f <- system.file("tif/L7_ETMs.tif", package = "stars")
r <- rast(f)
plot(r, 1)
r2 <- r
res(r2) <- 1234
r2 <- resample(r, r2)
plot(r2, 1)
res(r2)
#> [1] 1234 1234
Created on 2022-04-28 by the reprex package (v2.0.1)
I also propose (as described in the terra vignette) that you first aggregate the raster as close as possible and then resample. Resampling can be done e.g. using a template raster to guarantee correct crs, dimensions etc.
Related
I have a raster and a shapefile. The raster contains NA and I am filling the NAs using the focal function
library(terra)
v <- vect(system.file("ex/lux.shp", package="terra"))
r <- rast(system.file("ex/elev.tif", package="terra"))
r[45:60, 45:60] <- NA
r_fill <- terra::focal(r, 5, mean, na.policy="only", na.rm=TRUE)
However, there are some NA still left. So I do this:
na_count <- terra::freq(r_fill, value = NA)
while(na_count$count != 0){
r_fill <- terra::focal(r_fill, 5, mean, na.policy="only", na.rm=TRUE)
na_count <- terra::freq(r_fill, value = NA)
}
Once all NA's are filled, I clip the raster again using the shapefile
r_fill <- terra::crop(r_fill, v, mask = T, touches = T)
This is what my before and after looks like:
I wondered if the while loop is an efficient way to fill the NAs or basically determine how many times I have to run focal to fill all the NAs in the raster.
Perhaps we can, or want to, dispense with the while( altogether by making a better estimate of focal('s w= arg in a world where r, as ground truth, isn't available. Were it available, we could readily derive direct value of w
r <- rast(system.file("ex/elev.tif", package="terra"))
# and it's variants
r2 <- r
r2[45:60, 45:60] <- NA
freq(r2, value=NA) - freq(r, value=NA)
layer value count
1 0 NA 256
sqrt((freq(r2, value=NA) - freq(r, value=NA))$count)
[1] 16
which might be a good value for w=, and introducing another variant
r3 <- r
r3[40:47, 40:47] <- NA
r3[60:67, 60:67] <- NA
r3[30:37, 30:37] <- NA
r3[70:77, 40:47] <- NA
rm(r)
We no longer have our ground truth. How might we estimate an edge of w=? Turning to boundaries( default values (inner)
r2_bi <- boundaries(r2)
r3_bi <- boundaries(r3)
# examining some properties of r2_bi, r3_bi
freq(r2_bi, value=1)$count
[1] 503
freq(r3_bi, value=1)$count
[1] 579
freq(r2_bi, value=1)$count/freq(r2_bi, value = 0)$count
[1] 0.1306833
freq(r3_bi, value=1)$count/freq(r3_bi, value = 0)$count
[1] 0.1534588
sum(freq(r2_bi, value=1)$count,freq(r2_bi, value = 0)$count)
[1] 4352
sum(freq(r3_bi, value=1)$count,freq(r3_bi, value = 0)$count)
[1] 4352
Taken in reverse order, sum[s] and freq[s] suggest that while the total area of (let's call them holes) are the same, they differ in number and r2 is generally larger than r3. This is also clear from the first pair of freq[s].
Now we drift into some voodoo, hocus pocus in pursuit of a better edge estimate
sum(freq(r2)$count) - sum(freq(r2, value = NA)$count)
[1] 154
sum(freq(r3)$count) - sum(freq(r3, value = NA)$count)
[1] 154
(sum(freq(r3)$count) - sum(freq(r3, value = NA)$count))
[1] 12.40967
freq(r2_bi, value=1)$count/freq(r2_bi, value = 0)$count
[1] 0.1306833
freq(r2_bi, value=0)$count/freq(r2_bi, value = 1)$count
[1] 7.652087
freq(r3_bi, value=1)$count/freq(r3_bi, value = 0)$count
[1] 0.1534588
taking the larger, i.e. freq(r2_bi 7.052087
7.652087/0.1306833
[1] 58.55444
154+58
[1] 212
sqrt(212)
[1] 14.56022
round(sqrt(212)+1)
[1] 16
Well, except for that +1 part, maybe still a decent estimate for w=, to be used on both r2 and r3 if called upon to find a better w, and perhaps obviate the need for while(.
Another approach to looking for squares and their edges:
wtf3 <- values(r3_bi$elevation)
wtf2 <- values(r2_bi$elevation)
wtf2_tbl_df2 <- as.data.frame(table(rle(as.vector(is.na(wtf2)))$lengths))
wtf3_tbl_df2 <- as.data.frame(table(rle(as.vector(is.na(wtf3)))$lengths))
names(wtf2_tbl_df2)
[1] "Var1" "Freq"
wtf2_tbl_df2[which(wtf2_tbl_df2$Var1 == wtf2_tbl_df2$Freq), ]
Var1 Freq
14 16 16
wtf3_tbl_df2[which(wtf3_tbl_df2$Freq == max(wtf3_tbl_df2$Freq)), ]
Var1 Freq
7 8 35
35/8
[1] 4.375 # 4 squares of 8 with 3 8 length vectors
bringing in v finally and filling
v <- vect(system.file("ex/lux.shp", package="terra"))
r2_fill_17 <- focal(r2, 16 + 1 , mean, na.policy='only', na.rm = TRUE)
r3_fill_9 <- focal(r3, 8 + 1 , mean, na.policy='only', na.rm = TRUE)
r2_fill_17_cropv <- crop(r2_fill_17, v, mask = TRUE, touches = TRUE)
r3_fill_9_cropv <- crop(r3_fill_9, v, mask = TRUE, touches = TRUE)
And I now appreciate your while( approach as your r2 looks better, more naturally transitioned, though the r3 looks fine. In my few, brief experiments with smaller than 'hole', i.e. focal(r2, 9, I got the sense it would take 2 passes to fill, that suggests focal(r2, 5 would take 4.
I guess further determining the proportion of fill:hole:rast for when to deploy a while would be worthwhile.
so I've combined those 2 rasters and made them into one dem raster which contains elevation values:
dem1 = read_stars("srtm_43_06.tif")
dem2 = read_stars("srtm_44_06.tif")
pol = st_read("israel_borders.shp")
dem = st_mosaic(dem1, dem2)
dem = dem[, 5687:6287, 2348:2948]
names(dem) = "elevation"
dem = st_warp(src = dem, crs = 32636, method = "near", cellsize = 90)
Now I need to calculate a point geometry of the peak of the mountain by finding the centroid of the pixel that has the highest elevation in the image, does anyone know what functions I can use?
Building on Grzegorz Sapijaszko's example, here is an alternative path to the top of the mountain.
library(terra)
f <- system.file("ex/elev.tif", package="terra")
x <- rast(f)
If there is a single maximum, you can do
g <- global(x, which.max)
xyFromCell(x, g[,1])
# x y
#[1,] 6.020833 50.17917
Now, consider a situation with multiple maxima. I add three more cells with the maximum value.
x[c(1000, 2500, 5000)] <- 547
We can find the four highest peaks with:
g <- global(x, which.max)[[1]]
v <- x[g] |> unlist()
y <- ifel(x == v, v, NA)
p <- as.points(y)
crds(p)
#[1,] 6.020833 50.17917
#[2,] 6.154167 50.10417
#[3,] 5.987500 49.97083
#[4,] 6.237500 49.75417
You should not warp (project with terra) the raster data first because that changes the cell values and potentially the location of the highest peak. You should find the peaks with the original data, but then you can transform the results like this.
pp <- project(p, "EPSG:32636")
crds(pp)
# x y
#[1,] -1411008 5916157
#[2,] -1404896 5904422
#[3,] -1422145 5894509
#[4,] -1413735 5864236
With your files, you could start with something like
ff <- c("srtm_43_06.tif", "srtm_44_06.tif")
v <- vrt(ff)
g <- global(x, which.max)
And then continue as in the examples above.
Let's use terra, however similar approach can be applied by raster package as well. For testing purposes we will use raster supplied with terra package
library(terra)
#> terra 1.5.12
f <- system.file("ex/elev.tif", package="terra")
v <- rast(f)
plot(v)
You can check the details of your raster just typing the raster object name and pressing enter, you can check the min and max values with minmax() function form terra:
minmax(v)
#> elevation
#> [1,] 141
#> [2,] 547
Let's create another raster by copying original one, however checking if the value is the max value of elevation:
w <- v == minmax(v)[2]
plot(w)
Let's create a substitution matrix, and substitute all FALSE with NA and TRUE with 1:
mx <- matrix(c(FALSE, NA, TRUE, 1), ncol = 2, byrow = TRUE)
w <- classify(w, mx)
plot(v)
plot(as.polygons(w), add=TRUE)
Let's find centroids of those polygon(s):
pts <- centroids(as.polygons(w))
plot(pts, add=TRUE)
Let's see our coordinates:
as.data.frame(pts, geom = "WKT")
#> elevation geometry
#> 1 1 POINT (6.020833 50.179167)
Created on 2022-01-29 by the reprex package (v2.0.1)
Preamble
I've looked through other questions (1, 2, 3) describing the use and function of set.seed() and .Random.seed and can't find this particular issue documented so here it is as a question:
Inital Observation
When I inspect the .Random.seeds generated as a result of set.seed(1) and set.seed(2), I find that the first two elements are always the same (10403 & 624) while the rest appears not to be. See example below.
My questions
Is that expected?
Why does it happen?
Will this have any untoward consequenses for any random simulation I
might do based on it?
Reproducible Example
f <- function(s1, s2){
set.seed(s1)
r1 <- .Random.seed
set.seed(s2)
r2 <- .Random.seed
print(r1[1:3])
print(r2[1:3])
plot(r1, r2)
}
f(1, 2)
#> [1] 10403 624 -169270483
#> [1] 10403 624 -1619336578
Created on 2022-01-04 by the reprex package (v2.0.1)
Note that the first two elements of each .Random.seed are identical but the remainder is not. You can see in the scatterplot that it's just a random cloud as expected.
Expanding helpful comments from #r2evans and #Dave2e into an answer.
1) .Random.seed[1]
From ?.Random.seed, it says:
".Random.seed is an integer vector whose first element codes the
kind of RNG and normal generator. The lowest two decimal digits are in
0:(k-1) where k is the number of available RNGs. The hundreds
represent the type of normal generator (starting at 0), and the ten
thousands represent the type of discrete uniform sampler."
Therefore the first value doesn't change unless one changes the generator method (RNGkind).
Here is a small demonstration of this for each of the available RNGkinds:
library(tidyverse)
# available RNGkind options
kinds <- c(
"Wichmann-Hill",
"Marsaglia-Multicarry",
"Super-Duper",
"Mersenne-Twister",
"Knuth-TAOCP-2002",
"Knuth-TAOCP",
"L'Ecuyer-CMRG"
)
# test over multiple seeds
seeds <- c(1:3)
f <- function(kind, seed) {
# set seed with simulation parameters
set.seed(seed = seed, kind = kind)
# check value of first element in .Random.seed
return(.Random.seed[1])
}
# run on simulated conditions and compare value over different seeds
expand_grid(kind = kinds, seed = seeds) %>%
pmap(f) %>%
unlist() %>%
matrix(
ncol = length(seeds),
byrow = T,
dimnames = list(kinds, paste0("seed_", seeds))
)
#> seed_1 seed_2 seed_3
#> Wichmann-Hill 10400 10400 10400
#> Marsaglia-Multicarry 10401 10401 10401
#> Super-Duper 10402 10402 10402
#> Mersenne-Twister 10403 10403 10403
#> Knuth-TAOCP-2002 10406 10406 10406
#> Knuth-TAOCP 10404 10404 10404
#> L'Ecuyer-CMRG 10407 10407 10407
Created on 2022-01-06 by the reprex package (v2.0.1)
2) .Random.seed[2]
At least for the default "Mersenne-Twister" method, .Random.seed[2] is an index that indicates the current position in the random set. From the docs:
The ‘seed’ is a 624-dimensional set of 32-bit integers plus a current
position in that set.
This is updated when random processes using the seed are executed. However for other methods it the documentation doesn't mention something like this and there doesn't appear to be a clear trend in the same way.
See below for an example of changes in .Random.seed[2] over iterative random process after set.seed().
library(tidyverse)
# available RNGkind options
kinds <- c(
"Wichmann-Hill",
"Marsaglia-Multicarry",
"Super-Duper",
"Mersenne-Twister",
"Knuth-TAOCP-2002",
"Knuth-TAOCP",
"L'Ecuyer-CMRG"
)
# create function to run random process and report .Random.seed[2]
t <- function(n = 1) {
p <- .Random.seed[2]
runif(n)
p
}
# create function to set seed and iterate a random process
f2 <- function(kind, seed = 1, n = 5) {
set.seed(seed = seed,
kind = kind)
replicate(n, t())
}
# set simulation parameters
trials <- 5
seeds <- 1:2
x <- expand_grid(kind = kinds, seed = seeds, n = trials)
# evaluate and report
x %>%
pmap_dfc(f2) %>%
mutate(n = paste0("trial_", 1:trials)) %>%
pivot_longer(-n, names_to = "row") %>%
pivot_wider(names_from = "n") %>%
select(-row) %>%
bind_cols(x[,1:2], .)
#> # A tibble: 14 x 7
#> kind seed trial_1 trial_2 trial_3 trial_4 trial_5
#> <chr> <int> <int> <int> <int> <int> <int>
#> 1 Wichmann-Hill 1 23415 8457 23504 2.37e4 2.28e4
#> 2 Wichmann-Hill 2 21758 27800 1567 2.58e4 2.37e4
#> 3 Marsaglia-Multicarry 1 1280795612 945095059 14912928 1.34e9 2.23e8
#> 4 Marsaglia-Multicarry 2 -897583247 -1953114152 2042794797 1.39e9 3.71e8
#> 5 Super-Duper 1 1280795612 -1162609806 -1499951595 5.51e8 6.35e8
#> 6 Super-Duper 2 -897583247 224551822 -624310 -2.23e8 8.91e8
#> 7 Mersenne-Twister 1 624 1 2 3 4
#> 8 Mersenne-Twister 2 624 1 2 3 4
#> 9 Knuth-TAOCP-2002 1 166645457 504833754 504833754 5.05e8 5.05e8
#> 10 Knuth-TAOCP-2002 2 967462395 252695483 252695483 2.53e8 2.53e8
#> 11 Knuth-TAOCP 1 1050415712 999978161 999978161 1.00e9 1.00e9
#> 12 Knuth-TAOCP 2 204052929 776729829 776729829 7.77e8 7.77e8
#> 13 L'Ecuyer-CMRG 1 1280795612 -169270483 -442010614 4.71e8 1.80e9
#> 14 L'Ecuyer-CMRG 2 -897583247 -1619336578 -714750745 2.10e9 -9.89e8
Created on 2022-01-06 by the reprex package (v2.0.1)
Here you can see that from the Mersenne-Twister method, .Random.seed[2] increments from it's maximum of 624 back to 1 and increased by the size of the random draw and that this is the same for set.seed(1) and set.seed(2). However the same trend is not seen in the other methods. To illustrate the last point, see that runif(1) increments .Random.seed[2] by 1 while runif(2) increments it by 2:
# create function to run random process and report .Random.seed[2]
t <- function(n = 1) {
p <- .Random.seed[2]
runif(n)
p
}
set.seed(1, kind = "Mersenne-Twister")
replicate(9, t(1))
#> [1] 624 1 2 3 4 5 6 7 8
set.seed(1, kind = "Mersenne-Twister")
replicate(5, t(2))
#> [1] 624 2 4 6 8
Created on 2022-01-06 by the reprex package (v2.0.1)
3) Sequential Randoms
Because the index or state of .Random.seed (apparently for all the RNG methods) advances according to the size of the 'random draw' (number of random values genearted from the .Random.seed), it is possible to generate the same series of random numbers from the same seed in different sized increments. Furthermore, as long as you run the same random process at the same point in the sequence after setting the same seed, it seems that you will get the same result. Observe the following example:
# draw 3 at once
set.seed(1, kind = "Mersenne-Twister")
sample(100, 3, T)
#> [1] 68 39 1
# repeat single draw 3 times
set.seed(1, kind = "Mersenne-Twister")
sample(100, 1)
#> [1] 68
sample(100, 1)
#> [1] 39
sample(100, 1)
#> [1] 1
# draw 1, do something else, draw 1 again
set.seed(1, kind = "Mersenne-Twister")
sample(100, 1)
#> [1] 68
runif(1)
#> [1] 0.5728534
sample(100, 1)
#> [1] 1
Created on 2022-01-06 by the reprex package (v2.0.1)
4) Correlated Randoms
As we saw above, two random processes run at the same point after setting the same seed are expected to give the same result. However, even when you provide constraints on how similar the result can be (e.g. by changing the mean of rnorm() or even by providing different functions) it seems that the results are still perfectly correlated within their respective ranges.
# same function with different constraints
set.seed(1, kind = "Mersenne-Twister")
a <- runif(50, 0, 1)
set.seed(1, kind = "Mersenne-Twister")
b <- runif(50, 10, 100)
plot(a, b)
# different functions
set.seed(1, kind = "Mersenne-Twister")
d <- rnorm(50)
set.seed(1, kind = "Mersenne-Twister")
e <- rlnorm(50)
plot(d, e)
Created on 2022-01-06 by the reprex package (v2.0.1)
I have a script using the R package 'concaveman', but due to issues on the ubuntu platform that I need to run the code on I cannot install this package (it has taken me three days trying to solve it). So I am looking for an alternative.
I have a random set of points ranging from 3 to 1000s of points. I want to draw a convex hull/polygon around the outer most points (step after would be to rasterize). I have been trying to do it by converting the points to a raster, then use rastertopolygons, but in rare occasions points would be in the same raster cell resulting in only two unique points. Convaveman would make this into a linear polygon (which is what I want, without using concaveman). Here is the input data that would be problematic:
x <- structure(list(x = c(166.867, 166.867, 167.117, 166.8667), y = c(-20.6333,
-20.633, -20.833, -20.6333)), row.names = c(NA, -4L), class = c("tbl_df",
"tbl", "data.frame"))
This is what I tried not (with the error I get):
SP_pt <- SpatialPoints(x, proj4string=crs("+proj=longlat +ellps=WGS84 `+towgs84=0,0,0,0,0,0,0 +no_defs"))`
gridded(SP_pt) <- T
SP_pt_R <- raster(SP_pt)
SP_poly <- rasterToPolygons(SP_pt_R, dissolve = T)
suggested tolerance minimum: 0.333333
Error in points2grid(points, tolerance, round) :
dimension 1 : coordinate intervals are not constant
You can use chull in base R:
sp::Polygon(x[c(chull(x), chull(x)[1]), ])
#> An object of class "Polygon"
#> Slot "labpt":
#> [1] 166.95023 -20.69977
#>
#> Slot "area":
#> [1] 6.75e-05
#>
#> Slot "hole":
#> [1] FALSE
#>
#> Slot "ringDir":
#> [1] 1
#>
#> Slot "coords":
#> x y
#> [1,] 167.1170 -20.8330
#> [2,] 166.8667 -20.6333
#> [3,] 166.8670 -20.6330
#> [4,] 167.1170 -20.8330
Or if you want to use the sf package:
sf::st_polygon(list(as.matrix(x[c(chull(x), chull(x)[1]),])))
#> POLYGON ((167.117 -20.833, 166.8667 -20.6333, 166.867 -20.633, 167.117 -20.833))
You can use dismo::convHull and then use predict or rasterize
library(dismo)
xy <- cbind(x=c(1,1,2,2), y=c(3,2,1,2))
# must be matrix or data.frame, not a tbl
ch <- convHull(xy)
plot(ch)
# predict
r <- raster(xmn=0, xmx=5, ymn=0, ymx=5, res=.25)
p <- predict(ch, r)
# Or rasterize
sp <- polygons(ch)
x <- rasterize(sp, r)
For faster rasterization you can use terra
library(terra)
v <- vect(sp)
rr <- rast(r)
y <- rasterize(v, rr)
To cast sp to sf
sf <- as(sp, "sf")
I'm not sure if it's a meaningful question but I don't understand how (and if) is it possible to combine a list of ppp objects into a unique ppp object. For example
library(spatstat)
#> Loading required package: spatstat.data
#> Loading required package: nlme
#> Loading required package: rpart
#>
#> spatstat 1.62-2 (nickname: 'Shape-shifting lizard')
#> For an introduction to spatstat, type 'beginner'
ppp1 <- ppp(runif(20), runif(20), c(0,1), c(0,1))
ppp2 <- ppp(runif(20), runif(20), c(0,1), c(0,1))
do.call("rbind", list(ppp1, ppp2))
#> window n x y markformat
#> [1,] List,4 20 Numeric,20 Numeric,20 "none"
#> [2,] List,4 20 Numeric,20 Numeric,20 "none"
do.call("ppp", list(ppp1, ppp2))
#> Error in ppp(structure(list(window = structure(list(type = "rectangle", : is.numeric(x) is not TRUE
Created on 2020-01-30 by the reprex package (v0.3.0)
I think that the result should be a ppp object created by rbinding the coordinates and marks and window object should be the union of the owin objects. Is that a reasonable idea? Is that already coded or documented somewhere?
It was a stupid question, it's all documented here: https://rdrr.io/cran/spatstat/man/superimpose.html.