stars package: how to define additional dimensions based on an attribute (filename)?

stars package: how to define additional dimensions based on an attribute (filename)? - r

I have a set of raster files (in this case downloaded from http://www.paleoclim.org/) that I am reading into R using the stars package.
library("tidyverse")
library("fs")
library("stars")
data_path <- "./paleoclim"
(data_files <- list.files(data_path, pattern = "*.tif"))
#> [1] "BA_v1_2_5m_bio_1_badia.tif"
#> [2] "BA_v1_2_5m_bio_10_badia.tif"
#> [3] "BA_v1_2_5m_bio_11_badia.tif"
#> [...]
#> [39] "EH_v1_2_5m_bio_1_badia.tif"
#> [40] "EH_v1_2_5m_bio_10_badia.tif"
#> [41] "EH_v1_2_5m_bio_11_badia.tif"
#> [...]
#> [58] "HS1_v1_2_5m_bio_1_badia.tif"
#> [59] "HS1_v1_2_5m_bio_10_badia.tif"
#> [60] "HS1_v1_2_5m_bio_11_badia.tif"
#> [...]
(paleoclim <- read_stars(path(data_path, data_files)))
#> stars object with 2 dimensions and 133 attributes
#> attribute(s):
#> BA_v1_2_5m_bio_1_badia.tif BA_v1_2_5m_bio_10_badia.tif
#> Min. :101.0 Min. :213.0
#> 1st Qu.:166.0 1st Qu.:278.0
#> Median :173.0 Median :298.0
#> Mean :171.8 Mean :290.3
#> 3rd Qu.:180.0 3rd Qu.:304.0
#> Max. :200.0 Max. :325.0
#> [...]
#> dimension(s):
#> from to offset delta refsys point values
#> x 1 72 36 0.0416667 WGS 84 FALSE NULL [x]
#> y 1 48 33 -0.0416667 WGS 84 FALSE NULL [y]
Created on 2020-12-07 by the reprex package (v0.3.0)
The filenames contain two pieces of information that I would like to represent as dimensions of the stars object, e.g. HS1_v1_2_5m_bio_1_badia.tif refers to period "HS1" and bioclimatic variable "bio_1".
I've got as far as using st_redimension() to create the new dimensions and levels:
periods <- str_extract(names(paleoclim), "[^_]+")
biovars <- str_extract(names(paleoclim), "bio_[0-9]+")
paleoclim %>%
merge() %>%
st_redimension(
new_dims = st_dimensions(x = 1:72, y = 1:48,
period = unique(periods),
biovar = unique(biovars))
)
#> stars object with 4 dimensions and 1 attribute
#> attribute(s):
#> X
#> Min. : -91.0
#> 1st Qu.: 26.0
#> Median : 78.0
#> Mean : 588.2
#> 3rd Qu.: 256.0
#> Max. :11275.0
#> dimension(s):
#> from to offset delta refsys point values
#> x 1 72 1 1 NA FALSE NULL [x]
#> y 1 48 1 1 NA FALSE NULL [y]
#> period 1 7 NA NA NA FALSE BA,...,YDS
#> biovar 1 19 NA NA NA FALSE bio_1,...,bio_9
But this doesn't actually map the values of the attributes (filenames) to the levels of the new dimensions. Also, most of the information (e.g. CRS) about the original x and y dimensions are lost because I have to recreate them manually.
How do you properly define new dimensions of a stars object based on another dimension or attribute?

Don't see a straightforward way to split one dimension into two after all files have been read into a three-dimensional stars object. An alternative approach you could use is:
read one folder at a time, where all files of that folder go into the variable third dimension, stored as separate stars objects in a list,
then combine the resulting stars objects, where the stars objects go into the period fourth dimension.
For this example, I downloaded the following two products and unzipped into two separate folders:
http://sdmtoolbox.org/paleoclim.org/data/BA/BA_v1_10m.zip
http://sdmtoolbox.org/paleoclim.org/data/HS1/HS1_v1_10m.zip
Here is the code:
library(stars)
# Directories with GeoTIFF files
paths = c(
"/home/michael/Downloads/BA_v1_10m",
"/home/michael/Downloads/HS1_v1_10m"
)
# Read the files and set 3rd dimension
r = list()
for(i in paths) {
files = list.files(path = i, pattern = "\\.tif$", full.names = TRUE)
r[[i]] = read_stars(files)
names(r[[i]]) = basename(files)
r[[i]] = st_redimension(r[[i]])
}
# Combine the list
r = do.call(c, r)
# Attributes to 4th dimension
names(r) = basename(paths)
r = st_redimension(r)
# Clean dimension names
r = st_set_dimensions(r, names = c("x", "y", "variable", "period"))
r
and the printout of the result:
## stars object with 4 dimensions and 1 attribute
## attribute(s), summary of first 1e+05 cells:
## BA_v1_10m.HS1_v1_10m
## Min. :-344.0
## 1st Qu.:-290.0
## Median :-274.0
## Mean :-264.8
## 3rd Qu.:-252.0
## Max. :-128.0
## NA's :94073
## dimension(s):
## from to offset delta refsys point values x/y
## x 1 2160 -180 0.166667 WGS 84 FALSE NULL [x]
## y 1 1072 88.6667 -0.166667 WGS 84 FALSE NULL [y]
## variable 1 19 NA NA NA NA bio_1.tif,...,bio_9.tif
## period 1 2 NA NA NA NA BA_v1_10m , HS1_v1_10m
The result is a stars object with four dimensions, including x, y, variable, and period.
Here are plots, separately for each of the two levels in the period dimension:
plot(r[,,,,1,drop=TRUE])
plot(r[,,,,2,drop=TRUE])

Related

Select observations near to their neighbor in PCA cloud

I have a dataset ind with two population fr2100 and nr, where each individual in this population have an unique numerous. Each individual has coordinates, a Dim.1 and Dim.2 value. As you can see here:
> ind <- get_pca_ind(res_acp)
> ind
Principal Component Analysis Results for individuals
===================================================
Name Description
1 "$coord" "Coordinates for the individuals"
2 "$cos2" "Cos2 for the individuals"
3 "$contrib" "contributions of the individuals"
# isolate the population 'fr2100'
> fr2100 <- ind$coord[substr(rownames(ind$coord), 1, 7) == 'fr2100_', ]
> str(fr2100)
'data.frame': 6873 obs. of 3 variables:
$ rowname: chr "fr2100_72" "fr2100_73" "fr2100_74" "fr2100_75" ...
$ Dim.1 : num 1.37 1.3 1.25 1.25 1.18 ...
$ Dim.2 : num -1.249 -1.028 -0.835 -0.624 -0.483 ...
# isolate the population 'nr'
> nr <- ind$coord[substr(rownames(ind$coord), 1, 3) == 'nr_', ]
> str(nr)
'data.frame': 4897 obs. of 3 variables:
$ rowname: chr "nr_174" "nr_175" "nr_176" "nr_177" ...
$ Dim.1 : num -3.74 -3.44 -3.26 -2.97 -3.88 ...
$ Dim.2 : num 1.26 1.55 1.7 1.91 1.3 ...
My question: I am trying to understand how I can select only, among the 6873 individuals of fr2100, the individuals who have a value of Dim.1 AND Dim.2 at a distance of more or less 0.01 from the 4897 individuals nr, represented in this cloud of points:
In other words, each individuals fr2100 that can be within the perimeter (at 0.01) of an individual nr. as theoretically represented here
I'm interested in any answers. I can provide more information if needed. Thank you in advance.

I guess distance_semi_join() from fuzzyjoin package would be rather straightforward and compact way to filter by euclidean distance. Other variants like distance_left_join() are also worth considering as those will provide an optional distance variable in resulting dataframe.
library(fuzzyjoin)
library(ggplot2)
# example datasets
set.seed(1)
nr <- data.frame(rowname = paste0("nr_", 1:100), Dim.1 = rnorm(100, -0.05, 0.03), Dim.2 = rnorm(100, 0, 0.02))
fr <- data.frame(rowname = paste0("fr_", 1:100), Dim.1 = rnorm(100, 0.05, 0.03), Dim.2 = rnorm(100, 0, 0.02))
# fr points within distance of closest nr point:
fr_in_dist <- distance_semi_join(fr, nr,
by = c("Dim.1","Dim.2"),
max_dist=0.01)
fr_in_dist
#> rowname Dim.1 Dim.2
#> 5 fr_5 -0.018557066 3.308291e-02
#> 14 fr_14 0.008893764 -1.311564e-02
#> 18 fr_18 0.012401307 -2.420202e-03
#> 25 fr_25 0.015302829 9.640590e-03
#> 28 fr_28 0.001834598 3.409789e-03
#> 32 fr_32 -0.036667620 -3.138164e-02
#> 38 fr_38 0.014406241 8.797409e-05
#> 46 fr_46 -0.010004948 -2.817701e-02
#> 57 fr_57 -0.022092886 -2.347154e-02
#> 68 fr_68 0.014326601 1.135904e-02
#> 77 fr_77 -0.018673719 2.577108e-03
#> 79 fr_79 0.010512645 -3.278219e-03
#> 84 fr_84 0.028963050 3.286837e-03
#> 86 fr_86 0.019967835 -1.130428e-03
#> 94 fr_94 0.007212280 6.132097e-03
ggplot() +
geom_point(data = nr, aes(x = Dim.1, y = Dim.2, color = "nr"))+
geom_point(data = fr, aes(x = Dim.1, y = Dim.2, color = "fr"))+
geom_point(data = fr_in_dist, aes(x = Dim.1, y = Dim.2), shape = 1, size = 5 )+
coord_fixed() +
theme_bw()
Original answer was about singe reference point vs point could, in this dist() from base is also quite straightforward:
library(ggplot2)
# sample data, add point fr2100_xx that would fall outside of the perimeter
df <- read.csv(text = "rowname, Dim.1, Dim.2
fr2100_72, 0.003810163, 0.006935450
fr2100_73, 0.003433946, 0.004698691
fr2100_74, 0.003168248, 0.003097222
fr2100_xx, 0.015, 0.015")
# nr and threshold distance
nr <- c(0.0035, 0.005)
thr_dist <- 0.01
# insert nr point to first position to use it in distance matrix calculation
dist_m <- rbind(c(0.0035, 0.005),df[,c("Dim.1", "Dim.2")]) |> dist() |> as.matrix()
# distances:
as.dist(dist_m)
#> 1 2 3 4
#> 2 0.0019601448
#> 3 0.0003084643 0.0022681777
#> 4 0.0019314822 0.0038915356 0.0016233602
#> 5 0.0152397507 0.0137930932 0.0154884012 0.0167829223
# extract first column, distnaces from point "nr" ([1,1] = 0)
df$dist <-dist_m[-1,1]
# flag points that fall outside of the perimeter
df$in_dist = df$dist <= thr_dist
df
#> rowname Dim.1 Dim.2 dist in_dist
#> 1 fr2100_72 0.003810163 0.006935450 0.0019601448 TRUE
#> 2 fr2100_73 0.003433946 0.004698691 0.0003084643 TRUE
#> 3 fr2100_74 0.003168248 0.003097222 0.0019314822 TRUE
#> 4 fr2100_xx 0.015000000 0.015000000 0.0152397507 FALSE
Viz - https://i.imgur.com/jiqHmXn.png

Is it possible to create a stars object in R that is the minimum values of two other stars objects?

I have two stars objects that are read in to R as tifs:
tif1 <- stars::read_stars("/data.tif")
tif2 <- stars::read_stars("/data2.tif")
They cover the same extent and have the same resolution. I know that I can do algebra with the objects -- for example, to create a new object that is the average of the values of the first two, I can use:
tif.avg <- (tif1 + tif2)/2
However, I want to know if it's possible to create a new object that extracts the minimum value from them instead. I've tried it a couple of different ways but I've hit a brick wall with this. Does anybody know if this is even possible?

O.K. thanks for the clarification #Ben Lee.
So, as a follow-up of your comment, please find below (cf. Reprex) one solution to your problem :
REPREX:
library(raster)
#> Le chargement a nécessité le package : sp
library(stars)
#> Le chargement a nécessité le package : abind
#> Le chargement a nécessité le package : sf
#> Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1
# 1. Creating two stars objects
r1 <- raster(ncols = 3, nrows = 3)
values(r1) <- seq(length(r1))
r2 <- raster(ncols = 3, nrows = 3)
values(r2) <- rev(seq(length(r2)))
r_stack <- stack(r1, r2)
writeRaster(r_stack, "raster.tif",
bylayer = TRUE, suffix = 1:nlayers(r_stack))
tif1 <- read_stars("raster_1.tif")
tif2 <- read_stars("raster_2.tif")
# Array of the first stars object
tif1[[1]]
#> [,1] [,2] [,3]
#> [1,] 1 4 7
#> [2,] 2 5 8
#> [3,] 3 6 9
# Array of the second stars object
tif2[[1]]
#> [,1] [,2] [,3]
#> [1,] 9 6 3
#> [2,] 8 5 2
#> [3,] 7 4 1
# 2. Creating a 'stars' object with the minimum values
# of the two previous stars objects
# 2.1. retrieving the min values between the two stars object
tif_min <- pmin(tif1[[1]], tif2[[1]])
# 2.2. converting the resulting array 'tif_min' into a stars object
tif_min <- st_as_stars(tif_min)
# 2.3. retrieving the dimensions from one of the two previous
# stars object (here, tif1) and setting a name
st_dimensions(tif_min) <- st_dimensions(tif1)
setNames(tif_min, "tif_min")
#> stars object with 2 dimensions and 1 attribute
#> attribute(s):
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> tif_min 1 2 3 2.777778 4 5
#> dimension(s):
#> from to offset delta refsys point values x/y
#> x 1 3 -180 120 WGS 84 FALSE NULL [x]
#> y 1 3 90 -60 WGS 84 FALSE NULL [y]
# 2.4 a little check!
tif_min[[1]]
#> [,1] [,2] [,3]
#> [1,] 1 4 3
#> [2,] 2 5 2
#> [3,] 3 4 1
#>These are the minimum values of the two "star" input objects
Created on 2021-09-20 by the reprex package (v2.0.1)
Please confirm that this is what you were looking for (and if so, please do not forget to validate the answer to make it easier for other users to find this solution)

What is the correct way to use dplyr's slice_sample() within my apply function?

In the below code, I've simulated dice rolls at increasing sample sizes and computed the average roll at each sample size. My lapply function works, but I'm uncomfortable with it since I know sample_n is not a dplyr function and has been superceded by slice_sample. I would like make my code better with a dplyr solution rather than sample_n() within the lapply. I think I may have other syntactical errors within the lapply. Here is the code:
#Dice
dice <- c(1,2,3,4,5,6) #the set of possible outcomes of a dice role
dice_probs <- c(1/6,1/6,1/6,1/6,1/6,1/6) #the probability of each option per roll
dice_df <- data.frame(dice,dice_probs)
#Simulate dice rolls for each of these sample sizes and record the average of the rolls
sample_sizes <- c(10,25,50,100,1000,10000,100000,1000000,100000000) #compute at each sample size
output <- lapply(X=sample_sizes, FUN = function(var){
obs = sample_n(dice_df,var,replace=TRUE)
sample_mean = mean(obs$dice)
new.df <- data.frame(sample_mean, var)
return(new.df)
})
The final step is computing the difference compared to the expected value, 3.5. I want a column where that shows the difference between 3.5 and the sample mean. We should see the difference decreasing as the sample size increases.
output <- output %>%
mutate(difference = across(sample_mean, ~3.5 - .x))
When I run this, it's throwing this error:
Error in UseMethod("mutate") :
no applicable method for 'mutate' applied to an object of class "list"
I've tried using sapply but I get a similar error: no applicable method for 'mutate' applied to an object of class "c('matrix', 'array', 'list')"
If it helps, here was my failed attempt at using slice_sample:
output <- lapply(X=sample_sizes, FUN = function(...){
obs = slice_sample(dice_df, ..., .preserve=TRUE)
sample_mean = mean(obs$dice)
new.df <- data.frame(sample_mean, ...)
return(new.df)
})
I got this error: Error: '...' used in an incorrect context

The output is just a single row data.frame element in a list. We can bind them with bind_rows and simply subtract once instead of doing this multiple times
library(dplyr)
bind_rows(output) %>%
mutate(difference = 3.5 - sample_mean )
sample_mean var difference
1 3.500000 10 0.00000000
2 2.800000 25 0.70000000
3 3.440000 50 0.06000000
4 3.510000 100 -0.01000000
5 3.495000 1000 0.00500000
6 3.502200 10000 -0.00220000
7 3.502410 100000 -0.00241000
8 3.498094 1000000 0.00190600
9 3.500183 100000000 -0.00018332

The n argument of slice_sample correspondes to sample_n's size argument.
And to calculate the difference of your output list we can use purrr::map instead of dplyr::across.
library(dplyr)
library(purrr)
set.seed(123)
#Dice
dice <- c(1,2,3,4,5,6) #the set of possible outcomes of a dice role
dice_probs <- c(1/6,1/6,1/6,1/6,1/6,1/6) #the probability of each option per roll
dice_df <- data.frame(dice,dice_probs)
#Simulate dice rolls for each of these sample sizes and record the average of the rolls
sample_sizes <- c(10,25,50,100,1000,10000,100000,1000000,100000000) #compute at each sample size
output <- lapply(X=sample_sizes, FUN = function(var){
obs = slice_sample(dice_df,n = var,replace=TRUE)
sample_mean = mean(obs$dice)
new.df <- data.frame(sample_mean, var)
return(new.df)
})
output %>%
map(~ 3.5 - .x$sample_mean)
#> [[1]]
#> [1] -0.5
#>
#> [[2]]
#> [1] 0.42
#>
#> [[3]]
#> [1] -0.04
#>
#> [[4]]
#> [1] -0.34
#>
#> [[5]]
#> [1] 0.025
#>
#> [[6]]
#> [1] 0.0317
#>
#> [[7]]
#> [1] 0.00416
#>
#> [[8]]
#> [1] -2.6e-05
#>
#> [[9]]
#> [1] -4.405e-05
Created on 2021-08-02 by the reprex package (v0.3.0)
Alternatively, we can use purrr::map_df and add a row diff inside each tibble as proposed by Martin Gal in the comments:
output %>%
map_df(~ tibble(.x, diff = 3.5 - .x$sample_mean))
#> # A tibble: 9 x 3
#> sample_mean var diff
#> <dbl> <dbl> <dbl>
#> 1 2.6 10 0.9
#> 2 3.28 25 0.220
#> 3 3.66 50 -0.160
#> 4 3.5 100 0
#> 5 3.53 1000 -0.0270
#> 6 3.50 10000 -0.00180
#> 7 3.50 100000 -0.00444
#> 8 3.50 1000000 -0.000226
#> 9 3.50 100000000 -0.0000669

Here is a base R way -
transform(do.call(rbind, output), difference = 3.5 - sample_mean)
# sample_mean var difference
#1 3.80 10 -0.300000
#2 3.44 25 0.060000
#3 3.78 50 -0.280000
#4 3.30 100 0.200000
#5 3.52 1000 -0.015000
#6 3.50 10000 -0.004200
#7 3.50 100000 -0.004370
#8 3.50 1000000 0.002696
#9 3.50 100000000 0.000356
If you just need the difference value you can do -
3.5 - sapply(output, `[[`, 'sample_mean')

Products of all pairwise combinations of bands in two rasters with R stars package

I have two multiband rasters of class stars. They have the same resolution and extent in their first two dimensions (x and y). Each raster has multiple bands. I would like to take all pairwise combinations of bands from each of the rasters and find the product of each of those combinations. Is there a way to do this with a function like outer() or possibly st_apply(), without having to use nested for-loops?

Hoping that it is not too late and that my answer will still be useful for you #qdread, I suggest the following solution (see Reprex below).
As you wished, I used st_apply() to compute the products of all pairwise combinations of bands of the two rasters of class stars.
For your convenience, I have built a function (i.e. named crossBandsProducts()) that wraps the whole process.
This function has the following features:
Input:
Two stars objects or two stars_proxy objects
the number of bands can be the same or different between the two rasters
Output:
One stars object with a dimension named "bandsProducts" containing all pairwise combinations of products between the bands of the two rasters.
Each product has a name (i.e. a value - see Reprex below) of the form r2bX*r1bY where X and Y are respectively the band numbers of rasters 2 and 1.
REPREX:
library(stars)
#> Le chargement a nécessité le package : abind
#> Le chargement a nécessité le package : sf
#> Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1
# 1. Importing two stars objects with 6 and 3 bands respectively
tif <- system.file("tif/L7_ETMs.tif", package = "stars")
tif1 <- read_stars(tif, proxy = FALSE)
tif2 <- read_stars(tif, proxy = FALSE, RasterIO = list(bands = c(1, 3, 4)))
# 2. Building the 'crossBandsProducts' function
crossBandsProducts <- function(r1, r2) {
products <-
st_apply(r2, 3, function(x)
x * as.data.frame(split(r1, "band"))[, -grep("x|y", colnames(as.data.frame(split(r1, "band"))))])
if (class(r1)[1] == "stars_proxy"){
products <- st_as_stars(products)
}
products <- as.data.frame(split(products[[1]], "band"))
colnames(products) <-
paste0(rep(paste0("r2b", 1:dim(r2)["band"]), each = dim(r1)["band"]),
rep(paste0("*r1b", 1:dim(r1)["band"]), times = dim(r2)["band"]))
products <-
cbind(as.data.frame(split(r1, "band"))[, grep("x|y", colnames(as.data.frame(split(r1, "band"))))], products)
products <- st_as_stars(products, dims = c("x", "y"))
st_dimensions(products) <- st_dimensions(r1)[c("x", "y")]
products <- st_set_dimensions(merge(products),
names = c("x", "y", "bandsProducts"))
return(products)
}
# 3. Use of the function 'crossBandProducts'
(products <- crossBandsProducts(r1=tif1, r2=tif2))
#> stars object with 3 dimensions and 1 attribute
#> attribute(s), summary of first 1e+05 cells:
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> X 2209 4225 5776 6176.508 7569 65025
#> dimension(s):
#> from to offset delta refsys point
#> x 1 349 288776 28.5 UTM Zone 25, Southern Hem... FALSE
#> y 1 352 9120761 -28.5 UTM Zone 25, Southern Hem... FALSE
#> bandsProducts 1 18 NA NA NA NA
#> values x/y
#> x NULL [x]
#> y NULL [y]
#> bandsProducts r2b1*r1b1,...,r2b3*r1b6
#>
#> NB: The dimension 'bandsProducts' has 18 values, which is consistent since the
#> rasters tif1 and tif2 have 6 and 3 bands respectively.
# 4. Example of extraction : two possibilities
# 4.1. via the number(s) of 'bandsProducts'
products[,,,4]
#> stars object with 3 dimensions and 1 attribute
#> attribute(s):
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> X 649 3904 4788 4528.267 5525 65025
#> dimension(s):
#> from to offset delta refsys point
#> x 1 349 288776 28.5 UTM Zone 25, Southern Hem... FALSE
#> y 1 352 9120761 -28.5 UTM Zone 25, Southern Hem... FALSE
#> bandsProducts 4 4 NA NA NA NA
#> values x/y
#> x NULL [x]
#> y NULL [y]
#> bandsProducts r2b1*r1b4
# 4.2. via the name(s) (i.e. value(s)) of 'bandsProducts'
products[,,, values = c("r2b1*r1b4", "r2b3*r1b6")]
#> stars object with 3 dimensions and 1 attribute
#> attribute(s), summary of first 1e+05 cells:
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> X 2209 4225 5776 6176.508 7569 65025
#> dimension(s):
#> from to offset delta refsys point
#> x 1 349 288776 28.5 UTM Zone 25, Southern Hem... FALSE
#> y 1 352 9120761 -28.5 UTM Zone 25, Southern Hem... FALSE
#> bandsProducts 1 18 NA NA NA NA
#> values x/y
#> x NULL [x]
#> y NULL [y]
#> bandsProducts r2b1*r1b1,...,r2b3*r1b6
# 5. Example of visualization
# 5.1. All pairwise combinations of bands products
plot(products, axes = TRUE, key.pos = NULL)
#> downsample set to c(2,2,1)
# 5.2. Selected pairwise combinations of bands products (selection by names/values)
plot(products[,,, values = c("r2b1*r1b4", "r2b3*r1b6")], axes = TRUE, key.pos = NULL)
#> downsample set to c(2,2,1)
#>
#> NB: Don't know why this second figure doesn't appear in the reprex, but in any
#> case it displays without any problem on my computer, so you shouldn't have any
#> problem to display it when running this reprex locally.
Created on 2021-09-24 by the reprex package (v2.0.1)

Reordering an aggregated stars object

I'm performing temporal aggregations on netcdf rasters using the stars package in R. Typically the objects have (X, Y, Time) dimensions and, after doing the temporal averages, get an object with dimensions ordered as (Time, X, Y). How would one go about changing the order of the dimensions back to the original order (X, Y, T)?
I've been searching through the package vignettes and the likely function examples for a method, but haven't had any luck. I feel like I'm probably missing something simple and obvious...
Here's a reprex:
library(stars)
#> Loading required package: abind
#> Warning: package 'abind' was built under R version 4.0.3
#> Loading required package: sf
#> Warning: package 'sf' was built under R version 4.0.4
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
tif = system.file("tif/L7_ETMs.tif", package = "stars")
x = read_stars(c(tif, tif, tif), along = "band")
x
#> stars object with 3 dimensions and 1 attribute
#> attribute(s), summary of first 1e+05 cells:
#> L7_ETMs.tif
#> Min. : 47.00
#> 1st Qu.: 65.00
#> Median : 76.00
#> Mean : 77.34
#> 3rd Qu.: 87.00
#> Max. :255.00
#> dimension(s):
#> from to offset delta refsys point values x/y
#> x 1 349 288776 28.5 UTM Zone 25, Southern Hem... FALSE NULL [x]
#> y 1 352 9120761 -28.5 UTM Zone 25, Southern Hem... FALSE NULL [y]
#> band 1 18 NA NA NA NA NULL
time = as.Date("2021-03-24") + 1:18
x = st_set_dimensions(x, "band", values = time)
y = aggregate(x, by = "3 days", "mean")
y
#> stars object with 3 dimensions and 1 attribute
#> attribute(s):
#> L7_ETMs.tif
#> Min. : 5.00
#> 1st Qu.: 56.67
#> Median : 71.67
#> Mean : 68.91
#> 3rd Qu.: 84.00
#> Max. :255.00
#> dimension(s):
#> from to offset delta refsys point values x/y
#> time 1 6 2021-03-25 3 days Date NA NULL
#> x 1 349 288776 28.5 UTM Zone 25, Southern Hem... FALSE NULL [x]
#> y 1 352 9120761 -28.5 UTM Zone 25, Southern Hem... FALSE NULL [y]
Created on 2021-03-24 by the reprex package (v0.3.0)

The answer was simple and I should have known it! Because stars objects are, in part, a list of arrays, the dimensions can be reordered using aperm in the same way you would with a normal n-dimensionsal array.
aperm(y, c(2, 3, 1))
#> stars object with 3 dimensions and 1 attribute
#> attribute(s):
#> L7_ETMs.tif
#> Min. : 5.00
#> 1st Qu.: 56.67
#> Median : 71.67
#> Mean : 68.91
#> 3rd Qu.: 84.00
#> Max. :255.00
#> dimension(s):
#> from to offset delta refsys point values x/y
#> x 1 349 288776 28.5 UTM Zone 25, Southern Hem... FALSE NULL [x]
#> y 1 352 9120761 -28.5 UTM Zone 25, Southern Hem... FALSE NULL [y]
#> time 1 6 2021-03-25 3 days Date NA NULL

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

stars package: how to define additional dimensions based on an attribute (filename)? - r

Related

Select observations near to their neighbor in PCA cloud

Is it possible to create a stars object in R that is the minimum values of two other stars objects?

What is the correct way to use dplyr's slice_sample() within my apply function?

Products of all pairwise combinations of bands in two rasters with R stars package

Reordering an aggregated stars object

Categories

Resources