How to import sinusoidal remotesensed data into raster in R? - r

I am trying to read a NetCDF file from the Climate Change Initiative (CCI) into R with the terra package.
Given that the data is not on a regular grid, I am trying to find the proper way to project such data onto a regular grid.
library(terra)
#> terra 1.6.47
# Read the data
r <- rast("/vsicurl/https://dap.ceda.ac.uk/neodc/esacci/ocean_colour/data/v5.0-release/sinusoidal/netcdf/chlor_a/daily/v5.0/2007/ESACCI-OC-L3S-CHLOR_A-MERGED-1D_DAILY_4km_SIN_PML_OCx-20070104-fv5.0.nc?download=1", lyrs = "chlor_a")
#> Warning: [rast] unknown extent
As we can see here, there is no projection associated to the raster.
r
#> class : SpatRaster
#> dimensions : 1, 23761676, 1 (nrow, ncol, nlyr)
#> resolution : 1, 1 (x, y)
#> extent : 0, 23761676, 0, 1 (xmin, xmax, ymin, ymax)
#> coord. ref. :
#> source : https://ESACCI-OC-L3S-CHLOR_A-MERGED-1D_DAILY_4km_SIN_PML_OCx-20070104-fv5.0.nc?download=1://chlor_a
#> varname : chlor_a
#> name : chlor_a
I am guessing this should be the “original” projection to use.
sincrs <- "+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m"
At this point, I am not sure how to proceed to properly project the data on a regular grid. Any help would be appreciated.
Created on 2022-12-06 with reprex v2.0.2

I get
url = "https://dap.ceda.ac.uk/neodc/esacci/ocean_colour/data/v5.0-release/sinusoidal/netcdf/chlor_a/daily/v5.0/2007/ESACCI-OC-L3S-CHLOR_A-MERGED-1D_DAILY_4km_SIN_PML_OCx-20070104-fv5.0.nc"
download.file(url, basename(url), mode="wb")
library(terra)
r = rast(f, "chlor_a" )
#[1] "vobjtovarid4: **** WARNING **** I was asked to get a varid for dimension named bin_index BUT this dimension HAS NO DIMVAR! Code will probably fail at this point"
#Warning messages:
#1: In min(rs) : no non-missing arguments to min; returning Inf
#2: In max(rs) : no non-missing arguments to max; returning -Inf
#3: In min(rs) : no non-missing arguments to min; returning Inf
#4: [rast] cells are not equally spaced; extent is not defined
This suggest that data are not on a regular raster.

Based (i.e entirely copied) on #mdsumner suggestion:
library(terra)
#> terra 1.6.47
library(tidync)
library(palr)
url <- "https://dap.ceda.ac.uk/neodc/esacci/ocean_colour/data/v5.0-release/sinusoidal/netcdf/chlor_a/daily/v5.0/2007/ESACCI-OC-L3S-CHLOR_A-MERGED-1D_DAILY_4km_SIN_PML_OCx-20070704-fv5.0.nc?download=1"
f <- curl::curl_download(url, tempfile(fileext = ".nc"))
bins <- tidync(f) |>
activate("D1") |>
hyper_tibble()
bins
#> # A tibble: 23,761,676 × 3
#> lat lon bin_index
#> <dbl> <dbl> <int>
#> 1 -90.0 -120 1
#> 2 -90.0 0 2
#> 3 -90.0 120 3
#> 4 -89.9 -160 4
#> 5 -89.9 -120 5
#> 6 -89.9 -80 6
#> 7 -89.9 -40 7
#> 8 -89.9 0 8
#> 9 -89.9 40 9
#> 10 -89.9 80 10
#> # … with 23,761,666 more rows
## the bin is for joining on which bin_index is used here
d <- tidync::tidync(f) |>
activate("D1,D0") |>
hyper_tibble(select_var = c("chlor_a")) |>
dplyr::inner_join(bins, "bin_index")
d
#> # A tibble: 3,953,869 × 5
#> chlor_a bin_index time lat lon
#> <dbl> <int> <dbl> <dbl> <dbl>
#> 1 0.202 3028676 13698 -48.1 -173.
#> 2 0.246 3028677 13698 -48.1 -173.
#> 3 0.268 3028678 13698 -48.1 -173.
#> 4 0.370 3034441 13698 -48.1 -173.
#> 5 0.370 3034442 13698 -48.1 -173.
#> 6 0.322 3034443 13698 -48.1 -173.
#> 7 0.108 3035236 13698 -48.1 -123.
#> 8 0.166 3035237 13698 -48.1 -123.
#> 9 0.131 3035238 13698 -48.1 -123.
#> 10 0.119 3035239 13698 -48.1 -123.
#> # … with 3,953,859 more rows
d$time <- NULL
plot(d$lon, d$lat, pch = ".", col = palr::chl_pal(d$chlor_a))
## define a raster
r <- terra::rast(
terra::ext(c(-180, 180, -90, 90)),
nrows = 900,
ncols = 1800,
crs = "OGC:CRS84"
)
r
#> class : SpatRaster
#> dimensions : 900, 1800, 1 (nrow, ncol, nlyr)
#> resolution : 0.2, 0.2 (x, y)
#> extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84
cells <- tibble::tibble(
cell = terra::cellFromXY(r, cbind(d$lon, d$lat)),
chlor_a = d$chlor_a
) |>
dplyr::group_by(cell) |>
dplyr::summarize(chlor_a = mean(chlor_a))
r[cells$cell] <- cells$chlor_a
pal <- palr::chl_pal(palette = TRUE)
image(r, col = pal$cols[-1], breaks = pal$breaks)
Created on 2022-12-12 with reprex v2.0.2

Related

Error in UseMMethod: no applicable method for 'track_resample' applied to an object of class c('nested_track', 'tbl_df', tbl',data.frame

I am having trouble first using mutate and then using track_resample
Does someone have an idea what the problem is?
# Spatial data frame: accuracy_filter_locations
# class : SpatialPointsDataFrame
# features : 608
# extent : 388981.9, 448684.2, 5709170, 5741781 (xmin, xmax, ymin, ymax)
# crs : +proj=utm +zone=33 +ellps=GRS80 +units=m +no_defs
# variables : 19
# row.names group X_Coordinate Y_Coordinate Error DateShort
# 17 01.08.2022_22:26_009 13.57511 51.62247 original 01.08
# 30 01.08.2022_22:46_051 13.56865 51.62562 original 01.08
# 34 01.08.2022_22:55_051 13.55621 51.62076 original 01.08
# 48 02.08.2022_21:43_143 13.54523 51.61013 original 02.08
# 49 02.08.2022_21:45_113 13.54526 51.61234 original 02.08
# 50 02.08.2022_21:46_143 13.54439 51.61403 original 02.08
# 51 02.08.2022_21:48_113 13.54542 51.61100 original 02.08
# 52 02.08.2022_21:49_143 13.54564 51.61192 original 02.08
# 53 02.08.2022_21:51_113 13.54549 51.61143 original 03.08
# 148 03.08.2022_22:33_116 13.64297 51.69857 original 03.08
# 149 03.08.2022_22:34_116 13.64543 51.69026 original 03.08
# 150 03.08.2022_22:35_116 13.64581 51.69024 original 03.08
# 151 03.08.2022_22:36_116 13.64323 51.69730 original 03.08
# tracks of all individuals
# create a track object with all individuals
tracks<-make_track(accuracy_filter_locations#data,.x=X_Coordinate,.y=Y_Coordinate, id=group,crs = sp::CRS("+init=epsg:25833"))
tracks
# A tibble: 608 x 3
# x_ y_ id
# * <dbl> <dbl> <chr>
# 1 13.6 51.6 01.08.2022_22:26_009
# 2 13.6 51.6 01.08.2022_22:46_051
# 3 13.6 51.6 01.08.2022_22:55_051
# 4 13.5 51.6 02.08.2022_21:43_143
# 5 13.5 51.6 02.08.2022_21:45_113
# 6 13.5 51.6 02.08.2022_21:46_143
# 7 13.5 51.6 02.08.2022_21:48_113
# 8 13.5 51.6 02.08.2022_21:49_143
# 9 13.5 51.6 02.08.2022_21:51_113
# 10 13.5 51.6 02.08.2022_21:54_113
# ... with 598 more rows
## create a list of the tracks grouped by individuals
trks<-tracks %>% nest(data=-"id")
trks #have a look
# id data
# <chr> <list>
# 1 01.08.2022_22:26_009 <track_xy [1 x 2]>
# 2 01.08.2022_22:46_051 <track_xy [1 x 2]>
# 01.08.2022_22:55_051 <track_xy [1 x 2]>
# 4 02.08.2022_21:43_143 <track_xy [1 x 2]>
# with 598 more rows
# divide all tracks into bursts (rate is your set sampling rate and tolerance is the time buffer that you allow to still assign a step to this burst), filter only bursts with maximum minimum 2 points and create the steps based on your assigned bursts
trks2<-trks %>% mutate(steps=map(data, function(x)
x %>% track_resample(rate=seconds(secres), tolerance = minutes(gaptol)) %>% filter_min_n_burst(min_n=minstep) %>% steps_by_burst() )) # divide the track into bursts
# Error: Problem with `mutate()` column `steps`.
# i `steps = map(...)`.
# x no applicable method for 'track_resample' applied to an object of class "c('track_xy',
# 'tbl_df', 'tbl', 'data.frame')
# 'Error in UseMethod("track_resample", x) :
# no applicable method for 'track_resample' applied to an object of class` `"c('nested_track', 'tbl_df', 'tbl', 'data.frame')'

How to use osrmTable using the distance and source parameters

I want to compute distances from a data frame source to a data frame destination in R. Both data frames have coordinates but my code returns the following error:
Error: "src" should contain coordinates.
Below is my code
randompoints_df <- randompts %>%
as.data.frame()
bus_stops_df <- bus_stops %>%
dplyr::select(id = stop_name, stop_lon, stop_lat) %>%
as.data.frame()
t0 <- Sys.time()
distancetable <- osrm::osrmTable(src = randompoints_df, dst = bus_stops_df)
I tried selecting only the id, lon and lat columns in my data to ensure it had coordinates but the src parameter still brought the same error. Any help in fixing this would be appreciated.
osrmTable() accepts sf & sp objects or dataframes and matrices with exactly 2 columns, WGS84 lon & lat.
library(sf)
library(osrm)
# some sample data
stops <- read.csv(text = "lon,lat,name
24.7653725,59.4426444,Reisisadama D-terminal
24.7514720,59.4438342,Linnahall
24.7601144,59.4446286,Reisisadama A-terminal
24.7655886,59.4425595,Reisisadama D-terminal
24.7696412,59.4413391,Uus-Sadama
24.7593333,59.4401405,Siimeoni",header = T, )
# 3 columns, numeric lon/lat
str(stops)
#> 'data.frame': 6 obs. of 3 variables:
#> $ lon : num 24.8 24.8 24.8 24.8 24.8 ...
#> $ lat : num 59.4 59.4 59.4 59.4 59.4 ...
#> $ name: chr "Reisisadama D-terminal" "Linnahall" "Reisisadama A-terminal" "Reisisadama D-terminal" ...
# osrmTable() with 3-column tables - fails
osrmTable(stops[1:3,], stops[4:6,])
#> Error: "src" should contain coordinates.
# osrmTable() with 2-column tables - OK
osrmTable(stops[1:3,c("lon", "lat")], stops[4:6,c("lon", "lat")])
#> $durations
#> 4 5 6
#> 1 0.0 1.1 1.0
#> 2 3.1 3.0 2.3
#> 3 3.1 3.1 2.3
#>
#> $sources
#> lon lat
#> 1 24.76541 59.44259
#> 2 24.75145 59.44378
#> 3 24.76006 59.44473
#>
#> $destinations
#> lon lat
#> 4 24.76555 59.44261
#> 5 24.76971 59.44126
#> 6 24.75928 59.44008
# osrmTable() with sf objects - OK
stops_sf <- st_as_sf(stops, coords = c("lon", "lat"), crs = "WGS84")
osrmTable(stops_sf[1:3,], stops_sf[4:6,])
#> $durations
#> 4 5 6
#> 1 0.0 1.1 1.0
#> 2 3.1 3.0 2.3
#> 3 3.1 3.1 2.3
#>
#> $sources
#> lon lat
#> 1 24.76541 59.44259
#> 2 24.75145 59.44378
#> 3 24.76006 59.44473
#>
#> $destinations
#> lon lat
#> 4 24.76555 59.44261
#> 5 24.76971 59.44126
#> 6 24.75928 59.44008
Created on 2023-01-31 with reprex v2.0.2

Create a value associated with a member of each list and store it with that member in R

I have a list of many dataframes, all of the same format. For each member of this list, I would like to generate a spatial extent, and store it with that dataframe (this data is all lat/long data, and I am using functions from the terra package to analyze it). I am not super experienced with working with lists, and so I took the following stab at trying to generate it:
library(terra)
library(dplyr)
lat_1 <- c(23.2, 14.5, 28.6)
lon_1 <- c(12.1, 8.5, 2.2)
lat_2 <- c(89.3, 94.4, 72.3)
lon_2 <- c(45.2, 47, 48.5)
coords_1 <- data.frame(lon_1, lat_1)
coords_2 <- data.frame(lon_2, lat_2)
list_coords <- list(coords_1, coords_2)
write_extent <- function(lon, lat) {
max_lat <- ceiling(max(lat))
min_lat <- floor(min(lat))
max_lon <- ceiling(max(lon))
min_lon <- floor(min(lon))
extent <- extent(x = c(max_lat, min_lat, max_lon, min_lon))
}
However, this function has errors, and I can't conceptualize how I can store the spatial extent that corresponds with each member of the list with that specific list- should I be using mutate()? Should I not be designing a function and rather be using lapply?
You can do this a couple of different ways. First, you'll want to make the data frames have the same column names for longitude and latitude lon and lat, but that's arbitrary. Once you've done that, then one way is to produce a new list where each element of the list has both a data frame and an extent object:
library(terra)
library(raster)
library(dplyr)
lat_1 <- c(23.2, 14.5, 28.6)
lon_1 <- c(12.1, 8.5, 2.2)
lat_2 <- c(89.3, 94.4, 72.3)
lon_2 <- c(45.2, 47, 48.5)
coords_1 <- data.frame(lon = lon_1, lat = lat_1)
coords_2 <- data.frame(lon = lon_2, lat = lat_2)
list_coords <- list(coords_1, coords_2)
write_extent <- function(lon, lat) {
max_lat <- ceiling(max(lat))
min_lat <- floor(min(lat))
max_lon <- ceiling(max(lon))
min_lon <- floor(min(lon))
extent <- extent(x = min_lat, xmax=max_lat, ymin = min_lon, ymax=max_lon)
extent
}
res <- lapply(list_coords, function(x){
list(data=x, extent = write_extent(x$lon, x$lat))
})
res
#> [[1]]
#> [[1]]$data
#> lon lat
#> 1 12.1 23.2
#> 2 8.5 14.5
#> 3 2.2 28.6
#>
#> [[1]]$extent
#> class : Extent
#> xmin : 14
#> xmax : 29
#> ymin : 2
#> ymax : 13
#>
#>
#> [[2]]
#> [[2]]$data
#> lon lat
#> 1 45.2 89.3
#> 2 47.0 94.4
#> 3 48.5 72.3
#>
#> [[2]]$extent
#> class : Extent
#> xmin : 72
#> xmax : 95
#> ymin : 45
#> ymax : 49
In the output above, you could get the data for the first object with res[[1]]$data and the extent for the first object with res[[1]]$extent. Or you could get a list of all the extents with lapply(res, function(x)x$extent). Another option would be to store the extent as an attribute of the data. This way, it always follows the data around:
res <- lapply(list_coords, function(x){
e <- write_extent(x$lon, x$lat)
attr(x, "extent") <- e
x
})
res
#> [[1]]
#> lon lat
#> 1 12.1 23.2
#> 2 8.5 14.5
#> 3 2.2 28.6
#>
#> [[2]]
#> lon lat
#> 1 45.2 89.3
#> 2 47.0 94.4
#> 3 48.5 72.3
You don't see the extent when you print the data frame, but you can retrieve it either for a single data frame with:
attr(res[[1]], "extent")
#> class : Extent
#> xmin : 14
#> xmax : 29
#> ymin : 2
#> ymax : 13
Or for all of them with:
lapply(res, function(x)attr(x, "extent"))
#> [[1]]
#> class : Extent
#> xmin : 14
#> xmax : 29
#> ymin : 2
#> ymax : 13
#>
#> [[2]]
#> class : Extent
#> xmin : 72
#> xmax : 95
#> ymin : 45
#> ymax : 49
Created on 2022-12-08 by the reprex package (v2.0.1)
While I think it's a bit less conventional to set object attributes, this answer suggests it is not bad practice to do so.

How can I calculate how many pixels share the same value in two raster layers?

I have made a stacked raster of two layers with the same extent, which contain information about the presence of two different things across the area. Both are binary; the values are either 1 (presence) or 0 (absence) for each pixel.
This is the raster's description:
> stacked
class : SpatRaster
dimensions : 166, 1622, 2 (nrow, ncol, nlyr)
resolution : 0.1666667, 0.1666667 (x, y)
extent : -131.5, 138.8333, 36.33333, 64 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
sources : memory
memory
names : lyr1, lyr1
min values : 0, 0
max values : 1, 1
How can I use conditions to extract the number of pixels where value = 1 in both layers?
I tried a few things when the two areas were separate rasters, but had no success with that, so I used extend() to enlarge both their extents so that they could share a common area. I think this means there will be NAs in there as well as 1s and 0s.
I have tried the following:
> freq(stacked, value = 1)
layer value count
[1,] 1 1 12243
[2,] 2 1 14804
but this just counts how many pixels have a value of 1 in each of the layers separately, whereas what I need is the number of pixels in which the values for BOTH layers =1, so they match.
Many thanks for any tips!
If I fully understand your request, please find below one possible solution.
Reprex
library(terra)
# Build a dummy Spatraster with two layers containing only 1 and 0
set.seed(0) # for reproducibility
r <- rast(nrows=10, ncols=10, nlyrs=2)
values(r) <- sample(c(1,0), 200, replace = TRUE)
r
#> class : SpatRaster
#> dimensions : 10, 10, 2 (nrow, ncol, nlyr)
#> resolution : 36, 18 (x, y)
#> extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84
#> source : memory
#> names : lyr.1, lyr.2
#> min values : 0, 0
#> max values : 1, 1
# If you use terra::freq(), you get the number of cells with 0 and 1 for each layer
terra::freq(r)
#> layer value count
#> [1,] 1 0 52
#> [2,] 1 1 48
#> [3,] 2 0 47
#> [4,] 2 1 53
# So, one approach is to sum the two layers and retrieve the number of cells with 2
# (here 24 cells have the value 1 in the two layers)
terra::freq(sum(r))
#> layer value count
#> [1,] 1 0 23
#> [2,] 1 1 53
#> [3,] 1 2 24
# OR more directly:
terra::freq(sum(r), value = 2)[, 'count']
#> count
#> 24
Created on 2022-03-15 by the reprex package (v2.0.1)

Finding shortest distance between two sets of points ( latitude and longitude) points in R

I have two sets of geo location points i.e. latitude and longitude.
control <-
data.frame(
id = c("110000308033", "110000308042", "110000308060", "110000308346", "110000308505",
"110000308541", "110000308612", "110000309684", "110000309773", "110000309835"),
latitude = c(42.20227, 42.19802, 42.19251, 42.13690, 42.65253, 42.24066, 42.60008,
42.62743, 42.72361, 42.70060),
longitude = c(-72.606052, -72.600913, -72.609814, -72.542205, -73.110606, -73.358297,
-72.378388, -71.156079, -70.9629610, -71.16304)
)
treatment <-
data.frame(
id = c("110000308881", "110000310556", "110000314570", "110000316024", "110000324845"),
latitude = c(42.61366, 42.16657, 45.36801, 41.62371, 43.30851 ),
longitude = c(-71.633782, -71.212503, -68.510184, -72.043135, -73.63481 )
)
I want to find the shortest distance from control unit to treatment unit for each of the control unit. Also how can I show them on a US map. I would appreciate any help.
Please find one possible solution using sf, units, dplyr and tmap libraries
Reprex
Computing the distance matrix (distance in km)
library(sf)
library(units)
library(dplyr)
library(tmap)
# Convert the two dataframes into 'sf' objects
control_sf <- st_as_sf(control, coords = c("longitude", "latitude"), crs = 4326)
treatment_sf <- st_as_sf(treatment, coords = c("longitude", "latitude"), crs = 4326)
# Compute a distance matrix (distance in km)
# rows = control
# columns = treatment
Distances_df <- control_sf %>%
st_distance(., treatment_sf) %>%
set_units(., km) %>%
drop_units() %>%
round(., 1) %>%
as.data.frame() %>%
setNames(., treatment$id) %>%
mutate(control = control$id) %>%
relocate(control)
Distances_df
#> control 110000308881 110000310556 110000314570 110000316024
#> 1 110000308033 92.0 114.9 481.6 79.4
#> 2 110000308042 91.9 114.5 481.6 78.8
#> 3 110000308060 92.8 115.2 482.6 78.7
#> 4 110000308346 91.5 109.7 483.6 70.5
#> 5 110000308505 120.9 164.9 475.8 144.3
#> 6 110000308541 147.5 176.9 521.6 128.6
#> 7 110000308612 61.0 107.2 436.4 112.0
#> 8 110000309684 39.1 51.5 371.0 133.4
#> 9 110000309773 56.2 65.2 353.4 151.3
#> 10 110000309835 39.7 59.5 364.6 140.0
#> 110000324845
#> 1 148.9
#> 2 149.6
#> 3 149.7
#> 4 157.9
#> 5 84.5
#> 6 120.9
#> 7 129.1
#> 8 215.4
#> 9 226.7
#> 10 212.0
Visualization (control in blue and treatment in red)
tmap_mode("view")
#> tmap mode set to interactive viewing
tm_shape(control_sf) +
tm_dots(col = "blue")+
tm_shape(treatment_sf) +
tm_dots(col = "red")
Created on 2022-01-14 by the reprex package (v2.0.1)

Resources