Can I 'save' a SpatRaster in a R workspace (RData) file? - r

I noticed that if I load a raster from a file that is stored on local hard disk, using terra package and then load the workspace later, the SpatRaster object does not have any data associated with it. Is there a way to keep all the information associated with the SpatRaster object while saving and loading the workspace?
Here is a the example code to illustrate the issue:
library(terra)
f <- system.file("ex/elev.tif", package="terra")
r <- rast(f)
#This produces the following output
r
#class : SpatRaster
#dimensions : 90, 95, 1 (nrow, ncol, nlyr)
#resolution : 0.008333333, 0.008333333 (x, y)
#extent : 5.741667, 6.533333, 49.44167, 50.19167 (xmin, xmax,
ymin, #ymax)
#coord. ref. : lon/lat WGS 84 (EPSG:4326)
#source : elev.tif
#name : elevation
#min value : 141
#max value : 547
sources(r)#this works
save.image("delete_if_found.RData")
rm(list = ls())
load("delete_if_found.RData")
r
#which returns the spatraster as
#class : SpatRaster
#Error in .External(list(name = "CppMethod__invoke_notvoid", address = \<pointer: (nil)\>, :
#NULL value passed as symbol address`
I am currently importing all the relevant files again after loading the workspace, is there any other way to go about it?

Hello Aniruddha Marathe and welcome to SO!
If you take a look in the terra package documentation, you will see this:
[...] They cannot be recovered from a saved R session either or directly passed to nodes on a computer cluster. Generally, you should use writeRaster to save SpatRaster objects to disk (and pass a filename or cell values to cluster nodes)
So, you will have to load the SpatRaster each time you want to use it by executing terra::rast(system.file("ex/elev.tif", package="terra")), instead of doing it with the load() function.
Hope this helps 😀

You can use writeRaster and then rast and you can also use saveRDS and readRDS, but you cannot use save and load.
As far as I am concerned that is a good thing because saving a session is generally a bad idea (and I wish that R would not prod you to do that). It is bad because you should not start analysis with data coming from nowhere. Instead, you can save your intermediate data to files and read them again in the next step.

Related

Simple R functions not working on new laptop with win 11

I want to load raster files then mask out the clouded parts. I have done it many times and the code used to work fine. Now I upgraded to windows 11 and has a new laptop. Installed R 4.2, then RStudio and then RTools 4.2. Here is my code
`library(raster)
library(rgdal)
setwd('D:/Test/')
masks <- list.files('D:/Test', 'mask_C2.tif$', full.names = T)
ndvi <- list.files('D:/Test', 'NDVI_C2.tif$', full.names = T)
rm <- lapply(masks, raster)
rn <- lapply(ndvi, raster)
When I call 1st element on NEW MACHINE, 'names' property doesn't have file name, it just says layer
> rn[[1]]
class : RasterLayer
dimensions : 174, 164, 28536 (nrow, ncol, ncell)
resolution : 30, 30 (x, y)
extent : 323895, 328815, 3724335, 3729555 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=43 +datum=WGS84 +units=m +no_defs
source : cr_LC81500372020261LGN00_NDVI_C2.tif
names : layer
values : -0.1510984, 0.8510309 (min, max)
When I call 1st element on OLD MACHINE, 'names' property has file name
> rn[[1]]
class : RasterLayer
dimensions : 174, 164, 28536 (nrow, ncol, ncell)
resolution : 30, 30 (x, y)
extent : 323895, 328815, 3724335, 3729555 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=43 +datum=WGS84 +units=m +no_defs
source : cr_LC81500372020261LGN00_NDVI_C2.tif
names : cr_LC81500372020261LGN00_NDVI_C2
values : -0.1510984, 0.8510309 (min, max)`
I was using names property to name output rasters which I'm unable to do now. I know basename can do this but really wanna know whats wrong here. same code works fine on old machine with same versions of R, RStudio and RTools.
Really appreciate if someone can help me here. Thanks
I reinstalled windows, did a clean install. Then installed same versions of R, Rstudio and RTools as my old computer has. As names property has file name on old computer, I want it to be same on new one.

Can knitr's cache work with pointer objects?

I'm using knitr to make a document that uses the terra package to draw maps. Here's my minimal .Rmd file:
```{r init, echo=FALSE}
library(knitr)
opts_chunk$set(cache=TRUE)
```
```{r maps}
library(terra)
r = rast(matrix(1:12,3,4))
plot(r)
```
```{r test2}
print(r)
plot(r)
```
First run (via rmarkdown::render(...)), this works, creates a test_cache folder. Running a second time (with no changes) it also runs fine. If I make a minor change to chunk 2 (eg add a comment) and run I get:
Quitting from lines 14-17 (cache.Rmd)
Error in .External(list(name = "CppMethod__invoke_notvoid", address = <pointer: (nil)>, :
NULL value passed as symbol address
I've also had this from another Rmd file, probably related:
Quitting from lines 110-115 (work.Rmd)
Error in x#ptr$readStart() : external pointer is not valid
clearing the cache or running with cache=FALSE then works, but then what's the point of the cache.
I think its because r is some sort of reference class which exists via some memory allocated by Rcpp, and knitr is only caching a reference, so when it tries to read the cached version it gets a reference to memory which doesn't have the object that was created to go with the reference there. So it fails.
FWIW a terra raster object looks like this:
> str(r)
Formal class 'SpatRaster' [package "terra"] with 1 slot
..# ptr:Reference class 'Rcpp_SpatRaster' [package "terra"] with 20 fields
.. ..$ depth : num 0
.. ..$ extent :Reference class 'Rcpp_SpatExtent' [package "terra"] with 2 fields
and
> r#ptr
C++ object <0x55ce6fdf2bd0> of class 'SpatRaster' <0x55ce5a6750b0>
Is there a way to make the knitr cache work with these objects? I know I could exclude just these from the cache but 90% of my document is working with these sorts of objects and that's the reason I want to use the cache to speed things up. But then every time I get this error I have to stop, clear the cache, start again, and I don't know if that time is worth the speedup I get with the cache.
R 4.1.1 with
> packageVersion("knitr")
[1] ‘1.34’
> packageVersion("rmarkdown")
[1] ‘2.11’
terra does some work to allow serialization, for example:
library(terra)
f <- system.file("ex/elev.tif", package="terra")
r <- rast(f)
saveRDS(r, "test.rds")
readRDS("test.rds")
[1] "This is a PackedSpatRaster object. Use 'terra::rast()' to unpack it"
readRDS("test.rds") |> rast()
#class : SpatRaster
#dimensions : 90, 95, 1 (nrow, ncol, nlyr)
#resolution : 0.008333333, 0.008333333 (x, y)
#extent : 5.741667, 6.533333, 49.44167, 50.19167 (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84 (EPSG:4326)
#source : memory
#name : elevation
#min value : 141
#max value : 547
But it seems that knitr caches with "save" to an RData file; in that case, the raw object is stored, which won't be valid when reloaded.
I think it may be possible to work around this with some clever use of hook functions; but that would be so involved that it would defeat the purpose of caching.

read NASADEM .slope file into R and change format to .tif file

I downloaded some tiles using NASA EARTHDATA for the NASADEM Slope and Curvation Global 1 arc second V001 data set. When I extracted the files, I saw the filenames followed the pattern: "TileLocation.Variable". For example, a tile with slope data located on the eastern Mediteranian is called: "n36e035.slope". I was surprised the file extension was ".slope" and not ".tif".
When I tried reading the file into R with r <- raster::raster("Filepath/n36e035.slope") I get an error, because the file structure is not a tif or geotiff. I want to read multiple tiles for each variable (slope, curvature, etc.), merge, crop to my study area, then write out the combined raster to my local device as a .tif file. That way the DEM file format and data structure will match the other rasters I have.
My preferred language is R, but if there's another open-source way to change the file format from this NASA-specific extension to a .tif then I can use that. I tried looking online, but all the tutorials used Google Earth Engine or Arc and didn't allow me to save the combined .tif files locally.
You can download the n36e35 zip here and the n35e35 zip here. You may need to log-in to NASA EARTHDATA to view and download the DEM tiles. The overview is here and the user guide is available here, but the user guide is more about how the data set was made, not how to read it in or change the data format. One strange note is that the DEM this data set is based off of has an .hgt file extension, which I can easily read into R with the raster::raster function.
Regrettably, NASA does not provide header files so you need to create them yourself.
To help with that, I added a function makeVRT to the terra version 1.5-9. That is currently the development version, you can install it with install.packages('terra', repos='https://rspatial.r-universe.dev'). I use that function in demhdr below (after looking at this github repo by David Shean) that sets the specific parameters for these files. Parameters are also give here but note that on that page the datatype for SWB is incorrectly given as 8-bit signed integer, whereas it should be 8-bit unsigned integer.
demhdr <- function(filename, ...) {
f <- basename(filename)
stopifnot(tools::file_ext(f) != "vrt")
sign <- ifelse(substr(f, 1, 1) == "n", 1, -1)
lat <- sign * as.numeric(substr(f, 2, 3))
sign <- ifelse(substr(f, 4, 4) == "e", 1, -1)
lon <- sign * as.numeric(substr(f, 5, 7))
if (grepl("aspect", f) || grepl("slope", f)) {
datatype <- "INT2U"
} else if (grepl("swb", f)) {
datatype <- "INT1U"
} else {
datatype <- "FLT4S"
}
name <- unlist(strsplit(f, "\\."))[2]
terra::makeVRT(filename, 3601, 3601, 1, xmin=lon, ymin=lat, xres=1/3600,
lyrnms=name, datatype=datatype, byteorder="MSB", ...)
}
For a folder with these files that do not end on ".vrt":
ff <- grep(list.files("."), pattern="\\.vrt$", invert=TRUE, value=TRUE)
ff
#[1] "n37e037.aspect" "n37e037.planc" "n37e037.profc" "n37e037.slope"
#[5] "n37e037.swb"
You can use the demhdr function like this:
fvrt <- sapply(ff, demhdr, USE.NAMES=FALSE)
fvrt
#"n37e037.aspect.vrt" "n37e037.planc.vrt" "n37e037.profc.vrt"
# "n37e037.slope.vrt" "n37e037.swb.vrt"
And then, with files for a single tile, you can do
library(terra)
r <- rast(fvrt)
r
#class : SpatRaster
#dimensions : 3601, 3601, 5 (nrow, ncol, nlyr)
#resolution : 0.0002777778, 0.0002777778 (x, y)
#extent : 36.99986, 38.00014, 36.99986, 38.00014 (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=longlat +datum=WGS84 +no_defs
#sources : n37e037.aspect.vrt
# n37e037.planc.vrt
# n37e037.profc.vrt
# ... and 2 more source(s)
#names : aspect, planc, profc, slope, swb
Note the very unfortunate georeferencing of NASA SRTM data. The data would have lined up with other lon/lat raster data, and would have been much more usable if the extent would have been 37.0, 38.0, 37.0, 38.0 and the number of rows and columns would have been 3600. But that is not the case.
plot(r)
This tile did not seem to have no data values; in other tiles you may need to set it with the NAflag argument in makeVRT or by using NAflag(x) <- on single layer SpatRaster
For aspect, it looks like you could use scale=0.01 to get values between 0 and 360 degrees)
To merge many tiles for say aspect, you should be able to do something like
fasp <- grep("aspect", fvrt, value=TRUE)
x <- src(lapply(fasp, rast))
m <- merge(x)
or make a new VRT file that combines the tiles, like this
vrt(fasp, "aspect.vrt")
m <- rast("aspect.vrt")
To read the files with raster
library(raster)
s <- stack(fvrt)

Weird R problem: Internet keeps disconnecting whenever I run any raster calculation using the "raster" package

I ran into what seems to be a fairly unique problem on R's "raster" package.
I am not sure how to provide a reproducible example for people reading this, but the long and short of it is that whenever I perform calculations involving my raster objects, the internet shuts down for the whole house. While I can't provide the raster images themselves because they are so big, I can provide their general description. Objects like these are about a gigabyte each.
class : RasterLayer
dimensions : 15336, 19016, 291629376 (nrow, ncol, ncell)
resolution : 30.85642, 30.85642 (x, y)
extent : 610995.9, 1197762, 9526801, 10000015 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=52 +south +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
source : C:/Users/tug74077/AppData/Local/Temp/RtmpcVf7pt/raster/r_tmp_2020-07-21_082952_22832_46671.grd
names : layer
values : 6378.035, 2016403 (min, max)
And sometimes I use raster stacks, which are much larger (about 10 gigabytes)
class : RasterStack
dimensions : 15336, 19016, 291629376, 30 (nrow, ncol, ncell, nlayers)
resolution : 30.85642, 30.85642 (x, y)
extent : 610995.9, 1197762, 9526801, 10000015 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=52 +south +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
names : ap_hp_stack.1, ap_hp_stack.2, ap_hp_stack.3, ap_hp_stack.4, ap_hp_stack.5, ap_hp_stack.6, ap_hp_stack.7, ap_hp_stack.8, ap_hp_stack.9, ap_hp_stack.10, ap_hp_stack.11, ap_hp_stack.12, ap_hp_stack.13, ap_hp_stack.14, ap_hp_stack.15, ...
min values : 426.50653, 403.31589, 381.38617, 360.64886, 341.03912, 322.49561, 304.96039, 288.37863, 272.69846, 257.87088, 243.84953, 230.59058, 218.05255, 206.19627, 194.98465, ...
max values : 134839.22, 127507.53, 120574.50, 114018.44, 107818.85, 101956.36, 96412.63, 91170.34, 86213.09, 81525.38, 77092.55, 72900.77, 68936.89, 65188.55, 61644.02, ...
The type of calculation that would disconnect the internet ranges from a simple stacking operation of 30 rasters like this:
ann.cost_hp_t_stack <- stack(ann.cost_hp_t)
(ann.cost_hp_t is a list of 30 rasters that look like a single raster layer in the above (former) description that will be stacked to create ann.cost_hp_t_stack, which resembles a raster stack in the above (latter) description),
to an operation that looks like this:
for (i in c(1:30)){
ann.cost_hp_t[[i]] <- ann.cost_t_im / ((1 + 0.05)^i)
}
where ann.cost_t_im is another raster layer resembling the raster layer described above.
In addition to the internet cutting out for the whole router/house, my local disk gets filled up too, and I have to regularly restart R to free up about 140 gigabytes of disk space.
If you have read this far, thank you very much for your time. Also, sorry if the formatting is confusing.
TL;DR: My internet keeps cutting out when I use the "raster" package in R to create gigabytes upon gigabytes of data.
The raster package does not use your network for normal computations (it only uses it to download data when you use the getData function). So it is directly related to that. It has to be related to what you computer does when files are created.
All I can think of is that you have a system that automatically copies your data to the cloud. So if you create a bunch of big files, that would slow down the Internet.
As for the filling up of your disk; that is because you are using very large files with functions that then need save these to disk (in a temp file). You can use other functions such as calc to use filenames. There would still be files, of course, but you may cut out intermediate files; if there are any. See raster::removeTmpFiles for removing them without exiting R.

Can not open created raster in R

I have two issues related to the error:
first: I have one merged dem layer and multiple shapefiles, I create a list of masked shapefiles boundary, I was able to plot all of them except one "the first one" which is the biggest one:
> plot(DEM_masked_list[[1]])
Error in file(fn, "rb") : cannot open the connection
In addition: Warning message:
In file(fn, "rb") :
cannot open file '/private/var/folders/2w/rjzwcrbn3pg0jmsrfkz7n52h0000gn/T/RtmpkL8Ot5/raster/r_tmp_2018-01-29_014745_982_20879.gri': No such file or directory
I notice the data source of the first dem differ from all the others, that might be due to the larger size of it (509141570 no. of the cell)!!
DEM_masked_list
[[1]]
class : RasterLayer
dimensions : 20015, 25438, 509141570 (nrow, ncol, ncell)
resolution : 9.259259e-05, 9.259259e-05 (x, y)
extent : -70.43231, -68.07694, 45.98676, 47.84 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
data source : /private/var/folders/2w/rjzwcrbn3pg0jmsrfkz7n52h0000gn/T/RtmpkL8Ot5/raster/r_tmp_2018-01-29_014745_982_20879.grd
names : layer
values : 121.4266, 856.6606 (min, max)
[[2]]
class : RasterLayer
dimensions : 9043, 9896, 89489528 (nrow, ncol, ncell)
resolution : 9.259259e-05, 9.259259e-05 (x, y)
extent : -69.76269, -68.84639, 46.23528, 47.07259 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
data source : in memory
names : layer
values : 187.9911, 650.0044 (min, max)
Second: I merged 25 separate dem in one layer (DEM_merged), the data source also not stored in memory, I was able to plot it and work with it for one day which is 2018-01-28 (appear in the data source), then the same error appeared.
> DEM_merge
class : RasterLayer
dimensions : 75612, 75612, 5717174544 (nrow, ncol, ncell)
resolution : 9.259259e-05, 9.259259e-05 (x, y)
extent : -74.00056, -66.99944, 40.99944, 48.00056 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
data source : /private/var/folders/2w/rjzwcrbn3pg0jmsrfkz7n52h0000gn/T/RtmpkL8Ot5/raster/r_tmp_2018-01-28_163201_982_66674.grd
names : layer
values : -81.04944, 1915.734 (min, max)
> plot(DEM_merge)
Error in file(fn, "rb") : cannot open the connection
In addition: Warning message:
In file(fn, "rb") :
cannot open file '/private/var/folders/2w/rjzwcrbn3pg0jmsrfkz7n52h0000gn/T/RtmpkL8Ot5/raster/r_tmp_2018-01-28_163201_982_66674.gri': No such file or directory
>
Is there any way to fix that? I feel there is some issue with Raster package and the way that it store the data, I tried to reinstall Raster package, reinstall R, even I used a different computer after I post here but still the same issue, Appreciate your help!!
The values of large Raster* objects are written to file, to avoid memory limitation problems. If you do not explicitly provide a filename, they are stored in the temporary data folder that will be removed when the R session ends.
I am guessing you created the RasterLayers and saved the list to disk, and closed R? Or perhaps you reloaded your session when opening R again?
Just guessing, but if so, the values of the large raster should indeed have disappeared. To avoid that from happening, you can either try to force all values into memory with readAll (not recommended), or write them to a permanent file with writeRaster

Resources