I have two issues related to the error:
first: I have one merged dem layer and multiple shapefiles, I create a list of masked shapefiles boundary, I was able to plot all of them except one "the first one" which is the biggest one:
> plot(DEM_masked_list[[1]])
Error in file(fn, "rb") : cannot open the connection
In addition: Warning message:
In file(fn, "rb") :
cannot open file '/private/var/folders/2w/rjzwcrbn3pg0jmsrfkz7n52h0000gn/T/RtmpkL8Ot5/raster/r_tmp_2018-01-29_014745_982_20879.gri': No such file or directory
I notice the data source of the first dem differ from all the others, that might be due to the larger size of it (509141570 no. of the cell)!!
DEM_masked_list
[[1]]
class : RasterLayer
dimensions : 20015, 25438, 509141570 (nrow, ncol, ncell)
resolution : 9.259259e-05, 9.259259e-05 (x, y)
extent : -70.43231, -68.07694, 45.98676, 47.84 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
data source : /private/var/folders/2w/rjzwcrbn3pg0jmsrfkz7n52h0000gn/T/RtmpkL8Ot5/raster/r_tmp_2018-01-29_014745_982_20879.grd
names : layer
values : 121.4266, 856.6606 (min, max)
[[2]]
class : RasterLayer
dimensions : 9043, 9896, 89489528 (nrow, ncol, ncell)
resolution : 9.259259e-05, 9.259259e-05 (x, y)
extent : -69.76269, -68.84639, 46.23528, 47.07259 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
data source : in memory
names : layer
values : 187.9911, 650.0044 (min, max)
Second: I merged 25 separate dem in one layer (DEM_merged), the data source also not stored in memory, I was able to plot it and work with it for one day which is 2018-01-28 (appear in the data source), then the same error appeared.
> DEM_merge
class : RasterLayer
dimensions : 75612, 75612, 5717174544 (nrow, ncol, ncell)
resolution : 9.259259e-05, 9.259259e-05 (x, y)
extent : -74.00056, -66.99944, 40.99944, 48.00056 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
data source : /private/var/folders/2w/rjzwcrbn3pg0jmsrfkz7n52h0000gn/T/RtmpkL8Ot5/raster/r_tmp_2018-01-28_163201_982_66674.grd
names : layer
values : -81.04944, 1915.734 (min, max)
> plot(DEM_merge)
Error in file(fn, "rb") : cannot open the connection
In addition: Warning message:
In file(fn, "rb") :
cannot open file '/private/var/folders/2w/rjzwcrbn3pg0jmsrfkz7n52h0000gn/T/RtmpkL8Ot5/raster/r_tmp_2018-01-28_163201_982_66674.gri': No such file or directory
>
Is there any way to fix that? I feel there is some issue with Raster package and the way that it store the data, I tried to reinstall Raster package, reinstall R, even I used a different computer after I post here but still the same issue, Appreciate your help!!
The values of large Raster* objects are written to file, to avoid memory limitation problems. If you do not explicitly provide a filename, they are stored in the temporary data folder that will be removed when the R session ends.
I am guessing you created the RasterLayers and saved the list to disk, and closed R? Or perhaps you reloaded your session when opening R again?
Just guessing, but if so, the values of the large raster should indeed have disappeared. To avoid that from happening, you can either try to force all values into memory with readAll (not recommended), or write them to a permanent file with writeRaster
Related
I noticed that if I load a raster from a file that is stored on local hard disk, using terra package and then load the workspace later, the SpatRaster object does not have any data associated with it. Is there a way to keep all the information associated with the SpatRaster object while saving and loading the workspace?
Here is a the example code to illustrate the issue:
library(terra)
f <- system.file("ex/elev.tif", package="terra")
r <- rast(f)
#This produces the following output
r
#class : SpatRaster
#dimensions : 90, 95, 1 (nrow, ncol, nlyr)
#resolution : 0.008333333, 0.008333333 (x, y)
#extent : 5.741667, 6.533333, 49.44167, 50.19167 (xmin, xmax,
ymin, #ymax)
#coord. ref. : lon/lat WGS 84 (EPSG:4326)
#source : elev.tif
#name : elevation
#min value : 141
#max value : 547
sources(r)#this works
save.image("delete_if_found.RData")
rm(list = ls())
load("delete_if_found.RData")
r
#which returns the spatraster as
#class : SpatRaster
#Error in .External(list(name = "CppMethod__invoke_notvoid", address = \<pointer: (nil)\>, :
#NULL value passed as symbol address`
I am currently importing all the relevant files again after loading the workspace, is there any other way to go about it?
Hello Aniruddha Marathe and welcome to SO!
If you take a look in the terra package documentation, you will see this:
[...] They cannot be recovered from a saved R session either or directly passed to nodes on a computer cluster. Generally, you should use writeRaster to save SpatRaster objects to disk (and pass a filename or cell values to cluster nodes)
So, you will have to load the SpatRaster each time you want to use it by executing terra::rast(system.file("ex/elev.tif", package="terra")), instead of doing it with the load() function.
Hope this helps 😀
You can use writeRaster and then rast and you can also use saveRDS and readRDS, but you cannot use save and load.
As far as I am concerned that is a good thing because saving a session is generally a bad idea (and I wish that R would not prod you to do that). It is bad because you should not start analysis with data coming from nowhere. Instead, you can save your intermediate data to files and read them again in the next step.
I want to load raster files then mask out the clouded parts. I have done it many times and the code used to work fine. Now I upgraded to windows 11 and has a new laptop. Installed R 4.2, then RStudio and then RTools 4.2. Here is my code
`library(raster)
library(rgdal)
setwd('D:/Test/')
masks <- list.files('D:/Test', 'mask_C2.tif$', full.names = T)
ndvi <- list.files('D:/Test', 'NDVI_C2.tif$', full.names = T)
rm <- lapply(masks, raster)
rn <- lapply(ndvi, raster)
When I call 1st element on NEW MACHINE, 'names' property doesn't have file name, it just says layer
> rn[[1]]
class : RasterLayer
dimensions : 174, 164, 28536 (nrow, ncol, ncell)
resolution : 30, 30 (x, y)
extent : 323895, 328815, 3724335, 3729555 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=43 +datum=WGS84 +units=m +no_defs
source : cr_LC81500372020261LGN00_NDVI_C2.tif
names : layer
values : -0.1510984, 0.8510309 (min, max)
When I call 1st element on OLD MACHINE, 'names' property has file name
> rn[[1]]
class : RasterLayer
dimensions : 174, 164, 28536 (nrow, ncol, ncell)
resolution : 30, 30 (x, y)
extent : 323895, 328815, 3724335, 3729555 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=43 +datum=WGS84 +units=m +no_defs
source : cr_LC81500372020261LGN00_NDVI_C2.tif
names : cr_LC81500372020261LGN00_NDVI_C2
values : -0.1510984, 0.8510309 (min, max)`
I was using names property to name output rasters which I'm unable to do now. I know basename can do this but really wanna know whats wrong here. same code works fine on old machine with same versions of R, RStudio and RTools.
Really appreciate if someone can help me here. Thanks
I reinstalled windows, did a clean install. Then installed same versions of R, Rstudio and RTools as my old computer has. As names property has file name on old computer, I want it to be same on new one.
I'm using knitr to make a document that uses the terra package to draw maps. Here's my minimal .Rmd file:
```{r init, echo=FALSE}
library(knitr)
opts_chunk$set(cache=TRUE)
```
```{r maps}
library(terra)
r = rast(matrix(1:12,3,4))
plot(r)
```
```{r test2}
print(r)
plot(r)
```
First run (via rmarkdown::render(...)), this works, creates a test_cache folder. Running a second time (with no changes) it also runs fine. If I make a minor change to chunk 2 (eg add a comment) and run I get:
Quitting from lines 14-17 (cache.Rmd)
Error in .External(list(name = "CppMethod__invoke_notvoid", address = <pointer: (nil)>, :
NULL value passed as symbol address
I've also had this from another Rmd file, probably related:
Quitting from lines 110-115 (work.Rmd)
Error in x#ptr$readStart() : external pointer is not valid
clearing the cache or running with cache=FALSE then works, but then what's the point of the cache.
I think its because r is some sort of reference class which exists via some memory allocated by Rcpp, and knitr is only caching a reference, so when it tries to read the cached version it gets a reference to memory which doesn't have the object that was created to go with the reference there. So it fails.
FWIW a terra raster object looks like this:
> str(r)
Formal class 'SpatRaster' [package "terra"] with 1 slot
..# ptr:Reference class 'Rcpp_SpatRaster' [package "terra"] with 20 fields
.. ..$ depth : num 0
.. ..$ extent :Reference class 'Rcpp_SpatExtent' [package "terra"] with 2 fields
and
> r#ptr
C++ object <0x55ce6fdf2bd0> of class 'SpatRaster' <0x55ce5a6750b0>
Is there a way to make the knitr cache work with these objects? I know I could exclude just these from the cache but 90% of my document is working with these sorts of objects and that's the reason I want to use the cache to speed things up. But then every time I get this error I have to stop, clear the cache, start again, and I don't know if that time is worth the speedup I get with the cache.
R 4.1.1 with
> packageVersion("knitr")
[1] ‘1.34’
> packageVersion("rmarkdown")
[1] ‘2.11’
terra does some work to allow serialization, for example:
library(terra)
f <- system.file("ex/elev.tif", package="terra")
r <- rast(f)
saveRDS(r, "test.rds")
readRDS("test.rds")
[1] "This is a PackedSpatRaster object. Use 'terra::rast()' to unpack it"
readRDS("test.rds") |> rast()
#class : SpatRaster
#dimensions : 90, 95, 1 (nrow, ncol, nlyr)
#resolution : 0.008333333, 0.008333333 (x, y)
#extent : 5.741667, 6.533333, 49.44167, 50.19167 (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84 (EPSG:4326)
#source : memory
#name : elevation
#min value : 141
#max value : 547
But it seems that knitr caches with "save" to an RData file; in that case, the raw object is stored, which won't be valid when reloaded.
I think it may be possible to work around this with some clever use of hook functions; but that would be so involved that it would defeat the purpose of caching.
I ran into what seems to be a fairly unique problem on R's "raster" package.
I am not sure how to provide a reproducible example for people reading this, but the long and short of it is that whenever I perform calculations involving my raster objects, the internet shuts down for the whole house. While I can't provide the raster images themselves because they are so big, I can provide their general description. Objects like these are about a gigabyte each.
class : RasterLayer
dimensions : 15336, 19016, 291629376 (nrow, ncol, ncell)
resolution : 30.85642, 30.85642 (x, y)
extent : 610995.9, 1197762, 9526801, 10000015 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=52 +south +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
source : C:/Users/tug74077/AppData/Local/Temp/RtmpcVf7pt/raster/r_tmp_2020-07-21_082952_22832_46671.grd
names : layer
values : 6378.035, 2016403 (min, max)
And sometimes I use raster stacks, which are much larger (about 10 gigabytes)
class : RasterStack
dimensions : 15336, 19016, 291629376, 30 (nrow, ncol, ncell, nlayers)
resolution : 30.85642, 30.85642 (x, y)
extent : 610995.9, 1197762, 9526801, 10000015 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=52 +south +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
names : ap_hp_stack.1, ap_hp_stack.2, ap_hp_stack.3, ap_hp_stack.4, ap_hp_stack.5, ap_hp_stack.6, ap_hp_stack.7, ap_hp_stack.8, ap_hp_stack.9, ap_hp_stack.10, ap_hp_stack.11, ap_hp_stack.12, ap_hp_stack.13, ap_hp_stack.14, ap_hp_stack.15, ...
min values : 426.50653, 403.31589, 381.38617, 360.64886, 341.03912, 322.49561, 304.96039, 288.37863, 272.69846, 257.87088, 243.84953, 230.59058, 218.05255, 206.19627, 194.98465, ...
max values : 134839.22, 127507.53, 120574.50, 114018.44, 107818.85, 101956.36, 96412.63, 91170.34, 86213.09, 81525.38, 77092.55, 72900.77, 68936.89, 65188.55, 61644.02, ...
The type of calculation that would disconnect the internet ranges from a simple stacking operation of 30 rasters like this:
ann.cost_hp_t_stack <- stack(ann.cost_hp_t)
(ann.cost_hp_t is a list of 30 rasters that look like a single raster layer in the above (former) description that will be stacked to create ann.cost_hp_t_stack, which resembles a raster stack in the above (latter) description),
to an operation that looks like this:
for (i in c(1:30)){
ann.cost_hp_t[[i]] <- ann.cost_t_im / ((1 + 0.05)^i)
}
where ann.cost_t_im is another raster layer resembling the raster layer described above.
In addition to the internet cutting out for the whole router/house, my local disk gets filled up too, and I have to regularly restart R to free up about 140 gigabytes of disk space.
If you have read this far, thank you very much for your time. Also, sorry if the formatting is confusing.
TL;DR: My internet keeps cutting out when I use the "raster" package in R to create gigabytes upon gigabytes of data.
The raster package does not use your network for normal computations (it only uses it to download data when you use the getData function). So it is directly related to that. It has to be related to what you computer does when files are created.
All I can think of is that you have a system that automatically copies your data to the cloud. So if you create a bunch of big files, that would slow down the Internet.
As for the filling up of your disk; that is because you are using very large files with functions that then need save these to disk (in a temp file). You can use other functions such as calc to use filenames. There would still be files, of course, but you may cut out intermediate files; if there are any. See raster::removeTmpFiles for removing them without exiting R.
I'd created a .dbf file to be read with all associated files (.shp,.shx,.prj.,.cpg), using the R package 'rgdal'. But, after reading I've noted that some data was ignored.
When I call the input variable (which I called 'brs'), the values showed for max and min values of column 'med' is completely wrong.
> brs<-readOGR(dsn='/home/luis/Downloads/br_municipios',layer='BRMUE250GC_SIR')
> brs
class : SpatialPolygonsDataFrame
features : 5152
extent : -73.99045, -28.83591, -33.75118, 5.271841 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=GRS80 +no_defs
variables : 3
names : NM_MUNICIP, CD_GEOCMU, med
min values : ABADIA DE GOIÁS, 1200013, 0.00
max values : ZORTÉA, 5300108, 999.85
If I try to get those values by hand, I get:
> min(brs#data$med)
[1] 0
> max(brs#data$med)
[1] 1128036
Performing some tests,adding new and different columns to my .dbf file, I realize that the max value accepted is around 999. Is this correct?
I need that the input variable show the correct values, because I'm trying to generate some maps using 'tmap', or 'ggmap' packages.
If someone could help, I would appreciate.
Thanks in advance.