Read Sentinel-2 bands into R retrieves high values - r

When I directly read into R the jp2 band files I get unusual high values compared to when I read the files in SNAP (version 9). To read the bands into R I use terra package (you can also use raster package) and the values range from 0 to 18000 more or less. I was wondering if SNAP is doing some conversion that I am not aware of to show values that range from 0 to 0.15 more or less.
> r_10
class : SpatRaster
dimensions : 10980, 10980, 4 (nrow, ncol, nlyr)
resolution : 10, 10 (x, y)
extent : 499980, 609780, 6690240, 6800040 (xmin, xmax, ymin, ymax)
coord. ref. : WGS 84 / UTM zone 32N (EPSG:32632)
sources : T32VNN_20181018T105031_B02_10m.jp2
T32VNN_20181018T105031_B03_10m.jp2
T32VNN_20181018T105031_B04_10m.jp2
... and 1 more source(s)
names : B02_10m_m10_2018, B03_10m_m10_2018, B04_10m_m10_2018, B08_10m_m10_2018
min values : 0, 0, 0, 0
max values : 18815, 17880, 17023, 15608
>
I have tried to export the bands from SNAP into TIF to see if it is a problem of format but it takes forever. I was hoping that there is a convesion factor to show the actual values that I need for my analysis.

SNAP applies a scaling factor of 1.0E4 so that values can be stored more efficiently as integers. You will need to either divide the values by this scaling factor in R, or else use the scaled units to benefit from more efficient integer arithmetic. See https://step.esa.int/docs/tutorials/Exporting%20data%20from%20SNAP.pdf for more details

Legacy formats ([Geo]TIFF, JPEG) have incomplete metadata support. You may lose missing value codes, offset and scale factors, processing history, etc. NetCDF4-CF has good metadata support for applications that actually use what is available. The R terra package can import NetCDF4-CF format, but is selective about what metadata are imported. For the files I have tested, missing data, scale, and offset values are used by the rast() function, but other metadata are lost.

With terra you can also set a scale/offset if you want:
library(terra)
#terra 1.6.21
f <- system.file("ex/elev.tif", package="terra")
r <- rast(f)
r
#class : SpatRaster
#dimensions : 90, 95, 1 (nrow, ncol, nlyr)
#resolution : 0.008333333, 0.008333333 (x, y)
#extent : 5.741667, 6.533333, 49.44167, 50.19167 (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84 (EPSG:4326)
#source : elev.tif
#name : elevation
#min value : 141
#max value : 547
scoff(r) <- cbind(10, 0)
r
#class : SpatRaster
#dimensions : 90, 95, 1 (nrow, ncol, nlyr)
#resolution : 0.008333333, 0.008333333 (x, y)
#extent : 5.741667, 6.533333, 49.44167, 50.19167 (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84 (EPSG:4326)
#source : elev.tif
#name : elevation
#min value : 1410
#max value : 5470
Essentially this is way to delay the evaluation of value * scale + offset, but now it may need to be done multiple times (each time that r is used), so it is not something I would generally recommend doing.

Related

How to avoid memory issues when creating a mosaic of raster tiles using terra?

I have a list containing 375 raster tiles that I would like to mosaic into one raster:
filelist_lc <- list.files("Northern_Land_Cover_2000/")
lc_2000_tiles <- lapply(filelist_lc, rast)
> lc_2000_tiles[[1]]
class : SpatRaster
dimensions : 4962, 5049, 1 (nrow, ncol, nlyr)
resolution : 30, 30 (x, y)
extent : 1614258, 1765728, 8094905, 8243765 (xmin, xmax, ymin, ymax)
coord. ref. : LCC E008
source : 014M.tif
name : 014M
lc_2000_tiles[[2]]
class : SpatRaster
dimensions : 4791, 4747, 1 (nrow, ncol, nlyr)
resolution : 30, 30 (x, y)
extent : 1462288, 1604698, 8381835, 8525565 (xmin, xmax, ymin, ymax)
coord. ref. : LCC E008
source : 015L.tif
name : 015L
....
I was trying to figure out a way to use mosaic them. This was the solution I came up with. However, after 7 tiles are combined, the computer runs out of disk space.
Error: [merge] internal error: insufficient disk space (perhaps from temporary files?)
Is there a more efficient way to do this?
You do not show what you do with lc_2000_tiles. Do you provide a filename argument? If not, the output goes to the temp folder, and perhaps this is on disk with not much space (and it is inefficient). You can set the temp folder with terraOptions(tempdir = "....")
Also, do you need mosaic or can you use merge (equivalent if you have non-overlapping areas)? If so, you can also use vrt:
v <- vrt(filelist_lc)
A "VRT" is a virtual raster. It can treat different files (typically adjacent non-overlapping tiles, but that is not required) as a single data source (file). With that, you could then do
x <- writeRaster(v, "combined.tif")

Unable to do raster operations in R

Hi my raster values for a Raster Layer are the following:
dimensions : 2225, 2286, 5086350 (nrow, ncol, ncell)
resolution : 0.03333146, 0.03333146 (x, y)
extent : -20.86612, 55.32961, -35.40306, 38.75945 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs
source : solar.tif
names : solar
values : 0, 2855 (min, max)
However whenever I try to do simple raster operations such as:
plot(solar)
It returns this error:
Error in setValues(outras, m) :
could not find symbol "values" in environment of the generic function
Thanks for any help

Missing Values After Applying Mask Function

I have a raster called 'xx' and is the output from using the 'Distance' function within R Raster. As part of running the distance function, the ocean/sea surrounding my island has been set to a value of 0. In order to change this value from 0 back to NA, I have used the function 'Mask' like so:
asdf<-mask(xx,jamaica.raster, filename="distancesurface.asc", prj=TRUE, keepres=TRUE, overwrite=TRUE)
In the above example, jamaica.raster is being used to replace the 0 values with NA. I run the command and plot the result. As expected, it has left my distance surface in place and set the ocean to NA, exactly as I wanted.
After, when I attempt to writeRaster the object 'asdf' I get the error:
Error in .local(.Object, ...) :
Couldn't find data values in ASCII Grid file.
If I get a summary of the objects 'xx' and 'jamaica.raster' before applying 'mask' function, both have a 'values' section like so:
class : RasterLayer
dimensions : 135, 306, 41310 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -78.49359, -75.94359, 17.59729, 18.72229 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84 +no_defs
data source : D:\Dropbox\\...
names : jamaicapc1
values : -3.419589, 14.17305 (min, max)
If I do the same for my new object 'asdf' there are no values- They seem to have gone missing since using the Mask function. For example:
class : RasterLayer
dimensions : 135, 306, 41310 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -78.49359, -75.94359, 17.59729, 18.72229 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84 +no_defs
data source : C:\Users\Simon\Desktop\distancesurface.asc
names : distancesurface
I assume this is expected behaviour so my question is, how can I write my new raster (asdf) to disk? Is there a step I need to carry out before I can use the writeRaster function?
Thanks in advance for any help you may be able to provide.

Using 'disaggregate' with GCM data

I have data from various Global Circulation Models (GCM) that I need in at a finer resolution to perturb climate observations that are 0.5 degree pixel. I saw that I could use disaggregate because this function won't change pixels values, as 'resample' does using, e.g., the bilinear method. But still, the output doesn't match my fine-res-grids.
Here an example with the dimensions of the files I'm dealing with:
r = raster(ncols=720, nrows=360) #fine resolution grid
r[] = runif(1:100)
> r
class : RasterLayer
dimensions : 360, 720, 259200 (nrow, ncol, ncell)
resolution : 0.5, 0.5 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : layer
values : 0.0159161, 0.9876637 (min, max)
s = raster(ncols=192, nrows=145) #dimensions of one of the GCM
s[] = runif(1:10)
> s
class : RasterLayer
dimensions : 145, 192, 27840 (nrow, ncol, ncell)
resolution : 1.875, 1.241379 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : layer
values : 0.03861309, 0.9744665 (min, max)
d=disaggregate(s, fact=c(3.75,2.482759)) #fact equals r/s for cols and rows
> d
class : RasterLayer
dimensions : 290, 768, 222720 (nrow, ncol, ncell)
resolution : 0.46875, 0.6206897 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : layer
values : 0.03861309, 0.9744665 (min, max)
The dimensions of 'd' are not equal to the dimensions of 'r', so I can't do operations with the 2 grids. And I'm not meant to be interpolating the pixel values. So, what's the best method to achieve the disaggregation with GCM data?
Thanks in advance.
The code below should help- it uses aggregate to the closest integer scaling possible then resample to match the other raster's spatial characteristics exactly:
r = raster(ncols=720, nrows=360) #fine resolution grid
r[] = runif(1:100)
s = raster(ncols=192, nrows=145) #dimensions of one of the GCM
s[] = runif(1:10)
d=disaggregate(s, fact=c(round(dim(r)[1]/dim(s)[1]),round(dim(r)[2]/dim(s)[2])), method='') #fact equals r/s for cols and rows
e=resample(d, r, method="ngb")
But there a few caveats/ warnings: If you want to have the same values as the original raster, use disaggregate with method='' or else it will interpolate. But most important looking at the aspect ratio between your r and s rasters, they are not the same: dim(r)[1]/dim(s)[1] != dim(r)[2]/dim(s)[2]). I would double check the original data because if there is a difference in resolution, projection, or extent you will not get what you want from the steps above.

Calculating area of occupancy from a binary unprojected raster

I have a series of binary raster layers (ascii file) showing presence/absence of a species in Europe and Africa. The file is based on unprojected lat/long (WGS84) data. My aim is to calculate the area of presence using R (I don't have access to ArcGIS).
I know that the raster package has a function for calculating area, but I'm worried that this won't be accurate for unprojected data. I have also looked at the cellStats function in the raster package, and can use this to "sum" the number of cells occupied, but I feel this has the same problem.
jan<-raster("/filelocation/file.asc")
jan
class : RasterLayer
dimensions : 13800, 9600, 132480000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -20, 60, -40, 75 (xmin, xmax, ymin, ymax)
coord. ref. : NA
data source : "/filelocation"
names : file.asc
values : -2147483648, 2147483647 (min, max)
area(jan)
class : RasterLayer
dimensions : 13800, 9600, 132480000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -20, 60, -40, 75 (xmin, xmax, ymin, ymax)
coord. ref. : NA
names : layer
values : 6.944444e-05, 6.944444e-05 (min, max)
Warning messages:
1: In .local(x, ...) :
This function is only useful for Raster* objects with a longitude/latitude coordinates
2: In .rasterFromRasterFile(grdfile, band = band, objecttype, ...) :
size of values file does not match the number of cells (given the data type)
cellStats(jan,"sum")
[1] 3559779
Anybody know of a way to calculate the presence area accurately, accounting for the earth curvature?
Thanks!
I do not know what is going in with your file (why you get warning #2). But is here is a work around
r <- raster(nrow=13800, ncol=9600, xmn=-20, xmx=60, ymn=-40, ymx=75)
# equivalent to r <- raster(jan)
x = area(r)
x
class : RasterLayer
dimensions : 13800, 9600, 132480000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -20, 60, -40, 75 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84
data source : c:\temp\R_raster_Robert\2015-01-26_213612_1208_85354.grd
names : layer
values : 0.2227891, 0.8605576 (min, max)
Now you have the area of each cell in km2. By multiplying these values with Raster objects with presence/absence values and then using cellStats( , 'sum') you can obtain the total area with presence.

Resources