R Raster Merge Changes Values - r

I have a series of GTiff images that I am trying to merge into a single larger extent. 6 small tiles need to be combined to generate my larger extent. My original 6 tiles have values which range from 0 to 255.
For example:
> tiff.list[[1]]
class : RasterLayer
dimensions : 1200, 1200, 1440000 (nrow, ncol, ncell)
resolution : 926.6254, 926.6254 (x, y)
extent : -10007555, -8895604, 2223901, 3335852 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs
data source : D:\Scratch\Data\MOD15A2.A2016153.h09v06.005.2016166083754.tif
names : MOD15A2.A2016153.h09v06.005.2016166083754
values : 0, 255 (min, max)
However, when merging the tiles using the code detailed here, I get a new image file and the values have changed:
> xx
class : RasterLayer
dimensions : 2400, 3600, 8640000 (nrow, ncol, ncell)
resolution : 926.6254, 926.6254 (x, y)
extent : -10007555, -6671703, 1111951, 3335852 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs
data source : D:\Scratch\Modis\A2016161.tif
names : A2016161
values : 0, 25 (min, max)
Does anyone know why this is happening? I've tried changing the file format and dataType ('INT1U') but it keeps happening. It's important the values don't change from 0 to 255 as the original data comes from NASAs MODIS satellite and certain values (i.e. 248-255) have specific fill values associated with them (for example, land cover assigned as water or snow). This change from a max value of 255 to 25 is removing important information from the original files.
Any assistance provided would be most welcome.

This suggests that these values are absent in the original files. The min and max values reported for the original files are based on the metadata provided therein. The metadata was likely wrong (showing the range of possible, not the actual values). To investigate do
setMinMax(tiff.list[[1]])
or
tiff.list[[1]] * 1

Related

Merge (mosaic) of rasters changes resolution

I'm merging two MODIS DSR tiles using a R script that I developed, these are the products:
https://drive.google.com/drive/folders/1RG3JkXlbaotBax-h5lEMT7lEn-ObwWsD?usp=sharing
So, I open both products (tile h15v05 and tile h16v05) from same date (2019180), then I open each SDS and merge them together (00h from h15v05 with 00h from h16v05 and so on...)
Visualisation on Panoply (using the merge option) of the two products:
Purple square is the location of the division line that separates the two tiles.
With my code I obtain a plot with pixels with different resolution (and different min/max values) and I don't understand why:
I suspect that the results obtained are due to:
1- Changing from Sinusoidal CRS to longlat WGS84 CRS;
2- Using resample (method ngb) to work with mosaic.
My code is extensive, but here are some parts of it:
# Open scientific dataset as raster
SDSs <- sds(HDFfile)
SDS <- SDSs[SDSnumber]
crs(SDS) <- crs("+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs")
SDSreprojected <- project(SDS, DesiredCRS)
SDSasRaster <- as(SDSreprojected, "Raster")
# Resample SDS based on a reference SDS (SDS GMT_1200_DSR of a first product), I need to do this to be able to use mosaic
SDSresampled <- resample(SDSasRaster,ResampleReference_Raster,method='ngb')
# Create mosaic of same SDS, but first convert stack to list to use mosaic
ListWith_SameSDS_OfGroupFiles <- as.list(StackWith_SameSDS_OfGroupFiles)
ListWith_SameSDS_OfGroupFiles.mosaicargs <- ListWith_SameSDS_OfGroupFiles
ListWith_SameSDS_OfGroupFiles.mosaicargs$fun <- mean
SDSmosaic <- do.call(mosaic, ListWith_SameSDS_OfGroupFiles.mosaicargs)
# Save SDSs mosaic stack to netCDF
writeRaster(StackWith_AllMosaicSDSs_OfGroupFiles, NetCDFpath, overwrite=TRUE, format="CDF", varname= "DSR", varunit="w/m2", longname="Downward Shortwave Radiation", xname="Longitude", yname="Latitude", zname="TimeGMT", zunit="GMT")
Does anyone have an idea of what could be the cause of this mismatch between results?
print(ResampleReference_Raster)
class : RasterLayer
dimensions : 1441, 897, 1292577 (nrow, ncol, ncell)
resolution : 0.01791556, 0.006942043 (x, y)
extent : -39.16222, -23.09196, 29.99652, 40 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs
source : memory
names : MCD18A1.A2019180.h15v05.061.2020343034815
values : 227.5543, 970.2346 (min, max)
print(SDSasRaster)
class : RasterLayer
dimensions : 1399, 961, 1344439 (nrow, ncol, ncell)
resolution : 0.01515284, 0.007149989 (x, y)
extent : -26.10815, -11.54627, 29.99717, 40 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs
source : memory
names : MCD18A1.A2019180.h16v05.061.2020343040755
values : 0, 0 (min, max)
print(SDSmosaic)
class : RasterLayer
dimensions : 1441, 897, 1292577 (nrow, ncol, ncell)
resolution : 0.01791556, 0.006942043 (x, y)
extent : -39.16222, -23.09196, 29.99652, 40 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs
source : memory
names : layer
values : 0, 62.7663 (min, max)
Also, some of the islands were ignored by the script (bottom right)...
sorry I didn't reply earlier. So I think you're right that this issue is extent to which you are resampling. I think you might be able to get around this by creating a dummy raster that has the extent of the raster you want to resample, but has the resolution of the raster you want to mosaic to.Try:
dummy<-raster(ext = SDSasRaster#extent, resolution=ResampledReference_Raster#res, crs=SDSasRaster#crs)
SDS2<-resample(SDSasRaster, dummy, method="ngb")
Final<-moasic(SDS2, ResampledReference_Raster, fun=mean)

Speed up R Focal Function

I am running a Focal function in R calculating the mode within my moving window. This is being run on a large raster with a cell size of 56m (see details below).
class : RasterLayer
dimensions : 63091, 52410, 3306599310 (nrow, ncol, ncell)
resolution : 56, 56 (x, y)
extent : -1575288, 1359672, -1486356, 2046740 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=44.75 +lat_2=55.75 +lat_0=40 +lon_0=-96 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : D:dataPath\rasterFile.tif
names : int17_5001
values : 0, 500 (min, max)
This has worked great when using smaller window sizes like 29 or 15x see below. These completed after about 48hrs.
29xFocal <- focal(
myRasterInput,
w=matrix(1,nrow=29,ncol=29),
fun=modal
)
15xFocal <- focal(
myRasterInput,
w=matrix(1,nrow=15,ncol=15),
fun=modal
)
The issue is when I try and run this for larger windows.. Specifically, I need to use a window size of ~ 180x and 300x. These have been running for almost a week and have not finished.
Any suggestions for a better way to run a focal function with larger windows on these datasets?

resampled raster values out of range

I would like to resample a high resolution raster to a coarser resolution, but in such a way that the maximum values of cells are retained for the coarser grid cells.
As there is no fun argument in the resample function in R's raster package, I have put together a simply custom function:
resampleCustom <- function(r1, r2) {
resRatio <- as.integer(res(r2) / res(r1))
ret <- aggregate(r1, fact = resRatio, fun = max)
if (!compareRaster(ret, r2, stopiffalse = FALSE)) {
ret <- resample(ret, r2, method = 'bilinear')
}
return(ret)
}
Basically, I use aggregate, where I can provide a custom function, to get close to the target raster, and then I use resample to apply some final adjustments.
I applied this to a raster that represents the projected distribution of a species of fish (where cell values represent suitability scores ranging from 0 to 1), and the odd thing is that the resulting raster has values that are greater than the max values in the original rasters.
The two rasters can be downloaded here and here.
library(raster)
# read in species raster and template
sp <- raster('Abalistes_filamentosus.tif')
template <- raster('rasterTemplate.tif')
> sp
class : RasterLayer
dimensions : 360, 720, 259200 (nrow, ncol, ncell)
resolution : 48243.14, 40790.17 (x, y)
extent : -17367530, 17367530, -7342230, 7342230 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=cea +lon_0=0 +lat_ts=30 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs
data source : /Users/pascaltitle/Dropbox/Abalistes_filamentosus.tif
names : Abalistes_filamentosus
values : -5.684342e-14, 1 (min, max)
> template
class : RasterLayer
dimensions : 49, 116, 5684 (nrow, ncol, ncell)
resolution : 3e+05, 3e+05 (x, y)
extent : -17367530, 17432470, -7357770, 7342230 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=cea +lon_0=0 +lat_ts=30 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs
data source : /Users/pascaltitle/Dropbox/rasterTemplate.tif
names : rasterTemplate
values : 1, 1 (min, max)
> resampleCustom(sp, template)
class : RasterLayer
dimensions : 49, 116, 5684 (nrow, ncol, ncell)
resolution : 3e+05, 3e+05 (x, y)
extent : -17367530, 17432470, -7357770, 7342230 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=cea +lon_0=0 +lat_ts=30 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs
data source : in memory
names : Abalistes_filamentosus
values : -0.2061382, 1.206138 (min, max)
The max value is 1.2, but how can this be when the bilinear method should essentially be taking averages of cell values? I would expect all values of the resulting raster to be within the bounds of the original raster values.
The extreme values are for cells at the edge of the raster, where values are extrapolated, as there are no neighbors at one side. This shows where these values are:
x <- resampleCustom(sp, template)
a <- xyFromCell(x, which.max(x))
b <- xyFromCell(x, which.min(x))
plot(x)
points(a)
points(b)
Or
plot(Which(x < 0))
plot(Which(round(x, 15) > 0))
To remove these extreme values, you can use raster::clamp.
xc <- clamp(x, 0, 1)
By the way, what you do, first aggregate then resampling, is also what is done within raster::resample.
The fundamental problem is that your high-res raster data do not line up with the low resolution aggregation you are seeking. That suggests a mistake earlier on in your work flow. The best way to avoid this problem is probably to make the habitat suitability predictions with predictor raster data that are aligned with the high resolution raster. You perhaps did not consider that when you projected the predictor variables to +proj=cea?

rasters with same crs, extent, dimension, resolution do not align

I am finding the average production days per year for maple syrup. My maple distribution data is in an ascii file. I have a raster (created from NetCDF files) called brick.Tmax. I want to match the specs of brick.Tmax to my maple distribution data.
## These are the specs I want to use for my maple distribution
brick.Tmax
class : RasterBrick
dimensions : 222, 462, 102564, 366 (nrow, ncol, ncell, nlayers)
resolution : 0.125, 0.125 (x, y)
extent : -124.75, -67, 25.125, 52.875 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : E:\all_files\gridded_obs.daily.Tmax.1980.nc
names : X1980.01.01, X1980.01.02, X1980.01.03, X1980.01.04, X1980.01.05, X1980.01.06, X1980.01.07, X1980.01.08, X1980.01.09, X1980.01.10, X1980.01.11, X1980.01.12, X1980.01.13, X1980.01.14, X1980.01.15, ...
Date : 1980-01-01, 1980-12-31 (min, max)
varname : Tmax
## reading in red maple data from ascii file into rasterLayer
red_raster <- raster("E:/all_files/Maple_Data/redmaple.asc")
red_raster
class : RasterLayer
dimensions : 140, 150, 21000 (nrow, ncol, ncell)
resolution : 20000, 20000 (x, y)
extent : -1793092, 1206908, -1650894, 1149106 (xmin, xmax, ymin, ymax)
coord. ref. : NA
data source : E:\all_files\Maple_Data\redmaple.asc
names : redmaple
values : -2147483648, 2147483647 (min, max)
How can I project all specs (dimension, crs, resoluion, and extent) from brick.Tmax onto red_raster, while still preserving the values of red_raster? It seems like the two are mutually exclusive.
NOTE: In an attempt to streamline my question, I edited my question quite a bit from my original post, so apologies if the comments below are confusing in the current context. (I removed the raster prodavg_rastwhich was acting like a middleman).
The two rasters clearly do not have the same extent. In fact are in different universes (coordinate reference systems). brick.Tmax has angular (longitude/latitude) coordinates: +proj=longlat +datum=WGS84
but red_raster clearly does not given extent : -1793092, 1206908, -1650894, 1149106. So to use these data together, one of the two needs to be transformed (projected into the the coordinate reference system of the other). The problem is that we do not know what the the crs of red_raster is (esri ascii files do not store that information!). So you need to find out what it is from your data source, or by guessing giving the area covered and conventions. After you find out, you could do something like:
library(raster)
tmax <- raster(nrow=222, ncol=462, xmn=-124.75, xmx=-67, ymn=25.125, ymx=52.875, crs="+proj=longlat +datum=WGS84")
red <- raster(nrow=140, ncol=150, xmn=-1793092, xmx=1206908, ymn=-1650894, ymx=1149106, crs=NA)
crs(red) <- " ?????? "
redLL <- projectRaster(red, tmax)
Projectiong rasters takes time. A good way to test whether you figured out the crs would be to transform some polygons that can show whether things align.
library(rgdal)
states <- shapefile('states.shp')
sr <- spTransform(states, crs(red)
plot(red)
plot(sr, add=TRUE)

Slow extraction from a large raster, even after using crop to reduce size?

I have a large raster file (245295396) cells and stacks of rasters having 4 layers each which lie in the extent of this large raster. To start with, I am trying to get value from one stack (3 channels) and for the same zone from the large raster. Every things works fine, just the extraction from large raster takes 5 mins. So, if I repeat this process for 4000 more times it will take 13 days.
cld<- raster("cdl_30m_r_il_2014_albers.tif") #this is the large raster
r<- stack(paste(path,"/data_robin/", fl,sep="")) #1 stack,I have 4000 similar
mat<-as.data.frame(getValues(r)) # getting values from the stack
xy<-xyFromCell(r,c(1:ncell(r)),spatial = TRUE)
clip1 <- crop(cld, extent(r)) # Tried to crop it to a smaller size
cells<-cellFromXY(clip1,xy)
mat$landuse<- NA
# mat$landuse<-cld[cells]
mat$landuse<- extract(clip1,cells) #this line takes 5 mins based on profiling
> cld
class : RasterLayer
dimensions : 20862, 11758, 245295396 (nrow, ncol, ncell)
resolution : 30, 30 (x, y)
extent : 378585, 731325, 1569045, 2194905 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0
data source : /Users/kaswani/R/Image/cdl_30m_r_il_2014_albers.tif
names : cdl_30m_r_il_2014_albers
values : 0, 255 (min, max)
> r
class : RasterStack
dimensions : 9230, 7502, 69243460, 4 (nrow, ncol, ncell, nlayers)
resolution : 0.7995722, 0.7995722 (x, y)
extent : 589084.4, 595082.8, 1564504, 1571884 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
names : m_3608906_ne_16.1, m_3608906_ne_16.2, m_3608906_ne_16.3, m_3608906_ne_16.4
min values : 0, 0, 0, 0
max values : 255, 255, 255, 255
My data is in .tiff format and I am new to geospatial coding. Will really appreciate any suggestions to improve the speed. I have also tried this approach but during the masking part it gives the error:
Error in compareRaster(x, mask) : different extent.

Resources