Speed up R Focal Function - r

I am running a Focal function in R calculating the mode within my moving window. This is being run on a large raster with a cell size of 56m (see details below).
class : RasterLayer
dimensions : 63091, 52410, 3306599310 (nrow, ncol, ncell)
resolution : 56, 56 (x, y)
extent : -1575288, 1359672, -1486356, 2046740 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=44.75 +lat_2=55.75 +lat_0=40 +lon_0=-96 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : D:dataPath\rasterFile.tif
names : int17_5001
values : 0, 500 (min, max)
This has worked great when using smaller window sizes like 29 or 15x see below. These completed after about 48hrs.
29xFocal <- focal(
myRasterInput,
w=matrix(1,nrow=29,ncol=29),
fun=modal
)
15xFocal <- focal(
myRasterInput,
w=matrix(1,nrow=15,ncol=15),
fun=modal
)
The issue is when I try and run this for larger windows.. Specifically, I need to use a window size of ~ 180x and 300x. These have been running for almost a week and have not finished.
Any suggestions for a better way to run a focal function with larger windows on these datasets?

Related

Merge (mosaic) of rasters changes resolution

I'm merging two MODIS DSR tiles using a R script that I developed, these are the products:
https://drive.google.com/drive/folders/1RG3JkXlbaotBax-h5lEMT7lEn-ObwWsD?usp=sharing
So, I open both products (tile h15v05 and tile h16v05) from same date (2019180), then I open each SDS and merge them together (00h from h15v05 with 00h from h16v05 and so on...)
Visualisation on Panoply (using the merge option) of the two products:
Purple square is the location of the division line that separates the two tiles.
With my code I obtain a plot with pixels with different resolution (and different min/max values) and I don't understand why:
I suspect that the results obtained are due to:
1- Changing from Sinusoidal CRS to longlat WGS84 CRS;
2- Using resample (method ngb) to work with mosaic.
My code is extensive, but here are some parts of it:
# Open scientific dataset as raster
SDSs <- sds(HDFfile)
SDS <- SDSs[SDSnumber]
crs(SDS) <- crs("+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs")
SDSreprojected <- project(SDS, DesiredCRS)
SDSasRaster <- as(SDSreprojected, "Raster")
# Resample SDS based on a reference SDS (SDS GMT_1200_DSR of a first product), I need to do this to be able to use mosaic
SDSresampled <- resample(SDSasRaster,ResampleReference_Raster,method='ngb')
# Create mosaic of same SDS, but first convert stack to list to use mosaic
ListWith_SameSDS_OfGroupFiles <- as.list(StackWith_SameSDS_OfGroupFiles)
ListWith_SameSDS_OfGroupFiles.mosaicargs <- ListWith_SameSDS_OfGroupFiles
ListWith_SameSDS_OfGroupFiles.mosaicargs$fun <- mean
SDSmosaic <- do.call(mosaic, ListWith_SameSDS_OfGroupFiles.mosaicargs)
# Save SDSs mosaic stack to netCDF
writeRaster(StackWith_AllMosaicSDSs_OfGroupFiles, NetCDFpath, overwrite=TRUE, format="CDF", varname= "DSR", varunit="w/m2", longname="Downward Shortwave Radiation", xname="Longitude", yname="Latitude", zname="TimeGMT", zunit="GMT")
Does anyone have an idea of what could be the cause of this mismatch between results?
print(ResampleReference_Raster)
class : RasterLayer
dimensions : 1441, 897, 1292577 (nrow, ncol, ncell)
resolution : 0.01791556, 0.006942043 (x, y)
extent : -39.16222, -23.09196, 29.99652, 40 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs
source : memory
names : MCD18A1.A2019180.h15v05.061.2020343034815
values : 227.5543, 970.2346 (min, max)
print(SDSasRaster)
class : RasterLayer
dimensions : 1399, 961, 1344439 (nrow, ncol, ncell)
resolution : 0.01515284, 0.007149989 (x, y)
extent : -26.10815, -11.54627, 29.99717, 40 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs
source : memory
names : MCD18A1.A2019180.h16v05.061.2020343040755
values : 0, 0 (min, max)
print(SDSmosaic)
class : RasterLayer
dimensions : 1441, 897, 1292577 (nrow, ncol, ncell)
resolution : 0.01791556, 0.006942043 (x, y)
extent : -39.16222, -23.09196, 29.99652, 40 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs
source : memory
names : layer
values : 0, 62.7663 (min, max)
Also, some of the islands were ignored by the script (bottom right)...
sorry I didn't reply earlier. So I think you're right that this issue is extent to which you are resampling. I think you might be able to get around this by creating a dummy raster that has the extent of the raster you want to resample, but has the resolution of the raster you want to mosaic to.Try:
dummy<-raster(ext = SDSasRaster#extent, resolution=ResampledReference_Raster#res, crs=SDSasRaster#crs)
SDS2<-resample(SDSasRaster, dummy, method="ngb")
Final<-moasic(SDS2, ResampledReference_Raster, fun=mean)

Calculating a CHM with dtm and dsm with different resolutions

I have a DTM and DSM with different resolutions.
Here are the summaries of each Raster layer.
> raster_dsm
class : RasterLayer
dimensions : 2001, 2501, 5004501 (nrow, ncol, ncell)
resolution : 0.5, 0.5 (x, y)
extent : -112500.2, -111249.8, 388999.8, 390000.2 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=tmerc +lat_0=0 +lon_0=16.33333333333333 +k=1 +x_0=0 +y_0=-5000000 +ellps=bessel +units=m +no_defs
data source : D:/Test_Raster/DSM/dsm.asc
names : dsm
>raster_dtm
class : RasterLayer
dimensions : 1001, 1251, 1252251 (nrow, ncol, ncell)
resolution : 1, 1 (x, y)
extent : -112500.5, -111249.5, 388999.5, 390000.5 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=tmerc +lat_0=0 +lon_0=16.33333333333333 +k=1 +x_0=0 +y_0=-5000000 +ellps=bessel +units=m +no_defs
data source : D:/Test_Raster/DTM/dtm.asc
names : dtm
As you can see, the resolution of the dtm is 1 m and the resolution of the dsm is 0.5m.
I want to calculate a Crown Heigth Model (CHM).
The easiest way is to
CHM = dsm - dtm
But when I try in R the following error code appear:
Error in compareRaster(e1, e2, extent = FALSE, rowcol = FALSE, crs = TRUE, :
different resolution
Is there a simple way to ignore the resolution? Or must I do a resampling of the data, before further calculation?
In ArcGis you can do this kind of raster calculation easily, because you don't have to resample the data first.
Any suggestions will be appreciated!
Yes, Arc*** will do this for you, but what does it actually do? I think it is better to avoid that kind of ambiguity. In this case you cannot use dis/aggregate because the extents are different. So you need to use resample

Sentinel-2 R gdal raster

I would like to make a piece of code I have more efficient.
Right now I download Sentinel-2 data in jp2 format via the open acceshub. The jp2 files that I download have, for some reason, a wrong extent. Right now I correct this in the following way (in which file is the filename of the jp2):
r = raster(file)
extent(r) = new_extent
writeRaster(r, file)
This method, however, writes the entire raster (which takes ages) whereas I only changed a minor detail.
Is there a neat way using gdal or the raster package to do this more efficiently?
If I print the raster I see:
class : RasterLayer
dimensions : 1830, 1830, 3348900 (nrow, ncol, ncell)
resolution : 60, 60 (x, y)
extent : 499980, 609780, 6690240, 6800040 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=utm +zone=55 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : /home/daniel/R/farmhack/x.jp2
names : x
values : 0, 65535 (min, max)
I do not know what this extent means.

R Raster Merge Changes Values

I have a series of GTiff images that I am trying to merge into a single larger extent. 6 small tiles need to be combined to generate my larger extent. My original 6 tiles have values which range from 0 to 255.
For example:
> tiff.list[[1]]
class : RasterLayer
dimensions : 1200, 1200, 1440000 (nrow, ncol, ncell)
resolution : 926.6254, 926.6254 (x, y)
extent : -10007555, -8895604, 2223901, 3335852 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs
data source : D:\Scratch\Data\MOD15A2.A2016153.h09v06.005.2016166083754.tif
names : MOD15A2.A2016153.h09v06.005.2016166083754
values : 0, 255 (min, max)
However, when merging the tiles using the code detailed here, I get a new image file and the values have changed:
> xx
class : RasterLayer
dimensions : 2400, 3600, 8640000 (nrow, ncol, ncell)
resolution : 926.6254, 926.6254 (x, y)
extent : -10007555, -6671703, 1111951, 3335852 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs
data source : D:\Scratch\Modis\A2016161.tif
names : A2016161
values : 0, 25 (min, max)
Does anyone know why this is happening? I've tried changing the file format and dataType ('INT1U') but it keeps happening. It's important the values don't change from 0 to 255 as the original data comes from NASAs MODIS satellite and certain values (i.e. 248-255) have specific fill values associated with them (for example, land cover assigned as water or snow). This change from a max value of 255 to 25 is removing important information from the original files.
Any assistance provided would be most welcome.
This suggests that these values are absent in the original files. The min and max values reported for the original files are based on the metadata provided therein. The metadata was likely wrong (showing the range of possible, not the actual values). To investigate do
setMinMax(tiff.list[[1]])
or
tiff.list[[1]] * 1

Slow extraction from a large raster, even after using crop to reduce size?

I have a large raster file (245295396) cells and stacks of rasters having 4 layers each which lie in the extent of this large raster. To start with, I am trying to get value from one stack (3 channels) and for the same zone from the large raster. Every things works fine, just the extraction from large raster takes 5 mins. So, if I repeat this process for 4000 more times it will take 13 days.
cld<- raster("cdl_30m_r_il_2014_albers.tif") #this is the large raster
r<- stack(paste(path,"/data_robin/", fl,sep="")) #1 stack,I have 4000 similar
mat<-as.data.frame(getValues(r)) # getting values from the stack
xy<-xyFromCell(r,c(1:ncell(r)),spatial = TRUE)
clip1 <- crop(cld, extent(r)) # Tried to crop it to a smaller size
cells<-cellFromXY(clip1,xy)
mat$landuse<- NA
# mat$landuse<-cld[cells]
mat$landuse<- extract(clip1,cells) #this line takes 5 mins based on profiling
> cld
class : RasterLayer
dimensions : 20862, 11758, 245295396 (nrow, ncol, ncell)
resolution : 30, 30 (x, y)
extent : 378585, 731325, 1569045, 2194905 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0
data source : /Users/kaswani/R/Image/cdl_30m_r_il_2014_albers.tif
names : cdl_30m_r_il_2014_albers
values : 0, 255 (min, max)
> r
class : RasterStack
dimensions : 9230, 7502, 69243460, 4 (nrow, ncol, ncell, nlayers)
resolution : 0.7995722, 0.7995722 (x, y)
extent : 589084.4, 595082.8, 1564504, 1571884 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
names : m_3608906_ne_16.1, m_3608906_ne_16.2, m_3608906_ne_16.3, m_3608906_ne_16.4
min values : 0, 0, 0, 0
max values : 255, 255, 255, 255
My data is in .tiff format and I am new to geospatial coding. Will really appreciate any suggestions to improve the speed. I have also tried this approach but during the masking part it gives the error:
Error in compareRaster(x, mask) : different extent.

Resources