raster conversion from WGS84 to Mollweide alters values? - r

I have a question regarding conversion of rasters with categorical values from WGS84 to Mollweide projections. It looks like to conversion leads to alteration of the dataset values. Very unfortunately, I struggle to provide you with a reproducible example, so I’ll provide you with some details about my approach. You may have some tips on where my issue could come from, as this may be a common issue. The EU website https://ghsl.jrc.ec.europa.eu/download.php?ds=bu gives me access to the following rasters:
SMOD layer provides me with human settlement info (SMOD dataset which I transformed into 1) a urban mask with “1” for urban areas and “NA” for non urban areas 2) a rural mask with “1” for rural areas and “NA” for non rural areas). SMOD is available in Mollweide projections only.
POP layer provides me with human population density (number of people per grid cell). POP is available in both Mollweide and WGS84.
I tried two approaches to estimate rural and urban human population numbers.
My challenge is that I get different numbers for each of these approaches. I wonder why this is the case:
Approach 1) Change SMOD Mollweide projections to WGS84
# layers as provided by website
SMOD_MollweideProj <- raster ("./GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0.tif")
POP_WGSproj <- raster("./GHS_POP_E2015_GLOBE_R2019A_4326_30ss_V1_0.tif")
SMOD_WGSproj <- projectRaster(from=SMOD_MollweideProj, to= POP_WGSproj, method='ngb' , over=T )
#create rural and urban masks - Classes 30-23-22-21 if aggregated form the "urban domain", 13-12-11-10 form the "rural domain".
SMOD_rur_mask_1K <- SMOD_WGSproj
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) >14] = NA
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) <=13] = 1
SMOD_urb_mask_1K <- SMOD_WGSproj
SMOD_urb_mask_1K[SMOD_urb_mask_1K<20 ] <- NA
SMOD_urb_mask_1K[SMOD_urb_mask_1K>=21 ] <- 1
#Generate rural and urban population layers, based on total population per grid cell and rural and urban masks
POP_rur_1K_WGSproj <- POP_WGSproj * SMOD_rur_mask_1K
POP_urb_1K_WGSproj <- POP_WGSproj * SMOD_urb_mask_1K
#urban and rural population estimates
cellStats(POP_rur_1K_WGSproj, sum, na.rm=T)
#2024108119 which is different to the value I get with the second approach
cellStats(POP_urb_1K_WGSproj, sum, na.rm=T)
#5321638069 which is different to the value I get with the second approach
> SMOD_MollweideProj
class : RasterLayer
dimensions : 18000, 36082, 649476000 (nrow, ncol, ncell)
resolution : 1000, 1000 (x, y)
extent : -18041000, 18041000, -9e+06, 9e+06 (xmin, xmax, ymin, ymax)
crs : +proj=moll +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs
values : 10, 30 (min, max)
> SMOD_WGSproj
class : RasterLayer
dimensions : 21600, 43200, 933120000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
values : 10, 30 (min, max)
> POP_WGSproj
class : RasterLayer
dimensions : 21600, 43200, 933120000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
values : 0, 459434.6 (min, max)
Approach 2) Work with SMOD and POP Mollweide projections
# layers as provided by website
SMOD_MollweideProj <- raster ("./GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0.tif")
POP_MollweideProj <- raster("./GHS_POP_E2015_GLOBE_R2019A_54009_1K_V1_0.tif")
#create rural and urban masks - Classes 30-23-22-21 if aggregated form the "urban domain", 13-12-11-10 form the "rural domain".
SMOD_rur_mask_1K <- SMOD_MollweideProj
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) >14] = NA
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) <=13] = 1
SMOD_urb_mask_1K <- SMOD_MollweideProj
SMOD_urb_mask_1K[SMOD_urb_mask_1K<20 ] <- NA
SMOD_urb_mask_1K[SMOD_urb_mask_1K>=21 ] <- 1
#Generate rural and urban population layers, based on total population per grid cell and rural and urban masks
POP_rur_1K_MollweideProj <- POP_MollweideProj * SMOD_rur_mask_1K
POP_urb_1K_MollweideProj <- POP_MollweideProj * SMOD_urb_mask_1K
#urban and rural population estimates
cellStats(POP_rur_1K_MollweideProj, sum, na.rm=T)
# 1726372189 which is different to the value I get with the first approach
cellStats(POP_urb_1K_MollweideProj, sum, na.rm=T)
# 5622956252 which is different to the value I get with the first approach
Thank you very much for your suggestions

Following on from the comments above, here's my suggested code for your Approach 2, i.e. working with Mollweide projection for both SMOD and POP. I downloaded data for a single cell rather than the global layer, simply to reduce execution time.
Note I've used raster::mask() to mask POP to urban & rural. With this method there is no need to set a value of 1 in the masks, you can just retain the original values after setting to NA the cells you wish to mask. See ?raster::mask.
library(raster)
smod <- raster("data/GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0_11_4.tif")
pop <- raster("data/GHS_POP_E2015_GLOBE_R2019A_54009_1K_V1_0_11_4.tif")
# produce rural mask
maskRural <- smod
maskRural[maskRural > 14] <- NA
# produce urban mask
maskUrban <- smod
maskUrban[maskUrban < 20] <- NA
# use raster::mask to produce rural and urban population layers
popRural <- raster::mask(pop, maskRural)
popUrban <- raster::mask(pop, maskUrban)
# check that the total population of rural + urban is equal to the original
sum(pop[], na.rm = TRUE)
sum(popUrban[], na.rm = TRUE) + sum(popRural[], na.rm = TRUE)
The final couple of lines just check that the sum of rural and urban populations equals the original total population. So for the single cell I used the results are:
> sum(pop[], na.rm = TRUE)
[1] 67487835
> sum(popUrban[], na.rm = TRUE) + sum(popRural[], na.rm = TRUE)
[1] 67487835
To project to WGS84 you could do something like this:
popUrbanWgs84 <- projectRaster(popUrban, crs = crs("+init=epsg:4326"), method = "bilinear")
You can also specify a res parameter here, otherwise it will choose one for you, but the question would be what resolution to use? The original resolution of 1km is roughly 0.008 degrees but in terms of longitude this varies across the globe.
My suggestion is to stick to Mollweide if at all possible.

Related

disaggregate raster value to finer cells equally

I have a SpatRaster in the following dimension
layer1
class : SpatRaster
dimensions : 3600, 3600, 1 (nrow, ncol, nlyr)
resolution : 0.0002777778, 0.0002777778 (x, y)
extent : -81.00014, -80.00014, 24.99986, 25.99986 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source : nn.tif
name : nn_1
I have another SpatRaster which has data for revenue (in dollar) at a different resolution than layer1
layer2
class : SpatRaster
dimensions : 21600, 43200, 1 (nrow, ncol, nlyr)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84
source : XX.nc
varname : XX
name : XX
unit : US dollar
What I want to do is to take the average of layer1 using layer2 as weights. This is my approach:
Give them the same projections.
layer2 <- project(layer2, crs(layer1))
Clip layer2 to match the extent of layer1
r <- terra::crop(layer2, ext(layer1))
r <- terra::mask(r, layer1) # I get error in this step
[mask] number of rows and/or columns do not match
Suppose step 2 worked. Now I want to disaggregate clipped layer2 to match the resolution of layer1 i.e. 30 times finer.
This means revenue of each disaggregated cell should be 1/30 of the revenue of the parent cell.
Is the below the right way to do this:
r_disagg <- disaggregate(r, fact = 30)/30
Divide each cell value in r_disagg with total sum of r_disagg to get a weight raster r_disagg_wt
Multiply r_disagg_wt with layer1 and add the resulting raster to get the weighted average of layer1
I am stuck in step 2 and 3
When asking an R question, please include example data like this
Example data
library(terra)
r1 <- rast(nrow=360, ncol=360, ext=c(-81.00014, -80.00014, 24.99986, 25.99986))
r2 <- rast(nrow=2160, ncol=4320)
values(r1) <- 1:ncell(r1)
set.seed(0)
values(r2) <- runif(ncell(r2))
Solution
re2 <- resample(r2, r1)
global(r1, fun="mean", weights=re2)
# weighted_mean
#lyr.1 66190.15
In cases where the two input rasters align, it could be preferable to use disagg instead of resample (in this case, the extent of r1 is a bit odd, it seems to be shifted with half a cell).
There is no need to adjust the dollar values if they are only used as weights. In cases that the sum of the cell values should remain constant, when using resample or project, you could compute the density ($/km2) before resampling by dividing the values with the area of each cell (see cellSize) and multiply the values again with the area of the new cells after resampling.

How to run different formula on different layers in R

I downloaded worlclim/BIO climatic data which has 16 layers. 1-11 layers of which are temperature data. Rests are precipitation data. When I checked document, I should convert unit of temperature data by different conversion factors. 1-2,4-11 layers should be divided by 10 to convert degree celcius and 3-4 layers by 100. To do this, I wrote following code:
temp1<-clim[[1:2]]/10
temp2 <-clim[[5:11]]/10
temp3<-clim[[3:4]]/100
Stack them back according to the same order as they were in original data:
clim <-stack(temp1,temp3,temp2)
My question is how to calculate different formula on different layer and stack them back to original order? I want to know another way to do these steps.
Thank you!
Easist way could be to define a vector of "dividing factors" and then simply divide the stack by that vector. In this way, you do not need to put the bands in the "original" order:
library(raster)
a <- raster::raster(ncols = 10, nrows = 10)
a <- raster::init(a, runif)
# create a stack
b <- raster::stack(a,a,a,a,a,a)
# define vector of dividing factors
divs <- c(1,1,10,10,100,100)
# compute
c <- b / divs
c
class : RasterBrick
dimensions : 10, 10, 100, 6 (nrow, ncol, ncell, nlayers)
resolution : 36, 18 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : layer.1, layer.2, layer.3, layer.4, layer.5, layer.6
min values : 5.919103e-03, 5.919103e-03, 5.919103e-04, 5.919103e-04, 5.919103e-05, 5.919103e-05
max values : 0.99532098, 0.99532098, 0.09953210, 0.09953210, 0.00995321, 0.00995321

Can't change raster's extent

I want to crop an elevation raster to add it to a raster stack. It's easy, I did this before smoothly, adding a ecoregions raster to the same stack. But with the elevation one, just doesn't work. Now, there are several questions here in overflow adressing this issue and I tryed a lot of things...
First of all, we need this:
library(rgdal)
library(raster)
My stack is predictors2:
#Downloading the stack
predictors2_full<-getData('worldclim', var='bio', res=10)
#Cropping it, I don' need the whole world
xmin=-120; xmax=-35; ymin=-60; ymax=35
limits <- c(xmin, xmax, ymin, ymax)
predictors2 <- crop(predictors2_full,limits)
Then I've downloaded the terr_ecorregions shapefile here: http://maps.tnc.org/files/shp/terr-ecoregions-TNC.zip
setwd("~/ORCHIDACEAE/Ecologicos/w2/layers/terr-ecoregions-TNC")
ecoreg = readOGR("tnc_terr_ecoregions.shp") # I've loaded...
ecoreg2 <- crop(ecoreg,extent(predictors2)) # cropped...
ecoreg2 <- rasterize(ecoreg2, predictors2) # made the shapefile a raster
predictors4<-addLayer(predictors2,elevation,ecoreg2) # and added the raster
# to my stack
With elevation, I just can't. The Digital elevation model is based in GMTED2010, which can be downloaded here: http://edcintl.cr.usgs.gov/downloads/sciweb1/shared/topo/downloads/GMTED/Grid_ZipFiles/mn30_grd.zip
elevation<-raster("w001001.adf") #I've loaded
elevation<-crop(elevation,predictors2) # and cropped
But elevation gets a slightly different extent instead of predictors2's extent:
> extent(elevation)
class : Extent
xmin : -120.0001
xmax : -35.00014
ymin : -60.00014
ymax : 34.99986
>
I tried to make then equal by all means I read about in questions here...
I tried to extend so elevation's ymax would meet predictors2's ymax
elevation<-extend(elevation,predictors2) #didn't work, extent remains the same
I tried the opposite... making predictors2 extent meet elevation's extent... nothing either.
But then I read that
You might not want to play with setExtent() or extent() <- extent(), as you could end with wrong geographic coordinates of your rasters - #ztl, Jun 29 '15
And I tried to get the minimal common extent of my rasters, following #zlt answer in another extent question, by doing this
# Summing your rasters will only work where they are not NA
r123 = r1+r2+r3 # r123 has the minimal common extent
r1 = crop(r1, r123) # crop to that minimal extent
r2 = crop(r2, r123)
r3 = crop(r3, r123)
For that, first I had to set the resolutions:
res(elevation)<-res(predictors2) #fixing the resolutions... This one worked.
But then, r123 = r1+r2+r didn't work:
> r123=elevation+ecoreg2+predictors2
Error in elevation + ecoreg2 : first Raster object has no values
Can anyone give me a hint on this? I really would like to add my elevation to the raster. Funny thing is, I have another stack named predictors1 with the exact same elevation's extent... And I was able to crop ecoreg and add ecoreg to both predictors1 and predictors2... Why can't I just do the same to elevation?
I'm quite new to this world and runned out of ideas... I appreciate any tips.
EDIT: Solution, Thanks to #Val
I got to this:
#Getting the factor to aggregate (rasters are multiples of each other)
res(ecoreg2)/res(elevation)
[1] 20 20 #The factor is 20
elevation2<-aggregate(elevation, fact=20)
elevation2 <- crop(elevation2,extent(predictors2))
#Finally adding the layer:
predictors2_eco<-addLayer(predictors2,elevation2,ecoreg)
New problem, thought...
I can't write stack to a geotiff
writeRaster(predictors2_eco, filename="cropped_predictors2_eco.tif", options="INTERLEAVE=BAND", overwrite=TRUE)
Error in .checkLevels(levs[[j]], value[[j]]) :
new raster attributes (factor values) should be in a data.frame (inside a list)
I think you're having issues because you're working with rasters of different spatial resolutions. So when you crop both rasters to the same extent, they'll have a slightly different actual extent because of that.
So if you want to stack rasters, you need to get them into the same resolution. Either you disaggregate the raster with the coarser resolution (i.e. increase the resolution by resampling or other methods) or you aggregate the raster with the higher resolution (i.e. decrease the resolution with for instance taking the mean over n pixel).
Please note that if you change the extent or resolution with setExtent(x), extent(x) <-, res(x) <- or similar will NOT work, since you're just changing slots in the raster object, not the actual underlying data.
So to bring the rasters into a common resolution, you need to change the data. You can use the functions (amongst others) aggregate, disaggregate and resample for that purpose. But since you're changing data, you need to be clear on what you're and the function you use is doing.
The most handy way for you should be resample, where you can resample a raster to another raster so they match in extent and resolution. This will be done using a defined method. Per default it's using nearest neighbor for computing the new values. If you're working with continuous data such as elevation, you might want to opt for bilinear which is bilinear interpolation. In this case you're actually creating "new measurements", something to be aware of.
If your two resolutions are multiples of each other, you could look into aggregate and disaggregate. In the case of disaggregate you would split a rastercell by a factor to get a higher resolution (e.g. if your first resolution is 10 degrees and your desired resolution is 0.05 degrees, you could disaggregate with a factor of 200 giving you 200 cells of 0.05 degree for every 10 degree cell). This method would avoid interpolation.
Here's a little working example:
library(raster)
library(rgeos)
shp <- getData(country='AUT',level=0)
# get centroid for downloading eco and dem data
centroid <- coordinates(gCentroid(shp))
# download 10 degree tmin
ecovar <- getData('worldclim', var='tmin', res=10, lon=centroid[,1], lat=centroid[,2])
ecovar_crop <- crop(ecovar,shp)
# output
> ecovar_crop
class : RasterBrick
dimensions : 16, 46, 736, 12 (nrow, ncol, ncell, nlayers)
resolution : 0.1666667, 0.1666667 (x, y)
extent : 9.5, 17.16667, 46.33333, 49 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : tmin1, tmin2, tmin3, tmin4, tmin5, tmin6, tmin7, tmin8, tmin9, tmin10, tmin11, tmin12
min values : -126, -125, -102, -77, -33, -2, 19, 20, 5, -30, -74, -107
max values : -31, -21, 9, 51, 94, 131, 144, 137, 106, 60, 18, -17
# download SRTM elevation - 90m resolution at eqt
elev <- getData('SRTM',lon=centroid[,1], lat=centroid[,2])
elev_crop <- crop(elev, shp)
# output
> elev_crop
class : RasterLayer
dimensions : 3171, 6001, 19029171 (nrow, ncol, ncell)
resolution : 0.0008333333, 0.0008333333 (x, y)
extent : 9.999584, 15.00042, 46.37458, 49.01708 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : srtm_39_03
values : 198, 3865 (min, max)
# won't work because of different resolutions (stack is equal to addLayer)
ecoelev <- stack(ecovar_crop,elev_crop)
# resample
elev_crop_RS <- resample(elev_crop,ecovar_crop,method = 'bilinear')
# works now
ecoelev <- stack(ecovar_crop,elev_crop_RS)
# output
> ecoelev
class : RasterStack
dimensions : 16, 46, 736, 13 (nrow, ncol, ncell, nlayers)
resolution : 0.1666667, 0.1666667 (x, y)
extent : 9.5, 17.16667, 46.33333, 49 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
names : tmin1, tmin2, tmin3, tmin4, tmin5, tmin6, tmin7, tmin8, tmin9, tmin10, tmin11, tmin12, srtm_39_03
min values : -126.0000, -125.0000, -102.0000, -77.0000, -33.0000, -2.0000, 19.0000, 20.0000, 5.0000, -30.0000, -74.0000, -107.0000, 311.7438
max values : -31.000, -21.000, 9.000, 51.000, 94.000, 131.000, 144.000, 137.000, 106.000, 60.000, 18.000, -17.000, 3006.011

Resample raster

I am trying to resample a forest cover raster with high resolution (25 meters) and categorical data (1 to 13) to a new RasterLayer with a lower resolution (~ 1 km). My idea is to combine the forest cover data with other lower-resolution raster data :
I tried raster::resample(), but since the data is categorical I lost a lot of information:
summary(as.factor(df$loss_year_mosaic_30m))
0 1 2 3 4 5 6 7 8 9 10 11 12 13
3777691 65 101 50 151 145 159 295 291 134 102 126 104 91
As you can see, the new raster has the desired resolution but have lots of zeros as well. I suppose that is normal since I used the ´ngb´ option in resample.
The second strategy was using raster::aggregate() but I find difficult to define a factor integer since the change of resolution is not straightforward (like the double of the resolution or alike).
My high-resolution raster has the following resolution, and I want it to aggregate it to a 0.008333333, 0.008333333 (x, y) resolution to the same extent.
loss_year
class : RasterLayer
dimensions : 70503, 59566, 4199581698 (nrow, ncol, ncell)
resolution : 0.00025, 0.00025 (x, y)
extent : -81.73875, -66.84725, -4.2285, 13.39725 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : /Volumes/LaCie/Deforestacion/Hansen/loss_year_mosaic_30m.tif
names : loss_year_mosaic_30m
values : 0, 13 (min, max)
I have tried a factor of ~33.33 following the description of the aggregate help: "The number of cells is the number of cells of x divided by fact*fact (when fact is a single number)." Nonetheless, the resulting raster data do not seem to have the same number of rows and columns as my other low-resolution rasters.
I have never used this high-resolution data, and I am also computationally limited (some of this commands can be parallelized using clusterR, but sometimes they took the same time than the non-parallelized commands, especially since they do not work for nearest neighboor calculations).
I am short of ideas; maybe I can try layerize to obtain a count raster, but I have to ´aggregate´ and the factor problem arises. Since this processes are taking me days to process, I do want to know the most efficient way to create a lower resolution raster without losing much information
A reproducible example could be the following:
r_hr <- raster(nrow=70, ncol=70) #High resolution raster with categorical data
set.seed(0)
r_hr[] <- round(runif(1:ncell(r_hr), 1, 5))
r_lr <- raster(nrow=6, ncol=6) #Low resolution raster
First strategy: loss of information
r <- resample(r_hr, r_lr, method = "ngb") #The raster data is categorical
Second strategy: difficult to define an aggregate factor
r <- aggregate(r_hr, factor) #How to define a factor to get exactly the same number of cells of h_lr?
Another option: layerize
r_brick <- layerize(r_hr)
aggregate(r_brick, factor) #How to define factor to coincide with the r_lr dimensions?
Thanks for your help!
r_hr <- raster(nrow=70, ncol=70) #High resolution raster with categorical data
set.seed(0)
r_hr[] <- round(runif(1:ncell(r_hr), 1, 5))
r_lr <- raster(nrow=6, ncol=6)
r_hr
#class : RasterLayer
#dimensions : 70, 70, 4900 (nrow, ncol, ncell)
#resolution : 5.142857, 2.571429 (x, y)
#extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
#data source : in memory
#names : layer
#values : 1, 5 (min, max)
r_lr
#class : RasterLayer
#dimensions : 6, 6, 36 (nrow, ncol, ncell)
#resolution : 60, 30 (x, y)
#extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
Direct aggregate is not possible, because 70/6 is not an integer.
dim(r_hr)[1:2] / dim(r_lr)[1:2]
#[1] 11.66667 11.66667
Nearest neighbor resampling is not a good idea either as the results would be arbitrary.
Here is a by layer approach that you suggested and dww also showed already.
b <- layerize(r_hr)
fact <- round(dim(r_hr)[1:2] / dim(r_lr)[1:2])
a <- aggregate(b, fact)
x <- resample(a, r_lr)
Now you have proportions. If you want a single class you could do
y <- which.max(x)
In that case, another approach would be to aggregate the classes
ag <- aggregate(r_hr, fact, modal)
agx <- resample(ag, r_lr, method='ngb')
Note that agx and y are the same. But they can both be problematic as you might have cells with 5 classes with each about 20%, making it rather unreasonable to pick one winner.
It is pretty standard practice to aggregate land cover maps into layers of %cover. I.e you should aim to produce 13 layers, each being something like %cover in that grid cell. Doing this allows you to reduce the resolution while retaining as much information as possible. N.B if you require a different summary statistic than %, should be easy to adapt the following method to whatever statistic you want, by changing the fun = function in aggregate.
The following method is pretty fast (it takes just a few seconds on my laptop to process raster with 100 million cells):
First, let's create some dummy rasters to use
Nhr <- 1e4 # resolution of high-res raster
Nlr <- 333 # resolution of low-res raster
r.hr <- raster(ncols=Nhr, nrows=Nhr)
r.lr <- raster(ncols=Nlr, nrows=Nlr)
r.hr[] <- sample(1:13, Nhr^2, replace=T)
Now, we begin by aggregating the high res raster to almost the same resolution as the low res one (to nearest integer number of cells). Each resulting layer contains the fraction of area within that cell in which value of original raster is N.
Nratio <- as.integer(Nhr/Nlr) # ratio of high to low resolutions, to nearest integer value for aggregation
layer1 <- aggregate(r.hr, Nratio, fun=function(x, na.rm=T) {mean(x==1, na.rm=na.rm)})
layer2 <- aggregate(r.hr, Nratio, fun=function(x, na.rm=T) {mean(x==2, na.rm=na.rm)})
And finally, resample low res raster to the desired resolution
layer1 <- resample(layer1, r.lr, method = "ngb")
layer2 <- resample(layer2, r.lr, method = "ngb")
repeat for each layer, and build your layers into a stack or a multi-band raster

Converting shapefile to raster

I'm having an issue rasterizing a shapefile to produce points on a 0.5*0.5 grid. The shapefile represents classifications of risk level (Low-0, Medium-100, High-1000, Very High-1500) of global coral reefs to integrated threats.
I pulled the code from another example that works fine, but when I try it for my data I get nothing from the plot function. See below for the link to the shapefile and my code:
Reefs At Risk: Global Integreated Threats
# Read shapefile into R
library(rgdal)
library(raster)
int.threat.2030 <- readOGR(dsn = "Global_Threats/Integrated_Future",
layer = "rf_int_2030_poly")
## Set up a raster "template" for a 0.5 degree grid
ext <- extent(-110, -50, 0, 35)
gridsize <- 0.5
r <- raster(ext, res=gridsize)
## Rasterize the shapefile
rr <- rasterize(int.threat.2030, r)
## Plot raster
plot(rr)
Any ideas where I might be going wrong? Is it an issue with the shapefile itself?
Please and thanks!
You assumed that the polygons were in lon/lat coordinates, but they are not:
library(raster)
library(rgdal)
p <- shapefile('Global_Threats/Integrated_Future/rf_int_2030_poly.shp')
p
#class : SpatialPolygonsDataFrame
#features : 63628
#extent : -18663508, 14601492, -3365385, 3410115 (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=cea +lon_0=-160 +lat_ts=0 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
#variables : 3
#names : ID, THREAT, THREAT_TXT
#min values : 1, 0, Critical
#max values : 63628, 2000, Very High
You can either change the projection
pgeo <- spTransform(p, CRS('+proj=longlat +datum=WGS84'))
and then do something like:
ext <- floor(extent(pgeo))
rr <- raster(ext, res=0.5)
rr <- rasterize(pgeo, rr, field=1)
Or keep the orginal CRS and do something like:
ext <- extent(p)
r <- raster(ext, res=50000)
r <- rasterize(p, r, field=1)
plot(r)
Note that you are rasterizing very small polygons to large raster cells. A polygon is considered 'inside' if it covers the center of a cell (i.e. assuming a case where polygons cover multiple cells). So for these data you would need to use a much higher resolution (and then perhaps aggregate the results). Alternatively you could rasterize polygon centroids.
But none of the above is relevant really, as you are doing this all backwards. The polygons are clearly derived from a raster (look how blocky they are) and the raster is available in the dataset you point to!
So instead of rasterizing, do:
x <- raster('Global_Threats/Integrated_Future/rf_int_2030')
x
#class : RasterLayer
#dimensions : 25456, 80150, 2040298400 (nrow, ncol, ncell)
#resolution : 500, 500 (x, y)
#extent : -20037508, 20037492, -6363885, 6364115 (xmin, xmax, ymin, ymax)
#coord. ref. : NA
#data source : C:\temp\Global_Threats\Integrated_Future\rf_int_2030
#names : rf_int_2030
#values : 0, 2000 (min, max)
#attributes :
# ID COUNT THREAT_TXT
# 0 80971 Low
# 100 343535 Medium
# 1000 322231 High
# 1500 168518 Very High
# 2000 83598 Critical
Here plotting a part of Palawan:
e <- extent(c(-8990636, -8929268, 1182946, 1256938))
plot(x, ext=e)
plot(p, add=TRUE)
If you need a lower resolution see raster::aggregate. For a different coordinate reference system, see raster::projectRaster.

Resources