Project MaxENT fitted model into geographic space - r

Similar question was answered here; however, this problem is a bit different for which I cannot apply that solution. I have fitted maxent using site-with-data format. The problem is I cannot project the fitted model. The output of m1 in the path D:/maxent looks fine. I suspect this two error (below) is related with rJava, but I don't know the solution. Please see my codes below:
> m1 <- maxent(x = d, p = id, path = "D:/maxent",
args = c("-P", "noautofeature", "nolinear", "noquadratic", "nothreshold",
"noproduct", "betamultiplier=1", "replicates=10", "crossvalidate"))
# here d is a dataframe containing 11213 rows and 20 predictor columns (with numeric values), id is a vector containing numeric values of 1 and 0 (representing species presence and absence)
Loading required namespace: rJava
> plot(m1, xlim=c(0,100))
Error in as.double(y) :
cannot coerce type 'S4' to vector of type 'double'
> ras <- raster("E:/bio12.tif") # raster to project the fitted model 'm1'
> pred.m1 <- raster::predict(m1, ras)
Error in .local(object, ...) : missing layers (or wrong names)
Here is the properties of raster file
> ras
class : RasterLayer
dimensions : 4292, 4936, 21185312 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : 112.8917, 154.025, -43.75833, -7.991667 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : E:/Predictors_grasshoppers/selected.predictors/bio12.tif
names : bio12
values : 79, 7625 (min, max)
Update: I have tried using single quote in m1 and the problem is still there.
> m2 <- maxent(x = d, p = id, path = 'D:/PhD related/Historic climate data Australia/maxent2',
args = c('-P', 'noautofeature', 'nolinear', 'noquadratic', 'nothreshold',
'noproduct', 'betamultiplier=1', 'replicates=10', 'crossvalidate'))

Following the comment of #Bappa Das I found the solution. To project maxent fitted model into geographic space one should use a raster stack (not a single raster) containing variables that were used during the model fitting process. The order and name of predictors in the raster stack should be same as they were in fitted model.

Related

stars::read_stars and raster::raster have different projections when reading the same .nc file

I'm trying to read a .nc raster file in R. The older function raster::raster() reads the data perfectly fine. I'd like to reproduce the results using a newer function stars::read_stars(), but somehow it does not work for me. The data CHAP_PM2.5_Y1K_2020_V4.nc can be downloaded from here (~6.1 MB). Below is a minimal reproducible example:
library(stars)
library(raster)
pm_raster = raster::raster('CHAP_PM2.5_Y1K_2020_V4.nc')
pm_stars = stars::read_stars('CHAP_PM2.5_Y1K_2020_V4.nc')
# Warning messages:
# 1: In CPL_read_gdal(as.character(x), as.character(options), as.character(driver), :
# GDAL Message 1: The dataset has several variables that could be identified as vector fields,
# but not all share the same primary dimension. Consequently they will be ignored.
# 2: In CPL_read_gdal(as.character(x), as.character(options), as.character(driver), :
# GDAL Message 1: The dataset has several variables that could be identified as vector fields,
# but not all share the same primary dimension. Consequently they will be ignored.
The reading the files looks good, but the problem is when I plot them, the figure read by stars::read_stars looks wrong:
plot(pm_raster, main = 'raster::raster')
plot(pm_stars, main = 'stars::read_stars')
It looks like the projection is wrong for stars::read_stars(), but I have no clue on this. Any suggestion or comment would be appreciated.
This is a non-answer but using slightly different read options. It looks like stars::read_ncdf is your better read option.
library(terra)
library(stars)
pm_terra <- rast('~/Downloads/CHAP_PM2.5_Y1K_2020_V4.nc')
pm_terra
class : SpatRaster
dimensions : 3571, 6148, 1 (nrow, ncol, nlyr)
resolution : 0.009999999, 0.01 (x, y)
extent : 73.45216, 134.9322, 17.9697, 53.67971 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84
source : CHAP_PM2.5_Y1K_2020_V4.nc
varname : PM2.5
name : PM2.5
unit : µg/m3
plot(flip(pm_terra))
# and then using stars::read_ncdf
pm_stars_ncdf <- read_ncdf('~/Downloads/CHAP_PM2.5_Y1K_2020_V4.nc')
no 'var' specified, using PM2.5
other available variables:
lat, lon
Will return stars object with 21954508 cells.
Warning message:
In .get_nc_projection(meta$attribute, rep_var, cv) :
No projection information found in nc file.
flip(terra) above
stars::read_ncdf above.

calculating centroid of raster

I have a list (s) containing information on the probable locations of many animals in South America. For example, this is the type of stored information and what it looks like when plotted for the first individual.
Example:
> s[1]
[[1]]
class : RasterLayer
dimensions : 418, 313, 130834 (nrow, ncol, ncell)
resolution : 0.16666, 0.16666 (x, y)
extent : -86.333, -34.16842, -55.91633, 13.74755 (xmin, xmax, ymin, ymax)
coord. ref. : NA
data source : in memory
names : layer
values : 0, 1 (min, max)
> plot(s[[1]])
Note: the green areas are all "likely" locations and the grey areas are "unlikely" locations.
I would like to calculate the centroid of this probable location (i.e., centroid of the green area).
Below #dww suggested the following solution (which works for the simulated data), but leads to an error message with my data.
colMeans(xyFromCell(s, which(s[]==1)))
Error in xyFromCell(s[1], which(s[] == 1)) :
trying to get slot "extent" from an object of a basic class ("list") with no slots
To find the centroid of the cells where a raster r has the value 1, you can use
colMeans(xyFromCell(r, which(r[]==1)))
Essentially, the centroid is at the mean of the latitudes/longitudes of the subsetted locations.
Here's some reproducible dummy data to test on:
r = raster(matrix(sample(0:1, 10000,T), nrow=100,ncol=100))

How to get values for a pixel from a geoTIFF in R?

I'm trying to get RGB components from a geoTIFF file in R. The colours on the image correspond to different land classification types and I have a legend for each classification type in RGB components.
I'm using the raster library. My code so far is
library(raster)
my.map = raster("mygeoTIFFfile.tif")
Here is the information on the file once it has been read in:
> my.map[[1]]
class : RasterLayer
dimensions : 55800, 129600, 7231680000 (nrow, ncol, ncell)
resolution : 0.002777778, 0.002777778 (x, y)
extent : -180.0014, 179.9986, -64.99861, 90.00139 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : filepah/filename.tif
names : filename.tif
values : 11, 230 (min, max)
The specific geoTIFF file I'm working on can be found here:
http://due.esrin.esa.int/page_globcover.php
(just click on "Globcover2009_V2.3_Global_.zip")
Can someone please help me get the value from a single pixel location from this file please?
The rasterToPoints() function will convert your raster data to a matrix containing x, y, and value for each point. This will be very large, but may be what you're looking for if you want to do a broad analysis of the data.
library(raster)
map <- raster("GLOBCOVER_L4_200901_200912_V2.3.tif")
data <- rasterToPoints(map, progress="text")
head(data)
Another option is to use the extract() function to return a single point by passing a SpatialPoints object with latitude/longitude. If you only want a few individual data points, this will be a lot faster than loading the entire thing into a matrix.
library(raster)
map <- raster("GLOBCOVER_L4_200901_200912_V2.3.tif")
extract(map, SpatialPoints(cbind(-123.3680884, 48.4252848)))
It seems that you are asking the wrong question.
To get a value for a single pixel (grid cell), you can do use indexing. For example, for cell number 10,000 and 10,001 you can do r[10000:10001].
You could get all values by doing values(r). But that will fail for a very large raster like this (unless you have lots of RAM).
However, the question you need answered, it seems, is how to make a map by matching integer cell values with RGB colors.
Let's set up an example raster
library(raster)
r <- raster(nrow=4, ncol=4)
values(r) <- rep(c(11, 14, 20, 30), each=4)
And some matching RGB values
legend <- read.csv(text="Value,Label,Red,Green,Blue
11,Post-flooding or irrigated croplands (or aquatic),170,240,240
14,Rainfed croplands,255,255,100
20,Mosaic cropland (50-70%) / vegetation (grassland/shrubland/forest) (20-50%),220,240,100
30,Mosaic vegetation (grassland/shrubland/forest) (50-70%) / cropland (20-50%) ,205,205,102")
Compute the color code
legend$col <- rgb(legend$Red, legend$Green, legend$Blue, maxColorValue=255)
set up a "color table"
# start with white for all values (1 to 255)
ct <- rep(rgb(1,1,1), 255)
# fill in where necessary
ct[legend$Value+1] <- legend$col
colortable(r) <- ct
plot
plot(r)
You can also try:
tb <- legend[, c('Value', 'Label')]
colnames(tb)[1] = "ID"
tb$Label <- substr(tb$Label, 1,10)
levels(r) <- tb
library(rasterVis)
levelplot(r, col.regions=legend$col, at=0:length(legend$col))

Random sampling from large rasterlayer

I have a large Rasterlayer with integers ranging from 0 to 44.
class : RasterLayer
dimensions : 29800, 34470, 1027206000 (nrow, ncol, ncell)
resolution : 10, 10 (x, y)
extent : 331300, 676000, 5681995, 5979995 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=utm +zone=32 +ellps=GRS80 +units=m +no_defs
data source : /home/mkoehler/stk_rast_whz
names : stk_rast_whz
values : 0, 44 (min, max)
I want to do a stratified sampling of 5000 points per stratum.
I get the following error:
POINTS<-sampleStratified(b, size=5000, na.rm=T, xy=F)
(Error in ys[[i]] <- y : attempt to select less than one element)
Here is a code that reproduces the problems (even when only selecting 1
item per stratum):
set.seed(10)
r <- raster(ncol=5000, nrow=5000)
names(r) <- 'stratum'
r[] <- round((runif(ncell(r)))*44)
sampleStratified(r, size=1,xy=T)
Error in ys[[i]] <- y : attempt to select less than one element
Trying that with fewer strata and changing the settings of "size" or
"exp" have no effect.
R version: [64-bit] C:\Program Files\R\R-3.1.1
Any ideas?
thanks in advance!
This appears to be a bug (as at raster 2.3-12), and occurs when (1) your raster contains cells with value 0, and (2) the raster can't be processed in memory (i.e. canProcessInMemory(r) is FALSE).
The function loops over the unique cell values produced by freq(r), and then indexes a list by each of these values in turn. If one of those values is zero, the error will be triggered since the 0th element does not exist. For example:
list()[[0]]
# Error in list()[[0]] : attempt to select less than one element]
You'll notice that the error doesn't occur if you fill r with, e.g., r[] <- sample(44, ncell(r), replace=TRUE), since it won't have any zeroes.
When the raster can be processed in memory, the function loops over the row numbers of freq(r), and so the subsequent list indexing is sensible.
I've contacted the maintainer to report this bug.
Meanwhile, as a temporary fix, you could use something like the following to make a corrected copy of the function (which will remain available in the current R session).
sampleStratified2 <-
eval(parse(text=sub('sr\\[, 2\\] == i', 'sr[, 2] == f[i, 1]',
sub('i in f\\[, 1\\]', 'i in seq_len(nrow(f))',
deparse(getMethod(sampleStratified,
signature='RasterLayer')#.Data))
)))
sampleStratified2(r, size=1, xy=TRUE)

Getting Data out of raster file in R

I'm new to raster files, but they seem to be the best way to open up the large gov't files that have all the weather data, so I'm trying to figure out how to use them. For reference, I'm downloading the files located here (just some run of the mill weather stuff). When I use the raster package of R to import the file like this
> r <- raster("/path/to/file.grb")
Everything works fine. I can even get a little metadata when I type in
> r
class : RasterLayer
band : 1 (of 37 bands)
dimensions : 224, 464, 103936 (nrow, ncol, ncell)
resolution : 0.125, 0.125 (x, y)
extent : -125.0005, -67.0005, 25.0005, 53.0005 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +a=6371200 +b=6371200 +no_defs
data source : /path/to/file.grb
names : NLDAS_MOS0125_H.A20140629.0100.002
All I've managed to do at this point is index the raster in a very obvious way.
> r[100,100]
267.1
So, I guess I can "index" it, but I have no idea what the number 267.1 means. It's certainly not all there is in the cell. There should be a bunch of variables including, but not limited to, soil moisture, surface runoff, and evaporation.
How can I access this information in the same way using R?
# create two rasters
r1 <- raster(matrix(ncol = 10, nrow = 10, runif(100)))
r2 <- raster(matrix(ncol = 10, nrow = 10, runif(100)))
# creates a raster stack -- the stack (or brick function) allows you to
# to use multilayer band rasters
# http://www.inside-r.org/packages/cran/raster/docs/stack
st_r <- stack(r1, r2)
# extract values -- will create a matrix with 100 rows and two columns
vl <- getValues(st_r)
r <- raster("/path/to/file.grb")
values <- getValues(r)
You can read about the function here:
http://www.inside-r.org/packages/cran/raster/docs/values
I believe that the problem is that you are using raster and not stack. The raster function results in a single layer (matrix) whereas stack or brick read an array with all of the raster layers. Here is an example that demonstrates extracting values using an [i,j,z] index.
library(raster)
setwd("D:/TMP")
download.file("ftp://hydro1.sci.gsfc.nasa.gov/data/s4pa/NLDAS/NLDAS_MOS0125_H.002/2014/180/NLDAS_MOS0125_H.A20140629.0000.002.grb",
destfile="NLDAS_MOS0125_H.A20140629.0000.002.grb", mode="wb")
r <- stack("NLDAS_MOS0125_H.A20140629.0000.002.grb")
names(r) <- paste0("L", seq(1:nlayers(r)))
class(r)
# Values for [i,j]
i=100
j=100
r[i,j]
# Values for i,j and z at layer(s) 1, 5 and 10
z=c(1,5,10)
r[i,j][z]

Resources