How can I subset a raster by conditional statement in R using `terra`? - r

I am trying to plot only certain values from a categorical land cover raster I am working with. I have loaded it in to R using the terra package and it plots fine. However, since the original data did not come with a legend, I am trying to find out which raster value corresponds to what on the map.
Similar to the answer provided here: How to subset a raster based on grid cell values
I have tried using the following line:
> landcover
class : SpatRaster
dimensions : 20057, 63988, 1 (nrow, ncol, nlyr)
resolution : 0.0005253954, 0.0005253954 (x, y)
extent : -135.619, -102, 59.99989, 70.53775 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source : spat_n5WpgzBuVAV3Ijm.tif
name : CAN_LC_2015_CAL_wgs
min value : 1
max value : 18
> plot(landcover[landcover == 18])
Error: cannot allocate vector of size 9.6 Gb
However, this line takes a very long time to run and produces a vector memory error. The object is 1.3 kb in the global environment and the original tif is about 300 mb.

You can use cats to find out which values correspond to which categories.
library(terra)
set.seed(0)
r <- rast(nrows=10, ncols=10)
values(r) <- sample(3, ncell(r), replace=TRUE) - 1
cls <- c("forest", "water", "urban")
levels(r) <- cls
names(r) <- "land cover"
cats(r)[[1]]
# ID category
#1 0 forest
#2 1 water
#3 2 urban
To plot a logical (Boolean) layer for one category, you can do
plot(r == "water")
And from from the above you can see that in this case that is equivalent to
plot(r == 1)

I think I found the solution to write the conditional within the plot function as below:
plot(landcover == 18)
For those looking for a reproduceable example, just load the rlogo:
s <- rast(system.file("ex/logo.tif", package="terra"))
s <- s$red
plot(s == 255)

Related

Separating raster by land use attribute in R

I am new to R and trying to extract a subset of values from a raster file. I am using the Ontario Land Cover Compilation (OLCC) v.2.0 and want to only extract impervious cover values within my buffer regions. According to the Data Specifications there are classification names for land use classes and associated codes. I only want to extract data from the Community/Infrastructure name (code 27). I have uploaded the entire raster into R. Is there a way to separate the raster by code name/class? If I get the separated raster subset I know how to extract within my buffer region from there.
I have tried the raster brick function to see if it would recognize the code names and separate them into different layers automatically but this didn't work. I saw another post where raster attributes were extracted by class, but I am not sure how the land use classes are being separated and defined in R here.
Here is some example data
library(terra)
#terra 1.5.6
set.seed(0)
x <- rast(nrows=10, ncols=10, names="cover")
values(x) <- sample(3, ncell(x), replace=TRUE) - 1
levels(x) <- c("forest", "water", "urban")
Inspect
x
#class : SpatRaster
#dimensions : 10, 10, 1 (nrow, ncol, nlyr)
#resolution : 36, 18 (x, y)
#extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84
#source : memory
#name : cover
#min value : forest
#max value : water
levels(x)[[1]]
#[1] "forest" "water" "urban"
cats(x)[[1]]
# ID category
#1 0 forest
#2 1 water
#3 2 urban
So if you were interested in extracting the "urban" areas only, you can see that the ID (cell value) for that class is 2. And you can do
urban <- x == 2
plot(urban)
text(x)
urb <- x == "urban"
It should also have been possible to do
urb <- x == "urban"
But that may be offset by one class in some cases. To use this safely, you need terra 1.5-7 (currently the development version)
Also, if there are multiple categories, you may first need to activate the category you are interested in; like so:
activeCat(x) <- "cover"

How to get values for a pixel from a geoTIFF in R?

I'm trying to get RGB components from a geoTIFF file in R. The colours on the image correspond to different land classification types and I have a legend for each classification type in RGB components.
I'm using the raster library. My code so far is
library(raster)
my.map = raster("mygeoTIFFfile.tif")
Here is the information on the file once it has been read in:
> my.map[[1]]
class : RasterLayer
dimensions : 55800, 129600, 7231680000 (nrow, ncol, ncell)
resolution : 0.002777778, 0.002777778 (x, y)
extent : -180.0014, 179.9986, -64.99861, 90.00139 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : filepah/filename.tif
names : filename.tif
values : 11, 230 (min, max)
The specific geoTIFF file I'm working on can be found here:
http://due.esrin.esa.int/page_globcover.php
(just click on "Globcover2009_V2.3_Global_.zip")
Can someone please help me get the value from a single pixel location from this file please?
The rasterToPoints() function will convert your raster data to a matrix containing x, y, and value for each point. This will be very large, but may be what you're looking for if you want to do a broad analysis of the data.
library(raster)
map <- raster("GLOBCOVER_L4_200901_200912_V2.3.tif")
data <- rasterToPoints(map, progress="text")
head(data)
Another option is to use the extract() function to return a single point by passing a SpatialPoints object with latitude/longitude. If you only want a few individual data points, this will be a lot faster than loading the entire thing into a matrix.
library(raster)
map <- raster("GLOBCOVER_L4_200901_200912_V2.3.tif")
extract(map, SpatialPoints(cbind(-123.3680884, 48.4252848)))
It seems that you are asking the wrong question.
To get a value for a single pixel (grid cell), you can do use indexing. For example, for cell number 10,000 and 10,001 you can do r[10000:10001].
You could get all values by doing values(r). But that will fail for a very large raster like this (unless you have lots of RAM).
However, the question you need answered, it seems, is how to make a map by matching integer cell values with RGB colors.
Let's set up an example raster
library(raster)
r <- raster(nrow=4, ncol=4)
values(r) <- rep(c(11, 14, 20, 30), each=4)
And some matching RGB values
legend <- read.csv(text="Value,Label,Red,Green,Blue
11,Post-flooding or irrigated croplands (or aquatic),170,240,240
14,Rainfed croplands,255,255,100
20,Mosaic cropland (50-70%) / vegetation (grassland/shrubland/forest) (20-50%),220,240,100
30,Mosaic vegetation (grassland/shrubland/forest) (50-70%) / cropland (20-50%) ,205,205,102")
Compute the color code
legend$col <- rgb(legend$Red, legend$Green, legend$Blue, maxColorValue=255)
set up a "color table"
# start with white for all values (1 to 255)
ct <- rep(rgb(1,1,1), 255)
# fill in where necessary
ct[legend$Value+1] <- legend$col
colortable(r) <- ct
plot
plot(r)
You can also try:
tb <- legend[, c('Value', 'Label')]
colnames(tb)[1] = "ID"
tb$Label <- substr(tb$Label, 1,10)
levels(r) <- tb
library(rasterVis)
levelplot(r, col.regions=legend$col, at=0:length(legend$col))

time and geographical subset of netcdf raster stack or raster brick using R

For the following netcdf file with daily global sea surface temperatures for 2016, I'm trying to (i) subset temporally, (ii) subset geographically, (iii) then take long-term means for each pixel and create a basic plot.
Link to file: here
library(raster)
library(ncdf4)
open the netcdf after setting my working directory
nc_data <- nc_open('sst.day.mean.2016.v2.nc')
change the time variable so it's easy to interpret
time <- ncdf4::ncvar_get(nc_data, varid="time")
head(time)
change to dates that I can interpret
time_d <- as.Date(time, format="%j", origin=as.Date("1800-01-01"))
Now I'd like to subset only September 1 to October 15, but can't figure that out...
Following temporal subset, create raster brick (or stack) and geographical subset
b <- brick('sst.day.mean.2016.v2.nc') # I would change this name to my file with time subest
subset geographically
b <- crop(b, extent(144, 146, 14, 16))
Finally, I'd like to take the average for each pixel across all my days of data, assign this to a single raster, and make a simple plot...
Thanks for any help and guidance.
After b <- brick('sst.day.mean.2016.v2.nc'), we can type b to see information of the raster brick.
b
# class : RasterBrick
# dimensions : 720, 1440, 1036800, 366 (nrow, ncol, ncell, nlayers)
# resolution : 0.25, 0.25 (x, y)
# extent : 0, 360, -90, 90 (xmin, xmax, ymin, ymax)
# coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
# data source : C:\Users\basaw\Downloads\sst.day.mean.2016.v2.nc
# names : X2016.01.01, X2016.01.02, X2016.01.03, X2016.01.04, X2016.01.05, X2016.01.06, X2016.01.07, X2016.01.08, X2016.01.09, X2016.01.10, X2016.01.11, X2016.01.12, X2016.01.13, X2016.01.14, X2016.01.15, ...
# Date : 2016-01-01, 2016-12-31 (min, max)
# varname : sst
Notice that the Date slot has information from 2016-01-01 to 2016-12-31, which means the Z values already has date information and we can use that to subset the raster brick.
We can use the getZ function to access the values stored in the Z values. Type getZ(b) we can see a series of dates.
head(getZ(b))
# [1] "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" "2016-01-05" "2016-01-06"
class(getZ(b))
# [1] "Date"
We can thus use the following code to subset the raster brick.
b2 <- b[[which(getZ(b) >= as.Date("2016-09-01") & getZ(b) <= as.Date("2016-10-15"))]]
We can then crop the image based on the code you provided.
b3 <- crop(b2, extent(144, 146, 14, 16))
To calculate the average, just use the mean function.
b4 <- mean(b3, na.rm = TRUE)
Finally, we can plot the average.
plot(b4)
The subsetting and averaging task is easy to do in CDO:
cdo timmean -sellonlatbox,lon1,lon2,lat1,lat2 -seldate,date1,date2 in.nc out.nc
where the lon1,lon2 etc define the lon-lat area to cut out and date1,date2 are the date bounds.
You can call this command directly from R using the climate operators package as per this question.
So for example, without the piping, on 3 lines would be in R:
cdo("seldate,date1,date2",in.fname,out1.fname,debug=TRUE)
cdo("sellonlatbox,lon1,lon2,lat1,lat", out1.fname,out2.fname,debug=TRUE)
cdo("timmean",out2.fname,out.fname,debug=TRUE)

Draw polygon from raster after occurrence modeling

I want to draw polygons for species occurrence using the same methods BIEN uses, so I can use both my polygons and theirs. They use Maxent to model species occurrence when they have more then occurrence points.
So, this is, for example, a BIEN polygon:
library(BIEN)
Mormolyca_ringens<- BIEN_ranges_load_species(species = "Mormolyca ringens")
#And this is a polygon, yes. A SpatialPolygonsDataFrame.
plot(wrld_simpl, xlim=c(-100,-40), ylim=c(-30,30), axes=TRUE,col="light yellow", bg="light blue")
plot(Mormolyca_ringens, col="green", add=TRUE)
Mormolyca ringens polygon
Ok, then I'm trying to draw my polygons because BIEN lacks some for species I need.
# first, you need to download the Maxent software here: http://biodiversityinformatics.amnh.org/open_source/maxent/
#and paste the "maxent.jar" file in the ’java’ folder of the ’dismo’ package, which is here:
system.file("java", package="dismo")
#You have to do this **before** loading the libraries
#install.packages("rJava")
library(rJava)
#If you get the message that cannot load this library, it's possible that your version of java is not 64bit.
#Go to Oracle and install Java for windows 64bit.
#If library still doesn't load: Look in your computer for the path where the java's jre file is and paste in the code below
Sys.setenv(JAVA_HOME="your\\path\\for\\jre") #mine is "C:\\Program Files\\Java\\jre1.8.0_144", for example
library(rJava)
library(dismo)
library(maptools)
#Giving credits: I wrote the following code based on this tutorial: https://cran.r-project.org/web/packages/dismo/vignettes/sdm.pdf
#Preparing the example data - the map
data(wrld_simpl)
ext = extent(-90, -32, -33, 23)
#Preparing the example data - presence data for Bradypus variegatus
file <- paste(system.file(package="dismo"), "/ex/bradypus.csv", sep="")
bradypus <- read.table(file, header=TRUE, sep=',')
bradypus <- bradypus[,-1] #don't need th first col
#Getting the predictors (the variables)
files <- list.files(path=paste(system.file(package="dismo"),
'/ex', sep=''), pattern='grd', full.names=TRUE )
predictors <- stack(files)
#making a training and a testing set.
group <- kfold(bradypus, 5)
pres_train <- bradypus[group != 1, ]
pres_test <- bradypus[group == 1, ]
#Creating the background
backg <- randomPoints(predictors, n=1000, ext=ext, extf = 1.25)
colnames(backg) = c('lon', 'lat')
group <- kfold(backg, 5)
backg_train <- backg[group != 1, ]
backg_test <- backg[group == 1, ]
# Running maxent
xm <- maxent(predictors, pres_train, factors='biome')
plot(xm)
#A response plot:
response(xm)
# Evaluating and predicting
e <- evaluate(pres_test, backg_test, xm, predictors)
px <- predict(predictors, xm, ext=ext, progress='text', overwrite=TRUE)
#Checking result of the prediction
par(mfrow=c(1,2))
plot(px, main='Maxent, raw values')
plot(wrld_simpl, add=TRUE, border='dark grey')
tr <- threshold(e, 'spec_sens')
plot(px > tr, main='presence/absence')
plot(wrld_simpl, add=TRUE, border='dark grey')
points(pres_train, pch='+')
At this point, I have the following image:
Prediction for example's occurrence
And I'm trying to make a polygon from this raster with this code:
predic_pol<-rasterToPolygons(px )
And also:
px_rec<-reclassify(px, rcl=0.5, include.lowest=FALSE)
px_pol<-rasterToPolygons(px_rec)
But i keep getting a pixels version of my extent
Can you please give me a hint so I can extract a polygon out of this raster, like the BIEN's one? (Also I'm new to modeling and to R... any tips are welcome)
EDIT: this is the px console output:
> px
class : RasterLayer
dimensions : 172, 176, 30272 (nrow, ncol, ncell)
resolution : 0.5, 0.5 (x, y)
extent : -120, -32, -56, 30 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : C:\Users\thai\Documents\ORCHIDACEAE\Ecologicos\w2\predictions\Trigonidiumobtusum_prediction.grd
names : layer
values : 6.705387e-06, 0.9999983 (min, max)
Thank you in advance
Edit 2: Solution
Thanks to #Val I got to this:
#Getting only the values>tr to make the polygon
#"tr" is what gives me the green raster instear of the multicolour one
pol <- rasterToPolygons(px>tr,function(x) x == 1,dissolve=T)
#Ploting
plot(wrld_simpl, xlim=c(-120,-20), ylim=c(-60,10), axes=TRUE,col="light yellow", bg="light blue")
plot(pol, add=T, col="green")
And now I have what I wanted! Thank you!
(The polygon is not the same in the figures only because I used a different data set I had at my environment at the moment I got #Val 's answer)
Bonus question:
Do you know how to smooth the edges so I get a non pixelized polygon?
I don't know BIEN, so I din't really look at this part of your example. I just generalized your problem/question down to the following:
You have a binary raster (with 0 for absence and 1 for presence) and you want to convert all areas with 1 to a polygon.
As for your px raster, it's a bit odd that your values are not 0 and 1 but more basically 0 and basically 1. But if that's a problem, that can be an easy fix.
So I tried to recreate your example with just the area of Brasil:
library(raster)
library(rgeos)
# get Brasil borders
shp <- getData(country = 'BRA',level=0)
#create binary raster
r <- raster(extent(shp),resolution=c(0.5,0.5))
r[] <- NA # values have to be NA for the buffering
# take centroid of Brasil as center of species presence
cent <- gCentroid(shp)
# set to 1
r[cellFromXY(r,cent)] <- 1
# buffer presence
r <- buffer(r,width=1000000)
# set rest 0
r[is.na(r)] <- 0
# mask by borders
r <- mask(r,shp)
This is close enough to your raster I guess:
So now to the conversion to the polygon:
pol <- rasterToPolygons(r,function(x) x == 1,dissolve=T)
I use a function to only get pixels with value 1. Also I dissolve the polygons to not have single pixel polygons but rather an area. See rasterToPolygons for other options.
And now plot the borders and the new polygon together:
plot(shp)
plot(pol,col='red',add=T)
And there you have it, a polygon of the distribution. This is the console output:
> pol
class : SpatialPolygonsDataFrame
features : 1
extent : -62.98971, -43.48971, -20.23512, -1.735122 (xmin, xmax, ymin, ymax)
coord. ref. : NA
variables : 1
names : layer
min values : 1
max values : 1
Hope that helps!
Edit: Bonus answer
You have to be clear, that the pixelized boundaries of your polygon(s) represent an accurate representation of your data. So any change to that means a loss of precision. Now, depending on your purpose, that might not matter.
There's multiple ways to achieve it, either at the raster side with disaggregating and smoothing/filtering etc. or at the polygon side, where you can apply specific filters to the polygons like this.
If it's purely aesthetic, you can try gSimplify from the rgeos package:
# adjust tol for smoothness
pol_sm <- gSimplify(pol,tol=0.5)
plot(pol)
lines(pol_sm,col='red',lwd=2)

Getting Data out of raster file in R

I'm new to raster files, but they seem to be the best way to open up the large gov't files that have all the weather data, so I'm trying to figure out how to use them. For reference, I'm downloading the files located here (just some run of the mill weather stuff). When I use the raster package of R to import the file like this
> r <- raster("/path/to/file.grb")
Everything works fine. I can even get a little metadata when I type in
> r
class : RasterLayer
band : 1 (of 37 bands)
dimensions : 224, 464, 103936 (nrow, ncol, ncell)
resolution : 0.125, 0.125 (x, y)
extent : -125.0005, -67.0005, 25.0005, 53.0005 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +a=6371200 +b=6371200 +no_defs
data source : /path/to/file.grb
names : NLDAS_MOS0125_H.A20140629.0100.002
All I've managed to do at this point is index the raster in a very obvious way.
> r[100,100]
267.1
So, I guess I can "index" it, but I have no idea what the number 267.1 means. It's certainly not all there is in the cell. There should be a bunch of variables including, but not limited to, soil moisture, surface runoff, and evaporation.
How can I access this information in the same way using R?
# create two rasters
r1 <- raster(matrix(ncol = 10, nrow = 10, runif(100)))
r2 <- raster(matrix(ncol = 10, nrow = 10, runif(100)))
# creates a raster stack -- the stack (or brick function) allows you to
# to use multilayer band rasters
# http://www.inside-r.org/packages/cran/raster/docs/stack
st_r <- stack(r1, r2)
# extract values -- will create a matrix with 100 rows and two columns
vl <- getValues(st_r)
r <- raster("/path/to/file.grb")
values <- getValues(r)
You can read about the function here:
http://www.inside-r.org/packages/cran/raster/docs/values
I believe that the problem is that you are using raster and not stack. The raster function results in a single layer (matrix) whereas stack or brick read an array with all of the raster layers. Here is an example that demonstrates extracting values using an [i,j,z] index.
library(raster)
setwd("D:/TMP")
download.file("ftp://hydro1.sci.gsfc.nasa.gov/data/s4pa/NLDAS/NLDAS_MOS0125_H.002/2014/180/NLDAS_MOS0125_H.A20140629.0000.002.grb",
destfile="NLDAS_MOS0125_H.A20140629.0000.002.grb", mode="wb")
r <- stack("NLDAS_MOS0125_H.A20140629.0000.002.grb")
names(r) <- paste0("L", seq(1:nlayers(r)))
class(r)
# Values for [i,j]
i=100
j=100
r[i,j]
# Values for i,j and z at layer(s) 1, 5 and 10
z=c(1,5,10)
r[i,j][z]

Resources