time and geographical subset of netcdf raster stack or raster brick using R - r

For the following netcdf file with daily global sea surface temperatures for 2016, I'm trying to (i) subset temporally, (ii) subset geographically, (iii) then take long-term means for each pixel and create a basic plot.
Link to file: here
library(raster)
library(ncdf4)
open the netcdf after setting my working directory
nc_data <- nc_open('sst.day.mean.2016.v2.nc')
change the time variable so it's easy to interpret
time <- ncdf4::ncvar_get(nc_data, varid="time")
head(time)
change to dates that I can interpret
time_d <- as.Date(time, format="%j", origin=as.Date("1800-01-01"))
Now I'd like to subset only September 1 to October 15, but can't figure that out...
Following temporal subset, create raster brick (or stack) and geographical subset
b <- brick('sst.day.mean.2016.v2.nc') # I would change this name to my file with time subest
subset geographically
b <- crop(b, extent(144, 146, 14, 16))
Finally, I'd like to take the average for each pixel across all my days of data, assign this to a single raster, and make a simple plot...
Thanks for any help and guidance.

After b <- brick('sst.day.mean.2016.v2.nc'), we can type b to see information of the raster brick.
b
# class : RasterBrick
# dimensions : 720, 1440, 1036800, 366 (nrow, ncol, ncell, nlayers)
# resolution : 0.25, 0.25 (x, y)
# extent : 0, 360, -90, 90 (xmin, xmax, ymin, ymax)
# coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
# data source : C:\Users\basaw\Downloads\sst.day.mean.2016.v2.nc
# names : X2016.01.01, X2016.01.02, X2016.01.03, X2016.01.04, X2016.01.05, X2016.01.06, X2016.01.07, X2016.01.08, X2016.01.09, X2016.01.10, X2016.01.11, X2016.01.12, X2016.01.13, X2016.01.14, X2016.01.15, ...
# Date : 2016-01-01, 2016-12-31 (min, max)
# varname : sst
Notice that the Date slot has information from 2016-01-01 to 2016-12-31, which means the Z values already has date information and we can use that to subset the raster brick.
We can use the getZ function to access the values stored in the Z values. Type getZ(b) we can see a series of dates.
head(getZ(b))
# [1] "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" "2016-01-05" "2016-01-06"
class(getZ(b))
# [1] "Date"
We can thus use the following code to subset the raster brick.
b2 <- b[[which(getZ(b) >= as.Date("2016-09-01") & getZ(b) <= as.Date("2016-10-15"))]]
We can then crop the image based on the code you provided.
b3 <- crop(b2, extent(144, 146, 14, 16))
To calculate the average, just use the mean function.
b4 <- mean(b3, na.rm = TRUE)
Finally, we can plot the average.
plot(b4)

The subsetting and averaging task is easy to do in CDO:
cdo timmean -sellonlatbox,lon1,lon2,lat1,lat2 -seldate,date1,date2 in.nc out.nc
where the lon1,lon2 etc define the lon-lat area to cut out and date1,date2 are the date bounds.
You can call this command directly from R using the climate operators package as per this question.
So for example, without the piping, on 3 lines would be in R:
cdo("seldate,date1,date2",in.fname,out1.fname,debug=TRUE)
cdo("sellonlatbox,lon1,lon2,lat1,lat", out1.fname,out2.fname,debug=TRUE)
cdo("timmean",out2.fname,out.fname,debug=TRUE)

Related

How can I subset a raster by conditional statement in R using `terra`?

I am trying to plot only certain values from a categorical land cover raster I am working with. I have loaded it in to R using the terra package and it plots fine. However, since the original data did not come with a legend, I am trying to find out which raster value corresponds to what on the map.
Similar to the answer provided here: How to subset a raster based on grid cell values
I have tried using the following line:
> landcover
class : SpatRaster
dimensions : 20057, 63988, 1 (nrow, ncol, nlyr)
resolution : 0.0005253954, 0.0005253954 (x, y)
extent : -135.619, -102, 59.99989, 70.53775 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source : spat_n5WpgzBuVAV3Ijm.tif
name : CAN_LC_2015_CAL_wgs
min value : 1
max value : 18
> plot(landcover[landcover == 18])
Error: cannot allocate vector of size 9.6 Gb
However, this line takes a very long time to run and produces a vector memory error. The object is 1.3 kb in the global environment and the original tif is about 300 mb.
You can use cats to find out which values correspond to which categories.
library(terra)
set.seed(0)
r <- rast(nrows=10, ncols=10)
values(r) <- sample(3, ncell(r), replace=TRUE) - 1
cls <- c("forest", "water", "urban")
levels(r) <- cls
names(r) <- "land cover"
cats(r)[[1]]
# ID category
#1 0 forest
#2 1 water
#3 2 urban
To plot a logical (Boolean) layer for one category, you can do
plot(r == "water")
And from from the above you can see that in this case that is equivalent to
plot(r == 1)
I think I found the solution to write the conditional within the plot function as below:
plot(landcover == 18)
For those looking for a reproduceable example, just load the rlogo:
s <- rast(system.file("ex/logo.tif", package="terra"))
s <- s$red
plot(s == 255)

Separating raster by land use attribute in R

I am new to R and trying to extract a subset of values from a raster file. I am using the Ontario Land Cover Compilation (OLCC) v.2.0 and want to only extract impervious cover values within my buffer regions. According to the Data Specifications there are classification names for land use classes and associated codes. I only want to extract data from the Community/Infrastructure name (code 27). I have uploaded the entire raster into R. Is there a way to separate the raster by code name/class? If I get the separated raster subset I know how to extract within my buffer region from there.
I have tried the raster brick function to see if it would recognize the code names and separate them into different layers automatically but this didn't work. I saw another post where raster attributes were extracted by class, but I am not sure how the land use classes are being separated and defined in R here.
Here is some example data
library(terra)
#terra 1.5.6
set.seed(0)
x <- rast(nrows=10, ncols=10, names="cover")
values(x) <- sample(3, ncell(x), replace=TRUE) - 1
levels(x) <- c("forest", "water", "urban")
Inspect
x
#class : SpatRaster
#dimensions : 10, 10, 1 (nrow, ncol, nlyr)
#resolution : 36, 18 (x, y)
#extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84
#source : memory
#name : cover
#min value : forest
#max value : water
levels(x)[[1]]
#[1] "forest" "water" "urban"
cats(x)[[1]]
# ID category
#1 0 forest
#2 1 water
#3 2 urban
So if you were interested in extracting the "urban" areas only, you can see that the ID (cell value) for that class is 2. And you can do
urban <- x == 2
plot(urban)
text(x)
urb <- x == "urban"
It should also have been possible to do
urb <- x == "urban"
But that may be offset by one class in some cases. To use this safely, you need terra 1.5-7 (currently the development version)
Also, if there are multiple categories, you may first need to activate the category you are interested in; like so:
activeCat(x) <- "cover"

How to get values for a pixel from a geoTIFF in R?

I'm trying to get RGB components from a geoTIFF file in R. The colours on the image correspond to different land classification types and I have a legend for each classification type in RGB components.
I'm using the raster library. My code so far is
library(raster)
my.map = raster("mygeoTIFFfile.tif")
Here is the information on the file once it has been read in:
> my.map[[1]]
class : RasterLayer
dimensions : 55800, 129600, 7231680000 (nrow, ncol, ncell)
resolution : 0.002777778, 0.002777778 (x, y)
extent : -180.0014, 179.9986, -64.99861, 90.00139 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : filepah/filename.tif
names : filename.tif
values : 11, 230 (min, max)
The specific geoTIFF file I'm working on can be found here:
http://due.esrin.esa.int/page_globcover.php
(just click on "Globcover2009_V2.3_Global_.zip")
Can someone please help me get the value from a single pixel location from this file please?
The rasterToPoints() function will convert your raster data to a matrix containing x, y, and value for each point. This will be very large, but may be what you're looking for if you want to do a broad analysis of the data.
library(raster)
map <- raster("GLOBCOVER_L4_200901_200912_V2.3.tif")
data <- rasterToPoints(map, progress="text")
head(data)
Another option is to use the extract() function to return a single point by passing a SpatialPoints object with latitude/longitude. If you only want a few individual data points, this will be a lot faster than loading the entire thing into a matrix.
library(raster)
map <- raster("GLOBCOVER_L4_200901_200912_V2.3.tif")
extract(map, SpatialPoints(cbind(-123.3680884, 48.4252848)))
It seems that you are asking the wrong question.
To get a value for a single pixel (grid cell), you can do use indexing. For example, for cell number 10,000 and 10,001 you can do r[10000:10001].
You could get all values by doing values(r). But that will fail for a very large raster like this (unless you have lots of RAM).
However, the question you need answered, it seems, is how to make a map by matching integer cell values with RGB colors.
Let's set up an example raster
library(raster)
r <- raster(nrow=4, ncol=4)
values(r) <- rep(c(11, 14, 20, 30), each=4)
And some matching RGB values
legend <- read.csv(text="Value,Label,Red,Green,Blue
11,Post-flooding or irrigated croplands (or aquatic),170,240,240
14,Rainfed croplands,255,255,100
20,Mosaic cropland (50-70%) / vegetation (grassland/shrubland/forest) (20-50%),220,240,100
30,Mosaic vegetation (grassland/shrubland/forest) (50-70%) / cropland (20-50%) ,205,205,102")
Compute the color code
legend$col <- rgb(legend$Red, legend$Green, legend$Blue, maxColorValue=255)
set up a "color table"
# start with white for all values (1 to 255)
ct <- rep(rgb(1,1,1), 255)
# fill in where necessary
ct[legend$Value+1] <- legend$col
colortable(r) <- ct
plot
plot(r)
You can also try:
tb <- legend[, c('Value', 'Label')]
colnames(tb)[1] = "ID"
tb$Label <- substr(tb$Label, 1,10)
levels(r) <- tb
library(rasterVis)
levelplot(r, col.regions=legend$col, at=0:length(legend$col))

Mapping temperature data from an .nc file in R

I downloaded temperature data from [NARR] (https://www.esrl.noaa.gov/psd/data/gridded/data.narr.monolevel.html) specifically "Air temperature at 2m" -monthly mean
I opened the file using the package "ncdf4". The data has 4 dimensions- time, x, y, nbnds. y corresponds to lat and x corresponds to lon. There is a variable (not dimension) called air which I do not know how to use, although this is the temperature information.
My end goal is to map the temperature data on a map of North America, using averaged temperature data for each month for each year (12 maps, one for each month).
I am having trouble identifying how to use the data as all of the dimensions are just really long lists of numbers that don't seem to have meaning (eg. the x coordinates look like this: 6232896 6265359 6297822 6330285 6362748 6395211 6427674 6460137 6492600 6525063 6557526 6589989, and so do the y values and time).
Here is the code I am using to view the dimensions:
temp2m <- nc_open("air.2m.mon.mean.nc")
time <- temp2m$dim$time$vals
lat <- temp2m$dim$x$vals
lon <- temp2m$dim$y$vals
nbnds <- temp2m$dim$nbnds$vals
If someone could help me view the data as well as map temperature data onto North America that would be great.
Thank you!
You can use the raster package to read these into a stack:
> library(raster)
> air = stack("./air.2m.mon.mean.nc")
(Note, you may need a raster package compiled with netcdf drivers...)
You can then plot them by slice or by time-name:
> plot(air[[23]])
> plot(air[["X1979.10.01.01.01.15"]])
The stack prints like this:
> air
class : RasterStack
dimensions : 277, 349, 96673, 450 (nrow, ncol, ncell, nlayers)
resolution : 32462.99, 32463 (x, y)
extent : -16231.49, 11313351, -16231.5, 8976020 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=lcc +x_0=5632642.22547 +y_0=4612545.65137 +lat_0=50 +lon_0=-107 +lat_1=50 +lat_2=50 +ellps=WGS84
names : X1979.01.01.00.01.15, X1979.02.01.00.01.15, X1979.03.01.00.01.15, X1979.04.01.01.01.15, X1979.05.01.01.01.15, X1979.06.01.01.01.15, X1979.07.01.01.01.15, X1979.08.01.01.01.15, X1979.09.01.01.01.15, X1979.10.01.01.01.15, X1979.11.01.00.01.15, X1979.12.01.00.01.15, X1980.01.01.00.01.15, X1980.02.01.00.01.15, X1980.03.01.00.01.15, ...
and those coordinates are not really lat-long, but are in a transformed coordinate system described by that "coord. ref." string. If you want to put it on a lat-long map you need to warp it:
> air_ll = projectRaster(air[[1]],crs="+init=epsg:4326")
> plot(air_ll)
It might be better for you to transform any other data to this system, and keep the grid unprojected. Just look up how to deal with spatial data in R for more info on projections and transformations.

Getting Data out of raster file in R

I'm new to raster files, but they seem to be the best way to open up the large gov't files that have all the weather data, so I'm trying to figure out how to use them. For reference, I'm downloading the files located here (just some run of the mill weather stuff). When I use the raster package of R to import the file like this
> r <- raster("/path/to/file.grb")
Everything works fine. I can even get a little metadata when I type in
> r
class : RasterLayer
band : 1 (of 37 bands)
dimensions : 224, 464, 103936 (nrow, ncol, ncell)
resolution : 0.125, 0.125 (x, y)
extent : -125.0005, -67.0005, 25.0005, 53.0005 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +a=6371200 +b=6371200 +no_defs
data source : /path/to/file.grb
names : NLDAS_MOS0125_H.A20140629.0100.002
All I've managed to do at this point is index the raster in a very obvious way.
> r[100,100]
267.1
So, I guess I can "index" it, but I have no idea what the number 267.1 means. It's certainly not all there is in the cell. There should be a bunch of variables including, but not limited to, soil moisture, surface runoff, and evaporation.
How can I access this information in the same way using R?
# create two rasters
r1 <- raster(matrix(ncol = 10, nrow = 10, runif(100)))
r2 <- raster(matrix(ncol = 10, nrow = 10, runif(100)))
# creates a raster stack -- the stack (or brick function) allows you to
# to use multilayer band rasters
# http://www.inside-r.org/packages/cran/raster/docs/stack
st_r <- stack(r1, r2)
# extract values -- will create a matrix with 100 rows and two columns
vl <- getValues(st_r)
r <- raster("/path/to/file.grb")
values <- getValues(r)
You can read about the function here:
http://www.inside-r.org/packages/cran/raster/docs/values
I believe that the problem is that you are using raster and not stack. The raster function results in a single layer (matrix) whereas stack or brick read an array with all of the raster layers. Here is an example that demonstrates extracting values using an [i,j,z] index.
library(raster)
setwd("D:/TMP")
download.file("ftp://hydro1.sci.gsfc.nasa.gov/data/s4pa/NLDAS/NLDAS_MOS0125_H.002/2014/180/NLDAS_MOS0125_H.A20140629.0000.002.grb",
destfile="NLDAS_MOS0125_H.A20140629.0000.002.grb", mode="wb")
r <- stack("NLDAS_MOS0125_H.A20140629.0000.002.grb")
names(r) <- paste0("L", seq(1:nlayers(r)))
class(r)
# Values for [i,j]
i=100
j=100
r[i,j]
# Values for i,j and z at layer(s) 1, 5 and 10
z=c(1,5,10)
r[i,j][z]

Resources