Getting Data out of raster file in R - r

I'm new to raster files, but they seem to be the best way to open up the large gov't files that have all the weather data, so I'm trying to figure out how to use them. For reference, I'm downloading the files located here (just some run of the mill weather stuff). When I use the raster package of R to import the file like this
> r <- raster("/path/to/file.grb")
Everything works fine. I can even get a little metadata when I type in
> r
class : RasterLayer
band : 1 (of 37 bands)
dimensions : 224, 464, 103936 (nrow, ncol, ncell)
resolution : 0.125, 0.125 (x, y)
extent : -125.0005, -67.0005, 25.0005, 53.0005 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +a=6371200 +b=6371200 +no_defs
data source : /path/to/file.grb
names : NLDAS_MOS0125_H.A20140629.0100.002
All I've managed to do at this point is index the raster in a very obvious way.
> r[100,100]
267.1
So, I guess I can "index" it, but I have no idea what the number 267.1 means. It's certainly not all there is in the cell. There should be a bunch of variables including, but not limited to, soil moisture, surface runoff, and evaporation.
How can I access this information in the same way using R?

# create two rasters
r1 <- raster(matrix(ncol = 10, nrow = 10, runif(100)))
r2 <- raster(matrix(ncol = 10, nrow = 10, runif(100)))
# creates a raster stack -- the stack (or brick function) allows you to
# to use multilayer band rasters
# http://www.inside-r.org/packages/cran/raster/docs/stack
st_r <- stack(r1, r2)
# extract values -- will create a matrix with 100 rows and two columns
vl <- getValues(st_r)

r <- raster("/path/to/file.grb")
values <- getValues(r)
You can read about the function here:
http://www.inside-r.org/packages/cran/raster/docs/values

I believe that the problem is that you are using raster and not stack. The raster function results in a single layer (matrix) whereas stack or brick read an array with all of the raster layers. Here is an example that demonstrates extracting values using an [i,j,z] index.
library(raster)
setwd("D:/TMP")
download.file("ftp://hydro1.sci.gsfc.nasa.gov/data/s4pa/NLDAS/NLDAS_MOS0125_H.002/2014/180/NLDAS_MOS0125_H.A20140629.0000.002.grb",
destfile="NLDAS_MOS0125_H.A20140629.0000.002.grb", mode="wb")
r <- stack("NLDAS_MOS0125_H.A20140629.0000.002.grb")
names(r) <- paste0("L", seq(1:nlayers(r)))
class(r)
# Values for [i,j]
i=100
j=100
r[i,j]
# Values for i,j and z at layer(s) 1, 5 and 10
z=c(1,5,10)
r[i,j][z]

Related

How can I subset a raster by conditional statement in R using `terra`?

I am trying to plot only certain values from a categorical land cover raster I am working with. I have loaded it in to R using the terra package and it plots fine. However, since the original data did not come with a legend, I am trying to find out which raster value corresponds to what on the map.
Similar to the answer provided here: How to subset a raster based on grid cell values
I have tried using the following line:
> landcover
class : SpatRaster
dimensions : 20057, 63988, 1 (nrow, ncol, nlyr)
resolution : 0.0005253954, 0.0005253954 (x, y)
extent : -135.619, -102, 59.99989, 70.53775 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source : spat_n5WpgzBuVAV3Ijm.tif
name : CAN_LC_2015_CAL_wgs
min value : 1
max value : 18
> plot(landcover[landcover == 18])
Error: cannot allocate vector of size 9.6 Gb
However, this line takes a very long time to run and produces a vector memory error. The object is 1.3 kb in the global environment and the original tif is about 300 mb.
You can use cats to find out which values correspond to which categories.
library(terra)
set.seed(0)
r <- rast(nrows=10, ncols=10)
values(r) <- sample(3, ncell(r), replace=TRUE) - 1
cls <- c("forest", "water", "urban")
levels(r) <- cls
names(r) <- "land cover"
cats(r)[[1]]
# ID category
#1 0 forest
#2 1 water
#3 2 urban
To plot a logical (Boolean) layer for one category, you can do
plot(r == "water")
And from from the above you can see that in this case that is equivalent to
plot(r == 1)
I think I found the solution to write the conditional within the plot function as below:
plot(landcover == 18)
For those looking for a reproduceable example, just load the rlogo:
s <- rast(system.file("ex/logo.tif", package="terra"))
s <- s$red
plot(s == 255)

How to get values for a pixel from a geoTIFF in R?

I'm trying to get RGB components from a geoTIFF file in R. The colours on the image correspond to different land classification types and I have a legend for each classification type in RGB components.
I'm using the raster library. My code so far is
library(raster)
my.map = raster("mygeoTIFFfile.tif")
Here is the information on the file once it has been read in:
> my.map[[1]]
class : RasterLayer
dimensions : 55800, 129600, 7231680000 (nrow, ncol, ncell)
resolution : 0.002777778, 0.002777778 (x, y)
extent : -180.0014, 179.9986, -64.99861, 90.00139 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : filepah/filename.tif
names : filename.tif
values : 11, 230 (min, max)
The specific geoTIFF file I'm working on can be found here:
http://due.esrin.esa.int/page_globcover.php
(just click on "Globcover2009_V2.3_Global_.zip")
Can someone please help me get the value from a single pixel location from this file please?
The rasterToPoints() function will convert your raster data to a matrix containing x, y, and value for each point. This will be very large, but may be what you're looking for if you want to do a broad analysis of the data.
library(raster)
map <- raster("GLOBCOVER_L4_200901_200912_V2.3.tif")
data <- rasterToPoints(map, progress="text")
head(data)
Another option is to use the extract() function to return a single point by passing a SpatialPoints object with latitude/longitude. If you only want a few individual data points, this will be a lot faster than loading the entire thing into a matrix.
library(raster)
map <- raster("GLOBCOVER_L4_200901_200912_V2.3.tif")
extract(map, SpatialPoints(cbind(-123.3680884, 48.4252848)))
It seems that you are asking the wrong question.
To get a value for a single pixel (grid cell), you can do use indexing. For example, for cell number 10,000 and 10,001 you can do r[10000:10001].
You could get all values by doing values(r). But that will fail for a very large raster like this (unless you have lots of RAM).
However, the question you need answered, it seems, is how to make a map by matching integer cell values with RGB colors.
Let's set up an example raster
library(raster)
r <- raster(nrow=4, ncol=4)
values(r) <- rep(c(11, 14, 20, 30), each=4)
And some matching RGB values
legend <- read.csv(text="Value,Label,Red,Green,Blue
11,Post-flooding or irrigated croplands (or aquatic),170,240,240
14,Rainfed croplands,255,255,100
20,Mosaic cropland (50-70%) / vegetation (grassland/shrubland/forest) (20-50%),220,240,100
30,Mosaic vegetation (grassland/shrubland/forest) (50-70%) / cropland (20-50%) ,205,205,102")
Compute the color code
legend$col <- rgb(legend$Red, legend$Green, legend$Blue, maxColorValue=255)
set up a "color table"
# start with white for all values (1 to 255)
ct <- rep(rgb(1,1,1), 255)
# fill in where necessary
ct[legend$Value+1] <- legend$col
colortable(r) <- ct
plot
plot(r)
You can also try:
tb <- legend[, c('Value', 'Label')]
colnames(tb)[1] = "ID"
tb$Label <- substr(tb$Label, 1,10)
levels(r) <- tb
library(rasterVis)
levelplot(r, col.regions=legend$col, at=0:length(legend$col))

R- Raster math while preserving integer format

I have a some large rasters (~110 MB each) I want to perform some raster calculations on. For the purposes of this example, I want to average the files SNDPPT_M_sl1_1km_ll.tif and SNDPPT_M_sl2_1km_ll.tif, available at this website. In reality, the math is a bit more complex (some multiplication and division of several rasters).
Both input rasters are integer (INT1U) data, and I would like the output to also be INT1U. However, whenever I try to do a raster calculation, it creates intermediate temporary files in floating point format which are very large in size. I am working on a laptop with about 7 GB of free hard drive space, which gets filled before the calculation is complete.
# load packages
require(raster)
## script control
# which property?
prop <- "SNDPPT"
# load layers
r.1 <- raster(paste0("1raw/", prop, "_M_sl1_1km_ll.tif"))
r.2 <- raster(paste0("1raw/", prop, "_M_sl2_1km_ll.tif"))
# allocate space for output raster - this is about 100 MB (same size as input files)
r.out <- writeRaster(r.1,
filename=paste0("2derived/", prop, "_M_meanTop200cm_1km_ll.tif"),
datatype="INT1U")
# perform raster math calculation
r.out <- integer(round((r.out+r.2)/2))
# at this point, my hard drive fills due to temporary files > 7 GB in size
Is anyone aware of a workaround to perform raster math in R with integer input and output files while minimizing or avoiding the very large intermediate files?
The trick here could be to use raster::overlay to make the computation and save the results as a compressed tiff at the same time. Something like this should work:
library(raster)
#> Loading required package: sp
# load layers
r.1 <- raster("C:/Users/LB_laptop/Downloads/SNDPPT_M_sl1_1km_ll.tif")
r.2 <- raster("C:/Users/LB_laptop/Downloads/SNDPPT_M_sl1_1km_ll.tif")
out <- raster::overlay(r.1, r.2,
fun = function(x, y) (round((x + y) / 2)),
filename = "C:/Users/LB_laptop/Downloads/SNDPPT_out.tif",
datatype = "INT1U",
options = "COMPRESS=DEFLATE")
> out
class : RasterLayer
dimensions : 16800, 43200, 725760000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -180, 180, -56.00083, 83.99917 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : C:\Users\LB_laptop\Downloads\SNDPPT_out.tif
names : SNDPPT_out
values : 0, 242 (min, max)
HTH.

Random sampling from large rasterlayer

I have a large Rasterlayer with integers ranging from 0 to 44.
class : RasterLayer
dimensions : 29800, 34470, 1027206000 (nrow, ncol, ncell)
resolution : 10, 10 (x, y)
extent : 331300, 676000, 5681995, 5979995 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=utm +zone=32 +ellps=GRS80 +units=m +no_defs
data source : /home/mkoehler/stk_rast_whz
names : stk_rast_whz
values : 0, 44 (min, max)
I want to do a stratified sampling of 5000 points per stratum.
I get the following error:
POINTS<-sampleStratified(b, size=5000, na.rm=T, xy=F)
(Error in ys[[i]] <- y : attempt to select less than one element)
Here is a code that reproduces the problems (even when only selecting 1
item per stratum):
set.seed(10)
r <- raster(ncol=5000, nrow=5000)
names(r) <- 'stratum'
r[] <- round((runif(ncell(r)))*44)
sampleStratified(r, size=1,xy=T)
Error in ys[[i]] <- y : attempt to select less than one element
Trying that with fewer strata and changing the settings of "size" or
"exp" have no effect.
R version: [64-bit] C:\Program Files\R\R-3.1.1
Any ideas?
thanks in advance!
This appears to be a bug (as at raster 2.3-12), and occurs when (1) your raster contains cells with value 0, and (2) the raster can't be processed in memory (i.e. canProcessInMemory(r) is FALSE).
The function loops over the unique cell values produced by freq(r), and then indexes a list by each of these values in turn. If one of those values is zero, the error will be triggered since the 0th element does not exist. For example:
list()[[0]]
# Error in list()[[0]] : attempt to select less than one element]
You'll notice that the error doesn't occur if you fill r with, e.g., r[] <- sample(44, ncell(r), replace=TRUE), since it won't have any zeroes.
When the raster can be processed in memory, the function loops over the row numbers of freq(r), and so the subsequent list indexing is sensible.
I've contacted the maintainer to report this bug.
Meanwhile, as a temporary fix, you could use something like the following to make a corrected copy of the function (which will remain available in the current R session).
sampleStratified2 <-
eval(parse(text=sub('sr\\[, 2\\] == i', 'sr[, 2] == f[i, 1]',
sub('i in f\\[, 1\\]', 'i in seq_len(nrow(f))',
deparse(getMethod(sampleStratified,
signature='RasterLayer')#.Data))
)))
sampleStratified2(r, size=1, xy=TRUE)

Raster overlay from a matrix

I have a matrix of 100 raster layers and I'd like to create one new layer that is the average. I understand if there were two layers I could simply use the overlay function or perhaps just use c <- mean (a, b). However, I'm not sure how to proceed with the matrix.
Here is sample of the matrix:
[[1]]
class : RasterLayer
dimensions : 175, 179, 31325 (nrow, ncol, ncell)
resolution : 1, 1 (x, y)
extent : 0, 179, 0, 175 (xmin, xmax, ymin, ymax)
coord. ref. : NA
data source : in memory
names : layer
values : 0, 100 (min, max)
I have tried
a.avg <- mean (a.total[,])
and I receive the error argument is not numeric or logical: returning NA
I assume you have a list of rasterLayers ( or perhaps a stack ). If you already have a stack, skip step one, but I assume you have a list not a matrix which I have called mylistofrasters...
#1 - Get all rasters in the list into a stack
mystack <- do.call( stack , mylistofrasters )
#2 - Take mean of each pixel in the stack returning a single raster that is the average
mean.stack <- calc( mystack , mean , na.rm = TRUE )
This answer is similar to the #SimonO101's answer using a simpler code.
First, let's build a list of RasterLayer (you can skip this step if you already have the list):
library(raster)
r <- raster(nrow=10, ncol=10)
r <- init(r, runif)
lr <- lapply(1:8, function(i)r)
The raster package defines an stack method for lists, so you can use it directly without do.call:
s <- stack(lr)
Besides, there is a mean method for Raster* objects. Therefore, you don't really need calc:
mean(s, na.rm=TRUE)

Resources