Subsetting a rasterbrick by months - r

I got myself a rasterbrick called y, which has got 14 975 time layers as its got values of daily mean geopotential heigth every day since 1.1.1979 till 31.12.2019 (14 975 days). The brick has following description:
class : RasterBrick
dimensions : 221, 121, 26741, 14975 (nrow, ncol, ncell, nlayers)
resolution : 0.25, 0.25 (x, y)
extent : 14.875, 45.125, 24.875, 80.125 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84
source : C:/Users/Adam/AppData/Local/Temp/RtmpaKZVdb/raster/r_tmp_2020-10-26_165849_53084_29346.grd
names : index_1979.01.01, index_1979.01.02, index_1979.01.03, index_1979.01.04, index_1979.01.05, index_1979.01.06, index_1979.01.07, index_1979.01.08, index_1979.01.09, index_1979.01.10, index_1979.01.11, index_1979.01.12, index_1979.01.13, index_1979.01.14, index_1979.01.15, ...
min values : 46604.85, 47328.07, 48944.12, 49320.65, 49244.67, 49516.16, 49504.01, 48959.65, 48608.90, 47603.10, 47572.72, 48564.15, 49816.92, 49078.65, 48321.72, ...
max values : 57006.81, 56968.60, 56958.67, 56976.26, 57288.55, 57535.62, 57659.48, 57581.33, 57381.65, 57052.99, 56803.95, 56854.89, 56783.50, 56739.44, 56600.52, ...
and I would like to subset this rasterbrick into 12 rasterbricks by month so that I had 1 rasterbrick for every calendar month. I tried to do that several ways but nothing worked out well. For example, I tried to substract month character from names(y), and I think its definitely the way to go but it simply does not work. Every help appreciated, thank you!

if you try with this:
# We changed the names of the layers to months:
layer_names_original <- names(y)
layer_names <- layer_names_original
layer_names_2 <- gsub('index_', '', layer_names)
layer_names_3 <- gsub('\\.', '-', layer_names_2)
layer_names_3_as_date <- as.Date(layer_names_3)
library(lubridate)
layer_names_3_in_months <- month(layer_names_3_as_date)
names(y) <- layer_names_3_in_months
# We filtered the layers of the month 'i'
i <- 1 # january
y_i <- y[[i]]
# We replaced the month names by the originals
id_match <- which(names(y)%in%i)
names(y_i) <- layer_names_original[id_match]
# Pd: Try changed brick by stack in case of problems (y <- stack(y))

You try with this:
layer_names <- names(y)
layer_names_2 <- gsub('index_', '', layer_names)
layer_names_3 <- gsub('\\.', '-', layer_names_2)
layer_names_3_as_date <- as.Date(layer_names_3)
library(lubridate)
layer_names_3_in_months <- month(layer_names_3_as_date)
# We filtered the layers of the month 'i'
i <- 1 # january
layer_names_3_in_months_i <- which(layer_names_3_in_months==i)
layer_month_i <- names(y)[layer_names_3_in_months_i]
# Filter done
y[[layer_month_i]]
# Pd: Try changed brick by stack in case of problems (y <- stack(y))

Related

stack geotiff with stars 'along' when 'band' dimension contains band + time information

I have a timeseries of geotiff files I'd like to stack in R using stars. Here's the first two:
urls <- paste0("/vsicurl/",
"https://sdsc.osn.xsede.org/bio230014-bucket01/neon4cast-drivers/",
"noaa/gefs-v12/cogs/gefs.20221201/",
c("gep01.t00z.pgrb2a.0p50.f003.tif", "gep01.t00z.pgrb2a.0p50.f006.tif"))
library(stars)
stars::read_stars(urls, along="time")
Errors with:
Error in c.stars_proxy(`3` = list(gep01.t00z.pgrb2a.0p50.f003.tif = "/vsicurl/https://sdsc.osn.xsede.org/bio230014-bucket01/neon4cast-drivers/noaa/gefs-v12/cogs/gefs.20221201/gep01.t00z.pgrb2a.0p50.f003.tif"), :
don't know how to merge arrays: please specify parameter along
Context: bands contain both time+band info
This fails because the dimensions do not match, which happens because the files have concatenated temporal information into the band names:
x<- lapply(urls, read_stars)
x
produces:
[[1]]
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max.
gep01.t00z.pgrb2a.0p50.f003.ti... 50026.01 98094.81 101138 98347.42 101845.2 104605.2
dimension(s):
from to offset delta refsys point
x 1 720 -180.25 0.5 Coordinate System importe... FALSE
y 1 361 90.25 -0.5 Coordinate System importe... FALSE
band 1 8 NA NA NA NA
values x/y
x NULL [x]
y NULL [y]
band PRES:surface:3 hour fcst,...,DLWRF:surface:0-3 hour ave fcst
[[2]]
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max.
gep01.t00z.pgrb2a.0p50.f006.ti... 50029.83 98101.83 101170.6 98337.52 101825 104588.2
dimension(s):
from to offset delta refsys point
x 1 720 -180.25 0.5 Coordinate System importe... FALSE
y 1 361 90.25 -0.5 Coordinate System importe... FALSE
band 1 8 NA NA NA NA
values x/y
x NULL [x]
y NULL [y]
band PRES:surface:6 hour fcst,...,DLWRF:surface:0-6 hour ave fcst
Note the band names would align except for the existence of the timestamp being tacked on, e.g. PRES:surface:3 hour fcst vs PRES:surface:6 hour fcst.
How can I best read in these files so that I have dimensions of x,y,band, and time in my stars object?
alternatives: terra?
How about terra? Note that terra is happy to read these files in directly, but treats this as 16 unique bands. Can I re-align that so that I have the original 8 bands along a new "time" dimension? (I recognize stars emphasizes 'spatio-temporal', maybe the such a cube is out of scope to terra?) Also note that terra for some reason mangles the timestamp in these band names:
x <- terra::rast(urls)
x
class : SpatRaster
dimensions : 361, 720, 16 (nrow, ncol, nlyr)
resolution : 0.5, 0.5 (x, y)
extent : -180.25, 179.75, -90.25, 90.25 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat Coordinate System imported from GRIB file
sources : gep01.t00z.pgrb2a.0p50.f003.tif (8 layers)
gep01.t00z.pgrb2a.0p50.f006.tif (8 layers)
names : PRES:~ fcst, TMP:2~ fcst, RH:2 ~ fcst, UGRD:~ fcst, VGRD:~ fcst, APCP:~ fcst, .
With terra it is pretty easy to make a time-series for each variable as I show below.
urls <- paste0("/vsicurl/",
"https://sdsc.osn.xsede.org/bio230014-bucket01/neon4cast-drivers/",
"noaa/gefs-v12/cogs/gefs.20221201/",
c("gep01.t00z.pgrb2a.0p50.f003.tif", "gep01.t00z.pgrb2a.0p50.f006.tif"))
library(terra)
r <- rast(urls)
Extract two variables of interest
nms <- names(r)
tmp <- r[[grep("TMP", nms)]]
rh <- r[[grep("RH", nms)]]
# set time
tm <- as.POSIXct("2022-12-01", tz="GMT") + c(3,6) * 3600
time(rh) <- tm
time(tmp) <- tm
And you could combine them into a SpatRasterDatset like this:
s <- sds(list(tmp=tmp, rh=rh))
An alternative path to get to the same point would be to start with a SpatRasterDataset and subset it.
sd <- sds(urls)
nl <- 1:length(sd)
nms <- names(sd[1])
tmp2 <- rast(sd[nl, grep("TMP", nms)])
time(tmp2) <- tm
rh2 <- rast(sd[nl, grep("RH", nms)])
time(rh2) <- tm
I made the subsetting work a little nicer in terra version 1.7-5
urls <- paste0("/vsicurl/",
"https://sdsc.osn.xsede.org/bio230014-bucket01/neon4cast-drivers/",
"noaa/gefs-v12/cogs/gefs.20221201/", c("gep01.t00z.pgrb2a.0p50.f003.tif", "gep01.t00z.pgrb2a.0p50.f006.tif"))
library(terra)
#terra 1.7.5
sd <- sds(urls)
tmp <- sd[,2]
tmp
#class : SpatRaster
#dimensions : 361, 720, 2 (nrow, ncol, nlyr)
#resolution : 0.5, 0.5 (x, y)
#extent : -180.25, 179.75, -90.25, 90.25 (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat Coordinate System imported from GRIB file
#sources : gep01.t00z.pgrb2a.0p50.f003.tif
# gep01.t00z.pgrb2a.0p50.f006.tif
#names : TMP:2 m above g~Temperature [C], TMP:2 m above g~Temperature [C]
#unit : C, C
#time : 2022-12-01 03:00:00 to 2022-12-01 06:00:00 UTC
As for the layer names containing the forecast time, that is just because that is what is in the tif metadata. It looks like that was a decision made when they were created from the original GRIB files.
The latitude extent going beyond the north and south poles is an interesting feature of this dataset.
Just wanted to share some additional possible solutions for comparison. With larger numbers of files some of these differences become more relevant. this expands a bit beyond my original question.
terra
Prof Hijmans gives a very nice solution in terra. He also asked about the original upstream sources, which I didn't explain properly -- these are originally GRIB files for NOAA GEFS forecast.
Notably, we can work directly from the GRIB files. GEFS is a 35-day forecast, so let's try going more than 6 hrs into the future:
library(terra)
# original GRIB sources, AWS mirror
gribs <- paste0("/vsicurl/https://noaa-gefs-pds.s3.amazonaws.com/gefs.20220314/00/atmos/pgrb2ap5/geavg.t00z.pgrb2a.0p50.f",
stringr::str_pad(seq(3,240,by=3), 3, pad="0"))
bench::bench_time({
cube <- terra::sds(gribs)
})
cube[1,63] |> plot()
very nice!
gdalcubes
gdalcubes is another package that can also leverage the gdal virtual filesystem when working with these large-ish remote files. It also lets us define an abstract cube at potentially a different resolution in space & time than the original sources (averaging or interpolating). lazy operations mean this may run a bit faster(?)
library(gdalcubes)
date <- as.Date("2023-01-26")
date_time = date + lubridate::hours(seq(3,240,by=3))
# USA box
v <- cube_view(srs = "EPSG:4326",
extent = list(left = -125, right = -66,top = 49, bottom = 25,
t0= as.character(min(date_time)), t1=as.character(max(date_time))),
dx = 0.5, dy = 0.5, dt = "PT3H")
gribs <- paste0("/vsicurl/https://noaa-gefs-pds.s3.amazonaws.com/gefs.20220314/00/atmos/pgrb2ap5/geavg.t00z.pgrb2a.0p50.f",
stringr::str_pad(seq(3,240,by=3), 3, pad="0"))
bench::bench_time({
cube <- gdalcubes::create_image_collection(gribs, date_time = date_time)
})
bench::bench_time({
raster_cube(cube, v) |>
select_bands("band63") |> # tempearture
animate(col = viridisLite::viridis, nbreaks=50, fps=10, save_as = "temp.gif")
})
stars
didn't translate a full stars example, but here at least is the band name correction; a bit more cumbersome than the examples above.
urls <- paste0("/vsicurl/",
"https://sdsc.osn.xsede.org/bio230014-bucket01/neon4cast-drivers/",
"noaa/gefs-v12/cogs/gefs.20221201/",
c("gep01.t00z.pgrb2a.0p50.f003.tif", "gep01.t00z.pgrb2a.0p50.f006.tif"))
library(stars)
#stars::read_stars(urls, along="time") # no luck!
## grab unstacked proxy object for each geotiff
x <- lapply(urls, read_stars)
# extract band-names-part
band_names <- st_get_dimension_values(x[[1]], "band") |>
stringr::str_extract("([A-Z]+):") |>
str_remove(":")
# apply corrected band-names
x1 <- lapply(x, st_set_dimensions, "band", band_names)
# at last, we can stack into a cube:
x1 <- do.call(c, c(x1, along="time"))
# and add correct date timestamps to the new time dimension
dates <- as.Date("2022-12-01") + lubridate::hours(c(3,6))
x1 <- st_set_dimensions(x1, "time", dates)
x1

Why intersect function from terra R package not giving all the combinations?

I want to calculate the area under every possible combination of two classified rasters. I am using the following code
library(terra)
#First create two rasters
r1 <- r2 <- rast(nrow=100, ncol=100)
#Assign random cell values
set.seed(123)
values(r1) <- runif(ncell(r1), min=0, max=1)
values(r2) <- runif(ncell(r2), min=0, max=1)
# classify the values into two groups
m_r1 <- c(min(global(r1, "min", na.rm=TRUE)), 0.2, 1,
0.2, max(global(r1, "max", na.rm=TRUE)), 2)
m_r2 <- c(min(global(r2, "min", na.rm=TRUE)), 0.2, 1,
0.2, max(global(r2, "max", na.rm=TRUE)), 2)
#Reclassify the rasters
rclmat_r1 <- matrix(m_r1, ncol=3, byrow=TRUE)
rc_r1 <- classify(r1, rclmat_r1, include.lowest=TRUE)
rclmat_r2 <- matrix(m_r2, ncol=3, byrow=TRUE)
rc_r2 <- classify(r2, rclmat_r2, include.lowest=TRUE)
plot(rc_r1)
plot(rc_r2)
#Convert to polygons
r1_poly <- as.polygons(rc_r1, dissolve=TRUE)
r2_poly <- as.polygons(rc_r2, dissolve=TRUE)
plot(r1_poly)
plot(r2_poly)
#Perform intersections
x <- intersect(r1_poly, r2_poly)
x
#> class : SpatVector
#> geometry : polygons
#> dimensions : 2747, 2 (geometries, attributes)
#> extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84
#> names : lyr.1 lyr.1
#> type : <int> <int>
#> values : 1 1
#> 1 2
#> 2 1
As you can see from the output, one combination i.e. 2-2 is missing. Why is this happening?
When I am trying to calculate the area for each combination using expanse(x), it returns a long result. How can I get the area in km2 for the following combinations?
Combination Area (km2)
1-1
1-2
2-1
2-2
With this example it would be better to stay with raster data.
x = 10 * rc_r1 + rc_r2
a = cellSize(x, unit="km")
zonal(a, x, sum)
# lyr.1 area
#1 11 19886611
#2 12 81946082
#3 21 84763905
#4 22 323469024
By multiplying with 10, the values in the first layer become 10 (it they were 1) or 20 (if they were 2). If you then add the second layer, you get 10 + 1 or 2 and 20 + 1 or 2, so you end up with four classes: 11, 12, 21, and 22. These show the value in the first raster (first digit) and in the second raster (second digit).
When you show a SpatVector only the first three records are printed, and there is a 2-2 record. Nevertheless, intersect did not work properly and I have now fixed this.

Sum pixel values in a Raster stack based on another raster stack

I have a raster stack representing Evapotranspiration (ET) with 396 layers (3 raster layers for a month for 11 years in total - 2009 to 2019). For each month the raster layer always represents 1st, 11th and 21st day of the month called dekads. Here is the sample dataset
library(raster)
#create a raster with random numbers
r <- raster(ncol=5, nrow=5, xmx=-80, xmn=-150, ymn=20, ymx=60)
values(r) <- runif(ncell(r))
#create a random raster stack for 3 raster a month for 11 years
n <- 396 #number of raster
s <- stack(replicate(n, r)) # convert to raster stack
#rename raster layers to reflect date
d =rep(c(1,11,21),132)
m =rep(1:12, 11, each =3)
y = rep (2009:2019, each =36)
df.date <- as.Date(paste(y, m, d,sep="-"), "%Y-%m-%d")
names(s) = df.date
I also have two other raster stacks with pixel values representing Season start (ss) 11 layers and season end (se)11 layers for years 2009 to 2019.
#create a raster stack representing season start (ss) and season end (se)
# The pixel value represents dekad number. Each raster layer covers exactly three calendar years with the target year in the middle.
# (1-36 for the first year, 37-72 for the target year, 73-108 for the next year).
ss.1 = r # season start raster
values(ss.1)= as.integer(runif(ncell(ss.1), min=1, max=72))
se.1 = ss.1+10 # season end raster
yr = 11
ss <- stack(replicate(yr, ss.1)) # season start raster stack
se <- stack(replicate(yr, se.1)) #season end rasterstack
Now I need to estimate seasonal sum for each year from the "s" raster stack such that the time period for each pixels to sum should correspond to pixel values from "ss" and "se" by considering a 3 year moving window.
Here is an example of output I need for one time step (3yr window) with one season start (ss) raster and one season end (se) raster. But really struck at looping through three raster stacks (s - representing dataset, ss -representing season start date and se -representing season end date).
Grateful for any help.
# Example to calculate pixel based sum for 1 time step
#subset first 3 years - equal to 108 dekads
s.sub = subset(s, 1:108)
# sum each grid cells of "s" raster stack using "ss.1" and "se.1" as an indicator for the three year subset.
for (i in 1:ncell(s.sub)) {
x[i] <- sum(s[[ss.1[i]:se.1[i]]][i], na.rm = T)
}
The last part of your example did not work and I changed it to this (for one year)
x <- rep(NA, ncell(s))
for (i in 1:ncell(s)) {
x[i] <- sum(s[i][ss.1[i]:se.1[i]], na.rm = T)
}
x <- setValues(ss.1, x)
x
#class : RasterLayer
#dimensions : 5, 5, 25 (nrow, ncol, ncell)
#resolution : 14, 8 (x, y)
#extent : -150, -80, 20, 60 (xmin, xmax, ymin, ymax)
#crs : +proj=longlat +datum=WGS84 +no_defs
#source : memory
#names : layer
#values : 0.6505058, 10.69957 (min, max)
You can get that result like this
idx <- stack(ss.1, se.1)
thefun <- function(x, y){
apply(cbind(y, x), 1, function(i) sum(i[(i[1]:i[2])+2], na.rm = T))
}
z <- overlay(s, idx, fun=thefun)
There are more examples here for a similar question.
Given that this is a general problem, I have added a function rapp (range-apply) for it in terra (the replacement for raster) --- available here; this should be on CRAN in early July.
library(terra)
r <- rast(ncols=5, nrows=5, xmin=-150, xmax=-80, ymin=20, ymax=60)
values(r) <- 1:ncell(r)
s <- rast(replicate(36, r))
ss.1 <- r
values(ss.1) <- as.integer(runif(ncell(ss.1), min=1, max=72))
se.1 <- ss.1+10
x <- rapp(s, ss.1, se.1, sum)

How to overlay two Rasters using ifelse (Conditional Statements) in R?

I have two rasters (images), and want to overlay them using this code:
# Getting the images
library(raster)
URL1 <- "https://www.dropbox.com/s/6jjz7ou1skz88wr/raster_1.tif?dl=1"
URL2 <- "https://www.dropbox.com/s/d5xuixohjqfnfze/raster_2.tif?dl=1"
download.file(URL1, destfile=paste0(getwd(),"/", "raster_1.tif"), method="auto", mode="wb", timeout="6000")
download.file(URL2, destfile=paste0(getwd(),"/", "raster_2.tif"), method="auto", mode="wb", timeout="6000")
# Reading the images
raster_1 <- raster(list.files(pattern="raster_1.tif$"))
raster_2 <- raster(list.files(pattern="raster_2.tif$"))
# Overlaying
myFun <- function(x,y){ifelse(x==0 && y==0, 0, ifelse(x==1 && y==0, 2, ifelse(x==1 && y>0, y)))}
( res <- overlay(stack(raster_1 ,raster_2), fun = Vectorize(myFun) ) )
### R gives this error
Error in .overlayList(x, fun = fun, filename = filename, forcefun = forcefun, :
cannot use this formula, probably because it is not vectorized
I would be very grateful if anyone could help me.
Thanks.
You need a function that only uses vectorized operators. This is case where Boolean arithmetic should both succeed and be more efficient
myFun <- function(x,y){ 0*(x==0 && y==0)+
2*(x==1 && y==0)+
y*(x==1 && y>0) }
There are some edge cases that do not appear covered. Can x ever be a value other than exactly 0 or 1? Can y ever be negative?
After running my version I get:
> ( res <- overlay(stack(raster_1 ,raster_2), fun = Vectorize(myFun) ) )
class : RasterLayer
dimensions : 2958, 1642, 4857036 (nrow, ncol, ncell)
resolution : 500, 500 (x, y)
extent : -171063.8, 649936.2, 5317253, 6796253 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=utm +zone=12 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
data source : in memory
names : layer
values : 0, 14751 (min, max)
I didn't think I would need to use Vectorize around myFun,enter code here but the results seems more likely to be correct when I leave it in the call to overlay:
> Hmisc::describe(values(res))
values(res)
n missing distinct Info Mean Gmd .05 .10 .25
3222508 1634528 1502 0.727 4918 6403 0 0 0
.50 .75 .90 .95
0 13898 14082 14168
Value 0 13000 13200 13400 13600 13800 14000 14200 14400
Frequency 2089448 67 578 10515 69031 249817 523241 226628 46191
Proportion 0.648 0.000 0.000 0.003 0.021 0.078 0.162 0.070 0.014
Value 14600 14800
Frequency 6876 116
Proportion 0.002 0.000
When I took out the Vectorize step I did not get an error but I got all zeros, instead.
It is not clear what you really are trying to achieve, and there might be better solutions. In your example data, Y (raster_2) has no values of zero. That suggest that you want the values of raster_2 where raster_1 is not 0? That can be achieved like this:
m <- mask(raster_2, raster_1, maskvalue=0)
I think that 42-'s myFun has a problem in that it returns 0 when none of the conditions are true, specifically when (x == 0 & y > 0)
To make it work with overlay, replace the && with &
myFunV <- function(x,y){
0*(x==0 & y==0)+
2*(x==1 & y==0)+
y*(x==1 & y>0) }
res <- overlay(raster_1, raster_2, fun = myFunV)
(but, again, I doubt that this is good approach for your needs)

Importing Sea Surface Temperature text files in ASCII format into R

I have downloaded multiple .txt.gz files for Hadley Sea Surface Temperature observations. The data have been unzipped, resulting in mutiple .txt files in ASCII format.
I have the following files (the R script is the one I'm working on):
list.files()
[1] "Get_SST_Data.R" "HadISST1_SST_1931-1960.txt" "HadISST1_SST_1931-1960.txt.gz"
[4] "HadISST1_SST_1961-1990.txt" "HadISST1_SST_1961-1990.txt.gz" "HadISST1_SST_1991-2003.txt"
[7] "HadISST1_SST_2004.txt" "HadISST1_SST_2005.txt" "HadISST1_SST_2006.txt"
[10] "HadISST1_SST_2007.txt" "HadISST1_SST_2008.txt" "HadISST1_SST_2009.txt"
[13] "HadISST1_SST_2010.txt" "HadISST1_SST_2011.txt" "HadISST1_SST_2012.txt"
[16] "HadISST1_SST_2013.txt"
I would like to be able to utilize the temperature data to make a numeric vector for the Sea Surface Temperature for everyday since 1950, to eventually make a time series plot.
Which will look something like this
[p.s. this is just for reference...]
Thanks in advance!
NetCDF is definitely a better way to go since the format of the ascii data is pretty horrible. That said, here's a function that reads in the data you have downloaded.
read.things <- function(f) {
# f is the file path of your ascii data
require(raster)
d <- readLines(f)
d <- split(d, rep(1:12, each=181))
d <- lapply(d, function(x) read.fwf(textConnection(x), rep(6, 360),
skip=1, stringsAsFactors=FALSE,
na.strings=c(-1000, -32768)))
d <- lapply(d, function(x) sapply(x, as.numeric))
out <- stack(lapply(d, raster))
names(out) <- month.abb
extent(out) <- c(-180, 180, -90, 90)
out/100
}
Note that I've set 100% ice cells (-100) and land cells (-32768) as NA.
Below, we download one of the files (1Mb) as an example:
download.file(
'http://www.metoffice.gov.uk/hadobs/hadisst/data/HadISST1_SST_2004.txt.gz',
destfile= {f <- tempfile()})
s <- read.things(f)
s
# class : RasterBrick
# dimensions : 180, 360, 64800, 12 (nrow, ncol, ncell, nlayers)
# resolution : 1, 1 (x, y)
# extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
# coord. ref. : NA
# data source : in memory
# names : Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec
# min values : -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10
# max values : 30.34, 30.58, 30.43, 30.50, 30.83, 31.39, 32.71, 33.40, 32.61, 31.52, 30.60, 30.51
library(rasterVis)
levelplot(s, at=seq(min(s[], na.rm=T), max(s[], na.rm=T), len=100),
col.regions=colorRampPalette(c('#2c7bb6', '#abd9e9', '#ffffbf',
'#fdae61', '#d7191c')))
[Edit: Tested only on Linux]
R is able to read NetCDF format (http://www.metoffice.gov.uk/hadobs/hadisst/data/HadISST_sst.nc.gz). You can use the "raster" package to read these data, after decompression, such as:
library(raster)
library(xts)
library(caTools)
Some time definitions:
startYear <- 1950 # start of the period
endYear <- 2011 # end of the period
subp <- '1951-01-01/1980-12-01' # period for the climatology calculation
Open the file:
sst <- brick('HadISST_sst.nc')
Date <- substr(names(sst),2,11)
Date <- gsub('\\.', '\\-', Date)
Date <- as.Date(Date)
dstart <- paste(startYear,'01','01',sep='-'); dstart <- grep(dstart, Date)
dend <- paste(endYear,'12','01',sep='-'); dend <- grep(dend, Date)
sst <- subset(sst, dstart:dend)
Date <- Date[dstart:dend]
Extract the time serie for a specific point (lat=35, lon=120):
tserie <- as.vector(extract(sst, cbind(116, -35)))
tserie <- xts(tserie, order.by=Date)
Calculate the climatology for the subp period:
clim <- as.numeric()
for(ii in 1:12){
clim[ii] <- mean(tserie[subp][(.indexmon(tserie[subp])+1) == ii])
}
clim <- xts(rep(clim, length(tserie)/12), order.by=Date)
Calculate anomalies:
tserie <- tserie - clim
Plot the result:
par(las=1)
plot(tserie, t='n', main='HadISST')
lines(tserie, col='grey')
lines(xts(runmean(tserie, 12), order.by=Date), col='red', lwd=2)
legend('bottomleft', c('Monthly anomaly','12-month moving avg'), lty=c(1,1), lwd=c(1,2), col=c('grey','red'))
You get the error because if you look at the structure of the dates they always go from the 16th to the 16th. If you replace:
dstart <- paste(startYear,'01','01',sep='-'); dstart <- grep(dstart, Date)
dend <- paste(endYear,'12','01',sep='-'); dend <- grep(dend, Date)
for,
dstart <- paste(startYear,'01','16',sep='-'); dstart <- grep(dstart, Date)
dend <- paste(endYear,'12','16',sep='-'); dend <- grep(dend, Date)
It will work.

Resources