Trying to convert NOAA Snow data (NetCDF) into Raster format in R. This data has been pre-processed by me in CDO (interpolated from weekly-daily).
library(raster)
library(ncdf4)
nc<-nc_open('NOAA_Snow_JanJune2016.nc')
# extract variable name, size and dimension
v <- nc$var[[1]]
size <- v$varsize
dims <- v$ndims
nt <- size[dims] # length of time dimension
lat <- nc$dim$latitude$vals # latitude position
lon <- nc$dim$longitude$vals # longitude position
# read sea ice variable
r<-list()
for (i in 1:nt) {
start <- rep(1,dims) # begin with start=(1,1,...,1)
start[dims] <- i # change to start=(1,1,...,i) to read timestep i
count <- size # begin with count=(nx,ny,...,nt), reads entire var
count[dims] <- 1 # change to count=(nx,ny,...,1) to read 1 tstep
dt<-ncvar_get(nc, varid = 'snow_cover_extent', start = start, count = count)
# convert to raster
r[i]<-raster(dt)
}
Returns the following error:
Error in ncvar_get_inner(ncid2use, varid2use, nc$var[[li]]$missval, addOffset, :
Error: variable has 3 dims, but start has 2 entries. They must match!
Has anyone else had and solved this problem? I wonder if prepping the file in CDO is causing the problem. The (.nc) data can be accessed here:
https://drive.google.com/file/d/0Bz0W7Ut_SNfjeE9ObXpySzJ5UWs/view?usp=sharing
Many thanks!
Use ncdf4 to explore your ncdf attributes
library(ncdf4)
nc<-nc_open('D:/NOAA_Snow_JanJune2016.nc')
print(nc)
varname<-names(nc$var)
Use raster to convert your ncdf to raster
library (raster)
r<-brick('D:/NOAA_Snow_JanJune2016.nc',varname='snow_cover_extent')
Here the plot
spplot(r[1])
Related
I'm trying to visualize monthly averages of RegCM output I joined using CDO but I'm not being able to do it.
In order to do that I was trying to find a way to plot de monthly averages of my variable "pr" as you could do using GrADS.
I found that a way to do this was using the brick function and the raster library. So I was trying to use the code suggested in another question to convert my netcdf file into a raster brick:
NetCDF to Raster Brick "Unable to find inherited method for function 'brick' for 'ncdf4'"
# load package
library(sp)
library(raster)
library(ncdf4)
# read ncdf file
nc<-nc_open('dat.nc')
# extract variable name, size and dimension
v <- nc$var[[1]]
size <- v$varsize
dims <- v$ndims
nt <- size[dims] # length of time dimension
lat <- nc$dim$xlat$vals # latitude position
lon <- nc$dim$xlong$vals # longitude position
# read pr variable
r<-list()
for (i in 1:nt) {
start <- rep(1,dims) # begin with start=(1,1,...,1)
start[dims] <- i # change to start=(1,1,...,i) to read timestep i
count <- size # begin with count=(nx,ny,...,nt), reads entire var
count[dims] <- 1 # change to count=(nx,ny,...,1) to read 1 tstep
dt<-ncvar_get(nc, varid = 'pr', start = start, count = count)
# convert to raster
r[i]<-raster(dt)
}
# create layer stack with time dimension
r<-stack(r)
# transpose the raster to have correct orientation
rt<-t(r)
extent(rt)<-extent(c(range(lon), range(lat)))
# plot the result
spplot(rt)
But once I tried to run the for loop in the code, I get the following error:
Error in ncvar_get_inner(ncid2use, varid2use, nc$var[[li]]$missval, addOffset, :
Error: variable has 3 dims, but start has 2 entries. They must match!
The file I'm trying to visualize can be found in the following link:
https://drive.google.com/file/d/13KsOpnt-Wk2v93WwGcOU6AHw8KGOFlai/view?usp=sharing
I would really appreciate any insights with this problem!
i want to do some tests with my rasterdata in R. I need numeric values. But R shows me only integers. How can i change this? Any idea? Thanks in advance :D
#libraries
library(raster)
library(rgdal)
setwd("C:/Users/cathe/Documents/Cropped_raster")
## polygon with crop-extend ##
shp <- readOGR("C:/Users/cathe/OneDrive/Documents/ArcGIS/Projects/CLC2000_shapefiles/CLC2000_Nichtdurchgängig städtische Prägung.shp")
## load tif files ##
infiles = list.files(path=getwd(),
pattern="*.tif$|*.TIF$")
## Filenames with desired suffix and output place ##
outfiles = file.path("C:/Users/cathe/Documents/Cropped_raster_nicht_durchgängig",
paste0(basename(tools::file_path_sans_ext(infiles)),
".tif"))
outfiles[outfiles == -9999] <- NA #alle -9999 auf NA setzen, wenn nötig
## crop and output settings (compression and datatype)
for (i in seq_along(infiles)) {
r = crop(stack(infiles[i]), shp)
writeRaster(r, filename=outfiles[i],
bylayer=FALSE,
format="GTiff",
datatype="numeric",
options="COMPRESS=ZIP",
x, NAflag=-9999,
overwrite=TRUE)
}
Your issue is in the datatype argument of writeRaster().
From the documentation:
datatype: Character. Output data type (e.g. 'INT2S' or 'FLT4S'). See
dataType. If no datatype is specified, 'FLT4S' is used, unless this
default value was changed with rasterOptions
Numeric only exists in the R world. Outside you have integers and floats, what you need depends on your data.
I'm having a memory problem with R giving the Can not allocate vector of size XX Gb error message. I have a bunch of daily files (12784 days) in netcdf format giving sea surface temperature in a 1305x378 (longitude-latitude) grid. That gives 493290 points each day, decreasing to about 245000 when removing NAs (over land points).
My final objective is to build a time series for any of the 245000 points from the daily files and find the temporal trend for each point. And my idea was to build a big data frame with a point per row and a day per column (2450000x12784) so I could apply the trend calculation to any point. But then, building such data frame, the memory problem appeared, as expected.
First I tried a script I had previously used to read data and extract a three column (lon-lat-sst) dataframe by reading nc file and then melting the data. This lead to an excessive computing time when tried for a small set of days and to the memory problem. Then I tried to subset the daily files into longitudinal slices; this avoided the memory problem but the csv output files were too big and the process was very time consuming.
Another strategy I've tried without success to the moment it's been to sequentially read all the nc files and then extract all the daily values for each point and find the trend. Then I would only need to save a single 245000 points dataframe. But I think this would be time consuming and not the proper R way.
I have been reading about big.memory and ff packages to try to declare big.matrix or a 3D array (1305 x 378 x 12784) but had not success by now.
What would be the appropriate strategy to face the problem?
Extract single point time series to calculate individual trends and populate a smaller dataframe
Subset daily files in slices to avoid the memory problem but end with a lot of dataframes/files
Try to solve the memory problem with bigmemory or ff packages
Thanks in advance for your help
EDIT 1
Add code to fill the matrix
library(stringr)
library(ncdf4)
library(reshape2)
library(dplyr)
# paths
ruta_datos<-"/home/meteo/PROJECTES/VERSUS/CMEMS/DATA/SST/"
ruta_treball<-"/home/meteo/PROJECTES/VERSUS/CMEMS/TREBALL/"
setwd(ruta_treball)
sst_data_full <- function(inputfile) {
sstFile <- nc_open(inputfile)
sst_read <- list()
sst_read$lon <- ncvar_get(sstFile, "lon")
sst_read$lats <- ncvar_get(sstFile, "lat")
sst_read$sst <- ncvar_get(sstFile, "analysed_sst")
nc_close(sstFile)
sst_read
}
melt_sst <- function(L) {
dimnames(L$sst) <- list(lon = L$lon, lat = L$lats)
sst_read <- melt(L$sst, value.name = "sst")
}
# One month list file: This ends with a df of 245855 rows x 33 columns
files <- list.files(path = ruta_datos, pattern = "SST-CMEMS-198201")
sst.out=data.frame()
for (i in 1:length(files) ) {
sst<-sst_data_full(paste0(ruta_datos,files[i],sep=""))
msst <- melt_sst(sst)
msst<-subset(msst, !is.na(msst$sst))
if ( i == 1 ) {
sst.out<-msst
} else {
sst.out<-cbind(sst.out,msst$sst)
}
}
EDIT 2
Code used in a previous (smaller) data frame to calculate temporal trend. Original data was a matrix of temporal series, being each column a series.
library(forecast)
data<-read.csv(....)
for (i in 2:length(data)){
var<-paste("V",i,sep="")
ff<-data$fecha
valor<-data[,i]
datos2<-as.data.frame(cbind(data$fecha,valor))
datos.ts<-ts(datos2$valor, frequency = 365)
datos.stl <- stl(datos.ts,s.window = 365)
datos.tslm<-tslm(datos.ts ~ trend)
summary(datos.tslm)
output[i-1]<-datos.tslm$coefficients[2]
}
fecha is date variable name
EDIT 2
Working code from F. Privé answer
library(bigmemory)
tmp <- sst_data_full(paste0(ruta_datos,files[1],sep=""))
library(bigstatsr)
mat <- FBM(length(tmp$sst), length(files),backingfile = "/home/meteo/PROJECTES/VERSUS/CMEMS/TREBALL" )
for (i in seq_along(files)) {
mat[, i] <- sst_data_full(paste0(ruta_datos,files[i],sep=""))$sst
}
With this code a big matrix was created
dim(mat)
[1] 493290 12783
mat[1,1]
[1] 293.05
mat[1,1:10]
[1] 293.05 293.06 292.98 292.96 292.96 293.00 292.97 292.99 292.89 292.97
ncol(mat)
[1] 12783
nrow(mat)
[1] 493290
So, to your read data in a Filebacked Big Matrix (FBM), you can do
files <- list.files(path = "SST-CMEMS", pattern = "SST-CMEMS-198201*",
full.names = TRUE)
tmp <- sst_data_full(files[1])
library(bigstatsr)
mat <- FBM(length(tmp$sst), length(files))
for (i in seq_along(files)) {
mat[, i] <- sst_data_full(files[i])$sst
}
I also posted this question on stack gis 1. From the netcdf4 data that have sub categories, I want to be able to read "Retrieval/fs" variable. I also want to read them and convert to raster girds, but it seems that raster doesn't support netcdf4. I appreciate any suggestions.
library(ncdf4)
library(raster)
file <- "http://140906_B7101Ar_150909171225s.nc4"
names(file$var)
"latitude" ... "longitude"... "Retrieval/fs"
lat <- raster(file, varname="latitude")
lon <- raster(file, varname="longitude")
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘raster’ for signature ‘"ncdf4"’
raster does work with ncdf4 You are now showing actual code. file is a character vector. You cannot do names(file$var) with that (at least you won't get "latitude" ... "longitude"... "Retrieval/fs". So file is probably ncdf4 object (see the error message), while the raster function expects a filename (but not a url).
If you download the file and then do
library(raster)
x <- brick(filename, var="Retrieval/fs")
Things should work if the ncdf file had regular raster data.
However, it does not so you cannot directly import this as a raster. Instead you can get the lat and lon and values from the files, treat these as points and then rasterize (interpolate) these to get a regular raster.
Here is the answer to the question I asked. Since the data is not gridded, I retrieve the lon and lat information along with the variables to create a dataframe.
fs <- ncvar_get(ncfile, "Retrieval/fs")
xlon <- ncvar_get(ncfile, "longitude")
xlat <- ncvar_get(ncfile, "latitude")
d <- data.frame( as.vector(xlon),as.vector(xlat), as.vector(fs))# create a dataframe
coordinates(d) <- c("xlon","xlat")
proj4string(d) <- CRS("+proj=longlat")
spoint <- SpatialPoints(coords = d) #create a spatial point object
I'm having real difficulty with exporting data from GrADS to a .csv file although it should be really easy. The file in question is from the APHRODITE project relating to rainfall over Asia. Basically I can read this file into GrADS using:
open d:/aphro/aphro.ctl
and it tells me that:
Data file d:/aphro/APHRO_MA_025deg_V1101R2.%y4 is open as file 1
Lon set to 60.125 149.875
Lat set to -14.875 54.875
Lev set to 1 1
Time values set: 1961:1:1:0 1961:1:1:0
E set to 1 1
If I execute:
q ctlinfo
it also tells me that I have three variables:
precip 1 0 daily precipitation analysis
rstn 1 0 ratio of 0.05 degree grids with station
flag 1 0 ratio of 0.05 degree grids with snow
Okay, now all I want to do is produce a list in a .csv file (or .txt) file with the following information:
Precipitation Lon Lat Time(date)
It sounds really easy but I just can't do it. One method is to use:
fprintf precip d:/output.csv %g 1
This gives me an .csv file with the entire data for that day in one long column (which is what I want). I can also do the same for lon and lat in different files and combine them. The problem is that this takes for ages for the output file - it is much faster if you don't mind lots of columns but this becomes a pain to manage. Basically, this method is too slow.
Another method is to export the data as a NetCDF file by:
Set sdfwrite -4d d:/output.nc
define var = precip
sdfwrite precip
This then very quickly writes a file called output.nc which contains all the data I need. Using R I can then read all the variables individually e.g.
f <- open.ncdf("D:/aphro/test.nc")
A <- get.var.ncdf(nc=f,varid="time")
B <- get.var.ncdf(nc=f,varid="rain")
D <- get.var.ncdf(nc=f,varid="lon")
E <- get.var.ncdf(nc=f,varid="lat")
But what I want is to make an output file where each row tells me the time, rain amount, lon and lat. I tried rbind but it doesn't associate the correct time(date) with the right rain amount, and similarly messes up the lon and lat as there are hundreds of thousand of rain data but only a few dates and only 360 lon points and 280 lat points (i.e. the rain data is a grid of data for each day over several days). I'm sure this should be easy but how to do it?
Please help
Tony
Up to my knowledge, you can change the GrAD file to NetCDF file by using climate data operator and R together. Details can be found here. Further a NetCDF file can be converted in to a .csv file. For this I am providing a dummy code.
library(ncdf)
nc <- open.ncdf("foo.nc") #open ncdf file and read variables
lon <- get.var.ncdf(nc, "lon") # Lon lat and Time
lat <- get.var.ncdf(nc, "lat")
time <- get.var.ncdf(nc, "time")
dname <- "t" # name of variable which can be found by using print(nc)
nlon <- dim(lon)
nlat<- dim(lat)
nt<- dim(time)
lonlat <- expand.grid(lon, lat) # make grid of given longitude and latitude
mintemp.array <- get.var.ncdf(nc, dname)
dlname <- att.get.ncdf(nc, dname, "long_name")
dunits <- att.get.ncdf(nc, dname, "units")
fillvalue <- att.get.ncdf(nc, dname, "_FillValue")
mintemp.vec.long <- as.vector(mintemp.array)
mintemp.mat <- matrix(mintemp.vec.long, nrow = nlon * nlat, ncol = nt)
mintemp.df <- data.frame(cbind(lonlat, mintemp.mat))
options(width = 110)
write.csv(mintemp.df, "mintemp_my.csv")
I hope, it explains your question.