I need to read a NetCDF file with R and export each time step as a smoothed polygon shapefile.
I have two problems: smoothing the raster and exporting to shapefile with proper projection from the NC file.
The output is a regular grid and is not projected.
Here is a sample code:
>NCFileName = MyncFile.nc
NCFile = open.ncdf(NCFileName)
NCFile
[1] "file CF_OUTPUT.nc has 6 dimensions:"
[1] "time Size: 61"
[1] "height Size: 8"
[1] "lat Size: 185"
[1] "lon Size: 64"
[1] "Time Size: 61"
[1] "DateStrLen Size: 19"
[1] "------------------------"
[1] "file CF_OUTPUT.nc has 20 variables:"
[1] "float temp[lon,lat,height,time] Longname:Temperature Missval:1e+30"
[1] "float relh[lon,lat,height,time] Longname:Relative Humidity Missval:1e+30"
[1] "float airm[lon,lat,height,time] Longname:Air density Missval:1e+30"
[1] "float z[lon,lat,height,time] Longname:Layer top altitude Missval:1e+30"
[1] "float ZH[lon,lat,height,time] Longname:Layer top altitude Missval:1e+30"
[1] "float hlay[lon,lat,height,time] Longname:Layer top altitude Missval:1e+30"
[1] "float PM10ant[lon,lat,height,time] Longname:PM10ant Concentration Missval:1e+30"
[1] "float PM10bio[lon,lat,height,time] Longname:PM10bio Concentration Missval:1e+30"
[1] "float PM10[lon,lat,height,time] Longname:PM10 Concentration Missval:1e+30"
[1] "float PM25ant[lon,lat,height,time] Longname:PM25ant Concentration Missval:1e+30"
[1] "float PM25bio[lon,lat,height,time] Longname:PM25bio Concentration Missval:1e+30"
[1] "float PM25[lon,lat,height,time] Longname:PM25 Concentration Missval:1e+30"
[1] "float C2H4[lon,lat,height,time] Longname:C2H4 Concentration Missval:1e+30"
[1] "float CO[lon,lat,height,time] Longname:CO Concentration Missval:1e+30"
[1] "float SO2[lon,lat,height,time] Longname:SO2 Concentration Missval:1e+30"
[1] "float NO[lon,lat,height,time] Longname:NO Concentration Missval:1e+30"
[1] "float NO2[lon,lat,height,time] Longname:NO2 Concentration Missval:1e+30"
[1] "float O3[lon,lat,height,time] Longname:O3 Concentration Missval:1e+30"
[1] "char Times[DateStrLen,Time] Longname:Times Missval:NA"
[1] "float HGT[lon,lat,time] Longname:Topography Missval:1e+30"
nc.a=get.var.ncdf(NCFile , varid = 'NO2', start=c(1,1,1,1), count=c(-1,-1,1,1))
Pol <- rasterToPolygons(raster(nc.a),dissolve = TRUE)
Pol
class : SpatialPolygonsDataFrame
features : 11829
extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
coord. ref. : NA
variables : 1
names : layer
min values : 0.219758316874504
max values : 0.84041428565979
writeOGR(Pol, dsn = getwd(), layer = 'testPol', driver = 'ESRI Shapefile', overwrite_layer = TRUE)
What I get, however, are grided polygons that are not projected.
UPDATE:
Following #kakk11 and #RobertH answers, I was able to solve part of the problem. I still get a grid-like polygons, not smoothed. Here is what I did so far:
I couldn't extract the variable directly to raster as #RobertH suggested. so I used the 'get.var.ncdf' and then 'raster':
NCFileName = 'MyncFile.nc'
NCFile = open.ncdf(NCFileName)
nc.a = get.var.ncdf(NCFile, varid = 'NO2', start=c(1,1,1,13), count=c(-1,-1,1,1))
nc.a = raster(nc.a)
# put in correct extent:
lat = NCFile$dim$lat$vals
lon = NCFile$dim$lon$vals
ExtentLat = range(lat)
ExtentLon = range(lon)
rm(lat,lon)
nc.a = flip(t(nc.a), direction='y')
# Give it lat/lon coords
extent(nc.a) = c(ExtentLon,ExtentLat)
Then the 'cut' command returns vector, so i used 'ratser:reclassify':
cuts = c(0,5,15,30,50)
classes <- cbind(cuts[1:length(cuts)-1],cuts[2:length(cuts)],cuts[2:length(cuts)])
nc.class <- reclassify(nc.a, classes)
I then used the 'rasterToPolygons' with 'dissolve=TRUE' to create the polygons:
pol <- rasterToPolygons(nc.class, dissolve = TRUE)
# set UTM projection:
WGS84_Projection = "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
proj4string(pol) <- CRS(WGS84_Projection)
writeOGR(pol, dsn = getwd(), layer = 'file' , driver = 'ESRI Shapefile', overwrite_layer = TRUE)
Still, all this creates polygon shapefile with the polygons that are not smooth, which is the main challenge.
Could use some help with this.
Ilik
You first need correctly create a RasterLayer, like this:
r <- raster('MyncFile.nc', var='NO2')
# or, to get all time steps at once
# brick('MyncFile.nc', var='NO2')
You could then generalize (classify) the values using reclassify or cut. For example
cuts <- seq(0.2, 0.9, 0.1)
rc <- cut(r, cuts)
Make polygons and save to shapefile
pol <- rasterToPolygons(rc, dissolve = TRUE)
shapefile(pol, 'file.shp')
Related
I'm trying to read a HDF5 file with terra, but the extent of the grid could not be read.
> rst <- terra::rast("temp.h5")
Warning message:
[rast] unknown extent
> rst
class : SpatRaster
dimensions : 765, 700, 2 (nrow, ncol, nlyr)
resolution : 0.001428571, 0.00130719 (x, y)
extent : 0, 1, 0, 1 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84
sources : temp.h5://image_data
temp.h5://image_data
varnames : image_data
image_data
names : image_data, image_data
with the function terra::describe("temp.h5") I obtained the information below:
[4] "Metadata:"
[5] " geographic_geo_column_offset=0 "
[6] " geographic_geo_dim_pixel=KM,KM"
[7] " geographic_geo_number_columns=700 "
[8] " geographic_geo_number_rows=765 "
[9] " geographic_geo_par_pixel=X,Y"
[10] " geographic_geo_pixel_def=LU"
[11] " geographic_geo_pixel_size_x=1.0000013 "
[12] " geographic_geo_pixel_size_y=-1.0000055 "
[13] " geographic_geo_product_corners=0 49.362053 0 55.973602 10.856429 55.388977 9.0092793 48.895298 "
[14] " geographic_geo_row_offset=3649.9792 "
[15] " geographic_map_projection_projection_indication=Y"
[16] " geographic_map_projection_projection_name=STEREOGRAPHIC"
[17] " geographic_map_projection_projection_proj4_params=+proj=stere +lat_0=90 +lon_0=0 +lat_ts=60 +a=6378.14 +b=6356.75 +x_0=0 y_0=0"
From the he corners of the map (line 13), I read that xmin=0, xmax=10.856429, ymin=49.362053 and ymax=55.973602. However, when setting the extent of the raster and plotting, I see that the corners of the raster do not correspond with the corners as listed in line 13 of the output above.
ext(rst) <- c(0,10.856429,49.362053,55.973602)
plot(rst)
Clearly something is going wrong here. How do I fix this?
Thanks in advance!
The metadata show that the CRS is stereographic, this not lon/lat. We can set it like this (note that I added the "+units=km" based on the values of a and b
p <- "+proj=stere +lat_0=90 +lon_0=0 +lat_ts=60 +a=6378.14 +b=6356.75 +x_0=0 y_0=0 +units=km"
crs(rst) <- p
Now you need to find the extent in that projection. Perhaps that is available in the file, but we can also try to guess it from the lon/lat bounding box.
m <- matrix(c(0, 49.362053, 0, 55.973602, 10.856429, 55.388977, 9.0092793, 48.895298), ncol=2, byrow=T)
v <- vect(m, crs="+proj=longlat +datum=WGS84")
# I had to change the a and b parameters to m to make this work
p <- "+proj=stere +lat_0=90 +lon_0=0 +lat_ts=60 +a=6378137 +b=6356750 +x_0=0 y_0=0"
s <- project(v, p)
e <- extent(s)
e
#SpatExtent : 0, 700000.402146425, -4415000.95326333, -3649996.04313886 (xmin, xmax, ymin, ymax)
And now you could try
ext(rst) <- as.vector(e)/1000
And you may also need to consider this:
geographic_geo_row_offset=3649.9792 "
Overall, it would seem best to go to the source and ask there.
I'd like to read in a GRIB2 file to R, but have been unable to install wgrib2 (after several hours of struggling) meaning that rNOMADS is not an option. That's okay, as GRIB2 files can be read by both the raster and rgdal packages. The problem I run into is that the names of the layers are stripped when reading in the file.
Here's an example.
# Load libraries
library(raster)
library(rgdal)
# Name of file
file_name <- "https://dd.weather.gc.ca/model_gem_regional/coupled/gulf_st-lawrence/grib2/00/001/CMC_coupled-rdps-stlawrence-ocean_latlon0.02x0.03_2020120100_P001.grib2"
# Load as raster brick
b <- brick(file_name)
# Get layer names
names(b)
# [1] "CMC_coupled.rdps.stlawrence.ocean_latlon0.02x0.03_2020120100_P001.1"
# [2] "CMC_coupled.rdps.stlawrence.ocean_latlon0.02x0.03_2020120100_P001.2"
# [3] "CMC_coupled.rdps.stlawrence.ocean_latlon0.02x0.03_2020120100_P001.3"
# [4] "CMC_coupled.rdps.stlawrence.ocean_latlon0.02x0.03_2020120100_P001.4"
# [5] "CMC_coupled.rdps.stlawrence.ocean_latlon0.02x0.03_2020120100_P001.5"
# [6] "CMC_coupled.rdps.stlawrence.ocean_latlon0.02x0.03_2020120100_P001.6"
# [7] "CMC_coupled.rdps.stlawrence.ocean_latlon0.02x0.03_2020120100_P001.7"
# [8] "CMC_coupled.rdps.stlawrence.ocean_latlon0.02x0.03_2020120100_P001.8"
# [9] "CMC_coupled.rdps.stlawrence.ocean_latlon0.02x0.03_2020120100_P001.9"
#[10] "CMC_coupled.rdps.stlawrence.ocean_latlon0.02x0.03_2020120100_P001.10"
As you can see, the name are just generic defaults. Next, I tried rgdal.
# Load using rgdal
r <- readGDAL(file_name)
# Get names
names(r)
# [1] "band1" "band2" "band3" "band4" "band5" "band6" "band7" "band8"
# [9] "band9" "band10"
Once again, default names. But, if I use the command line utility ncl_convert2nc to convert the GRIB2 file to NetCDF and then read in the NetCDF file with ncdf4 – an additional conversion step that I don't want to include in my workflow if it can be avoided – there are definitely variable names present.
# [1] "UOGRD_P0_L160_GLL0" "VOGRD_P0_L160_GLL0" "ICEC_P0_L1_GLL0"
# [4] "ICETK_P0_L1_GLL0" "UICE_P0_L1_GLL0" "VICE_P0_L1_GLL0"
# [7] "ICETMP_P0_L1_GLL0" "ICEPRS_P0_L1_GLL0" "CICES_P0_L1_GLL0"
#[10] "WTMP_P0_L1_GLL0"
QUESTION: Is there a way to extract or retain the variable/layer names when using rgdal or raster to read a GRIB2 file?
PS The reason I need to get the variable names from the file is because the layers don't match up with the order of layers as specified on the website when loaded with (e.g.) raster. This is apparent from the variable values. While I could use the variable names gleaned from the NetCDF file shown above, if the order of the layers changed this would break my package.
You can use the terra package instead of raster.
file_name <- "https://dd.weather.gc.ca/model_gem_regional/coupled/gulf_st-lawrence/grib2/00/001/CMC_coupled-rdps-stlawrence-ocean_latlon0.02x0.03_2020120100_P001.grib2"
b <- basename(file_name)
if (!file.exists(b)) download.file(file_name, b, mode="wb")
library(terra)
r <- rast(b)
r
#class : SpatRaster
#dimensions : 325, 500, 10 (nrow, ncol, nlyr)
#resolution : 0.03, 0.02 (x, y)
#extent : -71.015, -56.015, 45.49, 51.99 (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=longlat +R=6371229 +no_defs
#source : CMC_coupled-rdps-stlawrence-ocean_latlon0.02x0.03_2020120100_P001.grib2
#names : 0[-] SFC="Ground or water surface", 0[-] SFC="Ground or water surface", 0[-] SFC="Ground or water surface", 0[-] SFC="Ground or water surface", 0[m] DBSL="Depth below sea level", 0[m] DBSL="Depth below sea level", ...
But the variable names do not match yours
names(r)
# [1] "0[-] SFC=\"Ground or water surface\"" "0[-] SFC=\"Ground or water surface\"" "0[-] SFC=\"Ground or water surface\""
# [4] "0[-] SFC=\"Ground or water surface\"" "0[m] DBSL=\"Depth below sea level\"" "0[m] DBSL=\"Depth below sea level\""
# [7] "0[-] SFC=\"Ground or water surface\"" "0[-] SFC=\"Ground or water surface\"" "0[-] SFC=\"Ground or water surface\""
#[10] "0[-] SFC=\"Ground or water surface\""
You can set the names to other pieces of the metadata
nms <- trimws(grep("GRIB_ELEMENT=", desc(b), value=TRUE))
names(r) <- gsub("GRIB_ELEMENT=", "", nms)
r
#class : SpatRaster
#dimensions : 325, 500, 10 (nrow, ncol, nlyr)
#resolution : 0.03, 0.02 (x, y)
#extent : -71.015, -56.015, 45.49, 51.99 (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=longlat +R=6371229 +no_defs
#source : CMC_coupled-rdps-stlawrence-ocean_latlon0.02x0.03_2020120100_P001.grib2
#names : ICEC, ICETK, UICE, VICE, UOGRD, VOGRD, ..
names(r)
#[1] "ICEC" "ICETK" "UICE" "VICE" "UOGRD" "VOGRD" "WTMP" "ICET" "ICEPRS" "CICES"
I can change the behavior of terra such that it uses "GRIB_ELEMENT" (please let me know if that makes sense). But it is not clear to me how to get to the names you show. For example, below is the GDAL metadat for the first layer. You show ICEC_P0_L1_GLL0. All names have P0 and GLL0 so at least for this file, these seem redundant. But what does L1 refer to?
d <-desc(b)
d[35:46]
# [1] " GRIB_COMMENT=Ice cover [Proportion]"
# [2] " GRIB_DISCIPLINE=10(Oceanographic_Products)"
# [3] " GRIB_ELEMENT=ICEC"
# [4] " GRIB_FORECAST_SECONDS=3600 sec"
# [5] " GRIB_IDS=CENTER=54(Montreal) SUBCENTER=0 MASTER_TABLE=4 LOCAL_TABLE=0 SIGNF_REF_TIME=1(Start_of_Forecast) REF_TIME=2020-12-01T00:00:00Z PROD_STATUS=0(Operational) TYPE=1(Forecast)"
# [6] " GRIB_PDS_PDTN=0"
# [7] " GRIB_PDS_TEMPLATE_ASSEMBLED_VALUES=2 0 2 56 56 0 0 1 1 1 0 0 255 -127 -2147483647"
# [8] " GRIB_PDS_TEMPLATE_NUMBERS=2 0 2 56 56 0 0 0 1 0 0 0 1 1 0 0 0 0 0 255 255 255 255 255 255"
# [9] " GRIB_REF_TIME= 1606780800 sec UTC"
#[10] " GRIB_SHORT_NAME=0-SFC"
#[11] " GRIB_UNIT=[Proportion]"
#[12] " GRIB_VALID_TIME= 1606784400 sec UTC"
I see that "UOGRD" and "VOGRD" have L160 and have the number 160 in "GRIB_PDS_TEMPLATE_ASSEMBLED_VALUE" and "GRIB_PDS_TEMPLATE_NUMBERS" where the others have 1.
The metadata structure is described here but I could use some guidance about where to look to understand what to extract from the metadata.
I have a NetCDF file with rotated coordinates. I need to convert it to normal lat/lon coordinates (-180 to 180 for lon and -90 to 90 for lat).
library(ncdf4)
nc_open('dat.nf')
For the dimensions, it shows:
[1] " 5 variables (excluding dimension variables):"
[1] " double time_bnds[bnds,time] "
[1] " double lon[rlon,rlat] "
[1] " long_name: longitude"
[1] " units: degrees_east"
[1] " double lat[rlon,rlat] "
[1] " long_name: latitude"
[1] " units: degrees_north"
[1] " char rotated_pole[] "
[1] " grid_mapping_name: rotated_latitude_longitude"
[1] " grid_north_pole_longitude: 83"
[1] " grid_north_pole_latitude: 42.5"
[1] " float tasmax[rlon,rlat,time] "
[1] " long_name: Daily Maximum Near-Surface Air Temperature"
[1] " standard_name: air_temperature"
[1] " units: K"
[1] " cell_methods: time:maximum within days time:mean over days"
[1] " coordinates: lon lat"
[1] " grid_mapping: rotated_pole"
[1] " _FillValue: 1.00000002004088e+20"
[1] " 4 dimensions:"
[1] " rlon Size:310"
[1] " long_name: longitude in rotated pole grid"
[1] " units: degrees"
[1] " axis: X"
[1] " standard_name: grid_longitude"
[1] " rlat Size:260"
[1] " long_name: latitude in rotated pole grid"
[1] " units: degrees"
[1] " axis: Y"
[1] " standard_name: grid_latitude"
[1] " bnds Size:2"
Could anyone show me how to convert the rotated coordinates back to normal lat/lon? Thanks.
NCO's ncks can probably do this in two commands using MSA
ncks -O -H --msa -d Lon,0.,180. -d Lon,-180.,-1.0 in.nc out.nc
ncap2 -O -s 'where(Lon < 0) Lon=Lon+360' out.nc out.nc
I would use cdo for this purpose https://code.zmaw.de/boards/2/topics/102
Another option is just create a mapping between rotated and geographic coordinates and use the original data without interpolation. I can find the equations if necessary.
I went through the CDO link as suggested by #kakk11, but somehow that could not work for me. Afte much research, I found a way
First, convert the rotated grid to curvilinear grid
cdo setgridtype,curvilinear Sin.nc out.nc
Next transform to your desired grid e.g. for global 1X1 degree
cdo remapbil,global_1 out.nc out2.nc
or for a grid like below
gridtype = lonlat
xsize = 320 # replace by your value
ysize = 180 # replace by your value
xfirst = 1 # replace by your value
xinc = 0.0625 # replace by your value
yfirst = 43 # replace by your value
yinc = 0.0625 # replace by your value
save this info as target_grid.txt and then run
cdo remapbil,target_grid.txt out.nc out2.nc
In my case there was additional issue that my variables did not have the grid information. so CDO assumed it to be regular lat-long grid. So before all the above-mentioned steps, I had to add grid information attribute to all the variables (in my cases all the variables ended with _ave) using nco
ncatted -a coordinates,'_ave$',c,c,'lon lat' in.nc
ncatted -a grid_mapping,'_ave$',c,c,'rotated_pole' in.nc
Please note that your should have a variable called rotated_pole in your nc file with the lat long information of rotated pole.
There is also the possibility to do that in R (as the User is referring to it in the question). Of course, NCO and CDO are more efficient (way faster).
Please, look also at this answer.
library(ncdf4)
library(raster)
nsat<- stack (air_temperature.nc)
##check the extent
extent(nsat)
## this will be in the form 0-360 degrees
#change the coordinates
nsat1<-rotate(nsat)
#check result:
extent(nsat1)
##this should be in the format you are looking for: -180/180
Hope this helps.
[edited]
I have been trying plot the following gridded netcdf file: "air.1999.nc" found at the following website:
http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.html
I have tried the code below based on answers I have found here and elsewhere, but no luck.
library(ncdf);
temp.nc <- open.ncdf("air.1999.nc");
temp <- get.var.ncdf(temp.nc,"air");
temp.nc$dim$lon$vals -> lon
temp.nc$dim$lat$vals -> lat
lat <- rev(lat)
temp <- temp[nrow(temp):1,]
temp[temp==-32767] <- NA
temp <- t(temp)
image(lon,lat,temp)
library(maptools)
data(wrld_simpl)
plot(wrld_simpl, add = TRUE)
This code was from modified from the one found here: The variable from a netcdf file comes out flipped
Does anyone have any ideas or experience with using these type of netcdf files? Thanks
In the question you linked the whole part from lat <- rev(lat) to temp <- t(temp) was very specific to that particular OP dataset and have absolutely no universal value.
temp.nc <- open.ncdf("~/Downloads/air.1999.nc")
temp.nc
[1] "file ~/Downloads/air.1999.nc has 4 dimensions:"
[1] "lon Size: 144"
[1] "lat Size: 73"
[1] "level Size: 12"
[1] "time Size: 365"
[1] "------------------------"
[1] "file ~/Downloads/air.1999.nc has 2 variables:"
[1] "short air[lon,lat,level,time] Longname:Air temperature Missval:32767"
[1] "short head[level,time] Longname:Missing Missval:NA"
As you can see from these informations, in your case, missing values are represented by the value 32767 so the following should be your first step:
temp <- get.var.ncdf(temp.nc,"air")
temp[temp=="32767"] <- NA
Additionnaly in your case you have 4 dimensions to your data, not just 2, they are longitude, latitude, level (which I'm assuming represent the height) and time.
temp.nc$dim$lon$vals -> lon
temp.nc$dim$lat$vals -> lat
temp.nc$dim$time$vals -> time
temp.nc$dim$level$vals -> lev
If you have a look at lat you see that the values are in reverse (which image will frown upon) so let's reverse them:
lat <- rev(lat)
temp <- temp[, ncol(temp):1, , ] #lat being our dimension number 2
Then the longitude is expressed from 0 to 360 which is not standard, it should be from -180 to 180 so let's change that:
lon <- lon -180
So now let's plot the data for a level of 1000 (i. e. the first one) and the first date:
temp11 <- temp[ , , 1, 1] #Level is the third dimension and time the fourth.
image(lon,lat,temp11)
And then let's superimpose a world map:
library(maptools)
data(wrld_simpl)
plot(wrld_simpl,add=TRUE)
I am trying to read some variables from the necdf file but I am getting an error:
f=open.ncdf("mrgHYDRO_Az_arfswp_Stom_PRUNI_20000101_20001231_1M_sechiba_history.nc")
A = get.var.ncdf(nc=f,varid="Evaporation",verbose=TRUE)
Error in vobjtovarid(nc, varid, verbose = verbose) : Variable not found
Any help please,Best Regards
"file has 9 dimensions:"
[1] "lon Size: 34"
[1] "lat Size: 30"
[1] "veget Size: 13"
"file has 113 variables:"
[1] "double time_counter_bnds[tbnds,time_counter] Longname:time_counter_bnds Missval:1e+30"
[1] "float evap[lon,lat,time_counter] Longname:Evaporation Missval:1e+30"
I believe the correct name to refer to in the call to get.var.ncdf is evap not Evaporation. The longname is just a more descriptive name, the real name is evap.
A = get.var.ncdf(nc=f,varid="evap",verbose=TRUE)