How to extract NetCDF data frame by region using a polygon shapefile

How to extract NetCDF data frame by region using a polygon shapefile - r

I'am trying to extract the variable "swh_ku" from multiple NetCDF files together with their corresponding latitude and longitude values into csv files using a polygon shapefile or it's extent. I'm working with Jason-1 global altimetry swath data but I only need the data for the domain represented by the shapefile. I just need help with some lines of code that would complete the working code bellow so I can extract only the data for the region I'm interested in.
I've tried several software applications such as QGIS, ESA SNAP, Broadview Radar Altimetry Toolbox (BRAT) with no success unfortunately because I couldn't find a way automate the extraction process for the hundreds of NetCDF files. So I resorted to code with which I'm fairly new but managed to get it working after reading other posts. I've tried opening the files as raster or brick to use the #extract or #mask functions because they seem more straightforward but I couldn't manage to work them out.
Link to data: https://drive.google.com/drive/folders/1d_XVYFe__-ynxbJNUwlyl74SPJi8GybR?usp=sharing
library(ncdf4)
library(rgdal)
library(raster)
my_read_function <- function(ncname) {
setwd("D:/Jason-1/cycle_030")
bs_shp=readOGR("D:/Black_Sea.shp")
e<-extent(bs_shp)
ncfname = ncname
names(ncin[['var']])
dname = "swh_ku"
ncin = nc_open(ncfname)
print(ncin)
vars<-(names(ncin[['var']]))
vars
lon <- ncvar_get(ncin, "lon")
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(ncin, "lat", verbose = F)
nlat <- dim(lat)
head(lat)
print(c(nlon, nlat))
sm_array <- ncvar_get(ncin,dname)
dlname <- ncatt_get(ncin,dname,"long_name")
dunits <- ncatt_get(ncin,dname,"units")
fillvalue <- ncatt_get(ncin,dname,"_FillValue")
dim(sm_array)
ls()
sm.slice <- sm_array[]
sm.vec <- as.vector(sm.slice)
length(sm.vec)
lonlat <- expand.grid(lon, lat)
sm.df01 <- data.frame(cbind(lonlat, sm.vec))
names(sm.df01) <- c("lon", "lat", paste(dname, sep = "_"))
head(na.omit(sm.df01), 20)
csvfile <- paste0(ncname,".csv")
write.table(na.omit(sm.df01), csvfile, row.names = FALSE, sep = ",")
}
my_files <- list.files("D:/Jason-1/cycle_030/")
lapply(my_files, my_read_function)

Looks like your data is not gridded.
library(ncdf4)
library(raster)
bs <- shapefile("Black_Sea.shp")
# simplify so that the data will look better later
bs <- as(bs, "SpatialPolygons")
f <- list.files("cycle_022", pattern="nc$", full=TRUE)
Loop would start here
ncfname = f[1]
dname = "swh_ku"
ncin = nc_open(ncfname)
lon <- ncvar_get(ncin, "lon")
lat <- ncvar_get(ncin, "lat", verbose = F)
sm_array <- ncvar_get(ncin, dname)
xyz <- na.omit(cbind(lon, lat, sm_array))
p <- SpatialPoints(xyz[,1:2], proj4string=crs(bs))
p <- SpatialPointsDataFrame(p, data.frame(xyz))
x <- intersect(p, bs)
x has the points that intersect with the Black Sea
plot(bs)
points(x)
head(x#data)

Related

How to submit a job in Spark with a netCDF data source?

I have this code:
install.packages("ncdf4")
library(ncdf4)
install.packages("tidync")
library(tidync)
pp <- tidync("~/climate_data/pp_ens_mean_0.1deg_reg_v25.0e.nc")
print(pp)
## Daily averaged sea level pressure - PP
# set path and filename
ncpath <- "~/climate_data/"
ncname <- "pp_ens_mean_0.1deg_reg_v25.0e"
ncfname <- paste(ncpath, ncname, ".nc", sep="")
dname <- "pp"
# open a netCDF file
ncin <- nc_open(ncfname)
print(ncin)
# get longitude and latitude
lon <- ncvar_get(ncin,"longitude")
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(ncin,"latitude")
nlat <- dim(lat)
head(lat)
print(c(nlon,nlat))
# get time
time <- ncvar_get(ncin,"time")
time
tunits <- ncatt_get(ncin,"time","units")
nt <- dim(time)
nt
# get pressure
pp_array <- ncvar_get(ncin,dname)
dlname <- ncatt_get(ncin,dname,"long_name")
dunits <- ncatt_get(ncin,dname,"units")
fillvalue <- ncatt_get(ncin,dname,"_FillValue")
dim(pp_array)
I have a RAM issue while running pp_array <- ncvar_get(ncin,dname). I would like to maybe submit a job in Spark to maybe run it on a cluster.
I have followed the installation procedure from here: https://therinspark.com/starting.html
#installing Spark
library(sparklyr)
#spark_install()
sc <- spark_connect(master = "local")
spark_web(sc)
# Retrieve the Spark installation directory
spark_home <- spark_home_dir()
# Build paths and classes
spark_path <- file.path(spark_home, "bin", "spark-class")
# Start cluster manager master node
system2(spark_path, "org.apache.spark.deploy.master.Master", wait = FALSE)
# Start worker node, find master URL at http://localhost:8080/
system2(spark_path, c("org.apache.spark.deploy.worker.Worker",
"spark://192.168.1.32:7077"), wait = FALSE)
but I am not familiar at all with Spark and I get confused on what to do next, as their main example is using a .csv file and I have .nc files as a source.
How can I run this previous piece of code with Stark?
Thank you very much for the precious help.

Streamline code, many years of data to process - lapply, raster, stack, mean, crop to AOI

I have some code in R which does the following:
Uses lapply to bring in files in a set folder e.g. 1997 data
Makes file list into a brick - they are NetCDF files, so I've used brick function
Stacks the bricks into one raster stack of months for each year.
Calculate the mean from the stack
Crops the new mean raster to the Area of Interest (AOI).
I've got a working code, see below, but it is clunky and I feel could be better in one loop to then run through each year's folder (I have data from 1997 to 2018). Could anyone aid in streamlining this into a simple looped code I could run by changing the filepath? I've used loops a bit before but not from scratch.
# Packages:
library(raster)
library(parallel) # Check cores in PC
library(lubridate) # needed for lapply
library(dplyr) # ""
library(sf) # For clipping data
library(rgdal)
# ChlA
# Set file paths for input and outputs:
usingfp <- "/filepath/GIS/ChlA/1997/"
the_dir_ex <- "Data/CHL/1997"
# List all NETCDF files in folder:
CHL_1997 <- list.files(path = usingfp, pattern = "\\.nc$", full.names = TRUE,
recursive = FALSE)
# Make file list into brick
CHL_1997_brick <- lapply(CHL_1997,
FUN = brick,
the_dir = the_dir_ex)
# Stack bricks
s <- stack(CHL_1997_brick)
# Calculate mean from stack
mean <- calc(s, fun = mean, na.rm = T)
plot(mean)
# Load vector boundary to "crop" to
AOI <- readOGR("/filepath/AOI/AOI.shp")
plot(AOI,
main = "Shapefile imported into R - crop extent",
axes = TRUE,
border = "blue",
add = T)
# crop the raster using the vector extent
CHL_1997_mean <- crop(mean, AOI)
plot(CHL_1997_mean, main = "Cropped mean CHL - 1997")
# add shapefile on top of the existing raster
plot(AOI, add = TRUE)
Thanks very much.

Something like this should work
library(raster)
AOI <- shapefile("/filepath/AOI/AOI.shp")
path <- "/filepath/GIS/ChlA/"
years <- 1997:2018
for (yr in years) {
fp <- file.path(path, yr)
fout <- file.path(fp, paste0(year, ".tif"))
print(fout); flush.console()
# if (file.exists(fout)) next
files <- list.files(path=fp, pattern="\\.nc$", full.names=TRUE)
b <- lapply(files, brick)
s <- stack(b)
s <- mean(s)
s <- crop(s, AOI, filename=fout) #, overwrite=TRUE)
}
Notes:
mean(s) is more efficient than calc(s, mean)
If the AOI is relatively small, it can be more efficient to first
use crop, then mean (and then use writeRaster)
You can also use terra like this:
library(terra)
AOI <- vect("/filepath/AOI/AOI.shp")
path <- "/filepath/GIS/ChlA/"
years <- 1997:2018
for (yr in years) {
fp <- file.path(path, year)
fout <- file.path(fp, paste0(year, ".tif"))
print(fout); flush.console()
# if (file.exists(fout)) next
files <- list.files(path=fp, pattern="\\.nc$", full.names=TRUE)
r <- rast(files)
s <- mean(r)
s <- crop(s, AOI, filename=fout) #, overwrite=TRUE)
}

How to save dataframe geometry to a shapefile in R

I want to save route_dhdb_sf in SHP file, it looks like this :
My problem is the geometry column.
The R code :
library(sf)
library(ggplot2)
library(stplanr)
# Read shapefile
nl_rails_sf <- sf::st_read("~/netherlands-railways-shape/railways.shp")
# Data frame with station locations
stations_df <- data.frame(station = c("Den Haag", "Den Bosch"),
lat = c(52.080276, 51.690556),
lon = c(4.325, 5.293611))
# Create sf object
stations_sf <- sf::st_as_sf(stations_df, coords = c("lon", "lat"), crs = 4326)
# Find shortest route
slnetwork <- SpatialLinesNetwork(nl_rails_sf)
find_nodes <- find_network_nodes(sln = slnetwork,
x = stations_df$lon,
y = stations_df$lat,
maxdist = 2e5)
route_dhdb_df <- data.frame(start = find_nodes[1], end = find_nodes[2])
route_dhdb_sf <- sum_network_links(sln = slnetwork, routedata = route_dhdb_df)
How do I save this route_dhdb_sf to a shape file?
#Code of mharinga

Normally you can use:
sf::st_write(route_dhdb_sf,"~/netherlands-railways-shape/route_dhdb_sf.shp")
If you tried it, and if it did not work (you got an error message), you can try to save to other formats e.g. geosjon by editing the suffix like this:
sf::st_write(route_dhdb_sf,"~/netherlands-railways-shape/route_dhdb_sf.geojson")

Split Raster Iteration or loop in R

R Programming Language (New to this)
I am attempting to loop through a number of tiled rasters that have been output by splitRaster. During the loop I want to carry out some processes on each raster.
But the following code throws an error.
library(ForestTools)
library(raster)
library(sp)
library(rgdal)
library(SpaDES)
rm(list = ls())
tmpdir <- file.path(tempdir(), "splitRaster")
lin <- function(x){x * 0.1 + 0.6}
inCHM <- raster("input raster path and name.tif")
split <- splitRaster(inCHM, 5, 5, c(0.05, 0.05), tmpdir)
files <- list.files(path=tmpdir, pattern="*.grd", full.names=FALSE, recursive=FALSE)
file.names <- dir(tmpdir, pattern ="*.grd")
for(file.names in files ){
name <- file.names
ttops <- vwf(name, winFun = lin, minHeight = 5)
writeOGR(ttops, "output folder", name, driver = "ESRI Shapefile")
}
and this is the error
[1] "Xrastername_tile1.grd"
Error in CRS(x) :
PROJ4 argument-value pairs must begin with +: Xrastername_tile1.grd
More to the problem (24/7/2020),
I have removed the loop for trouble shooting instead just choosing one of the splitRasters outputs that would be used in the loop ie files[[3]]
When I run the following code the error is the same;
library(ForestTools)
library(raster)
library(sp)
library(rgdal)
library(SpaDES)
rm(list = ls())
# set temp directory
tmpdir <- "C:\\R-Test\\Temp_Output"
# get raster
r <- raster("C:\\Lidar\\grid_treeheight_max_1m_nofill.tif")
# define projection
projection(r) <- "+proj=utm +zone=50 +south +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs"
# split raster brick
y <- splitRaster(r, 8, 8, c(0.05, 0.05), tmpdir)
# Get the complete file locations with full.names = T
files <- list.files(path=tmpdir, pattern="*.grd", full.names=FALSE, recursive=FALSE)
tmpfile <- paste(tmpdir, "\\", files[[3]], sep="")
lin <- function(x){x * 0.06 + 0.6}
ttops <- vwf(tmpfile, winFun = lin, minHeight = 5)
This is the error
Error in CRS(x) :
PROJ4 argument-value pairs must begin with +: D:\R-Test\Temp_Output\Xgrid_treeheight_max_1m_nofill_tile11.grd
When I run the following code using one of the splitRaster outputs (files[[3]]) from the above code it runs error free and I am able to plot ttops.
rm(list = ls())
# set temp directory
tmpdir <- "D:\\R-Test\\Temp_Output"
# get raster
r <- raster("D:\\R-Test\\Temp_Output\\Xgrid_treeheight_max_1m_nofill_tile11.grd")
lin <- function(x){x * 0.06 + 0.6}
ttops <- vwf(r, winFun = lin, minHeight = 5)
Why is the PROJ4 error occurring?
This seems to be the error that is causing the loop to fail?

I think the problem is that you are trying to feed the vwf function with the file name instead of a raster object. I would also recommend using lapply instead of for for the loop. Here is a code that should work
library(raster)
library(ForestTools)
library(rgdal)
# Get the complete file locations with full.names = T
files <- list.files(path=tmpdir, pattern="*.grd", full.names=T, recursive=FALSE)
# Loop over each item of the list, i.e., each raster
lapply(files, function(x){
# Load the image as raster
image <- raster(x)
# Calculate vwf (I added a dummy function for winFun)
ttops <- vwf(image, winFun = function(x){x * 0.06 + 0.5}, minHeight = 5)
# Write the file with the name of each raster
writeOGR(ttops, "output_dir", names(x), driver = "ESRI Shapefile")
})

How to Reorder Formal Class Spatial Polygons

I'm looking for a way to re-order a set of Formal Class Spatial Polygons using
I'm using US Census data (Limited to Texas) and want to create 33 polygons out of different county combinations.
library(tmap)
library(maptools)
library(ggplot2)
library(rgeos)
library(sp)
library(mapdata)
library(rgdal)
library(raster)
# Download the map of texas and get the LMAs boundaries
# Download shape
f <- tempfile()
download.file("http://www2.census.gov/geo/tiger/GENZ2010/gz_2010_us_050_00_20m.zip", destfile = f)
unzip(f, exdir = ".")
US <- read_shape("gz_2010_us_050_00_20m.shp")
# Select only Texas
Texas <- US[(US$STATE %in% c("48")),]
# Load the LMA append data
LMAs = read.table('LMA append data.csv',header=T, sep=',')
# Append LMA data to Texas shape
Texas$FIPS <- paste0(Texas$STATE, Texas$COUNTY)
Texas <- append_data(Texas, LMAs, key.shp = "FIPS", key.data = "FIPS")
Texas <- Texas[order(Texas$LMA),]
# Create shape object with LMAs polygons
Texas_LMA <- unionSpatialPolygons(Texas, IDs=Texas$LMA)
I've tried converting Texas_LMA into a SpatialPolygonsDataFrame with
# Create shape object with LMAs polygons
Texas_LMA <- unionSpatialPolygons(Texas, IDs=Texas$LMA)
spp <- SpatialPolygonsDataFrame(Texas_LMA,data=matrix(1:33,nrow=33,ncol=1))
But that hasn't worked for me.

Your question is not very clear. But I think this is what you are after:
library(raster)
f <- tempfile() download.file("http://www2.census.gov/geo/tiger/GENZ2010/gz_2010_us_050_00_20m.zip", destfile = f)
unzip(f, exdir = ".")
US <- shapefile("gz_2010_us_050_00_20m.shp")
Texas <- US[(US$STATE %in% c("48")),]
LMAs = read.csv('LMA append data.csv')
Texas$FIPS <- paste0(Texas$STATE, Texas$COUNTY)
You did not provide the LMA data, so guessing a bit from here:
Texas <- merge(Texas, LMAs, by="FIPS")
Texas_LMA <- aggregate(Texas, by='LMA')

shapefilename[order(shapefilename$column_name),]

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to extract NetCDF data frame by region using a polygon shapefile - r

Related

How to submit a job in Spark with a netCDF data source?

Streamline code, many years of data to process - lapply, raster, stack, mean, crop to AOI

How to save dataframe geometry to a shapefile in R

Split Raster Iteration or loop in R

How to Reorder Formal Class Spatial Polygons

Categories

Resources