netCDF to raster and projection - projection

I've recently started using R for spatial data. I'd be really grateful if you could could help me with this. Thanks!
I've extracted data from a multidimensional netCDF file. This file had longitude, latitude and temperature data (for 12 months of a specific year).
From this netCDF I've got a data frame for January with these variables: longitude, latitude, temperature.
With this data frame I've created a raster.
# Packages
library("sp")
library("raster")
library("rgdal")
library("ncdf")
library("maptools")
library("rgeos")
library("sm")
library("chron")
# Dataframe to raster
# Create spatial points data frame
coordinates(tmp.df01) <- ~ lon + lat
# Coerce to SpatialPixelsDataFrame
gridded(tmp.df01) <- T
# Coerce to raster
rasterDF1 <- raster(tmp.df01)
> print(tmp.df01)
class : SpatialPixelsDataFrame
dimensions : 103, 241, 24823, 1 (nrow, ncol, npixels, nlayers)
resolution : 0.02083333, 0.02083333 (x, y)
extent : 5.739583, 10.76042, 45.73958, 47.88542 (xmin, xmax, ymin, ymax)
coord. ref. : NA
names : TabsM_1
min values : -18.1389980316162
max values : 2.26920962333679
There is no value for 'coord. ref.'
The projection of the original netCDF was WGS84. So I gave this projection to the raster.
proj4string(rasterDF1) <- "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"
Then, I wanted to reproject my raster to another projection:
# Reprojecting into CH1903_LV03
# First, change the coordinate reference system (crs)
proj4string(rasterDF1) <- "+init=epsg:21781"
# Second, reproject the raster
rasterDF1.CH <- spTransform(rasterDF1, crs("+init=epsg:21781"))
At this point I get the following error:
Error in spTransform(rasterDF1, crs("+init=epsg:21781")) :
load package rgdal for spTransform methods
But the package rgdal is already uploaded! It must be something wrong in the code!

Here the code to solve the problem described.
Solution provided by Frede Aakmann Tøgersen.
tmp.df01 # tmp.df01 is a data.frame
coordinates(tmp.df01) <- ~ lon + lat # tmp.df01 is now a SpatialPointsDataFrame
# Assign orignial data projection
proj4string(tmp.df01) <- CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0")
gridded(tmp.df01) <- T # tmp.df01 is now a SpatialPixelFrame
# Coerce to raster
rasterDF1 <- raster(tmp.df01) # rasterDF1 is a RasterLayer
# To reproject the raster layer
rasterDF1.proj <- projectRaster(rasterDF1, crs=CRS("+init=epsg:21781"))

Here is another option GDAL. Using gdal_translate you can convert netcdf file with bands to geotiffs.
gdal_translate -a_srs EPSG:4326 NETCDF:File_Name.nc:Band_Name -of ‘Gtiff’ Output_FileName.geotiff
To explore more options in gdal_translate you can visit this link

Related

Fast Extraction from Raster Datasets using Points - How to speed up the raster::extract() function

I need to extract values from large RasterLayer (a Digital Terrain Model or DTM), using XY coordinates. Coordinates are in another large data.table.
> DTM
class : RasterLayer
dimensions : 93690, 74840, 7011759600 (nrow, ncol, ncell)
resolution : 16, 16 (x, y)
extent : -80000, 1117440, 6448080, 7947120 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=33 +ellps=GRS80 +units=m +no_defs
source : dtm_16x16_utm33.tif
names : dtm_16x16_utm33
values : -6.648066, 2273.72 (min, max)
> XY
x y
1: 986488 7930296
2: 986504 7930296
3: 986536 7930296
4: 986552 7930296
5: 986488 7930280
---
454986003: 61832 6451208
454986004: 61848 6451208
454986005: 61864 6451208
454986006: 61912 6451208
454986007: 61928 6451208
extract() function from raster r package could do the task. According to the package manual, the code would be:
Altitude <- extract(DTM, XY)
This, however, takes a long time!!
I have tried using the following line of code to run the function in parallel.
beginCluster(30)
Altitude <- extract(DTM, XY)
endCluster()
However, I can see that of the 64 available cores, only 1 is being used by R, the code is not running in parallel, and the function continues to take a long time.
Any ideas how I can speed this up?
Note1: I have successfully used lines of code similar to this one...
beginCluster(20)
raster3 <- projectRaster(raster1, crs=crs(raster2))
endCluster()
...and the server has worked with multiple cores at the same time.
One of the fastest solutions is to use the velox package.
https://www.rdocumentation.org/packages/velox/versions/0.2.0
library("velox")
library("raster")
library("sp")
library("rgeos")
## convert XY data to SpatialPoints
XY <- SpatialPoints(coords = XY)
## create a velox object from the original raster
DTMV <- velox(DTM)
## fast extraction of altitude data
Altitude <- DTM1V$extract_points(sp = XY)

point pattern analysis in Spatstat

I am having some trouble setting up my data for some point pattern analysis.
What I want to do: conduct a point pattern analysis on NYC arrest data and see if there exists a spatial dependence between arrests and Covid-19 cases.
What I've done so far: downloaded data in the form of shapefiles
https://data.cityofnewyork.us/City-Government/Borough-Boundaries/tqmj-j8zm (the ZIP code boundaries)
https://www1.nyc.gov/site/nypd/stats/crime-statistics/citywide-crime-stats.page (year to date data for arrests in NYC by zip code)
Code:
library(readxl)
library(rgdal) #Brings Spatial Data in R
library(spatstat) # Spatial Statistics
library(lattice) #Graphing
library(maptools)
library(raster)
library(ggplot2)
library(RColorBrewer)
library(broom)
# Load nyc zip code boundary polygon shapefile
s <- readOGR("/Users/my_name/Documents/fproject/zip","zip")
nyc <- as(s,"owin")
### OGR data source with driver: ESRI Shapefile
Source: "/Users/my_name/Documents/project/zip", layer: "zip"
with 263 features
# Load nyc arrests point feature shapefile
> s <- readOGR("/Users/my_name/Documents/project/nycarrests/","geo1")
### OGR data source with driver: ESRI Shapefile
Source: "/Users/my_name/Documents/project/nycarrests", layer: "geo1"
with 103376 features
It has 19 fields
#Converting the dataset into a point pattern
arrests <- as(s,"ppp”)
### Error in as.ppp.SpatialPointsDataFrame(from) :
Only projected coordinates may be converted to spatstat class objects
This gave me the error above.
I know the error has to do with the coordinates not being in the cartesian coordinates. So my question is:
How can I convert my sp object to have (projected) cartesian coordinates in order to convert it to a point pattern (poisson point process)?
You are looking for spTransform.
Here is some example data
library(raster)
filename <- system.file("external/lux.shp", package="raster")
p <- shapefile(filename)
Solution
utm <- "+proj=utm +zone=32 +datum=WGS84"
x <- spTransform(p, utm)
x
#class : SpatialPolygonsDataFrame
#features : 12
#extent : 266045.9, 322163.8, 5481445, 5563062 (xmin, xmax, ymin, ymax)
#crs : +proj=utm +zone=32 +datum=WGS84 +units=m +no_defs
#variables : 5
#names : ID_1, NAME_1, ID_2, NAME_2, AREA
#min values : 1, Diekirch, 1, Capellen, 76
#max values : 3, Luxembourg, 12, Wiltz, 312

How do I get this raster and this shapefile on the same projection?

I have a shapefile of 10 CA counties projected in NAD83(Hard)/CA Albers. I have a raster (a netCDF file of temperature) for the entire US projected in WGS84/WGS84. I want to use the shapefile to clip the raster. I know that I need to get them on the same datum/projection first. But I've tried re-projecting the raster using raster::projectRaster(). That failed (as in the data disappeared). So then I tried re-projecting the shapefile instead using sp::spTransform(). This also failed (as in the data don't overlap). I've searched through stackoverflow but didn't see anything that seemed to help. I'm not getting an error, but projectRaster is not working and re-projecting the shapefile using spTransform doesn't produce the desired outcome. I feel like there is something specific going on here, like the transformation from WGS84 to NAD83 or loading the raster in using raster() is the problem ... but then again, it could easily be something stupid that I'm missing! =)
my shapefile and raster are here: https://www.dropbox.com/sh/l3b2syzcioeqmyy/AAA5CstBZty4ofOcVFkAumNYa?dl=0
here is my code:
library(raster) #for creating rasters from .bil files
library(rgdal) #for reading .bil files and .gdb files
library(ncdf4) #for working with ncdf files
library(sp) #for working with spatial data files
load(my_counties.RData)
myraster <- raster(myraster.nc)
my.crs <- CRS("+init=EPSG:3311") #NAD83(HARN) / California Albers (HARN is high resolution)
newraster <- projectRaster(myraster, res = 6000, crs = my.crs) #raster resolution is 1/16th of a degree
#There is data in the raster.
plot(myraster)
#but none in newraster
plot(newraster)
#Now try re-projecting the shapefile
my.crs2 <- crs(myraster)
newshapefile <- spTransform(my_counties, my.crs2)
#but the data don't overlap
plot(newshapefile); plot(myraster, add = T)
You can do
library(raster)
library(rgdal)
load("my_counties.RData")
b <- brick("myraster.nc")
Now look at b
b
#class : RasterBrick
#dimensions : 490, 960, 470400, 365 (nrow, ncol, ncell, nlayers)
#resolution : 0.0625, 0.0625 (x, y)
#extent : 234, 294, 23.375, 54 (xmin, xmax, ymin, ymax)
#crs : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
#source : myraster.nc
#names : X2005.01.01, X2005.01.02, X2005.01.03, X2005.01.04, X2005.01.05, X2005.01.06, X2005.01.07, X2005.01.08, X2005.01.09, X2005.01.10, X2005.01.11, X2005.01.12, X2005.01.13, X2005.01.14, X2005.01.15, ...
#Date : 2005-01-01, 2005-12-31 (min, max)
#varname : tasmax
The horizontal extent is between 234 and 294 degrees. That points to a system with longitudes that start in Greenwich at 0 and continues to 360 (again in Greenwich). Climatologist do that. To go to the more conventional -180 to 180 degrees system:
r <- shift(b, -360)
(if your data had a global extent, you would use raster::rotate instead)
Now, transform the counties to lonlat and show that they overlap.
counties <- spTransform(my_counties, crs(r))
plot(r, 1)
lines(counties)
It is generally best to transform vector data, not raster data if you can avoid it.

Converting HDF to georeferenced file (geotiff, shapefile)

I am dealing with 28 HDF4 files on ocean primary productivity (annual .tar files can be found here: http://orca.science.oregonstate.edu/1080.by.2160.monthly.hdf.cbpm2.v.php)
My goal is to do some calculations (I need to calculate concentrations per area and obtain the mean over several years, i.e. combine all files spatially) and then convert them to a georeferenced file I can work with in ArcGIS (preferably shapefile, or geotiff).
I have tried several ways to convert to ASCII or raster files and then add a projection using gdalUtils tools such as gdal_translate and get_subdatasets. However, as the HDF4 files are not named after the standards (unlike the MODIS files), the latter doesn't work and I cannot access the subsets.
Here's the code I used to convert to raster:
library(raster)
library(gdalUtils)
setwd("...path_to_files...")
gdalinfo("cbpm.2015060.hdf")
hdf_file <- "cbpm.2015060.hdf"
outfile="testout"
gdal_translate(hdf_file,outfile,sds=TRUE,verbose=TRUE)
file.rename(outfile,paste("CBPM_test",".tif",sep=""))
rast <- raster("CBPM_test.tif")
wgs1984 <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")
projection(rast) <- wgs1984
#crs(rast) <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
plot(rast)
writeRaster(rast, file="CBPM_geo.tif", format='GTiff', overwrite=TRUE)
The resulting projection is completely off. I'd appreciate help how to do this (converting through any format that works), preferably as batch process.
You've not set the extent of your raster, so its assumed to be 1:ncols, 1:nrows, and that's not right for a lat-long data set...
The gdalinfo implies its meant to be a full sphere, so if I do:
extent(rast)=c(xmn=-180, xmx=180, ymn=-90, ymx=90)
plot(rast)
writeRaster(rast, "output.tif")
I see a raster with full global lat-long extent and when I load the raster into QGIS it overlays nicely with an OpenStreetMap.
There doesn't seem to be enough metadata in the file to do a precise projection (what's the Earth radius and eccentricity?) so don't try and do anything small-scale with this data...
Here's how it looks:
Also you've jumped through a few unnecessary hoops to read this. You can read the HDF directly and set its extent and projection:
> r = raster("./cbpm.2017001.hdf")
What do we get:
> r
class : RasterLayer
dimensions : 1080, 2160, 2332800 (nrow, ncol, ncell)
resolution : 1, 1 (x, y)
extent : 0, 2160, 0, 1080 (xmin, xmax, ymin, ymax)
coord. ref. : NA
data source : /home/rowlings/Downloads/HDF/cbpm.2017001.hdf
names : cbpm.2017001
Set extent:
> extent(r)=c(xmn=-180, xmx=180, ymn=-90, ymx=90)
And projection:
> projection(r)="+init=epsg:4326"
And land values to NA:
> r[r==-9999]=NA
Write it, plot it:
> writeRaster(r,"r.tif")
> plot(r)

R : error in moving a Raster using Shift function: It does NOT move. Why?

I have a nc file (which I can read it as a raster in R and ArcGIS) and a shapefile (polygon) for Canada.
When I open both files in ArcMap, it re-projects them on-fly and the polygon would be located on the right location:
Then, I use R to open and plot them using the code:
## set the working directory:
setwd("C:/Users/Desktop/Data")
## required libraries
library(raster)
library(rgdal)
library(ncdf4)
## reading the polygon:
My_study_area <- readOGR(dsn = ".", layer = "Canada_AB")
## create a connection to the downloaded NC file:
fn <- "tasmin_day_CanESM5_ssp126_r1i1p1f1_gn_21010101-23001231_cliped_AB.nc"
ncin <- nc_open(fn)
ncin
## create a raster from the nc file:
fn_brick <- brick(fn)
fn_brick
extent(fn_brick)
Here is the output:
class : Extent
xmin : 229.2188
xmax : 271.4062
ymin : 39.06842
ymax : 64.18258
## since it has 73,000 layers, let us only work on one layer: fn_brick[[1]]
My Goal:
## I want to shift this raster in the x direction for -177.1876 units:
## why -177.1876 units ?
## because I know if I shift it for -177.1876 units, the raster and the polygon will be in the right place
fn_brick_shifted <- shift(fn_brick[[1]], dx= -177.1876, dy= 0, filename="new_nc_file", format="GTiff", datatype='FLT4S', overwrite=T)
extent(fn_brick_shifted)
Despite shifting the raster, its extent is still remained unchanged:
class : Extent
xmin : 229.2188
xmax : 271.4062
ymin : 39.06842
ymax : 64.18258
However, let us plot them:
plot(fn_brick_shifted)
lines(My_study_area)
The result in the R is like this:
As can be seen, the raster is NOT shifted, therefore the polygon is not plotted on the right place.
However, we can see the coordinate info for the files:
For the raster:
crs(fn_brick_shifted)
CRS arguments:
+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
For the polygon:
crs(My_study_area)
CRS arguments:
+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
Did I make any mistake in using the shift function for the raster?
Do I need to employ another function for obtaining my goal?
Any comment or help would be highly appreciated.
If you think having the raster and polygon files may help, you could download them here in a zip file:
nc and polygon.zip

Resources