I'm fairly new to using R for GIS purposes. I have a netcdf file containing several variables with multiple dimensions (x,y,z,value and time). I am trying to turn this into a raster brick. The data is quite large so I need to pull data from a specified time window and z(depth). This has not been a problem and extract an array with the appropriate dimensions using the below code .
library(ncdf4)
library(raster)
t <- ncvar_get(nc, "model_time")
t1<-ncvar_get(nc,"model_time_step")
tIdx<-t[t> 20120512 & t < 20120728]
tIdx2<-which(t> 20120512 & t < 20120728)
# Depths profiles < 6 meters
dIdx<-which(nc$dim$depthu$vals <6)
# ncdf dimension lengths
T3 <- nc$var[[7]]
varsize <- T3$varsize
# Define the data (depths,time,etc.) you wish to extract from the ncdf
start <- c(x = 1, y= 1,depthu=1, time_counter = min(tIdx2))
count <- c(x = max(varsize[1]), y = max(varsize[2]),depthu=1, time_counter =
max(tIdx2)-min(tIdx2)+1)
# order of the dimensions
dim.order <- sapply(nc$var$votemper$dim, function(x) x$name)
temp<-ncvar_get(nc,"votemper",start=start[dim.order],count=count[dim.order])
nc$var$votemper
An example of my data (dropping the depth/z and the time dimensions)
temp<-structure(c(0,0,0,0,0,0,0,15.7088003158569,15.3642873764038,14.9720048904419,,15.9209365844727,14.9940872192383,15.0184164047241,15.0260219573975, 0,15.7754755020142, 15.424690246582, 15.6697931289673,15.6437339782715, 0,15.6151847839355, 15.5979156494141, 15.6487197875977,15.432520866394), .Dim = c(x = 5L, y = 5L))
The latitudes and longitudes extracted from the ncdf are irregularly spaced and two dimensions each (i.e. An irregular spaced lat and lon for each cell)
lon<-structure(c(-71.2870483398438,-71.2038040161133,-71.1205596923828,-71.0373153686523, -70.9540710449219, -71.2887954711914, -71.2055587768555,-71.122314453125, -71.0390701293945,-70.9558258056641,-71.2905654907227,-71.2073211669922,-71.1240844726562,-71.0408401489258,-70.9576034545898,-71.292350769043,-71.209114074707, -71.1258773803711, -71.0426330566406,-70.9593963623047, -71.2941513061523, -71.2109222412109, -71.127685546875,-71.0444488525391, -70.9612045288086), .Dim = c(5L, 5L))
lat<-structure(c(38.5276718139648, 38.529125213623, 38.5305824279785,38.532039642334, 38.5334968566895, 38.5886116027832, 38.5900802612305,38.591552734375, 38.5930252075195, 38.5944976806641, 38.6494789123535,38.6509628295898, 38.6524467468262, 38.6539344787598, 38.6554222106934,38.7102699279785, 38.7117652893066, 38.713264465332, 38.7147674560547,38.7162704467773, 38.7709808349609, 38.7724952697754, 38.7740097045898,38.7755241394043, 38.777042388916), .Dim = c(5L, 5L))
Typically I would generate a raster brick from this data using
Temp_brick <- brick(temp, xmn=min(lat), xmx=max(lat), ymn=min(lon), ymx=max(lon),transpose=T)
Temp_brick<-t(flip(Temp_brick,1))
This, however does not account for the irregular spacing and raster cell values are located in the wrong position (lon,lat). I have searched across stack overflow and other gis help sources and I can't find a similar problem with a solution or I'm not asking the right question. I'm not particularly sure how to go about this. Not sure whether this should be dealt with when extracting the data from the netcdf or if it should be dealt with after the raster brick has been created without defined extent. I have tried to find a way to define the lon lats for the raster without any luck. Tried converting lon,lat and value to 3 column dataframe and then use the raster::rasterFromXYZ function. This won't work quick enough for the size of the data I'm dealing with, which in reality is 197(x)*234(y)*2(z)*900(time)*5(variables)*12(years(separate netcdf files).
Any help is greatly appreciated
an option with akima to first interp the data to a regular grid and then turn it into a raster:
# define the regular lon lat or just pass the nx, ny param to interp functions
lonlat_reg <- expand.grid(lon = seq(min(lon), max(lon), length.out = 5),
lat = seq(min(lat), max(lat), length.out = 5))
# interp irregular data to a regular grid
# both solution return the same results because
# i've define the regular grid as akima default
test <- interp(x = as.vector(lon), y = as.vector(lat), z = as.vector(temp),
xo = unique(lonlat_reg[,"lon"]), yo = unique(lonlat_reg[,"lat"]),
duplicate = "error", linear = FALSE, extrap = FALSE)
test <- interp(x = as.vector(lon), y = as.vector(lat), z = as.vector(temp),
nx = 5, ny = 5, linear = FALSE, extrap = FALSE)
# turn into a raster
test_ras <- raster(test)
Check the arguments of the function to choose the interpolation performed etc and be careful if you use extrapolation!
I've seen also that method
Cheers
Related
I am working with air pollution data and I am unable to figure out how to adjust the coordinates of the raster file I am working with. The raw data I am working with is in the form of NETcdf4 at a resolution of .01*.01 available here
This is how it looks like:
a <- raster("V4NA03_PM25_NA_201701_201701-RH35.nc")
a <- as.data.frame(a, xy=T)
Raster info
Dataframe
I use this file to crop out USA, but I require the latitudes and longitudes at .5*.5 resolution, in the form of, for instance, x = -170.5 y = -65.5, x = -170, y = -65, x = -169.5 y = -64.5 etc.
I use the following command:
a <- aggregate(a, fact = 50)
a <- as.data.frame(a, xy = T)
But then I get the following lat longs: -167.75, -167.25, 67.95, with number of rows down to 16296.
enter image description here
Am I doing something wrong? Or is my approach entirely wrong? Would rotation or reprojection help my cause?
Any help is appreciated. Thanks.
I have a few very small country-level polygon and point shapefiles that I would like to rasterize in R. The final product should be one global binary raster (indicating whether grid cell center is covered by a polygon / point lies within cell or not). My approach is to loop over the shapefiles and do the following for each shapefile:
# load shapefile
shp = sf::read_sf(shapefile_path)
# create a global raster template with resolution 0.0083
ext = extent(-180.0042, 180.0042, -65.00417, 75.00417)
gridsize = 0.008333333
r = raster(ext, res = gridsize)
# rasterize polygon or point shapefile to raster
rr = rasterize(shp, r, background = 0) #all grid cells that are not covered get 0
# convert to binary raster
values(rr)[values(rr)>0] = 1
Here, rr is the raster file where the polygons / points in shp are coded as 1 and all other grid cells are coded as 0. Afterwards, I take the sum over all rr to arrive at one global binary raster file including all polygons / points.
The final two steps are incredibly slow. In addition, I get RAM problems when I try to replace the all positive values in rr with 1 as the cell count is very large due to the fine resolution. I was wondering whether it is possible to come up with a smarter solution for what I'd like to achieve.
I have already found the fasterize package that has a speedy implementation of rasterize which works fine. I think it would be of great help if someone has a solution where rasterize directly returns a binary raster.
This is how you can do this better with raster. Note the value=1 argument, and also that that I changed your specification of the extent -- as what you do is probably not correct.
library(raster)
v <- shapefile(shapefile_path)
ext <- extent(-180, 180, -65, 75)
r <- raster(ext, res = 1/120)
rr <- rasterize(v, r, value=1, background = 0)
There is no need for your last step, but you could have done
rr <- clamp(rr, 0, 1)
# or
rr <- rr > 0
# or
rr <- reclassify(rr, cbind(1, Inf, 1))
raster::calc is not very efficient for simple arithmetic like this
It should be much faster to rasterize all vector data in one step, rather than in a loop, especially with large rasters like this (for which the program may need to write a temp file for each iteration).
To illustrate this solution with example data
library(raster)
cds1 <- rbind(c(-180,-20), c(-140,55), c(10, 0), c(-140,-60))
cds2 <- rbind(c(-10,0), c(140,60), c(160,0), c(140,-55))
cds3 <- rbind(c(-125,0), c(0,60), c(40,5), c(15,-45))
v <- spLines(cds1, cds2, cds3)
r <- raster(ncols=90, nrows=45)
r <- rasterize(v, r, field=1)
To speed things up, you can use terra (the replacement for raster)
library(raster)
f <- system.file("ex/lux.shp", package="terra")
v <- as.lines(vect(f))
r <- rast(v, ncol=75, nrow=100)
x <- rasterize(v, r, field=1)
Something that seems to work computationally and significantly improves computation time is to
Create one large shapefile shp instead of working with individual rasterized shapefiles.
Use the fasterize package to rasterize the merged shapefile.
Use raster::calc to avoid memory problems.
ext = extent(-180.0042, 180.0042, -65.00417, 75.00417)
gridsize = 0.008333333
r = raster(ext, res=gridsize)
rr = fasterize(shp, r, background = 0) #all not covered cells get 0, others get sum
# convert to binary raster
fun = function(x) {x[x>0] <- 1; return(x) }
r2 = raster::calc(rr, fun)
I need to calculate the magnitude-per-unit area of polylines that fall within a radius around each cell. Essentially I need to calculate a km/km2 road density within a 500m pixel search radius. ArcMap has a quick and easy tool that handles this, but I need a pure R solution.
Here is a link on how line density works: http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/how-line-density-works.htm
And this is how to use it in a python (arcpy) script: http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/line-density.htm
I currently execute a backwards approach using raster::focal function, calculating a density of burned in road features. I then convert the km2/km2 output to km/km2.
#Import libraries
library(raster)
library(rgdal)
library(gdalUtils)
#Read-in an already created raster mask (cells are all set to 0)
mask <- raster("x://path to raster mask...")
#Make a copy of the mask to burn features in, keeping the original untouched
roads_mask <- file.copy(mask, "x://output path ...//roads.tif")
#Read-in road features (shapefile format)
roads_sldf <- readOGR("x://path to shapefile" , "roads")
#Rasterize spatial lines data frame ie. burn road features into mask
#Where road features get a value of 1, mask extent gets a value of 0
roads_raster <- gdalUtils::gdal_rasterize(src_datasource = roads_sldf,
dst_filename = "x://output path ...//roads.tif", b = 1,
burn = 1, l = "roads", output_Raster = TRUE)
#Run a 1km circular radius density function (be mindful of edge effects)
weight <- raster::focalWeight(roads_raster,1000,type = "circle")
1km_rdDensity <- raster::focal(roads_raster, weight, fun=sum, filename = '',
na.rm=TRUE, pad=TRUE, NAonly=FALSE, overwrite=TRUE)
#Convert km2/km2 road density to km/km2
#Set up the moving window
weight <- raster::focalWeight(roads_raster,1000,type = "circle")
#Count how many records in each column of the moving window are > 0
columnCount <- apply(weight,2,function(x) sum(x > 0))
#Get the sum of the column count
number_of_cells <- sum(columnCount)
#multiply km2/km2 density by number of cells in the moving window
step1 <- roads_raster * number_of_cells
#Rescale step1 output with respect to cell size(30m) and radius of a circle
final_rdDensity <- (step1*0.03)/3.14159265
#Write out final km/km2 road density raster
writeRaster(final_rdDensity,"X://path to output...", datatype = 'FLT4S', overwrite = TRUE)
After some more research I think I may be able to use a kernel function, however I don't want to apply the smoothing algorithm... As well the output is an 'im' object which I would need to write to as a 'tif'
#Import libraries
library(spatstat)
library(rgdal)
#Read-in road features (shapefile format)
roads_sldf <- readOGR("x://path to shapefile" , "roads")
#Convert roads spatial lines data frame to psp object
psp_roads <- as.psp(roads_sldf)
#Apply kernel density, however this is where I am unsure of the arguments
road_density <- spatstat::density.psp(psp_roads, sigma = 0.01, eps = 500)
Cheers.
See this question https://gis.stackexchange.com/questions/138861/calculating-road-density-in-r-using-kernel-density
Tried to mark as a duplicate but doesn't work because the other Q is on gis stack exchange
Short answer is use spatstat.geom::pixellate()
I also needed spatstat.geom::as.psp(sf::st_geometry(x)) to convert an sf lines object to the correct format and maptools::as.im.RasterLayer(r) to convert a raster. I was able to convert the result to RasterLayer with raster::raster(pix_res)
Perhaps you can use terra::rasterizeGeom which is available in the development version that you can install with install.packages('terra', repos='https://rspatial.r-universe.dev')
Example data
library(terra)
f <- system.file("ex/lux.shp", package="terra")
v <- vect(f) |> as.lines()
r <- rast(v, res=.1)
Solution
x <- rasterizeGeom(v, r, fun="length", "km")
And then use focal sum, but you would not have a perfect circle.
What you could do instead, if your dataset is not too large, is create a circle for each grid cell and use intersect. Something like this:
p <- xyFromCell(r, 1:ncell(r)) |> vect(crs="+proj=longlat")
p$id <- 1:ncell(r)
b <- buffer(p, 10000)
values(v) <- NULL
i <- intersect(v, b)
x <- aggregate(perim(i), list(id=i$id), sum)
r[x$id] <- x[,2]
I have some gridded data of sea surface temperature values in the Mediterranean to which I've applied clustering. I have 420 files with three columns structure (long,lat,value). The data for a particular file looks like this map
Now I want to extract the cluster areas as shapefile for postprocessing. I have found this post (https://gis.stackexchange.com/a/187800/9227) and tried to use its code like this
# Packages
library(sp)
library(rgdal)
library(raster)
# Paths
ruta_datos<-"/home/meteo/PROJECTES/VERSUS/OUTPUT/DATA/CLUSTER_MED/"
setwd("~/PROJECTES/VERSUS/temp")
# File list
files <- list.files(path = ruta_datos, pattern = "SST-cluster-mitja-mensual")
for (i in 1:length(files)){
datos<-read.csv(paste0(ruta_datos,files[i],sep=""),header=TRUE)
nclusters<-max(datos$cluster)
for (j in 1:nclusters){
clust.dat<-subset(datos, cluster == j)
coordinates(clust.dat)=~longitud+latitud
proj4string(clust.dat)=CRS("+init=epsg:4326")
pts = spTransform(clust.dat,CRS("+init=epsg:4326"))
gridded(pts) = TRUE
r = raster(pts)
projection(r) = CRS("+init=epsg:4326")
# make all values the same. Either do
s <- r > -Inf
# convert to polygons
pp <- rasterToPolygons(s, dissolve=TRUE)
# save shapefile
shname<-paste("SST-shape-",substr(files[i],27,32),"-",j,sep="")
writeOGR(pp, dsn = '.', layer = shname, driver = "ESRI Shapefile")
}
}
But the code stops for with this error message
gridded(pts) = TRUE
suggested tolerance minimum: 1
Error in points2grid(points, tolerance, round) : dimension 2
: coordinate intervals are not constant
Warning message: In points2grid(points, tolerance, round) : grid has empty
column/rows in dimension 1
I don't understand that at a certain file it says that coordinate intervals are not constant while they indeed are, original SST data from which clustering was derived are on a regular grid over the whole globe. All cluster data files have the same size, 4248 points. A sample data file is available here
What does the tolerance suggestion means? I've been looking for a solution and found some suggestion to use SpatialPixelsDataFrame but couldn't find out how to apply.
Any help would be appreciated. Thanks.
I am not an expert of geospatial data but for me, if you filter on cluster, data are indeed not on a grid. So far as I understand, you start from a grid (convex set of regularly distant points).
I tried following modifications to your code and some files are generated but I can't test whether they are correct or not.
Principle is to build the grid on all data then only filter on cluster before calling raster.
This gives:
files <- list.files(path = ruta_datos, pattern = "SST-cluster-mitja-mensual")
for (i in 1:length(files)){
datos<-read.csv(paste0(ruta_datos,files[i],sep=""),header=TRUE)
nclusters<-max(datos$cluster)
for (j in 1:nclusters){
## clust.dat<-subset(datos, cluster == j)
clust.dat <- datos
coordinates(clust.dat)=~longitud+latitud
proj4string(clust.dat)=CRS("+init=epsg:4326")
pts = spTransform(clust.dat,CRS("+init=epsg:4326"))
gridded(pts) = TRUE
## r = raster(pts)
r= raster(pts[pts$cluster==j,])
projection(r) = CRS("+init=epsg:4326")
# make all values the same. Either do
s <- r > -Inf
# convert to polygons
pp <- rasterToPolygons(s, dissolve=TRUE)
# save shapefile
shname<-paste("SST-shape-",substr(files[i],27,32),"-",j,sep="")
writeOGR(pp, dsn = '.', layer = shname, driver = "ESRI Shapefile")
}
}
So, two lines in comment and update just the line below.
I have two problems.
Problem 1:
I would like to delete all cells of the grid whose centroid is not located over the raster. I'm not even sure if I'm working with the right "types of objects" (RasterLayer, SpatialPixels, etc.).
See example with dummy data below:
# Load package
library(raster)
# Create raster and define coordinate reference system
ras <- raster(nrows = 100, ncol = 100, xmn = 0, xmx = 100, ymn = 0, ymx = 100)
proj4string(ras) <- CRS("+init=epsg:32198")
# Generate random values
val <- sample(x = 1:100, size = ncell(ras), replace = T)
values(ras) <- val
# Create effort grid
xym <- matrix(c(-30,130,130,-30,-30,-30,130,130), nrow = 4, ncol = 2)
p <- Polygon(xym)
ps <- Polygons(list(p), 1)
sps <- SpatialPolygons(list(ps))
proj4string(sps) <- CRS("+init=epsg:32198")
data <- data.frame(f = 99.9)
spdf <- SpatialPolygonsDataFrame(sps, data)
ptsreg <- spsample(spdf, 50, type = "regular")
grid <- SpatialPixels(ptsreg)
# Plot raster over grid
plot(grid)
plot(ras, add = T)
Problem 2: Is there another way to create an effort grid? My code works but I'm pretty sure there's a simpler way?
Also... In this example, I plotted the grid first and added the raster. If I do it the other way (raster first and grid second), the resulting plot doesn't show the whole extent of the grid.
How can I plot the grid over the raster but still show the entire grid? Such as:
For your first question, here is an option. Try:
cont <- as(extent(ras), "SpatialPolygons")
proj4string(cont ) <- CRS("+init=epsg:32198")
grid.small <- grid[!is.na(grid%over%cont)]
You first make and polygon with the extent of your raster, then you subsample your grid with the %over% function. I think it gives you the result you are looking for.
For your second question, it isn't the best, but try:
plot(grid)
plot(ras,add=T)
plot(grid,add=T)
Simple and works. Normally,
plot(ras,ext=extent(grid))
plot(grid,add=T)
Should work but it's not and I couldn't figure out why...
As for everything in R, there is always multiple ways to do one thing. If your try what I propose and it's not efficient with your real data, then it could worth trying to find a other solution.