Error when attempting distance() with raster - r

I have been trying to get a graph from using distance() in the raster package. The raster dimensions are inherited from a SpatialPointsDataFrame. The raster works fine until I try distance(raster) and get the following warning:
Warning message:
In matrix(v, ncol = tr$nrows[1] + 3) :
data length [8837790] is not a sub-multiple or multiple of the number of rows [4384]
The bizarre thing is the raster works at smaller resolution but not large. The error can be replicated below:
Fails:
library(raster)
r <- raster(ncol=4386,nrow=6039)
r[] <- NA
r[500] <- 1
dist <- distance(r)
plot(dist / 1000)
Works:
r <- raster(ncol=438.6,nrow=603.9)
r[] <- NA
r[500] <- 1
dist <- distance(r)
plot(dist / 1000)
Why? Have I missed something really obvious?

An update to raster_2.4-20 solved the problem. Thanks Pascal and RobertH for pointing me in the right direction.

Related

Aggregate high resolution (300m*300m) raster (raster::aggregate and velox not able to handle well this resolution)

I'm trying to aggregate a raster r of global extent from a ~300m*300m (10 arc‐seconds, 7.4GB) resolution to a ~10km resolution (0.083333 decimal degrees), i.e. a factor of 30.
Both the aggregate functions from the raster and the velox packages do not seem to handle such large dataset. I very much welcome recommendations!
# sample rasters
r <- raster(extent(-180,180, -90 , 90))
res(r)<-c(0.5/6/30, 0.5/6/30)
r <- setValues(r, runif(ncell(r))) # Error: cannot allocate vector of size 62.6 Gb
# velox example
devtools::install_github('hunzikp/velox')
library(velox)
vx <- velox(r) # the process aborts in linux
vx$aggregate(factor=30, aggtype='mean')
# raster example
r_agg <- aggregate(r, fact=30)
You say that raster cannot handle a large raster like that, but that is not true. The problem is that you are trying to create a very large data set in memory --- more memory than your computer has available. You can use the init function instead. I show that below but not using a global 300 m raster to make the example run a bit faster.
library(raster)
r <- raster(ymn=80, res=0.5/6/30)
r <- init(r, "col")
r_agg <- aggregate(r, fact=30)
You get better mileage with terra
library(terra)
rr <- rast(ymin=80, res= 0.5/6/30)
rr <- init(rr, "col")
rr_agg <- aggregate(rr, fact=30)
In addition to Robert's suggestion, I'd resample the rast with a template so the ext and crs would be precise.
r <- terra::rast("your_rast.tif") %>%
aggregate(., fact = 30) %>%
resample(., template_rast, filename ="sth.tif",
wopt = list(gdal = c("COMPRESS=LZW", "TFW=YES", "BIGTIFF=YES"),
tempdir = "somewhere_you_have_a_lot_of_space", todisk = TRUE))
Those wopt options might help you a lot with large rasters.

How to select one point per raster grid cell?

I have a point shapefile ("search_effort.shp") that is highly clustered and an NDVI raster (resolution in m: 30.94948, 30.77829). I would like to subset my search_effort.shp by selecting 1 point per raster grid cell and create a new search_effort shapefile. I am using R version 4.0.3
I think I could have used Package ‘gridsample’ (in 'raster' v1.3-1), but it was removed from the CRAN repository and I would prefer not to use the archived version. Is there another way to do this in R?
I have also tried sample.grid but I do not know how to specify my raster as the grid, and have tried the following:
# NDVI raster to be used as the reference extent
NDVI_extent <-readGDAL('C:/Model_layers/NDVI.tif')
# Load the file names
layername <- "SearchEffort"
# Read in the shapefile
search_effort <- readOGR(dsn= ".", layer = layername)
plot(search_effort)
# Set the reference extent
r <- raster(NDVI_extent)
# Extract coordinates from the shapefile
search_effort#coords <- search_effort#coords[, 1:2]
#Subset points
sample.grid(search_effort, cell.size = c(30.94948, 30.77829), n = 1)
I get the following error:
"Error in validObject(.Object) : invalid class “GridTopology” object: cellsize has incorrect dimension."
I get the same error regardless of the cell.size I specify.
Example data
library(raster)
r <- raster(res=30)
values(r) <- 1:ncell(r)
x <- runif(1000,-180,180)
y <- runif(1000,-90,90)
xy <- cbind(x, y)
Solution
library(dismo)
s <- gridSample(xy, r, n=1)
Illustration
plot(as(r, "SpatialPolygons"))
points(s, col="red")
points(xy, cex=.1, col="blue")

Streamlining binary rasterization in R

I have a few very small country-level polygon and point shapefiles that I would like to rasterize in R. The final product should be one global binary raster (indicating whether grid cell center is covered by a polygon / point lies within cell or not). My approach is to loop over the shapefiles and do the following for each shapefile:
# load shapefile
shp = sf::read_sf(shapefile_path)
# create a global raster template with resolution 0.0083
ext = extent(-180.0042, 180.0042, -65.00417, 75.00417)
gridsize = 0.008333333
r = raster(ext, res = gridsize)
# rasterize polygon or point shapefile to raster
rr = rasterize(shp, r, background = 0) #all grid cells that are not covered get 0
# convert to binary raster
values(rr)[values(rr)>0] = 1
Here, rr is the raster file where the polygons / points in shp are coded as 1 and all other grid cells are coded as 0. Afterwards, I take the sum over all rr to arrive at one global binary raster file including all polygons / points.
The final two steps are incredibly slow. In addition, I get RAM problems when I try to replace the all positive values in rr with 1 as the cell count is very large due to the fine resolution. I was wondering whether it is possible to come up with a smarter solution for what I'd like to achieve.
I have already found the fasterize package that has a speedy implementation of rasterize which works fine. I think it would be of great help if someone has a solution where rasterize directly returns a binary raster.
This is how you can do this better with raster. Note the value=1 argument, and also that that I changed your specification of the extent -- as what you do is probably not correct.
library(raster)
v <- shapefile(shapefile_path)
ext <- extent(-180, 180, -65, 75)
r <- raster(ext, res = 1/120)
rr <- rasterize(v, r, value=1, background = 0)
There is no need for your last step, but you could have done
rr <- clamp(rr, 0, 1)
# or
rr <- rr > 0
# or
rr <- reclassify(rr, cbind(1, Inf, 1))
raster::calc is not very efficient for simple arithmetic like this
It should be much faster to rasterize all vector data in one step, rather than in a loop, especially with large rasters like this (for which the program may need to write a temp file for each iteration).
To illustrate this solution with example data
library(raster)
cds1 <- rbind(c(-180,-20), c(-140,55), c(10, 0), c(-140,-60))
cds2 <- rbind(c(-10,0), c(140,60), c(160,0), c(140,-55))
cds3 <- rbind(c(-125,0), c(0,60), c(40,5), c(15,-45))
v <- spLines(cds1, cds2, cds3)
r <- raster(ncols=90, nrows=45)
r <- rasterize(v, r, field=1)
To speed things up, you can use terra (the replacement for raster)
library(raster)
f <- system.file("ex/lux.shp", package="terra")
v <- as.lines(vect(f))
r <- rast(v, ncol=75, nrow=100)
x <- rasterize(v, r, field=1)
Something that seems to work computationally and significantly improves computation time is to
Create one large shapefile shp instead of working with individual rasterized shapefiles.
Use the fasterize package to rasterize the merged shapefile.
Use raster::calc to avoid memory problems.
ext = extent(-180.0042, 180.0042, -65.00417, 75.00417)
gridsize = 0.008333333
r = raster(ext, res=gridsize)
rr = fasterize(shp, r, background = 0) #all not covered cells get 0, others get sum
# convert to binary raster
fun = function(x) {x[x>0] <- 1; return(x) }
r2 = raster::calc(rr, fun)

how to run savitzkey Golay filter on time series NDVI image

Is there any way to run savitzky golay filter on time series NDVI image in R. I had already tried with the following code given in the package 'signal';
sg <- sgolayfilt(timeseries,3,5).
But it returns following error;
Error in if (all(is.na(x))) return(x) :
argument is not interpretable as logical
The file "timeseries" here is a stacked raster NDVI image. Can anybody help me in this regard.
Thank you for your kind help.
I have 12 raster layers, then I stack them into raster stack
###### Load needed package
library(raster)
library(sp)
library(rgdal)
library(tiff)
library(ggplot2)
library(maptools)
library (zoo)
library (signal)
library(timeSeries)
NDVI_STACK <- stack (jan,feb,mar,apr,may,jun,jul,aug,sep,oct,nov,dec)
fun <- function(x) {
v=as.vector(x)
z=substituteNA(v, type="mean") # from package timeSeries
NDVI.ts2 = ts(z, start=c(2005,1), end=c(2005,12), frequency=12)
x=sgolayfilt(NDVI.ts2, p=2, n=5, ts=30)
NDVI.filtered <- calc(NDVI_STACK, fun, progress='text') #raster calculation process ......
You may need to adjust p and n depending on your data. BTW, n must be odd dimension and, from my experience, p is less than n
: This link is very useful
I hope it could help:)

Set single raster to NA where values of raster stack are NA

I have two 30m x 30m raster files which I would like to sample points from. Prior to sampling, I would like to remove the clouded areas from the images. I turned to R and Hijman's Raster package for the task.
Using the drawPoly(sp=TRUE) command, I drew in 18 different polygons. The function did not seem to allow 18 polygons as one sp object, so I drew them all separately. I then gave the polygons a proj4string matching the rasters', and set them into a list. I ran the list through a lapply function to convert them to rasters (rasterize function in Hijman's package) with the polygon areas set to NA, and the rest of the image set to 1.
My end goal is one raster layer with the 18 areas set to NA. I have tried stacking the list of rasterized polygons, and subsetting it to put set a new raster to NA in the same areas. My reproducible code is below.
library(raster)
r1 <- raster(nrow=50, ncol = 50)
r1[] <- 1
r1[4:10,] <- NA
r2 <- raster(nrow=50, ncol = 50)
r2[] <- 1
r2[9:15,] <- NA
r3 <- raster(nrow=50, ncol = 50)
r3[] <- 1
r3[24:39,] <- NA
r4 <- raster(nrow=50, ncol = 50)
r4[] <- 1
s <- stack(r1, r2, r3)
test.a.cool <- calc(s, function(x){r4[is.na(x)==1] <- NA})
For whatever reason, the darn testacool is a blank plot, where I'm aiming to have it as a raster with all values except for the NAs in the stack, s, equal to 1.
Any tips?
Thanks.
Doing sum(s) will work, as sum() returns NA for any grid cell with even one NA value in the stack.
To see that it works, compare the figures produced by the following:
plot(s)
plot(sum(s))
I posted this question on the R-Sig-Geo forum, as well, and received a response from the package author. The two simplest solutions:
Use the sp package to rbind my polygons into one, then rasterize the polygon.
p <- rbind(p1, p2, p3...etc., makeUniqueIDs = TRUE)
r4 <- raster(nrow=50, ncol = 50)
r4[] <- 1
mask <- rasterize(p, r4)
mask[mask %in% 1:18] <- 1
#The above code produces a single raster file with
#my polygons as unique values, ready for masking.
And the second simple solution, as just pointed out by Josh O'Brien:
m <- sum(s)
test <- mask(r4, m)
The R community rocks. Problem solved (twice) within an hour. Thanks.
I'm not familiar with the package you are using, however looking at the final line in your code, I think the issue might be here:
function(x){r4[is.na(x)==1] <- NA})
It doesn't look like calc will do much with that. It is setting the values of r4 indexed by the NAs of x and setting those to NA.
What then? If anything, maybe:
function(x){r4[is.na(x)==1] <- NA; return(r4) })
Although, it's not clear if that is even what you are after.
You were on the right track. The [ operator is defined for rasters and raster stacks, so you could just use the single line:
r4[ any(is.na(s) ) ] <- NA
plot(r4)
If you wanted to use calc you could have used it like this:
r4 <- calc( s, function(x){ ( ! any( is.na(x) ) ) } )
r4[is.na(r4)] <- NA
plot(r4)

Resources