Impossible Nas in wrld_simpl - r

When I create a raster of the world land based on wrld_simpl (or any other environmental layer coming from worldclim) always appear to be some "impossible" NAs on land. Why would that happen? I need a perfect mask of the world land to excerpt records that did not fall in the ocean. However, there are many records on land and still are considered NA.
My script goes like this:
require(raster)
require(maptools)
data(wrld_simpl)
x=read.csv("https://www.dropbox.com/s/ncvu64r2fxgfd4e/NAlocations.csv?dl=0")
r=raster(ncols=360,nrows=(180))
extent(r)=extent(wrld_simpl)
r=rasterize(wrld_simpl,r,wrld_simpl$AREA)
plot(r)
x=x[-which(is.na(extract(r,x$lon,x$lat))),]# This should eliminate all locations on land.
points(x$lon,x$lat, col="red", cex=.3)
How is that possible? And would it be a way to create a clean raster for the world land?

The direct read.csv from dropbox does not work for me.
If I do
z <- extract(r, x)
# NOT z <- extract(r, x[,1], x[,2]) !!!
i <- which(is.na(z))
points(x[i,])
I see a bunch of points in the water of the coast of Mozambique.

Related

I want to rasterise a shape file but it is taking me ages to complete and have not succeeded yet. Can anyone tell me why?

I have a shapefile that I made in qgis of a national park with landcover type. I clipped a large shapefile of Thailand with a smaller shapefile of just the park. This file (DPKY.lc5) is now 10.5 Mb. When I run the code it takes forever and has not been successful. Why is that?
library(raster)
DPKY.lc5 <- shapefile("dpky.lc5.shp")
DPKY.lc5 <- spTransform(DPKY.lc5, CRS('+init=EPSG:4326'))
DPKY.lc5$VALUE<-as.numeric(DPKY.lc5$VALUE)
rr <- raster(DPKY.lc5, res=0.01)
rr1 <- rasterize(DPKY.lc5, rr, field="VALUE")
rr1
I would expect this to work as it is only 10.5Mb but it takes forever. I need this raster to work so I can use data points in another data frame to show the frequency of elephant habitat use in the park. I succeeded with other raster files to get elevation and slope aspect but this time it doesn't like​ it.

How to calculate area of multipart polygon in R

I'm having trouble calculating in R the area of an imported shapefile that has a multipart polygon (one feature containing two separate polygons). I noticed that ArcMap gave me a different value for the area of a shapefile than raster::area. To figure out which program was giving me the correct area, I broke the shapefile into single parts and recalculated the area of the two separate polygons:
library(raster)
> single_part <- shapefile("../Desktop/test/test_sp.shp")
> area(single_part)
[1] 575924.0 433409.8
> sum(area(single_part))
[1] 1009334
>
> multi_part <- shapefile("../Desktop/test/test_mp.shp")
> area(multi_part)
[1] 1018390
I realize now that I know about this problem, I should always break up polygon feature classes into single parts, but does anyone know how raster::area calculates the area of multipart polygons? I also tried using rgeos::gArea but got the same result. Is there a way to calculate the area of multipart polygons in R?
I'd love to know, because they're pretty common and I'm trying to switch from doing all my analyses in ArcMap to R.
In case it's helpful, here's an image of the shapefile:
multipart poly shapefile
EDIT ADDED 9/21/2018 -------------------------------------------------------
Here's a link to the shapefile test_mp.shp
From what I can tell, it seems like the problem stems from how R (vs. ArcMap) interprets the holes. See the difference between the ArcMap display and the R display. For some reason R is filling in those holes as part of the shapefile, which must be the reason that I'm getting different calculations for the area. Is there something wrong with the shapefile, or how I'm importing it?
Clearly your object named 'multi_part' has only one (multi?) polygon, as area returns a single value. I illustrate here how to investigate what you are after:
library(raster)
d <- getData('GADM', country='Isle of Man', level=0)
area(d)
[1] 579672897
Split into 4 polygons (islands)
dd <- disaggregate(d)
a <- area(dd)
a
[1] 19424.12 2705442.41 25629.79 576922400.90
sum(a)
[1] 579672897
The same area, and there is no reason why they would be different. Except perhaps if there is confusion with polygon holes. It is difficult to comment without your data.
You can write these objects to disk (see below) and see what ArcGIS gives you as area (but note that this example uses lon/lat coordinates, I am not sure if ArcGIS can compute areas on those).
shapefile(d, "man.shp")
Here is a case with and without a hole:
p1 <- rbind(c(-180,-20), c(-140,55), c(10, 0), c(-140,-60), c(-180,-20))
p2 <- rbind(c(-150,-20), c(-100,-10), c(-110,20), c(-150,-20))
# two (overlapping) polygons (no hole)
pol1 <- spPolygons(p1, p2, crs="+proj=utm +zone=1 +datum=WGS84")
# single polygon with hole
pol2 <- spPolygons(list(p1, p2), crs="+proj=utm +zone=1 +datum=WGS84")
a <- area(pol1) / 10e+9
b <- area(pol2) / 10e+9
a
#[1] 10925 800
sum(a)
#[1] 11725
a[1]-a[2]
#[1] 10125
b matches a[1] - a[2], as expected
b
#[1] 10125
I get exactly the same results with ArcGIS, using "calculate geometry" for a field in the attribute tables.

Cropping Netcdf rasters in R

I am still working in the same project where I asked this question.
And I am getting by allright, but my processing times are too long, for each NetCDF file I am reading it, getting a stack form several of the time-slices, and then cropping each of those time slices within r, using a code like the one bellow:
library(raster)
library(ncdf4)
library(ncdf4.helpers)
library(rworldxtra)
data("countriesHigh")
The NETCdf files I have are for all of the world, but I am using only South America, for that I use countiresHigh from the rworldxtra package to subset it:
NONA <- countriesHigh[!is.na(countriesHigh#data$GEO3),]
## get shapefile of South America
SA <- NONA[NONA#data$GEO3 == "South America",]
Then using the following code I crop every layer that I need.
##Open conection to the layer
nc <- nc_open("C:/Users/mean_temperature-15000BP-10000BP.nc")
Now I start a loop
for(i in 1:10){
message(paste("reading layer", i))
# Read the stack for year i
r <- stack("C:/Users/mean_temperaturemean_temperature-15000BP-10000BP.nc", varname = age[i])
#Change the extent to the correct one
extent(r) <- c(-180,180,-90,90)
#Crop it to South America
r <- crop(r, SA)
gc()
}
Problem 1: Can I crop the netcdf once before reading stacks to make this process faster?
I have looked here, here and here, and found no answer.
Problem 2: If I can crop it, how do I define the extent given that the NETCDF file contains only positive latitude and positive longitude
as stated in this question, the defined extent of the map is not the most usual, but I am basing my cropping in the extent of a shapefile that has the more typical c(-180,180,-90,90) extent

Query raster brick layer based on another raster in R

I have a NetCDF file of global oceanographic (OmegaA) data at relatively coarse spatial resolution with 33 depth levels. I also have a global bathymetry raster at much finer resolution. My goal is to use get the seabed OmegaA data from the NetCDF file, using the bathymetry data to determine the desired depth. My code so far;
library(raster)
library(rgdal)
library(ncdf4)
# Aragonite data. Defaults to CRS WGS84
ncin <- nc_open("C:/..../GLODAPv2.2016b.OmegaA.nc")
ncin.depth <- ncvar_get(ncin, "Depth")# 33 depth levels
omegaA.brk <- brick("C:/.../GLODAPv2.2016b.OmegaA.nc")
omegaA.brk <-rotate(omegaA.bkr)# because netCDF is in Lon 0-360.
# depth raster. CRS WGS84
r<-raster("C:/....GEBCO.tif")
# resample the raster brick to the resolution that matches the bathymetry raster
omegaA.brk <-resample(omegaA.brk, r, method="bilinear")
# create blank final raster
omegaA.rast <- raster(ncol = r#ncols, nrow = r#nrows)
extent(omegaA.rast) <- extent(r)
omegaA.rast[] <- NA_real_
# create vector of indices of desired depth values
depth.values<-getValues(r)
depth.values.index<-which(!is.na(depth.values))
# loop to find appropriate raster brick layer, and extract the value at the desired index, and insert into blank raster
for (p in depth.values.index) {
dep.index <-which(abs(ncin.depth+depth.values[p]) == min(abs(ncin.depth+depth.values[p]))) ## this sometimes results in multiple levels being selected
brk.level <-omegaA.brk[[dep.index]] # can be more than on level if multiple layers selected above.
omegaA.rast[p] <-omegaA.brk[[1]][p] ## here I choose the first level if multiple levels have been selected above
print(paste(p, "of", length(depth.values.index))) # counter to look at progress.
}
The problem: The result is a raster with massive gaps (NAs) in it where there should be data. The gaps often take a distinctive shape - eg, follow a contour, or along a long straight line. I've pasted a cropped example.
enter image description here
I think this could be because either 1) for some reason the 'which' statement in the loop is not finding a match or 2) a misalignment of the projections is created which I've read can happen when using 'Rotate'.
I've tried to make sure all the extents, resolutions, number of cells, and CRS's are all the same, which they seem to be.
To speed up the process I've cropped the global brick and bathy raster to my area of interest, again checking that all the spatial resolutions, etc etc match - I've not included those steps here for simplicity.
At a loss. Any help welcome!
Without a reproducible example, this kind of problems is hard to solve. I can't tell where your problem is but I'll present to you the approach I would try. Maybe it's good, maybe it's bad, I don't know but it may inspire you to find a way to go around your problem.
To my understanding, you have a brick of OmegaA (33 layers/depth) and a bathymetry raster. You want to get the OmegaA value at the bottom of the sea. Here is how I would do:
Make OmegaA raster to the same resolution and extent to the bathymetry one
Transforme the bathymetry raster into a raster brick of 33 three layers of 0-1. e.g. If the sea bottom is at 200m for one particular pixel, than this pixel on all depth layer other than 200 is 0 and 1 for the 200. To program this, I would go the long way, something like
:
r_1 <- r
values(r_1) <- values(r)==10 # where 10 is the depth (it could be a range with < or >)
r_2 <- r
values(r_2) <- values(r)==20
...
r_33 <- r
values(r_33) <- values(r)==250
r_brick <- brick(r_1, r_2, ..., r_33)
then you multiple both your raster bricks. They have the same dimension, it should be easy. The output should be a raster brick of 33 layers with 0 everywhere where it isn't the bottom of the sea and the value of OmegaA anywhere else.
Combine all the layer of the brick obtained previously into a simple raster with a sum.
This should work. If you have problem with dealing with raster brick, you could make the data into base R arrays, it could be simpler.
Good luck.

How to calculate geo-distance from a polygon?

I have a shapefile with 50+ different polygonal shapes (representing 50+ different regions) and 10,000+ data points that are supposed to be present in one of the regions. The thing is, the 10,000+ points are already coded with a region they are supposed to be in, and I want to figure out how far they are from this coded region in geo-spatial distance.
My current approach (code below), which involves converting shapefiles to owin objects from the sp library and using distfun gets me distances in lat,long euclidean space. But I would like to get geo-spatial distances (eventually to convert to km). Where should I go next?
#basically cribbed from http://cran.r-project.org/web/packages/spatstat/vignettes/shapefiles.pdf (page 9)
shp <- readShapeSpatial("myShapeFile.shp", proj4string=CRS("+proj=longlat +datum=WGS84"))
regions <- lapply(slot(shp, "polygons"), function(x) SpatialPolygons(list(x)))
windows <- lapply(regions, as.owin)
# need to convert this to geo distance
distance_from_region <- function(regionData, regionName) {
w <- windows[[regionName]]
regionData$dists <- distfun(w)(regionData$lat, regionData$long)
regionData
}
I'd project the data to a euclidean (or near euclidean) coordinate system - unless you are spanning a large chunk of the globe then this is feasible. Use spTransform from maptools or sp or rgdal (I forget which) and convert to a UTM zone near your data.
You also might do better with package rgeos and the gDistance function:
gDistance by default returns the cartesian minimum distance
between the two geometries in the units of the current projection.
If your data is over a large chunk of globe then... tricky... 42...
Barry

Resources