MCP home range overlap - r

I am trying to estimate home range overlap among a batch of animals that I have estimated home ranges for through both the MCP and kernelUD methods in the adehabitatHR package. I realize there is a function for calculating overlap with the kernel method, but can anyone tell me how best to calculate overlap for MCP?
I gather the rgeos package has a gIntersection function that I have begun to play around with.
Has anyone come up with a relatively straightforward method/code - whether in adehabitatHR, regeos, or elsewhere?

Maybe something like this helps:
library(adehabitatHR)
library(rgeos)
## Use some simulated data
ani1 <- SpatialPoints(matrix(rnorm(200, mean=2), ncol=2))
ani2 <- SpatialPoints(matrix(rnorm(200, mean=1), ncol=2))
## Function to calculate overlaps
gOverlap <- function(hr1, hr2, ...) {
a <- gIntersection(hr1, hr2, ...)
if (is.null(a)) {
return(0)
}
gArea(a, byid=TRUE) / gArea(hr1, byid=TRUE)
}
## Calcualte homeranges
hr1 <- mcp(ani1)
hr2 <- mcp(ani2)
## Calculate HR overlap
gOverlap(hr1, hr2)

Running overlap for single pairs at once can be very inefficient when you have many animals or many collaring sessions. I'm sure there is a cleaner way to do this (and I welcome suggestions on refinements!), but this worked for me:
#Relevant libraries (I think they're all here!)
library(adehabitatHR)
library(geosphere)
library(rgeos)
library(nlme)
#Define an overlap function for a single dyad
dyad.overlap <- function(cp,dyads.df){
p1<-subset(cp, cp#data$id==dyads.df[1])
p2<-subset(cp, cp#data$id==dyads.df[2])
Overlap<- ifelse(is.null(gIntersection(p1,p2)), 0, gArea(gIntersection(p1,p2)) / gArea(p1)) #puts 0 if no overlap between the dyad
return(Overlap)
}
#Define a function for overlap of all animals from a given time period
mcp.overlaps <- function(period.df){
period.df$Animal<-factor(period.df$Animal) #remove any ids not relevant to that period
count <- length(unique(period.df$Animal)) #identify number of individuals for the period
anim <- unique(period.df$Animal) #identify names of individuals/periods
xy<-SpatialPointsDataFrame(period.df[c("Easting","Southing")], data=data.frame(id=period.df$Animal)) #create SPDF
proj4string(xy)<-CRS("+init=epsg:32750") #define projection
cp <- adehabitatHR::mcp(xy, percent=95) #create Spatial Polygons Data Frame of the 95% MCPs
writeOGR(obj=cp, dsn="tempdir", layer=paste(period.df$file_folder[i],"95 Percent MCPs"), driver="ESRI Shapefile") #export shp of 95% MCPs for period, if desired
dyads<-(combn(anim, 2, simplify=T)) #all possible dyad combinations
dyads.df<-data.frame(A1=dyads[1,1:count],A2=dyads[2,1:count]) #creates a data frame of the pairs
dyads.df$Overlap<-apply(dyads.df, 1, dyad.overlap, cp=cp)
dyads.df$Period<-period.df$file_folder[1]
return(dyads.df)
}
#Now run the overlap for each time period included in the data frame
All.Overlaps<-do.call(rbind.data.frame, gapply(df, groups=df$file_folder, FUN=mcp.overlaps))
Of course, if you only have a single time period (e.g., a year of data and looking at annual home ranges), all you have to do is:
All.Overlaps <- mcp.overlaps(df)
This assumes that all data is in a single data frame (df), with the variables defined as following (adjust names to suit your data):
file_folder: collaring period/session of interest
Animal: unique identifier for an individual
Easting/Southing: your lat long coordinates (be sure to adjust the projection as needed)

I have modified a little the Johannes' answer, maybe it helps a little more:
library(adehabitatHR)
library(rgeos)
data(puechabonsp)
rel <- puechabonsp$relocs
cp <- mcp(rel[,1])
## Set Up a matrix which will store the results
mat <- matrix(NA,4,4,dimnames=list(c("1","2","3","4"),
c("1","2","3","4")))
## Set Up the loop
gOverlap <- function(hr, number, matrix){
for(i in c(1:number)){
for(j in c(1:number)){
a <- gIntersection(hr[i,], hr[j,])
if (is.null(a)){
matrix[i, j] <- 0
} else{matrix[i, j] <- gArea(a)}
}
}
return(matrix)
}
##Test the function
gOverlap(cp, 4, mat)
##Plotting to corroborate
plot(cp)
The output is a matrix with every intersection, and to obtain the total overlap you only have to sum row or columns without the self-intersection. Finally, you can add or delete individuals just modifying the initial matrix.
All the best,

Related

Extract Raster Pixels Values Using Vector Polygons in R

I have been struggling with this for hours.
I have a shapefile (called "shp") containing 177 polygons i.e. 177 counties. This shapefile is overlaid on a raster. My raster (called "ras") is made of pixels having different pollution values.
Now I would like to extract all pixel values and their number of occurrences for each polygon.
This is exactly what the QGIS function "zonal histogram" is doing. But I would like to do the exact same thing in R.
I tried the extract() function and I managed to get a mean value per county, which is already a first step, but I would like to make a pixels distribution (histogram).
Could someone give me a hand ?
Many thanks,
Marie-Laure
Thanks a lot for your help. Next time I promise I will be careful and explain my issue more in details.
With your help I managed to find a solution.
I also used this website : http://zevross.com/blog/2015/03/30/map-and-analyze-raster-data-in-r/
For information, first I had to uninstall the "tidyr" package because there was a conflict with the extract function.
In case it can help someone, here is the final code :
# Libraries loading
library(raster)
library(rgdal)
library(sp)
# raster layer import
ras=raster("C:/*.tif")
# shapefile layer import
shp<-shapefile("C:/*.shp")
# Extract the values of the pixels raster per county
ext <- extract(ras, shp, method='simple')
# Function to tabulate pixel values by region & return a data frame
tabFunc <- function(indx, extracted, region, regname) {
dat <- as.data.frame(table(extracted[[indx]]))
dat$name <- region[[regname]][[indx]]
return(dat)
}
# run through each county & compute a table of the number
# of raster cells by pixel value. ("CODE" is the county code)
tabs <- lapply(seq(ext), tabFunc, ext, shp, "CODE")
# assemble into one data frame
df <- do.call(rbind, tabs)
# to see the data frame in R
print(df)
# table export
write.csv(df,"C:/*.csv", row.names = FALSE)
Here is a minimal, self-contained, reproducible example (almost literally from ?raster::extract, so not difficult to make)
library(raster)
r <- raster(ncol=36, nrow=18, vals=rep(1:9, 72))
cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
polys <- spPolygons(cds1, cds2)
Now you can do
v <- extract(r, polys)
par(mfrow=c(1,2))
z <- lapply(v, hist)
Or more fancy
mains <- c("first", "second")
par(mfrow=c(1,2))
z <- lapply(1:length(v), function(i) hist(v[[i]], main=mains[i]))
Or do you want a barplot
z <- lapply(1:length(v), function(i) barplot(table(v[[i]]), main=mains[i]))

Create neighborhood list of large dataset / fasten up

I want to create a weight matrix based on distance. My code for the moment looks as follows and functions for a smaller sample of the data. However, with the large dataset (569424 individuals in 24077 locations) it doesn't go through. The problem arise at the nb2blocknb fuction. So my question would be: How can I optimize my code for large datasets?
# load all survey data
DHS <- read.csv("Daten/final.csv")
attach(DHS)
# define coordinates matrix
coormat <- cbind(DHS$location, DHS$lon_s, DHS$lat_s)
coorm <- cbind(DHS$lon_s, DHS$lat_s)
colnames(coormat) <- c("location", "lon_s", "lat_s")
coo <- cbind(unique(coormat))
c <- as.data.frame(coo)
coor <- cbind(c$lon_s, c$lat_s)
# get a list with beneighbored locations thath are inbetween 50 km distance
neighbor <- dnearneigh(coor, d1 = 0, d2 = 50, row.names=c$location, longlat=TRUE, bound=c("GE", "LE"))
# get neighborhood list on individual level
nb <- nb2blocknb(neighbor, as.character(DHS$location)))
# weight matrix in list format
nbweights.lw <- nb2listw(nb, style="B", zero.policy=TRUE)
Thanks a lot for your help!
you're trying to make 1.3 e10 distance calculations. The results would be in the GB.
I think you'd want to limit either the maximum distance or the number of nearest neighbors you're looking for. Try nn2 from the RANN package:
library('RANN')
nearest_neighbours_w_distance<-nn2(coordinatesA, coordinatesB,10)
note that this operation is not symmetric (Switching coordinatesA and coordinatesB gives different results).
Also you would first have to convert your gps coordinates to a coordinate reference system in which you can calculate euclidean distances, for example UTM (code not tested):
library("sp")
gps2utm<-function(gps_coordinates_matrix,utmzone){
spdf<-SpatialPointsDataFrame(gps_coordinates_matrix[,1],gps_coordinates_matrix[,2])
proj4string(spdf) <- CRS("+proj=longlat +datum=WGS84")
return(spTransform(spdf, CRS(paste0("+proj=utm +zone=",utmzone," ellps=WGS84"))))
}

Identifying which points in a regular lattice are within a polygon's boundaries

I would like to work out which points that define a regular lattice are within a polygon. The code below does this but VERY VERY slowly:
#the polygon that I want to check each point against
glasgow_single <- readShapePoly(
fn="data/clipped/glasgow_single"
)
#interpolated contains the coordinates of the regular grid
points_to_check <- expand.grid(
x=interpolated$x,
y=interpolated$y
)
#function to be called by plyr
fn <- function(X){
this_coord <- data.frame(lon=X["x"], lat=X["y"])
this_point <- SpatialPoints(this_coord)
out <- gContains(glasgow_single, this_point)
out <- data.frame(x=X["x"], y=X["y"], val=out)
return(out)
}
#plyr call
vals <- adply(points_to_check, 1, fn, .progress="text")
vals$val <- as.numeric(vals$val)
Taking into account both thinking time and computing time, is there a much faster way of doing this?
Yes, there's a much better approach. For this and many other topological operations, the rgeos package has you well covered. Here, you're wanting rgeos::gWithin():
## Required packages
library(rgdal)
library(raster) ## For example polygon & functions used to make example points
library(rgeos)
## Reproducible example
poly <- readOGR(system.file("external", package="raster"), "lux")[1,]
points <- as(raster(extent(poly)), "SpatialPoints")
proj4string(points) <- proj4string(poly)
## Test which points fall within polygon
win <- gWithin(points, poly, byid=TRUE)
## Check that it works
plot(poly)
points(points, col=1+win)

why is subscript out of bounds in adehabitatHR getvertices (puechabonsp data)

Hope someone can help, I have a large dataset from which I have generated 10 estUD's with the same grid and h value=200. Here is a subset of just two of them. I can visulise them using image(liud) but when I try to use the fuction getverticeshr I get an error that the subscript is out of bounds. I have tried changing the grid and the value of h to no avail. I wonder if it something to do with the way I am combining them into and out of a list?
library(adehabitatHR)
#combine all Ud's into one dataset
liud <- list(Y2889a, Y2889b)
class(liud) <- "estUDm"
image(liud)#plot all est ud's
v<-getverticeshr(liud)
I have reproduced the error with the puechabonsp dataset below
library(adehabitatHR)
## Load the data
data(puechabonsp)
loc <- puechabonsp$relocs
## have a look at the data
head(as.data.frame(loc))
## the first column of this data frame is the ID
## Estimation of UD for each of the animals (two here as an example)
udBrock <- kernelUD(loc[as.data.frame(loc)[,1]=="Brock",], grid=200)
udCalou <- kernelUD(loc[as.data.frame(loc)[,1]=="Calou",], grid=200)
liud <- list(udBrock, udCalou)
class(liud) <- "estUDm"
image(liud)#plot all est ud's
v<-getverticeshr(liud)
Thanks for your comment Chris, I should have explained my dataset. I have 10 animals and have generated random points based on recorded polygons for each animal. I have run this 100 times per animal. My aim is to generate a mean utilized distribution for each animal based on all 100 runs. so far I have used this code:
xybat <- subset(bat.master, bat.master$id =="Y2889a",select=x:loopno )
#change to spatial points
xy <- xybat[1:2]#first two rows save as coords
df <- xybat[-1:-3]#remove unneded columns for ud
SPDF <- SpatialPointsDataFrame(coords=xy, data=df)#combine df and xy
udHR <- kernelUD(SPDF, h = 200, grid=habitat, kernel=epa)
## I would proceed using the raster packages
ud <- stack(lapply(udHR, raster))
## You can now check the first one
plot(ud[[1]])
## or at all of them
#plot(ud)
## take the mean
plot(udm <- mean(ud))
## now you can either proceed in raster and calculate your isopleths or convert it back to a estUD, this is a bit of a hack and not the nicest way to do it
Y2889a<- udHR[[1]]
Y2889a#grid <- as(udm, "GridTopology")
so if I follow your suggestion and run the kernelud function on the whole dataset I still need to stack each of the animal's ud's separatley and then combine them into an EstUDm and I am back to the same problem. I hope you can help me come up with a solution.
Best wishes,
Simone
This basically just a generalisation for multiple animals of my previouse answer, maybe it is useful:
library(adehabitatHR)
library(raster)
## generate some dummy data for 15 animals, each with 10 replications)
pts <- replicate(15, SpatialPointsDataFrame(coords=cbind(rnorm(1000), rnorm(1000)),
data=data.frame(id=rep(1:10, each=100))))
## generate uds
uds <- lapply(pts, function(x) kernelUD(x, h = "href", same4all = TRUE, kern = "bivnorm"))
udsr <- lapply(uds, function(x) stack(lapply(x, raster)))
## You can now check the first one
plot(udsr[[1]][[1]])
## or at all 10 uds of the first animal
plot(udsr[[1]])
## take the mean
udsm <- lapply(udsr, mean)
## go back to adehabitat
for (i in seq_along(udsm)) {
uds[[i]] <- uds[[i]][[1]]
uds[[i]]#grid <- as(udsm[[i]], "GridTopology")
}
## now you can work with udHR as if it were a HR estimate
iso95 <- lapply(uds, getverticeshr, percent=95)
## plot first animal
plot(iso95[[1]])
## plot second animal
plot(iso95[[2]])

In R, how to average spatial points data over spatial grid squares

Managed to solve problem now
I have a set of around 50 thousand points that have coordinates and one value associated with them. I would like to be able to place points into a grid averaging the associated value of all points that fall into a grid square. So I want to end up with an object that identifies each grid square and gives the average inside the grid square.
I have the data in a spatial points data frame and a spatial grid object if that helps.
Improving answer: I have definitely done some searching, sorry about the initial state of the question I had only managed to frame the question inside my own head; hadn't had to communicate it to anyone else before...
Here is example data that hopefully illustrates the problem more clearly
##make some data
longi <- runif(100,0,10)
lati <- runif(100,0,10)
value <- runif(500,20,30)
##put in data frame then change to spatial data frame
df <- data.frame("lon"=longi,"lat"=lati,"val"=value)
coordinates(df) <- c("lon","lat")
proj4string(df) <- CRS("+proj=longlat")
##create a grid that bounds the data
grd <- GridTopology(cellcentre.offset=bbox(df)[,1],
cellsize=c(1,1),cells.dim=c(11,11))
sg <- SpatialGrid(grd)
Then I hope to get an object albeit a vector/data frame/list that gives me the average of value in each grid cell/square and some way of identifying which cell it is.
Solution
##convert the grid into a polygon##
polys <- as.SpatialPolygons.GridTopology(grd)
proj4string(polys) <- CRS("+proj=longlat")
##can now use the function over to select the correct points and average them
results <- rep(0, length(polys))
for(i in 1:length(polys)) {
results[i] = mean(df$val[which(!is.na(over(x=df,y=polys[i])))])
}
My question now is if this is the best way to do it or is there a more efficient way?
Your description is vague at best. Please try to ask more specific answers preferably, with code illustrating what you have already tried. Averaging a single value in your point data or a single raster cell makes absolutely no sense.
The best guess at an answer I can provide is to use raster extract() to assign the raster values to a sp point object and then use tapply() to aggregate the values to your grouping values in the points. You can use the coordinates of the points to identify cell location or alternately, the cellnumbers returned from extract (per below example).
require(raster)
require(sp)
# Create example data
r <- raster(ncol=500, nrow=500)
r[] <- runif(ncell(r))
pts <- sampleRandom(r, 100, sp=TRUE)
# Add a grouping value to points
pts#data <- data.frame(ID=rownames(pts#data), group=c( rep(1,25),rep(2,25),
rep(3,25),rep(4,25)) )
# Extract raster values and add to #data slot dataframe. Note, the "cells"
# attribute indicates the cell index in the raster.
pts#data <- data.frame(pts#data, extract(r, pts, cellnumbers=TRUE))
head(pts#data)
# Use tapply to cal group means
tapply(pts#data$layer, pts#data$group, FUN=mean)

Resources