I have a raster layer masked by river network in binary form. While calculating the no. of pixel in R (freq(raster)) and QGIS using r.report I found the same number in both. But in calculation of area in sq. km I found difference in area R calculated by (tapply(area(raster), raster[], sum)) and QGIS. However, the major problem that I have with this is why are the area calculation not in par with the pixel number? The resolution of the raster is 30sec (approx. 1km*1km) so the number of pixel has to be approx equal to the area in sq km. The raster has a geographic coordinate system OGC:CRS84 - WGS 84 (CRS84) - Geographic and is in .grd form. I also projected it to UTM for QGIS which slightly increased the area but not considerable difference.
I am also posting the reports from R and QGIS below and the please follow the link below if you want to have a look at the raster too. I want the value in areas so I really don't know if I should convert pixel number into area in sq km. which in this case should be equal or use one of the answers from R or QGIS.
In r.report in QGIS area in sq. km Vs pixel:
0 value: 222, 520 Vs 290,767
1 value: 81,653 Vs 106,934
In Rstudio area in sq. km Vs pixel:
0 value: 222,068.53 Vs 290,767
1 value: 81,484.18 Vs 106,934
https://drive.google.com/drive/folders/1pBba0ejIc4t9ayKl36nyIo3vbgD4BKCT?usp=sharing
With the R raster package area is computed in m2 for longitude/latitude raster cells. Also note that at the equator, a 30 sec cell is about 860 m2, and this area decreases going towards to poles.
Given a the number of cells n, and the average latitude of you raster lat, the area should be about
n * 860 * cos(lat*pi/180)
An example to show the proper (memory-safe) way to do this in raster
#example data
library(raster)
r <- raster()
set.seed(67)
values(r) <- 0
r[sample(ncell(r), 1000)] <- 1
#select the cells that are not 0
rr <- reclassify(r, cbind(0,NA))
a <- area(rr)
aa <- mask(a, rr)
cellStats(aa, "sum")
---- with your data that is
r <- raster("raster.grd")
rr <- reclassify(r, cbind(0, NA))
a <- area(r)
aa <- mask(a, rr)
cellStats(aa, "sum")
#81484.18
That is, about 81 thousand km2
Related
I have data points of a species observed using camera traps and would like to measure the distance of each camera trap site (CameraStation) to the edge of a national park using R. I have a shapefile of the park (shp) and want to apply a criterion to CameraStation(s) which are <5km from the edge. My data frame (df) consists of multiple events/observations (EventID) per CameraStation. The aim is to analyse when events near the park edge are most frequent given other environmental factors such as Season, Moon Phase and DayNight (also columns in DF).
I found a package called distance in R but this is for distance sampling and not what I want to do. Which package is relevant in this situation?
I expect the following outcome:
EventID CameraStation Distance(km) Within 5km
0001 Station 1 4.3 Yes
0002 Station 1 4.3 Yes
0003 Station 2 16.2 No
0004 Station 3 0.5 Yes
...
Here's a general solution, adapted from Spacedmans answer to this question at gis.stackexchange. Note: This solution requires working in a projected coordinate system. You can transform to a projected CRS if needed using spTransform.
The gDistance function of the rgeos package calculates the distance between geometries, but for the case of points inside a polygon the distance is zero. The trick is to create a new "mask" polygon where the original polygon is a hole cut out from the mask. Then we can measure the distance between points in the hole and the mask, which is the distance to the edge of the original polygon that we really care about.
We'll use the shape file of the Yellowstone National Park Boundary found on this page.
library(sp) # for SpatialPoints and proj4string
library(rgdal) # to read shapefile with readOGR
library(rgeos) # for gDistance, gDifference, and gBuffer
# ab67 was the name of the shape file I downloaded.
yellowstone.shp <- readOGR("ab67")
# gBuffer enlarges the boundary of the polygon by the amount specified by `width`.
# The units of `width` (meters in this case) can be found in the proj4string
# for the polygon.
yellowstone_buffer <- gBuffer(yellowstone.shp, width = 5000)
# gDifference calculates the difference between the polygons, i.e. what's
# in one and not in the other. That's our mask.
mask <- gDifference(yellowstone_buffer, yellowstone.shp)
# Some points inside the park
pts <- list(x = c(536587.281264245, 507432.037861251, 542517.161278414,
477782.637790409, 517315.171218198),
y = c(85158.0056377799, 77251.498952222, 15976.0721391485,
40683.9055315169, -3790.19457474617))
# Sanity checking the mask and our points.
plot(mask)
points(pts)
# Put the points in a SpatialPointsDataFrame with camera id in a data field.
spts.df <- SpatialPointsDataFrame(pts, data = data.frame(Camera = ordered(1:length(pts$x))))
# Give our SpatialPointsDataFrame the same spatial reference as the polygon.
proj4string(spts.df) <- proj4string(yellowstone.shp)
# Calculate distances (km) from points to edge and put in a new column.
spts.df$km_to_edge <- apply(gDistance(spts.df, difference, byid=TRUE),2,min)/1000
# Determine which records are within 5 km of an edge and note in new column.
spts.df$edge <- ifelse(spts.df$km_to_edge < 5, TRUE, FALSE)
# Results
spts.df
# coordinates Camera km_to_edge edge
# 1 (536587.3, 85158.01) 1 1.855010 TRUE
# 2 (507432, 77251.5) 2 9.762755 FALSE
# 3 (542517.2, 15976.07) 3 11.668700 FALSE
# 4 (477782.6, 40683.91) 4 4.579638 TRUE
# 5 (517315.2, -3790.195) 5 8.211961 FALSE
Here's a quick solution.
Simplify the outline of your shapefile into N points. Then calculate the minimum distance for each camera trap to every point in the outline of the national park.
library(geosphrere)
n <- 500 ##The number of points summarizing the shapefile
NPs <- ##Your shapefile goes here
NP.pts <- spsample(NPs, n = n, type = "regular")
CP.pts <- ## Coordinates for a single trap
distances<-distm(coordinates(CP.pts), coordinates(NP.pts), fun = distHaversine)/1000
##Distance in Km between the trap to each point in the perimeter of the shapefile:
distances
Use distances to find the minimum distance between the shapefile and that given trap. This approach can easily be generalizable using for loops or apply functions.
I had a problem with the points data frame and shape file being projected so instead I used the example in this link to answer my question
https://gis.stackexchange.com/questions/225102/calculate-distance-between-points-and-nearest-polygon-in-r
Basically, I used this code;
df # my data frame with points
shp # my shapefile (non-projected)
dist.mat <- geosphere::dist2Line(p = df2, line = shp)
coordinates(df2)<-~Longitude+Latitude # Longitude and Latitude are columns in my df
dmat<-data.frame(dist.mat) # turned it into a data frame
dmat$km5 <- ifelse(dmat$distance < 5000, TRUE, FALSE) # in meters (5000)
coordinates(dmat)<-~lon+lat
df2$distance <- dmat$distance # added new Distance column to my df
My following question builds on the solution proposed by #jbaums on this post: Global Raster of geographic distances
For the purpose of reproducing the example, I have a raster dataset of distances to the nearest coastline:
library(rasterVis); library(raster); library(maptools)
data(wrld_simpl)
# Create a raster template for rasterizing the polys.
r <- raster(xmn=-180, xmx=180, ymn=-90, ymx=90, res=1)
# Rasterize and set land pixels to NA
r2 <- rasterize(wrld_simpl, r, 1)
r3 <- mask(is.na(r2), r2, maskvalue=1, updatevalue=NA)
# Calculate distance to nearest non-NA pixel
d <- distance(r3) # if claculating distances on land instead of ocean: d <- distance(r3)
# Optionally set non-land pixels to NA (otherwise values are "distance to non-land")
d <- d*r2
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
The data looks like this:
From here, I need to subset the distance raster (d), or create a new raster, that only contains cells for which the distance to coastline is less than 200 km. I have tried using getValues() to identify the cells for which the value <= 200 (as show below), but so far without success. Can anyone help? Am I on the right track?
#vector of desired cell numbers
my.pts <- which(getValues(d) <= 200)
# create raster the same size as d filled with NAs
bar <- raster(ncols=ncol(d), nrows=nrow(d), res=res(d))
bar[] <- NA
# replace the values with those in d
bar[my.pts] <- d[my.pts]
I think this is what you are looking for, you can treat a raster like a matrix here right after you d <- d*r2 line:
d[d>=200000]<-NA
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
(in case you forgot: the unit is in meters so the threshold should be 200000, not 200)
I have spatial data with lat/long (x/y) and want to put a raster on it. I want to get all values inside every raster cell where the respective points fit into. The points are not equally distributed so one raster cell does not contain the same amount of points as the neighbouring raster cell. I know that there is the function rasterize that uses the mean to average all values inside that cell into one new value but I dont want to interpolate the mean inside the cell, I want to extract all values (here values of points inside that cell).
How can I do this in an effective way?
consider I have:
library(raster)
library(sp)
my data:
x <- runif(n) * 360 - 180
y <- runif(n) * 180 - 90
n <- 1000
values=runif(n)
xy <- cbind(x,y)
my raster
r <- raster(ncols=10, nrows=10)
Now I dont want to average all values like using rasterize, but extract all values (e.g into a list) that fall into that cell.
Many thanks for ideas and help! Is there any function for this?
Firstly, you have to have values in the raster to be sampled. In your example you are just trying to sample an empty raster. ( I mistook this for your sample size in the originals edit; issue is with your example, not the question)
To answer your question...
extract() is the function you are looking for:
library(raster)
library(sp)
r <- raster(ncols=10, nrows=10)
n <- 1000
x <- runif(n) * 360 - 180
y <- runif(n) * 180 - 90
values=runif(n)
r[]<-values
xy <- SpatialPointsDataFrame(data=data.frame(cbind(x,y)),coords=cbind(x,y))
r0 <- extract(r, xy)
plot(r0)
New to spatial analysis on R here. I have a shapefile for the USA that I downloaded from HERE. I also have a set of lat/long points (half a million) that lie within the contiguous USA.
I'd like to find the "most remote spot" -- the spot within the contiguous USA that's farthest from the set of points.
I'm using the rgdal, raster and sp packages. Here's a reproducible example with a random sample of 10 points:
# Set wd to the folder tl_2010_us_state_10
usa <- readOGR(dsn = ".", layer = "tl_2010_us_state10")
# Sample 10 points in USA
sample <- spsample(usa, 10, type = "random")
# Set extent for contiguous united states
ext <- extent(-124.848974, -66.885444, 24.396308, 49.384358)
# Rasterize USA
r <- raster(ext, nrow = 500, ncol = 500)
rr <- rasterize(usa, r)
# Find distance from sample points to cells of USA raster
D <- distanceFromPoints(object = rr, xy = sample)
# Plot distances and points
plot(D)
points(sample)
After the last two lines of code, I get this plot.
However, I'd like it to be over the rasterized map of the USA. And, I'd like it to only consider distances from cells that are in the contiguous USA, not all cells in the bounding box. How do I go about doing this?
I'd also appreciate any other tips regarding the shape file I'm using -- is it the best one? Should I be worried about using the right projection, since my actual dataset is lat/long? Will distanceFromPoints be able to efficiently process such a large dataset, or is there a better function?
To limit raster D to the contiguous USA you could find the elements of rr assigned values of NA (i.e. raster cells within the bounding box but outside of the usa polygons), and assign these same elements of D a value of NA.
D[which(is.na(rr[]))] <- NA
plot(D)
lines(usa)
You can use 'proj4string(usa)' to find the projection info for the usa shapefile. If your coordinates of interest are based on a different projection, you can transform them to match the usa shapefile projection as follows:
my_coords_xform <- spTransform(my_coords, CRS(proj4string(usa)))
Not sure about the relative efficiency of distanceFromPoints, but it only took ~ 1 sec to run on my computer using your example with 10 points.
I think you were looking for the mask function.
library(raster)
usa <- getData('GADM', country='USA', level=1)
# exclude Alaska and Hawaii
usa <- usa[!usa$NAME_1 %in% c( "Alaska" , "Hawaii"), ]
# get the extent and create raster with preferred resolution
r <- raster(floor(extent(usa)), res=1)
# rasterize polygons
rr <- rasterize(usa, r)
set.seed(89)
sample <- spsample(usa, 10, type = "random")
# Find distance from sample points to cells of USA raster
D <- distanceFromPoints(object = rr, xy = sample)
# remove areas outside of polygons
Dm <- mask(D, rr)
# an alternative would be mask(D, usa)
# cell with highest value
mxd <- which.max(Dm)
# coordinates of that cell
pt <- xyFromCell(r, mxd)
plot(Dm)
points(pt)
The distances should be fine, also when using long/lat data. But rasterFromPoints could indeed be a bit slow with a large data set as it uses a brute force algorithm.
Im wondering if someone has built a raster of the continents of the world where each cell equals the distance of that cell cell to the nearest shore. This map would highlight the land areas that are most isolated inland.
I would imagine this would simply rasterize a shapefile of the global boundaries and then calculate the distances.
You can do this with raster::distance, which calculates the distance from each NA cell to the closest non-NA cell. You just need to create a raster that has NA for land pixels, and some other value for non-land pixels.
Here's how:
library(raster)
library(maptools)
data(wrld_simpl)
# Create a raster template for rasterizing the polys.
# (set the desired grid resolution with res)
r <- raster(xmn=-180, xmx=180, ymn=-90, ymx=90, res=1)
# Rasterize and set land pixels to NA
r2 <- rasterize(wrld_simpl, r, 1)
r3 <- mask(is.na(r2), r2, maskvalue=1, updatevalue=NA)
# Calculate distance to nearest non-NA pixel
d <- distance(r3)
# Optionally set non-land pixels to NA (otherwise values are "distance to non-land")
d <- d*r2
To create the plot above (I like rasterVis for plotting, but you could use plot(r)):
library(rasterVis)
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),
colorkey=list(height=0.6), main='Distance to coast')