I'm using the spatstat package to compute the nearest distance to it's cooresponding point bases on xyz data. The code works, but i'm getting incorrect answers. See below.
ex<- data.frame(long= c(-103.5664,-103.5664,-103.5586),lat= c(32.09539,32.10129,32.10799),elevation= c(5000,5500,5700))
####bounding box 3D
bb <- box3(range(ex$long), range(ex$lat), range(ex$elevation))
# Create a spatial points data frame:
comp_dist.pp3<- spatstat::pp3(ex$long,ex$lat,ex$elevation,bb)
nndist.pp3(comp_dist.pp3,k=1)
[1] 500 200 200
The points are more than a mile away so it should be closer to 6800.
Unfortunately spatstat doesn’t automatically recognize latitude and longitude
coordinates. Your points are interpreted as (x,y,z) coordinates in Euclidean
space, and the three pairwise distances measured by
sqrt((x2-x1)^2 + (y2-y1)^2 + (z2-z1)^2) are (very suspiciously) the nice
round numbers 200, 500, and 700. Here is the small change to the original
code to calculate all pairwise distances:
library(spatstat)
ex<- data.frame(long= c(-103.5664,-103.5664,-103.5586),
lat= c(32.09539,32.10129,32.10799),
elevation= c(5000,5500,5700))
bb <- box3(range(ex$long), range(ex$lat), range(ex$elevation))
comp_dist.pp3<- spatstat::pp3(ex$long,ex$lat,ex$elevation,bb)
pairdist(comp_dist.pp3)
#> [,1] [,2] [,3]
#> [1,] 0 500 700
#> [2,] 500 0 200
#> [3,] 700 200 0
You can use sp::spTransform or sf::transform to convert from spherical
(lon,lat) to planar (x,y) and then you can attach your elevation as z-coordinate
when you define the pp3 object and things should work.
Created on 2019-02-12 by the reprex package (v0.2.1)
Check your units. If you look at your longitude values: all around -103, latitude values: all around 32, and elevation values: 5000, 5500, 5700. The dimension that causes the most distance is the elevation. Since these only differ by 500 and 200, I would not expect distances to be "closer to 6800."
Edit: That is to say, I believe your package is treating your latitudes and longitudes as numeric dimensions on xyz plane, and not as actual latitudes and longitudes!
Related
I have a raster layer masked by river network in binary form. While calculating the no. of pixel in R (freq(raster)) and QGIS using r.report I found the same number in both. But in calculation of area in sq. km I found difference in area R calculated by (tapply(area(raster), raster[], sum)) and QGIS. However, the major problem that I have with this is why are the area calculation not in par with the pixel number? The resolution of the raster is 30sec (approx. 1km*1km) so the number of pixel has to be approx equal to the area in sq km. The raster has a geographic coordinate system OGC:CRS84 - WGS 84 (CRS84) - Geographic and is in .grd form. I also projected it to UTM for QGIS which slightly increased the area but not considerable difference.
I am also posting the reports from R and QGIS below and the please follow the link below if you want to have a look at the raster too. I want the value in areas so I really don't know if I should convert pixel number into area in sq km. which in this case should be equal or use one of the answers from R or QGIS.
In r.report in QGIS area in sq. km Vs pixel:
0 value: 222, 520 Vs 290,767
1 value: 81,653 Vs 106,934
In Rstudio area in sq. km Vs pixel:
0 value: 222,068.53 Vs 290,767
1 value: 81,484.18 Vs 106,934
https://drive.google.com/drive/folders/1pBba0ejIc4t9ayKl36nyIo3vbgD4BKCT?usp=sharing
With the R raster package area is computed in m2 for longitude/latitude raster cells. Also note that at the equator, a 30 sec cell is about 860 m2, and this area decreases going towards to poles.
Given a the number of cells n, and the average latitude of you raster lat, the area should be about
n * 860 * cos(lat*pi/180)
An example to show the proper (memory-safe) way to do this in raster
#example data
library(raster)
r <- raster()
set.seed(67)
values(r) <- 0
r[sample(ncell(r), 1000)] <- 1
#select the cells that are not 0
rr <- reclassify(r, cbind(0,NA))
a <- area(rr)
aa <- mask(a, rr)
cellStats(aa, "sum")
---- with your data that is
r <- raster("raster.grd")
rr <- reclassify(r, cbind(0, NA))
a <- area(r)
aa <- mask(a, rr)
cellStats(aa, "sum")
#81484.18
That is, about 81 thousand km2
I have data points of a species observed using camera traps and would like to measure the distance of each camera trap site (CameraStation) to the edge of a national park using R. I have a shapefile of the park (shp) and want to apply a criterion to CameraStation(s) which are <5km from the edge. My data frame (df) consists of multiple events/observations (EventID) per CameraStation. The aim is to analyse when events near the park edge are most frequent given other environmental factors such as Season, Moon Phase and DayNight (also columns in DF).
I found a package called distance in R but this is for distance sampling and not what I want to do. Which package is relevant in this situation?
I expect the following outcome:
EventID CameraStation Distance(km) Within 5km
0001 Station 1 4.3 Yes
0002 Station 1 4.3 Yes
0003 Station 2 16.2 No
0004 Station 3 0.5 Yes
...
Here's a general solution, adapted from Spacedmans answer to this question at gis.stackexchange. Note: This solution requires working in a projected coordinate system. You can transform to a projected CRS if needed using spTransform.
The gDistance function of the rgeos package calculates the distance between geometries, but for the case of points inside a polygon the distance is zero. The trick is to create a new "mask" polygon where the original polygon is a hole cut out from the mask. Then we can measure the distance between points in the hole and the mask, which is the distance to the edge of the original polygon that we really care about.
We'll use the shape file of the Yellowstone National Park Boundary found on this page.
library(sp) # for SpatialPoints and proj4string
library(rgdal) # to read shapefile with readOGR
library(rgeos) # for gDistance, gDifference, and gBuffer
# ab67 was the name of the shape file I downloaded.
yellowstone.shp <- readOGR("ab67")
# gBuffer enlarges the boundary of the polygon by the amount specified by `width`.
# The units of `width` (meters in this case) can be found in the proj4string
# for the polygon.
yellowstone_buffer <- gBuffer(yellowstone.shp, width = 5000)
# gDifference calculates the difference between the polygons, i.e. what's
# in one and not in the other. That's our mask.
mask <- gDifference(yellowstone_buffer, yellowstone.shp)
# Some points inside the park
pts <- list(x = c(536587.281264245, 507432.037861251, 542517.161278414,
477782.637790409, 517315.171218198),
y = c(85158.0056377799, 77251.498952222, 15976.0721391485,
40683.9055315169, -3790.19457474617))
# Sanity checking the mask and our points.
plot(mask)
points(pts)
# Put the points in a SpatialPointsDataFrame with camera id in a data field.
spts.df <- SpatialPointsDataFrame(pts, data = data.frame(Camera = ordered(1:length(pts$x))))
# Give our SpatialPointsDataFrame the same spatial reference as the polygon.
proj4string(spts.df) <- proj4string(yellowstone.shp)
# Calculate distances (km) from points to edge and put in a new column.
spts.df$km_to_edge <- apply(gDistance(spts.df, difference, byid=TRUE),2,min)/1000
# Determine which records are within 5 km of an edge and note in new column.
spts.df$edge <- ifelse(spts.df$km_to_edge < 5, TRUE, FALSE)
# Results
spts.df
# coordinates Camera km_to_edge edge
# 1 (536587.3, 85158.01) 1 1.855010 TRUE
# 2 (507432, 77251.5) 2 9.762755 FALSE
# 3 (542517.2, 15976.07) 3 11.668700 FALSE
# 4 (477782.6, 40683.91) 4 4.579638 TRUE
# 5 (517315.2, -3790.195) 5 8.211961 FALSE
Here's a quick solution.
Simplify the outline of your shapefile into N points. Then calculate the minimum distance for each camera trap to every point in the outline of the national park.
library(geosphrere)
n <- 500 ##The number of points summarizing the shapefile
NPs <- ##Your shapefile goes here
NP.pts <- spsample(NPs, n = n, type = "regular")
CP.pts <- ## Coordinates for a single trap
distances<-distm(coordinates(CP.pts), coordinates(NP.pts), fun = distHaversine)/1000
##Distance in Km between the trap to each point in the perimeter of the shapefile:
distances
Use distances to find the minimum distance between the shapefile and that given trap. This approach can easily be generalizable using for loops or apply functions.
I had a problem with the points data frame and shape file being projected so instead I used the example in this link to answer my question
https://gis.stackexchange.com/questions/225102/calculate-distance-between-points-and-nearest-polygon-in-r
Basically, I used this code;
df # my data frame with points
shp # my shapefile (non-projected)
dist.mat <- geosphere::dist2Line(p = df2, line = shp)
coordinates(df2)<-~Longitude+Latitude # Longitude and Latitude are columns in my df
dmat<-data.frame(dist.mat) # turned it into a data frame
dmat$km5 <- ifelse(dmat$distance < 5000, TRUE, FALSE) # in meters (5000)
coordinates(dmat)<-~lon+lat
df2$distance <- dmat$distance # added new Distance column to my df
I want to calculate road network distances between a reference line (or a reference point if a single point facilitates the possible solution) and a dataframe of long/lat points. I have the following data frame:
Latitude Longitude
1 40.66858 22.88713
2 40.66858 22.88713
3 40.66858 22.88713
4 40.66858 22.88713
5 40.66858 22.88714
6 40.66857 22.88715
7 40.66858 22.88716
8 40.66858 22.88717
9 40.66859 22.88718
10 40.66861 22.88719
and the following reference line with start/end coordinates:
22.88600 40.66885
22.88609 40.66880
(If we want a single reference point in the middle of the line (instead of the whole line) its coordinates are: 22.88602844465866,40.66883357487465)
Here is a screenshot from google earth after plotting the points and the line:
I have tried to compute the distances of each point with the reference line with the following way:
dist2Line(points, line, distfun=distHaversine) #from geosphere package
The distance which is computed (e.g. for the first point) is the one with the yellow line in the following screenshot. The desired one is the one with the red
line (road network distance). How can I solve this? I want to compute the road network distances for all points!
Thank you in advance!
library(sp)
library(rgeos)
library(geosphere)
Let's join the midpoint of your line to the other line:
pt1 <- matrix(c(22.88600, 40.66885), ncol=2)
pt2 <- matrix(c(22.88609, 40.66880), ncol=2)
midpt <- as.data.frame(midPoint(pt1, pt2))
NOTE: The first 4 line points are the same in your supplied data
read.csv(text="lat,lon
40.66858,22.88713
40.66858,22.88713
40.66858,22.88713
40.66858,22.88713
40.66858,22.88714
40.66857,22.88715
40.66858,22.88716
40.66858,22.88717
40.66859,22.88718
40.66861,22.88719", stringsAsFactors = FALSE) -> l
l <- rbind.data.frame(midpt, l)
Using the midpoint on the line isn't perfect so you could use the spatial intersection operations as well to find the correct intersecting point.
Now, make it a spatial object and give it the boring longlat "projection".
l <- SpatialLines(list(Lines(Line(l[,2:1]), "1")), proj4string = CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"))
Convert said "projection" to something meaningful (I picked EPSG:3265, but choose whatever you want so you can get real distance):
l <- spTransform(l, CRS("+init=epsg:3265"))
Get the points from the line:
pts <- as(l, "SpatialPoints")
Follow How to calculate geographic distance between two points along a line in R? to get the distance between points which you can do the rest from there:
diff(sort(gProject(l, pts, normalized = FALSE)))
## [1] 372.553928 0.000000 0.000000 0.000000 3.360954 4.581859
## [7] 4.581860 3.360956 4.581862 7.077129
It'd be 👍🏼 if someone who knows how to do this with sf could do that as well since I couldn't find a gProject equivalent.
I need to calculate the shortest distance between two point matrices. I am new to R and have no clue how to do this. This is the code that I used to call in the data and convert them to points
library(dismo)
laurus <- gbif("Laurus", "nobilis")
locs <- subset(laurus, select = c("country", "lat", "lon"))
#uk observations
locs.uk <-subset(locs, locs$country=="United Kingdom")
#ireland observations
locs.ire <- subset(locs, locs$country=="Ireland")
uk_coord <-SpatialPoints(locs.uk[,c("lon","lat")])
ire_coord <-SpatialPoints(locs.ire[,c("lon","lat")])
crs.geo<-CRS("+proj=longlat +ellps=WGS84 +datum=WGS84") # geographical, datum WGS84
proj4string(uk_coord) <-crs.geo #define projection
proj4string(ire_coord) <-crs.geo #define projection
I need to calculate the shortest distance (Euclidean) from points in Ireland to points in UK. In other words I need to calculate the distance from each point in Ireland to its closet point in the UK points layer.
Can some one tell me what function or package I need to use in order to do this. I looked at gdistance and could not find a function that calculate the shortest distance.
You can use the FNN package which uses spatial trees to make the search efficient. It works with euclidean geometry, so you should transform your points to a planar coordinate system. I'll use rgdal package to convert to UK grid reference (stretching it a bit to use it over ireland here, but your original data was New York and you should use a New York planar coord system for that):
> require(rgdal)
> uk_coord = spTransform(uk_coord, CRS("+init=epsg:27700"))
> ire_coord = spTransform(ire_coord, CRS("+init=epsg:27700"))
Now we can use FNN:
> require(FNN)
> g = get.knnx(coordinates(uk_coord), coordinates(ire_coord),k=1)
> str(g)
List of 2
$ nn.index: int [1:69, 1] 202 488 202 488 253 253 488 253 253 253 ...
$ nn.dist : num [1:69, 1] 232352 325375 87325 251770 203863 ...
g is a list of indexes and distances of the uk points that are nearest to the 69 irish points. The distances are in metres because the coordinate system is in metres.
You can illustrate this by plotting the points then joining irish point 1 to uk point 202, irish 2 to uk 488, irish 3 to uk 202 etc. In code:
> plot(uk_coord, col=2, xlim=c(-1e5,6e5))
> plot(ire_coord, add=TRUE)
> segments(coordinates(ire_coord)[,1], coordinates(ire_coord)[,2], coordinates(uk_coord[g$nn.index[,1]])[,1], coordinates(uk_coord[g$nn.index[,1]])[,2])
gDistance() from the rgeos package will give you the distance matrix
library(rgeos)
gDistance(uk_coord, ire_coord, byid = TRUE)
Another option is nncross() from the spatstat package. Pro: it gives the distance to the nearest neighbour. Contra: you 'll need to convert the SpatialPoints to a SpatialPointPattern (see ?as.ppp in statstat)
library(spatstat)
nncros(uk.ppp, ire.ppp)
The package geosphere offers a lot of dist* functions to evaluate distances from two lat/lon points. In your example, you could try:
require(geosphere)
#get the coordinates of UK and Ireland
pointuk<-uk_coord#coords
pointire<-ire_coord#coords
#prepare a vector which will contain the minimum distance for each Ireland point
res<-numeric(nrow(pointire))
#get the min distance
for (i in 1:length(res)) res[i]<-min(distHaversine(pointire[i,,drop=FALSE],pointuk))
The distances you'll obtain are in meters (you can change by setting the radius of the earth in the call to distHaversine).
The problem with gDistance and other rgeos functions is that they evaluate the distance as the coordinates were planar. Basically, the number you obtain is not much useful.
I have a file in this format:
ASCII format
The first rows look like this:
ncols 1440
nrows 720
xllcorner -180.0
yllcorner -90
cellsize 0.25
NODATA_value -9999
Basically I have the world with 1440 'tiles' in x direction (longitude) and 720 'tiles' in y direction (latitude). Each 'tile' is a square with a length of 0.25 degrees. I think I have xllcorner and yllcorner correct. I can draw this map like this in R:
library("adehabitat")
bio1 <- import.asc("D:/ENFA/data.asc")
maps <- as.kasc(list(data = bio1))
image(maps, col = cm.colors(256), clfac = list(Aspect = cl))
The map looks fine.
I would like to perform some ecological niche factor analysis (ENFA) using the adehabitat package and am not too sure about the location data. Basically I have them as longitudes and latitudes at the moment but I could also generate then as 'tile index' (e.g. lower left corner has the latitude -90 and longitude -180 so the 'tile index' would be 0, 0 - right?). Which is the correct location data format? I would use ENFA code like this:
locs <- read.table("D:/ENFA/Locs.txt", header = TRUE, sep="\t")
dataenfa1 <- data2enfa(maps, locs)
pc <- dudi.pca(dataenfa1$tab, scannf = FALSE)
enfa1 <- enfa(pc, dataenfa1$pr,scannf = FALSE)
hist(enfa1)
I would appreciate any comments please. Thanks in advance.
The problem with leaving your coordinates in lat-long form is that, at most places on earth, a degree of longitude has a different length than a degree of latitude. This might distort your ENFA by exaggerating distances in some directions relative to those in others.
Especially if your data are from a relatively small area, I'd suggest re-expressing the coordinates in meters along an W/E x-axis and S/N y-axis. If all of your points fall inside a single UTM zone, then you could do the conversion within R, using project() in the rgdal package:
Here's one example, taken from here:
library(rgdal)
# Make a two-column matrix, col1 = long, col2 = lat
xy <- cbind(c(118, 119), c(10, 50))
# Convert it to UTM coordinates (in units of meters)
project(xy, "+proj=utm +zone=51 ellps=WGS84")
[,1] [,2]
[1,] -48636.65 1109577
[2,] 213372.05 5546301
Much more info about how to manipulate spatial data is available in the "Applied Spatial Data Analysis with R" by Bivand, Pebesma, and Gomez-Rubio. If you need more specific assistance, try the R-sig-Geo mailing list.
Hope this helps.
Maybe you want to convert the coordinates into
GHAM (Global, Hierarchical, Alphanumeric, and Morton-encoded)
which represents the globe by cells of arbitrary precision (as fine or coarse as you wish), so any lat/lon has a single alpha-numeric address that remains sortable.
Here's the abstract from GHAM: A compact global geocode suitable for sorting, by Duncan Agnew:
The GHAM code is a technique for labeling geographic locations based
on their positions. It defines addresses for equal-area cells bounded
by constant latitude and longitude, with arbitrarily fine precision.
The cell codes are defined by applying Morton ordering to a recursive
division into a 16 by 16 grid, with the resulting numbers encoded into
letter–number pairs. A lexical sort of lists of points so labeled will
bring near neighbors (usually) close together; tests on a variety of
global datasets show that in most cases the actual closest point is
adjacent in the list 50% of the time, and within 5 entries 80% of the
time.
Source code is the IAMG repository, but if you can't access it I'm sure he would provide it.