I have a set of GPS coordinates in R that I want to treat as an "exposure" for another set of GPS coordinates corresponding to patients of interest. How do I do proximity analysis in R to separate patients into two groups: those within x meters of an exposure coordinate and those further away?
I think something like this should work:
library(sp)
exp <- data.frame(lat= 40.741895,long = -73.989308)
patients <- data.frame(lat = rnorm(10,exp$lat,0.1),long = rnorm(10,exp$long,0.1))
coordinates(patients) <- ~ long + lat
coordinates(exp) <- ~ long + lat
d <- spDistsN1(coordinates(patients),coordinates(exp),longlat = TRUE)
Strongly inspired from: Calculate distance from GPS data
Related
I want to calculate the shortestPath distance (using gDistance package) between a set of geographic coordinates, using a transition layer of the ocean to prevent 'movement' across land.
Here is how I created the transition layer:
library(raster); library(gdistance); library(maptools); library(rgdal); library(sp)
mapcrs <- "+proj=longlat +datum=WGS84 +no_defs"
data(wrld_simpl)
world <- wrld_simpl
worldshp <- spTransform(world, mapcrs)
ras <- raster(nrow=300,ncol=300)
crs(ras) <- crs(oceans.shp)
extent(ras) <- extent(worldshp)
landmask <- rasterize(worldshp, ras)
landras <- is.na(landmask)
tr <- transition(landras, transitionFunction = mean, directions = 8, symm = FALSE)
tr = geoCorrection(tr, scl=FALSE)
I then want to calculate the shortestPath distance between every coordinate in my dataset i.e. location 1 to location n, location 2 to location n etc.
Let's produce some hypothetical geographic coordinates and convert to spatial points
x <- rnorm(10, mean = -40, sd=5)
y <- rnorm(10, mean = 20, sd=5)
xy <- cbind(x,y); colnames(xy) <- c("lon","lat")
xy <- SpatialPoints(xy); projection(xy) <- projection(mapcrs)
Using the shortestPath function in gDistance, I can calculate the distance from the first coordinate (i.e. xy[1]) to all other xy coordinates, like so.
dist <- shortestPath(tr, origin = xy, goal = xy, output="SpatialLines")
I then tried to apply a for loop to sequentially calculate distance from location 1 to all other locations, and then calculating distance from location 2 to all other locations etc., which I wrote as follows:
for(i in seq_along(xy)){
AtoB <- shortestPath(tr, origin = xy[i,], goal=xy, output="SpatialLines")
i <- i+1
}
This, however, still only calculates the distances relative to the first xy spatial point and does not 'loop' for all subsequent rows. I don't know what I'm doing wrong. It's probably super-easy, but I'm struggling. Any help would be appreciated.
Thanks in advance,
Tony
---- UPDATE ----
We have come up with a bit of a work around (thanks Charley Clubley) but it still won't produce outputs for every spatial line. This will generate a matrix of distances.
The work around is as follows:
Using xy as a matrix, not spatial points
distances <- matrix(ncol=nrow(xy), nrow=nrow(xy))
xy_b <- xy ## Coords needs to be as a matrix (not spatial points)
## This generates an error indicating there are no more rows to delete once complete, but the computation works
for (i in 1:nrow(xy_b)) {
AtoB <-shortestPath(tr, xy_b, xy, output="SpatialLines")
length <- SpatialLinesLengths(AtoB)
distances[i, ] <- length
xy_b <- xy_b[-1,]
}
Using tmaptools package in R - How can I extract the 'Bearing' information from a .GPX track file. This appears in Garmin Basecamp but does not appear using tmaptools::read_GPX. Currently I use the below code. But surely there is a simpler way? Link to GPS Track: https://www.dropbox.com/s/02p3yyjkv9fmrni/Barron_Thomatis_2019_EOD.gpx?dl=0
library(tmaptools)
library(tmap)
library(sf)
library(tidyverse)
library(geosphere)
GPSTrack <- read_GPX("Barron_Thomatis_2019_EOD.gpx", layers = "track_points", as.sf = TRUE)
#
#Adjust GPS Track Data
#
#Extract Lat & Lon from Track geometery (c(lat, Lon))
GPSTrack_Pts <- st_coordinates(GPSTrack)
#Add X, Y Columns to Track
GPSTrack2 <- cbind(GPSTrack, GPSTrack_Pts)
#Create a coordinate vector by combining X & Y
coords <- cbind(GPSTrack2$X,GPSTrack2$Y)
#Convert GPS Track into SpatialPoints format for calculating Bearing
GPSTrack_SpPts <- SpatialPoints(coords)
#Create GPS Point Bearing, GPP point distance & GPS Time interval columns
empty <- st_as_sfc("POINT(EMPTY)")
GPSTrack2 <- GPSTrack2 %>%
st_set_crs(4326) %>% # will use great circle distance
mutate(
Bearing = bearing(coords))
#Convert Bearing to Course and Add as column
GPSTrack2 <- GPSTrack2 %>%
mutate(course = (Bearing + 360) %% 360) # add full circle, i.e. +360, and determine modulo for 360
I suggest you use lwgeom::st_geod_azimuth() for this task - it makes for somewhat more concise code.
Note that there is a challenge when adding the vector of bearings back to the spatial dataframe of points; it has by definition one element less than is the number of rows (you need two points to define a bearing).
One possibility of achieving that - if required - is by concatenating the vector with a single NA value representing the bearing of the very last point. By definition it has no azimuth, as there is no following point.
The azimuth values are objects of class units, originally in radians. Should the class create a problem (as it does with concatenating with the NA) you can easily convert it to a plain number via units::drop_units().
library(sf)
library(dplyr)
library(lwgeom)
points <- st_read("Barron_Thomatis_2019_EOD.gpx",
layer = "track_points",
quiet = T,
stringsAsFactors = F)
points <- points %>%
mutate(bearing = c(lwgeom::st_geod_azimuth(.) %>% units::drop_units(), NA))
I have built a gridded area in the Gulf of Alaska with a resolution of 0.02 decimal degrees (~1nm);
library(sp)
library(rgdal)
# Set interval for grid cells.
my.interval=0.02 #If 1 is 1 degree, which is 60nm, than 0.1 is every 6nm, and 0.05 is every
3nm, so 0.0167 is every 1nm
# Select range of coordinates for grid boundaries (UTM to maintain constant grid cell area regardless of geographic location).
lonmin = -140.5083
lonmax = -131.2889
latmin = 53.83333
latmax = 59.91667
LON = seq(lonmin, lonmax, by=my.interval)
LAT = seq(latmin, latmax, by=my.interval)
# Compile series of points for grid:
mygrd = expand.grid(
Longitude = seq(lonmin, lonmax, by=my.interval),
Latitude = seq(latmin, latmax, by=my.interval)) %>%
#mutate(z=1:n()) %>%
data.frame
I exported that grid as a .csv file and brought it into ArcGIS where I used a few bathymetry rasters to extract the bottom depth at the midpoint of each cell. I then exported that from GIS back into R as a .csv file data frame. So now it has another column called "Depth" on it.
For now, I'll just add a column with random "depth" numbers in it:
mygrd$Depth<-NA
mygrd$Depth<-runif(nrow(mygrd), min=100, max=1000)
I would like to calculate the slope at the midpoint of each cell (between points).
I've been trying to do this with the slope() function in SDMTools package, which requires you to have a SpatialGridDataFrame in the sp package.
I can't get this to work; I am also not sure if this is the easiest way to do that?
I have a data frame with 3 columns: Longitude, Latitude, and Depth. I'd like to calculate slope. If anyone knows any better way to do this, let me know! Any help is much appreciated!
Here is some of the code I've been trying to use:
library(SDMTools)
proj <- CRS('+proj=longlat +datum=WGS84')
coords <- mygrd[,1:2]
t2 <- SpatialPointsDataFrame(coords=coords, data=mgrd proj4string=proj)
t2<-SpatialPixelsDataFrame(points=t2[c("Longitude","Latitude")], data=t1[,c(1,2)])
t3 <- SpatialGridDataFrame(grid=NULL, data=t2, proj4string=CRS("+proj=longlat +datum=WGS84"))
class(t3)
slope.test<-slope(t3, latlon=TRUE)
I am working on a project where I pull crime data from an API, and essentially calculate the density of crime per predefined grid unit. I do this now by putting lat and lon into a data.frame and then calculating the count of points within a radius of a point center. This is computationally expensive as there are thousands of points in the predefined grid and thousands of crime points.
I'm wondering if there is a better way to calculate crime density; I've heard that raster may be valuable?
Some sample data:
# Create a predefined grid of coordinates
predef.grid <- data.frame(lat = seq(from = 2.0, to = 4.0, by = 0.1),lon = seq(from = 19.0, to = 21.0, by = 0.1))
predef.grid <- expand.grid(predef.grid)
# Create random sample of crime incidents
crime.incidents <- data.frame(lat = rnorm(10, 4),lon = rnorm(10,20))
crime.incidents <- expand.grid(mydata)
# Need to count number of crimes within radius of every point in predef.grid
Thanks!
# Need to count number of crimes within radius of every point in
library(raster)
library(sp)
# predfined raster
predef.grid <- raster(xmn=2, # xmin
ymn=4, # ymin
xmx=19, # xmax
ymx=21, # ymax
res=1, # spatial resolution
vals = 1) # cell value
plot(predef.grid)
# Create random sample of crime incidents
# points should be a Spatial object of some form, point, etc.
crime.incidents <- spsample(x = as(extent(predef.grid), 'SpatialPolygons'),
n = 100,
type = 'random')
# plot points over grid
points(crime.incidents, pch = 20)
# count points per cell
density <- rasterize(crime.incidents, predef.grid, fun='count')
# plot the density
plot(density)
I have only been working with R for the past three weeks, so I am very much a novice. I am currently working with NetCDF climate data from across New England(path). I also have a .csv file with coordinates of specific cities we want to look at (cities.path). I have been able to extract the time annual time series and trend from model grid cells corresponding to the specific cities from the .csv file. The problem has been with being able to plot the important city stations from my .csv file onto the map with annual averages.
When I run the script line
'ave_annual_cities <- extract(annual_ave_stack, cities.points, df = T)'
I get a latitude/longitude graph with my important cities points. When I run 'plot(coordinates(cities.points))' in the console, I again get a latitude/longitude graph with my important cities points, however it stands alone, separate from my ave_annual_cities graph.
When I run 'levelplot(subset(annual_ave_stack, 1), margin=F) +
layer(sp.points(cities.points, plot.grid = TRUE))' I get a graph of New England with annual averages.
Here is my script so far.
#Coordinate CS lat/lon pull from Important City Stations
# imports the csv of lat/lon for specific cities
# reads the lat/lon of the .nc modeled climate files
# extracts the time annual time series and trend from model grid cells
corresponding to your specific cities' locations.
#Graphs City points according to annual time series graph
# Libraries (probably do not need all)
library(survival)
library(lattice)
library(ncdf4)
library(RNetCDF)
library(date)
library(raster)
library(rasterVis)
library(latticeExtra)
library(RColorBrewer)
library(rgdal)
library(rgeos)
library(wux)
path <- "/net/nfs/merrimack/raid/Northeast_US_Downscaling_cmip5/"
vars = c("tasmin","tasmax")
mods = c("ACCESS1-0", "ACCESS1-3",
"bcc-csm1-1-m", "bcc-csm1-1")
scns = c("rcp45", "rcp85")
cities.path <-
"/net/home/cv/marina/Summer2017_Projects/Lat_Lon/NE_Important_Cities.csv"
necity.vars <- c("City", "State", "Population", "Latitude", "Longitude",
"Elevation(meters")
#These both read the .csv file (first uses 'utils', second uses 'wux')
#1
cities.read <- read.delim(cities.path, header = T, sep = ",")
#2
read.table <- read.wux.table(cities.path)
cities.read <- subset(cities.read, subreg = "City", sep = ",")
# To test one coordinate point
point_1 <- c("test.city", 44.31, -69.79)
colnames(point_1)<-c("cities", "latitude", "longitude" )
# To read only "Cities", "Latitude", and "Longitude"
cities.points <- subset(cities.read, select = c(1, 4, 5))
cities.points <- as.data.frame(cities.points)
colnames(cities.points)<- c("City", "Latitude", "Longitude" )
#Set plot coordinates for .csv graph
coordinates(cities.points) <- ~ Longitude + Latitude
proj4string(cities.points) <- c("+proj=longlat +datum=WGS84 +ellps=WGS84
+towgs84=0,0,0")
# Start loop to envelope all .nc files
for (iv in 1:2){
for (im in 1:4){
for (is in 1:2){
for(i in 2006:2007){
full<-paste(path, vars[iv], "_day_", mods[im], "_", scns[is],
"_r1i1p1_", i, "0101-", i, "1231.16th.nc", sep="")
# this will print out
#/net/nfs/merrimack/raid/Northeast_US_Downscaling_cmip5/NameOfFiles.nc
# this line will print the full file name
print(full)
# use the brick function to read the full netCDF file.
# note: the varname argument is not necessary, but if a file has multiple
varables, brick will read the first one by default.
air_t<-brick(full, varname=vars[iv])
# use the calc function to get an average for each grid cell over the
entire year
annual_ave_t<-calc(air_t, fun = mean, na.rm=T)
if(i == 2006){
annual_ave_stack = annual_ave_t
}else{
annual_ave_stack<-stack(annual_ave_stack, annual_ave_t)
} # end of if/else
} # end of year loop
#extract annual means for grid cells for each year corresponding to
important cities
ave_annual_cities <- extract(annual_ave_stack, cities.points, df = T)
} # end of scenario loop
} # end of model loop
} # end of variable loop
levelplot(subset(annual_ave_stack, 1), margin=F) +
layer(sp.points(cities.points, plot.grid = TRUE))
# Read lat/lon from .nc climate files
# http://geog.uoregon.edu/bartlein/courses/geog607/Rmd/netCDF_01.htm
climate.data <- nc_open(full)
lat <- ncvar_get(climate.data, varid = "lat")
nlat <- dim("lat")
lat
lon <- ncvar_get(climate.data, varid = "lon")
nlon <- dim("lon")
lon
# This gives all lat data.
#print long and lat variables: confirms the dimensions of the data
print(c(nlon, nlat))
# If I need time series...
my.time <- nc_open(climate.data, "time")
n.dates <- trunc(length(climate.data))
n.dates
# open NetCDF choosing pop and lat/lon points
cities.pop.points <- subset(cities.read, select = c(1, 3, 4, 5))
# print NetCDF coordinates with pop
print(cities.pop.points)
Hope this makes sense.
If your input data is correctly read as a multiband raster, you could read the netcdf as a raster using:
r <- raster::stack(path)
cells <- raster::cellFromXY(r, xy)
data <- raster::extract(r, cells)
, where xy is a matrix of x/y coordinates you can get quite easily from your csv using
something on the lines of:
as.matrix(csv_read(cities.path))
HTH