Calculating distance to nearest shore from multiple GPS coordinates - r

I have tried using the response to this question to solve this problem but I cannot apply it in my case since I have many coordinates distributed at a global scale.
Does anyone have a way to calculate the minimum distance in km from a series of points to the nearest shore using a loop? This is a subset of the points I am using (DATA HERE)
#setwd and load directories----
setwd("your_wd")
require (ggplot2)
require (ggmap)
#build a map to plot distribution of sample sites ----
sites<-read.csv("sites.csv", header=T)
#Using GGPLOT, plot the Base World Map
mp <- NULL
mapWorld <- borders("world", colour="gray50", fill="gray50") # create a layer of borders
mp <- ggplot() + mapWorld
#Now Layer the sites on top
Lon<-sites$x
Lat<-sites$y
mp <- mp+ geom_point(aes(x=Lon, y=Lat),color="blue", size=3)
mp

Have a look at the rgeos package
library(rgeos)
gDistance(spPoints, spPolygon, byid = TRUE)
spPoints will be a SpatialPoints object holding the coordinates. spPolygon will be a SpatialPolygons objects with landmasses. See the sp package. Make sure that both object have the same projection and have a sensible projection.

Related

Maps doesn't register weird shapes

I'm working with one of my professors on some research aimed toward bettering the current methods of carbon accounting. We noticed that many of the locations for point sources were defaulted to the centroid of the county it was in (this is specific to the US at the moment, though it will be applied globally) if there was no data on the location.
So I'm using R to to address the uncertainty associated with these locations. My code takes the range of longitude and latitude for a county and plots 10,000 points. It then weeds out the points that are not in the county and take the average of the leftover points to locate the centroid. My goal is to ultimately take the difference between these points and the centroid to find the spacial uncertainty for point sources that were placed in the centroid.
However, I'm running into problems with coastal regions. My first problem is that the maps package ignores islands (the barrier islands for example) as well as other disjointed county shapes, so the centroid is not accurately reproduced when the points are averaged. My second problem is found specifically with Currituck county (North Carolina). Maps seems to recognize parts of the barrier islands contained in this county, but since it is not continuous, the entire function goes all wonky and produces a bunch of "NAs" and "Falses" that don't correspond with the actual border of the county at all.
(The data for the centroid is going to be used in other areas of the research which is why it's important we can accurately access all counties.)
Is there any way around the errors I'm running into? A different data set that could be read in, or anything of the sort? Your help would be greatly appreciated. Let me know if there are any questions about what I'm asking, and I'll be happy to clarify.
CODE:
ggplot2 helps
SOME TROUBLE COUNTIES: north carolina, currituck & massachusetts,dukes
library(ggplot2)
library(maps) # package has maps
library(mapproj) # projections
library(sp)
WC <- map_data('county','north carolina,currituck') #calling on county
p <- ggplot(data = WC, aes(x = long, y = lat)) #calling on latitude and longitude
p1 <- p + geom_polygon(fill = "lightgreen") + theme_bw() +
coord_map("polyconic") + coord_fixed() #+ labs(title = "Watauga County")
p1
### range for the long and lat
RLong <- range(WC$long)
RLong
RLat <- range(WC$lat)
RLat
### Add some random points
n <- 10000
RpointsLong <- sample(seq(RLong[1], RLong[2], length = 100), n, replace = TRUE)
RpointsLat <- sample(seq(RLat[1], RLat[2], length = 100), n, replace = TRUE)
DF <- data.frame(RpointsLong, RpointsLat)
head(DF)
p2<-p1 + geom_point(data = DF, aes(x = RpointsLong, y = RpointsLat))
p2
# Source:
# http://www.nceas.ucsb.edu/scicomp/usecases/GenerateConvexHullAndROIForPoints
inside <- map.where('county',RpointsLong,RpointsLat)=="north carolina,currituck"
inside[which(nchar(inside)==2)] <- FALSE
inside
g<-inside*DF
g1<-subset(g,g$RpointsLong!=0)
g1
CentLong<-mean(g1$RpointsLong)
CentLat<-mean(g1$RpointsLat)
Centroid<-data.frame(CentLong,CentLat)
Centroid
p1+geom_point(data=g1, aes(x=RpointsLong,y=RpointsLat)) #this maps all the points inside county
p1+geom_point(data=Centroid, aes(x=CentLong,y=CentLat))
First, given your description of the problem, I would probably invest a lot of effort to avoid this business of locations defaulting to the county centroids - that's the right way to solve this problem.
Second, if this is a research project, I would not use the built in maps in R. The USGS National Atlas website has excellent county maps of the US. Below is an example using Currituck County in NC.
library(ggplot2)
library(rgdal) # for readOGR(...)
library(rgeos) # for gIntersection(...)
setwd("< directory contining shapefiles >")
map <- readOGR(dsn=".",layer="countyp010")
NC <- map[map$COUNTY=="Currituck County" & !is.na(map$COUNTY),]
NC.df <- fortify(NC)
bbox <- bbox(NC)
x <- seq(bbox[1,1],bbox[1,2],length=50) # longitude
y <- seq(bbox[2,1],bbox[2,2],length=50) # latitude
all <- SpatialPoints(expand.grid(x,y),proj4string=CRS(proj4string(NC)))
pts <- gIntersection(NC,all) # points inside the polygons
pts <- data.frame(pts#coords) # ggplot wants a data.frame
centroid <- data.frame(x=mean(pts$x),y=mean(pts$y))
ggplot(NC.df)+
geom_path(aes(x=long,y=lat, group=group), colour="grey50")+
geom_polygon(aes(x=long,y=lat, group=group), fill="lightgreen")+
geom_point(data=pts, aes(x,y), colour="blue")+
geom_point(data=centroid, aes(x,y), colour="red", size=5)+
coord_fixed()
Finally, another way to do this (which I'd recommend, actually), is to just calculate the area weighted centroid. This is equivalent to what you are approximating, is more accurate, and much faster.
polys <- do.call(rbind,lapply(NC#polygons[[1]]#Polygons,
function(x)c(x#labpt,x#area)))
polys <- data.frame(polys)
colnames(polys) <- c("long","lat","area")
polys$area <- with(polys,area/sum(area))
centr <- with(polys,c(x=sum(long*area),y=sum(lat*area)))
centr # area weighted centroid
# x y
# -76.01378 36.40105
centroid # point weighted centroid (start= 50 X 50 points)
# x y
# 1 -76.01056 36.39671
You'll find that as you increase the number of points in the points-weighted centroid the result gets closer to the area-weighted centroid.

Plot spatial area defined by multiple polygons

I have a SpatialPolygonsDataFrame with 11589 spatial objects of class "polygons". 10699 of those objects consists of exactly 1 polygon. However, the rest of those spatial objects consist of multiple polygons (2 to 22).
If an object of consists of multiple polygons, three scenarios are possible:
1) The additional polygons could describe a "hole" in the spatial area described by the first polygon .
2) The additional polygons could also describe additional geographic areas, i.e. the shape of the region is quite complex and described by putting together multiple parts.
3) Often it is a mix of both, 1) and 2).
My question is: How to plot such a spatial object which is based on multiple polygons?
I have been able to extract and plot the information of the first polygon, but I have not figured out how plot all polygons of such a complex spatial object at once.
Below you find my code. The problem is the 15th last line.
# Load packages
# ---------------------------------------------------------------------------
library(maptools)
library(rgdal)
library(ggmap)
library(rgeos)
# Get data
# ---------------------------------------------------------------------------
# Download shape information from the internet
URL <- "http://www.geodatenzentrum.de/auftrag1/archiv/vektor/vg250_ebenen/2012/vg250_2012-01-01.utm32s.shape.ebenen.zip"
td <- tempdir()
setwd(td)
temp <- tempfile(fileext = ".zip")
download.file(URL, temp)
unzip(temp)
# Get shape file
shp <- file.path(tempdir(),"vg250_0101.utm32s.shape.ebenen/vg250_ebenen/vg250_gem.shp")
# Read in shape file
x <- readShapeSpatial(shp, proj4string = CRS("+init=epsg:25832"))
# Transform the geocoding from UTM to Longitude/Latitude
x <- spTransform(x, CRS("+proj=longlat +datum=WGS84"))
# Extract relevant information
att <- attributes(x)
Poly <- att$polygons
# Pick an geographic area which consists of multiple polygons
# ---------------------------------------------------------------------------
# Output a frequency table of areas with N polygons
order.of.polygons.in.shp <- sapply(x#polygons, function(x) x#plotOrder)
length.vector <- unlist(lapply(order.of.polygons.in.shp, length))
table(length.vector)
# Get geographic area with the most polygons
polygon.with.max.polygons <- which(length.vector==max(length.vector))
# Check polygon
# x#polygons[polygon.with.max.polygons]
# Get shape for the geographic area with the most polygons
### HERE IS THE PROBLEM ###
### ONLY information for the first polygon is extracted ###
Poly.coords <- data.frame(slot(Poly[[polygon.with.max.polygons ]]#Polygons[[1]], "coords"))
# Plot
# ---------------------------------------------------------------------------
## Calculate centroid for the first polygon of the specified area
coordinates(Poly.coords) <- ~X1+X2
proj4string(Poly.coords) <- CRS("+proj=longlat +datum=WGS84")
center <- gCentroid(Poly.coords)
# Download a map which is centered around this centroid
al1 = get_map(location = c(lon=center#coords[1], lat=center#coords[2]), zoom = 10, maptype = 'roadmap')
# Plot map
ggmap(al1) +
geom_path(data=as.data.frame(Poly.coords), aes(x=X1, y=X2))
I may be misinterpreting your question, but it's possible that you are making this much harder than necessary.
(Note: I had trouble dealing with the .zip file in R, so I just downloaded and unzipped it in the OS).
library(rgdal)
library(ggplot2)
setwd("< directory with shapefiles >")
map <- readOGR(dsn=".", layer="vg250_gem", p4s="+init=epsg:25832")
map <- spTransform(map, CRS("+proj=longlat +datum=WGS84"))
nPolys <- sapply(map#polygons, function(x)length(x#Polygons))
region <- map[which(nPolys==max(nPolys)),]
plot(region, col="lightgreen")
# using ggplot...
region.df <- fortify(region)
ggplot(region.df, aes(x=long,y=lat,group=group))+
geom_polygon(fill="lightgreen")+
geom_path(colour="grey50")+
coord_fixed()
Note that ggplot does not deal with the holes properly: geom_path(...) works fine, but geom_polygon(...) fills the holes. I've had this problem before (see this question), and based on the lack of response it may be a bug in ggplot. Since you are not using geom_polygon(...), this does not affect you...
In your code above, you would replace the line:
ggmap(al1) + geom_path(data=as.data.frame(Poly.coords), aes(x=X1, y=X2))
with:
ggmap(al1) + geom_path(data=region.df, aes(x=long,y=lat,group=group))

Overlay decimal coordinates (New Jersey) on NAD83 Stateplane polygon in R

I am trying to make a plot with points (decimal coordinates in New Jersey) on polyline shapefile with projection NAD 83 Stateplane (feet) (New Jersey). How can I do it? So far, I could plot the points and the shapefile separately but cannot overlay.
Plotted the shapefile using the following code:
orgListLayers("Counties.shp") # Shows the available layers for the shpaefile "Counties:
shape=readOGR("Counties.shp", layer="Counties") # Load the layer of the shapefile
plot(shape) # Plots the shapefile
Plotted points (vectors are lat1,long1) using the following code after transforming the points into Stateplane in ArcGIS:
dpts <- as.data.frame(cbind(long1,lat1))
plot(dpts2)
How can I overlay these points on the polyline shapefile?
Ultimately, I will have multiple sets of points which I want to plot on the shapefile as circles whose size would be dependent on values associated with the points. e.g. if each point represents a town, I want a bigger circle for a town having higher population.
You didn't provide any data, so this may be a partial answer.
Using the ggplot package it is easy to create layered maps. This map, of universities in NJ, was created with the code snippet that follows. It demonstrates plotting points and boundaries on the same map, and sizing the points based on a datum of the university (here, enrollment).
library(ggplot2)
library(rgdal)
setwd("<directory containing your data and maps")
states <- readOGR(dsn=".",layer="tl_2013_us_state")
nj.map <- states[states$NAME=="New Jersey",]
univ.map <- readOGR(dsn=".",layer="NJ_College_Univ_NAD83njsp")
nj.df <- fortify(nj.map)
univ.df <- univ.map#data
univ.df$ENROLL <- as.numeric(as.character(univ.df$ENROLL))
# create the layers
ggMap <- ggplot(nj.df)
ggMap <- ggMap + geom_path(aes(x=long,y=lat, group=group)) # NJ boundary
ggMap <- ggMap + geom_point(data=univ.df, aes(x=X, y=Y, size=ENROLL),color="red", alpha=0.7)
ggMap <- ggMap + coord_fixed()
ggMap <- ggMap + scale_size_continuous(breaks=c(5000,10000,15000,20000,25000,30000), range=c(0,10))
# render the map
ggMap
The TIGER/Line shapefile of US States was obtained here. The NJ Universities were obtained here.
Explanation:
The call to ggplot(...) defines the NJ map as the default dataset.
The call to geom_path(...) adds a layer to draw the NJ boundary.
The call to geom_point(...) adds a point layer locating the universities, with point size proportional to enrollment.
The call to coord_fixed(...) ensures that the map will not be distorted.
The call to scale_size_continuous(...) establishes breaks for the legend labels.

Plot points and polygons in single window in spatstat

I am attempting to plot coordinate data that includes both polygons and points (in separate files) in a single window so that I can later run tests to see what patterns exist. I am fairly new to R (and very new to spatstat) so I really appreciate any advice on how best to create a single plot with multiple types of spatial data.
library(sp)
library(maptools)
library(mgcv)
library(spatstat)
##read in the shapefiles (from Pathfinder)
data<-readShapeSpatial("SouthC1")
regions<-slot(data, "polygons")
regions<-lapply(regions, function(data){SpatialPolygons(list(data))})
windows<-lapply(regions, as.owin)
spatstat.options(checkpolygons=FALSE)
y<-as(data, "owin")
spatstat.options(checkpolygons=TRUE)
points<-readShapeSpatial("Plants1")
##Define points and polygons as objects that can be read into owin?
I suspect that I'm suffering from novice-itis and that reading different types of spatial data into a single window is not difficult. Sorry.
Side note: some of the polygons do overlap, which is why I don't want spatstat to check the polygons. I am aware that this creates complication, but that is not a pressing concern.
As an alternative you might plot your polygons first using
plot(regions)
points("Plants1")
If you use spplot from the sp package, you can use the sp.layout argument. Note the examples below combine a spatial grid with spatial points, but the exact same techniques can be used for points and polygons.
library(sp)
library(lattice)
trellis.par.set(sp.theme()) # sets bpy.colors() ramp
data(meuse)
coordinates(meuse) <- ~x+y
data(meuse.grid)
gridded(meuse.grid) <- ~x+y
spplot(meuse.grid, c("ffreq"), sp.layout = list("sp.points", meuse))
or use ggplot2 (my preference):
library(ggplot2)
# Note that you can use `fortify` to transform a SpatialPolygons* object to a data.frame
# for `ggplot2`.
pt_data = as.data.frame(meuse)
grid_data = as.data.frame(meuse.grid)
ggplot(grid_data, aes(x = x, y = y)) + geom_tile(aes(fill = ffreq)) +
geom_point(data = pt_data)
In spatstat you can plot objects on top of each other using the layered class.
In your example, regions is a list of windows (class owin). Just type
plot(as.layered(as.solist(regions)))
Here as.solist converts a vanilla list to a list of spatial objects; as.layered converts this to a layered object.

How to plot contours on a map with ggplot2 when data is on an irregular grid?

Sorry for the wall of text, but I explain the question, include the data, and provide some code :)
QUESTION:
I have some climate data that I want to plot using R. I am working with data that is on an irregular, 277x349 grid, where (x=longitude, y=latitude, z=observation). Say z is a measure of pressure (500 hPa height (m)). I tried to plot contours (or isobars) on top of a map using the package ggplot2, but I am having some trouble due to the structure of the data.
The data comes from a regular, evenly spaced out 277x349 grid on a Lambert conformal projection, and for each grid point we have the actual longitude, latitude, and pressure measurement. It is a regular grid on the projection, but if I plot the data as points on a map using the actual longitude and latitude where the observations were recorded, I get the following:
I can make it look a little nicer by translating the rightmost piece to the left (maybe this can be done with some function, but I did this manually) or by ignoring the rightmost piece. Here is the plot with the right piece translated to the left:
(An aside) Just for fun, I tried my best to re-apply the original projection. I have some of the parameters for applying the projection from the data source, but I do not know what these parameters mean. Also, I do not know how R handles projections (I did read the help files...), so this plot was produced through some trial and error:
I tried to add the contour lines using the geom_contour function in ggplot2, but it froze my R. After trying it on a very small subset of the data, I found that out after some googling that ggplot was complaining because the data was on an irregular grid. I also found out that that is the reason geom_tile was not working. I am guessing that I have to make my grid of points evenly spaced out - probably by projecting it back into the original projection (?), or by evenly spacing out my data by either sampling a regular grid (?) or by extrapolating between points (?).
My questions are:
How can I draw contours on top of the map (preferably using ggplot2) for my data?
Bonus questions:
How do I transform my data back to a regular grid on the Lambert conformal projection? The parameters of the projection according to the data file include (mpLambertParallel1F=50, mpLambertParallel2F=50, mpLambertMeridianF=253, corners, La1=1, Lo1=214.5, Lov=253). I have no idea what these are.
How do I center my maps so that one side is not clipped (like in the first map)?
How do I make the projected plot of the map look nice (without the unnecessary parts of the map hanging around)? I tried adjusting the xlim and ylim, but it seems to apply the axes limits before projecting.
DATA:
I uploaded the data as rds files on Google drive. You can read in the files using the readRDS function in R.
lat2d: The actual latitude for the observations on the 2d grid
lon2d: The actual longitude for the observations on the 2d grid
z500: The observed height (m) where pressure is 500 millibars
dat: The data arranged in a nice data frame (for ggplot2)
I am told that the data is from the North American Regional Reanalysis data base.
MY CODE (THUS FAR):
library(ggplot2)
library(ggmap)
library(maps)
library(mapdata)
library(maptools)
gpclibPermit()
library(mapproj)
lat2d <- readRDS('lat2d.rds')
lon2d <- readRDS('lon2d.rds')
z500 <- readRDS('z500.rds')
dat <- readRDS('dat.rds')
# Get the map outlines
outlines <- as.data.frame(map("world", plot = FALSE,
xlim = c(min(lon2d), max(lon2d)),
ylim = c(min(lat2d), max(lat2d)))[c("x","y")])
worldmap <-geom_path(aes(x, y), inherit.aes = FALSE,
data = outlines, alpha = 0.8, show_guide = FALSE)
# The layer for the observed variable
z500map <- geom_point(aes(x=lon, y=lat, colour=z500), data=dat)
# Plot the first map
ggplot() + z500map + worldmap
# Fix the wrapping issue
dat2 <- dat
dat2$lon <- ifelse(dat2$lon>0, dat2$lon-max(dat2$lon)+min(dat2$lon), dat2$lon)
# Remake the outlines
outlines2 <- as.data.frame(map("world", plot = FALSE,
xlim = c(max(min(dat2$lon)), max(dat2$lon)),
ylim = c(min(dat2$lat), max(dat2$lat)))[c("x","y")])
worldmap2 <- geom_path(aes(x, y), inherit.aes = FALSE,
data = outlines2, alpha = 0.8, show_guide = FALSE)
# Remake the variable layer
ggp <- ggplot(aes(x=lon, y=lat), data=dat2)
z500map2 <- geom_point(aes(colour=z500), shape=15)
# Try a projection
projection <- coord_map(projection="lambert", lat0=30, lat1=60,
orientation=c(87.5,0,255))
# Plot
# Without projection
ggp + z500map2 + worldmap2
# With projection
ggp + z500map + worldmap + projection
Thanks!
UPDATE 1
Thanks to Spacedman's suggestions, I think I have made some progress. Using the raster package, I can directly read from an netcdf file and plot the contours:
library(raster)
# Note: ncdf4 may be a pain to install on windows.
# Try installing package 'ncdf' if this doesn't work
library(ncdf4)
# band=13 corresponds to the layer of interest, the 500 millibar height (m)
r <- raster(filename, band=13)
plot(r)
contour(r, add=TRUE)
Now all I need to do is get the map outlines to show under the contours! It sounds easy, but I'm guessing that the parameters for the projection need to be inputted correctly to do things properly.
The file in netcdf format, for those that are interested.
UPDATE 2
After much sleuthing, I made some more progress. I think I have the proper PROJ4 parameters now. I also found the proper values for the bounding box (I think). At the very least, I am able to roughly plot the same area as I did in ggplot.
# From running proj +proj=lcc +lat_1=50.0 +lat_2=50.0 +units=km +lon_0=-107
# in the command line and inputting the lat/lon corners of the grid
x2 <- c(-5628.21, -5648.71, 5680.72, 5660.14)
y2 <- c( 1481.40, 10430.58,10430.62, 1481.52)
plot(x2,y2)
# Read in the data as a raster
p4 <- "+proj=lcc +lat_1=50.0 +lat_2=50.0 +units=km +lon_0=-107 +lat_0=1.0"
r <- raster(nc.file.list[1], band=13, crs=CRS(p4))
r
# For some reason the coordinate system is not set properly
projection(r) <- CRS(p4)
extent(r) <- c(range(x2), range(y2))
r
# The contour map on the original Lambert grid
plot(r)
# Project to the lon/lat
p <- projectRaster(r, crs=CRS("+proj=longlat"))
p
extent(p)
str(p)
plot(p)
contour(p, add=TRUE)
Thanks to Spacedman for his help. I will probably start a new question about overlaying shapefiles if I can't figure things out!
Ditch the maps and ggplot packages for now.
Use package:raster and package:sp. Work in the projected coordinate system where everything is nicely on a grid. Use the standard contouring functions.
For map background, get a shapefile and read into a SpatialPolygonsDataFrame.
The names of the parameters for the projection don't match up with any standard names, and I can only find them in NCL code such as this
whereas the standard projection library, PROJ.4, wants these
So I think:
p4 = "+proj=lcc +lat_1=50 +lat_2=50 +lat_0=0 +lon_0=253 +x_0=0 +y_0=0"
is a good stab at a PROJ4 string for your data.
Now if I use that string to reproject your coordinates back (using rgdal:spTransform) I get a pretty regular grid, but not quite regular enough to transform to a SpatialPixelsDataFrame. Without knowing the original regular grid or the exact parameters that NCL uses we're a bit stuck for absolute precision here. But we can blunder on a bit with a good guess - basically just take the transformed bounding box and assume a regular grid in that:
coordinates(dat)=~lon+lat
proj4string(dat)=CRS("+init=epsg:4326")
dat2=spTransform(dat,CRS(p4))
bb=bbox(dat2)
lonx=seq(bb[1,1], bb[1,2],len=277)
laty=seq(bb[2,1], bb[2,2],len=349)
r=raster(list(x=laty,y=lonx,z=md))
plot(r)
contour(r,add=TRUE)
Now if you get a shapefile of your area you can transform it to this CRS to do a country overlay... But I would definitely try and get the original coordinates first.

Resources