Misalignment between POSGAR94 polygons and WGS84 leaflet map - r

I need to draw a bunch of polygons from this dataset on a leaflet map on R:
The coordinates are in POSGAR94, but I need them in WGS84 to plot on leaflet (over a OpenStreetMap layer) and to compare them with other data (already on WGS84):
library(rgdal)
library(magrittr)
library(leaflet)
complete_data <- readOGR("data_folder",
GDAL1_integer64_policy = TRUE)
complete_data <- spTransform(bsas,
CRS("+proj=longlat +datum=WGS84 +no_defs"))
I filter the data to keep only a section of it:
int_data <- complete_data[grep("^0604219|^0604201|060421102|060421103", complete_data#data$link), ]
And I plot:
leaflet(int_data, options = leafletOptions(minZoom = 12, maxZoom = 18)) %>%
setMaxBounds(lat1 = -37.1815, lng1 = -58.5581, lat2 = -37.1197, lng2 = -58.4297) %>%
addTiles()%>%
addPolygons(color = "#3498db", weight = 2, smoothFactor = 0.5,
opacity = 0.5, fillOpacity = 0.1,
highlightOptions = highlightOptions(color = "black", weight = 3,
bringToFront = TRUE))
The current result looks like:
All the polygons are offset by a block, its mostly visible on the city perimeter. Here's how that polygon should look like:
My questions are:
Am I making a mistake with the proyection? Or does spTransform introduce an error in the coordinates?
or
Is my code ok, but the data is wrong?
EDIT: This is the output of st_crs before and after the conversion:
BEFORE
Coordinate Reference System:
User input: +proj=tmerc +lat_0=-90 +lon_0=-66 +k=1 +x_0=3500000 +y_0=0 +ellps=WGS84 +units=m +no_defs
wkt:
PROJCS["unnamed",
GEOGCS["WGS 84",
DATUM["unknown",
SPHEROID["WGS84",6378137,298.257223563]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433]],
PROJECTION["Transverse_Mercator"],
PARAMETER["latitude_of_origin",-90],
PARAMETER["central_meridian",-66],
PARAMETER["scale_factor",1],
PARAMETER["false_easting",3500000],
PARAMETER["false_northing",0],
UNIT["Meter",1]]
AFTER
Coordinate Reference System:
User input: +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
wkt:
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0,
AUTHORITY["EPSG","8901"]],
UNIT["degree",0.0174532925199433,
AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4326"]]

This looks like an issue with the dataset. I'm looking at it in QGIS, along with some OSM basemap, and around Buenos Aires everything seems to fit the road network nicely:
But going a bit south (e.g. around Mar del Plata coastline) shows an obvious misalignment:
Is my code ok, but the data is wrong?
Since the same problem can be seen using a completely different method, it's safe to say that your code is OK, and working as expected.
We can say that the dataset mismatches when reprojected over a OSM basemap. However, we cannot say that the data is wrong. If we load your dataset along with some reference data from ide.ign.gob.ar (department limits), the data also doesn't match:
In fact, superimposing OSM, IDE-IGN and INDEC data means three different data sources which don't match:
This is normal. There is no easy definition of "right" when it comes to alignment of GIS datasets, and there are a lot of factors in play: collection criteria, orthorectification parameters, datum shifts, projection warping and even continental drift, among others.

Related

ggplot can find lng and lat in dataset but leaflet cannot, "Need '+proj=longlat +datum=WGS84' "

I have a dataset form the public repo https://github.com/highsource/verbundkarte
Reading the dataset with st_read and plotting it with ggplot yields a beautiful map with correct lng and lat data.
df <- st_read("~/Verkehrsverbunde.shp")
map <- ggplot(df) + geom_sf(aes(fill=SHORTNAME))
I therefore assume, that the lng/lat values are included in the variable df$geometry. However, if I use leaflet, no matter what I try I end up with an error. For instance
df%>% leaflet() %>%
addProviderTiles("CartoDB") %>%
addPolygons(label = htmlEscape(verbunddaten$SHORTNAME)) %>%
setView(lng = 10.3, lat = 51.9, zoom = 5.1)
ends up with
Warning messages:
1: sf layer is not long-lat data
2: sf layer has inconsistent datum (+proj=tmerc +lat_0=0 +lon_0=9 +k=1 +x_0=3500000 +y_0=0 +datum=potsdam +units=m +no_defs ).
Need '+proj=longlat +datum=WGS84'
I found this beautiful conversation of which I understood basically nothing. Reading the data with readOGR as suggested here doesn't solve my problem.
How do I force leaflet to assume the same longlat and EPSG as ggplot?
The data in your Verkehrsverbunde.shp shapefile is in a custom traverse mercator projection, where coordinates are expressed in meters. By contrast, r-leaflet expects data to be expressed in degrees of latitude-longitude, which would look like an equirectangular projection if/when plotted.
I'll guess that, contrary to your belief, ggplot is not using latitude and longitude in degrees, but rather northing and easting values in meters. The representation of the data might look similar for small areas.
In GIS terms, your data is using the EPSG:31467 CRS (Coordinate Reference System), as can be inferred by proj=tmerc and datum=postdam; r-leaflet expects data in the EPSG:4326 CRS.
The approach here would be to reproject your data, so that coordinates are in latitude-longitude as expected. There are plenty of ways to do this; running ogr2ogr -t_srs epsg:4326 Verkehrsverbunde-latlng.shp Verkehrsverbunde-latlng.shp on a command line, using R, or using QGIS, amongst other methods.

Visual bug when changing Robinson projection's central meridian with ggplot2

I am attempting to project a world map in a Robinson projection where the central meridian is different from 0. According to this StackOverFlow thread, it should be an easy thing (albeit the example uses sp).
Here is my reproducible code:
library(sf)
library(ggplot2)
library(rnaturalearth)
world <- ne_countries(scale = 'small', returnclass = 'sf')
# Notice +lon_0=180 instead of 0
world_robinson <- st_transform(world, crs = '+proj=robin +lon_0=180 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs')
ggplot() +
geom_sf(data = world_robinson)
This is the result. Polygons are closing themselves from one side to the other of the projection.
Trying with sp gives the same effect. I also tried with a shapefile including only polygons from coastlines (no political borders) from http://www.naturalearthdata.com/ and the effect is similar.
I tried to run my snippet on two independent R installations on Mac OS X and Ubuntu 18.04.
Polygons that straddle the meridian line get stretched all the way across the map, after the transformation. One way to get around this is to split these polygons down the middle, so that all polygons are either completely to the west or east of the line.
# define a long & slim polygon that overlaps the meridian line & set its CRS to match
# that of world
polygon <- st_polygon(x = list(rbind(c(-0.0001, 90),
c(0, 90),
c(0, -90),
c(-0.0001, -90),
c(-0.0001, 90)))) %>%
st_sfc() %>%
st_set_crs(4326)
# modify world dataset to remove overlapping portions with world's polygons
world2 <- world %>% st_difference(polygon)
# perform transformation on modified version of world dataset
world_robinson <- st_transform(world2,
crs = '+proj=robin +lon_0=180 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs')
# plot
ggplot() +
geom_sf(data = world_robinson)
This is an extension to Z.lin's answer (i.e. use that answer first to calculate world_robinson). However, there is another useful step that can be added. After projecting, regions that were comprised of more than one polygon because they cross from one side of the map to the other in the original projection (see Antarctica, Fiji and Russia) still have this split after reprojection. For example, here is a close up of Antarctica where we can see that it has a boundary on the prime meridian where none should be:
To stitch these regions back togther, we can first find out which polygons are the problems by finding those that cross a the prime meridian:
bbox = st_bbox(world_robinson)
bbox[c(1,3)] = c(-1e-5,1e-5)
polygon2 <- st_as_sfc(bbox)
crosses = world_robinson %>%
st_intersects(polygon2) %>%
sapply(length) %>%
as.logical %>%
which
Now we can select those polygons and set their buffer size to zero:
library(magrittr)
world_robinson[crosses,] %<>%
st_buffer(0)
ggplot(world_robinson) + geom_sf()
As we can see, the map no longer has splits down the prime meridian:

R - transition function for modelling surface water flow with gdistance

I am trying to model overland (surface) water flow from specified origin points to a single downslope goal point using the gdistance shortestPath function. I need help with defining the appropriate transitionFunction for this, as I need to make sure the least cost path only allows water to flow along the path to elevation cells of equal or lesser value than the previous cell. The transitionFunction in the example below selects the minimum elevation cell but, based on the transitionFunction I have defined, this value may still be greater than the previous cell value.
I realize that, when the above is defined as I want it, the path may terminate before reaching the goal point. This is fine, although I would ideally like to be able to preserve the path from the origin to wherever it terminates if possible.
Also, if anyone knows of a different R package capable of modelling this kind of thing, please let me know.
library(gdistance)
library(raster)
library(elevatr)
library(sp)
#load example DEM raster
data(lake)
elevation <- get_elev_raster(lake, z = 9)
#remove negative elevation values from raster
elevation[elevation < 0] <- NA
#create origin and goal points with same projection as elevation raster
origin <- SpatialPoints(cbind(1790000, 640000), proj4string = CRS("+proj=aea +lat_1=20 +lat_2=60 +lat_0=40 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"))
goal <- SpatialPoints(cbind(1820000, 540000), proj4string = CRS("+proj=aea +lat_1=20 +lat_2=60 +lat_0=40 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"))
#create df data and convert to SpatialPointsDataFrame
odf <- data.frame("flowreg" = 1)
gdf <- data.frame("flowreg" = 2)
origindf <- SpatialPointsDataFrame(origin, odf)
goaldf <- SpatialPointsDataFrame(goal, gdf)
trCost1 <- transition(elevation, transitionFunction=function(x) 1/min(x), directions=8)
trCost1gc <- geoCorrection(trCost1, type="c")
plot(raster(trCost1))
sPath1 <- shortestPath(trCost1, origin, goal,
output="SpatialLines")
plot(elevation)
plot(origindf, add = TRUE, col='red', cex = 5)
plot(goaldf, add = TRUE, col='green', cex = 5)
lines(sPath1)
I have found the GRASS GIS (accessed in R using rgrass7) r.drain function OR raster::flowPath achieve what I am trying to do in the above question.

Create Distance to Shore (km) Variable from Lat Lon data?

I have a data-frame with 3k + data points spread throughout the northern Gulf of Mexico (here I only provide 6). I am trying to create a new variable for is distance to shore (km). I have a shape-file (gulf.shape) which I would like to use but I'm not clear on how.
Here is some data
require(maptools)
require(sp)
library(rgdal)
library(lubridate)
df <- data.frame(Lat = c(26.84853, 28.38329, 28.00364,
29.53840, 29.32030, 26.81622, 25.28146),
Lon = c(-96.55716, -94.29307, -91.21581,
-88.42556, -84.20031, -83.89737, -82.95665))
and I load the shapefile (provided here).
gulf.shape <- "Shape\\stateshigh.shp"
gulf.shape <- maptools::readShapePoly(gulf.shape)
and a quick plot to visualize what I have.
plot(df$Lon, df$Lat,
xlim = c(-97.5, -80.7), ylim = c(25, 30.5),
xlab ="Latitude", ylab = "Longitude",
pch = 20, col="red", cex=1.5)
par(new=T)
sp::plot(gulf.shape, add= T,
xlim = c(-97.5, -80.7), ylim = c(25, 30.5),
xlab ="Latitude", ylab = "Longitude",
col = "gray")
I found a stack overflow post (here), which allowed me to get an answer using the code below. The shape file they use is available here.
require(rgdal) # for readOGR(...); loads package sp as well
require(rgeos) # for gDistance(...)
require(parallel) # for detect cores
require(foreach) # for foreach(...)
require(snow) # for makeCluster(...)
require(doSNOW) # for resisterDoSNOW(...)
wgs.84 <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
mollweide <- "+proj=moll +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs"
sp.points <- SpatialPoints(df[,c("Lon","Lat")], proj4string=CRS(wgs.84))
coast <- rgdal::readOGR(dsn=".",layer="ne_10m_coastline",p4s=wgs.84);str(coast)
coast.moll <- spTransform(coast,CRS(mollweide))
point.moll <- spTransform(sp.points,CRS(mollweide))
no_cores <- detectCores()
cl <- makeCluster(no_cores,type="SOCK") # create a 4-processor cluster
registerDoSNOW(cl) # register the cluster
get.dist.parallel <- function(n) {
foreach(i=1:n, .combine=c, .packages="rgeos", .inorder=TRUE,
.export=c("point.moll","coast.moll")) %dopar% gDistance(point.moll[i],coast.moll)
}
df$Dis.to.SHORE <- get.dist.parallel(length(sp.points))
df$Dis.to.SHORE <- df$Dis.to.SHORE/1000
df
plot(coast)
points(sp.points,pch=20,col="red")
However, I do not understand the CRS code used in wgs.84 and mollweide and this makes me uneasy about using the data generated with this code. I would also like to use just the gulf.shape file and not the whole world, since there was some suggestion in the previously mentioned stack overflow post that this would be better.
So my questions are:
Are the values I'm getting for distance to shore reasonably accurate (i.e. within 5 - 10m)?
How can I modify the code to utilize the gulf.shape file rather than the whole world?
Can anyone explain the CRS code or point me toward a good reference?
Note that I use parallel computing to speed things up since I have more than 6 data points in reality.
I'll try to answer your 3rd question.
CRS (Coordinate Reference System) is quite important in mapping as it defines the coordinate system of your points. Here's a helpful overview.
https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/OverviewCoordinateReferenceSystems.pdf
For your particular situation, when you change to a different shapefile, you'll need to (1) find out what the CRS is for your shapefile (gulf.shape). Usually, it's in the .prj file or metadata that comes with the shapefile. (2) pick a CRS that's suitable for your goal. You are calculating distance, so an equidistant projection likely is most helpful to you. (3) transform the original crs to the target crs before calculating the distance.
The code you cited was also doing this. The world shapefile came with wgs84 crs; the chosen target crs was mollweide; and it converted wgs84 to mollweide using the spTransform() function.
On another note, related to your 1st question, the accuracy of your calculation is related to the crs you use, but is also related to the scale of your shapefile, and precision of your points (lat/long).

Map raw data and mean data based on the shapefile

sI have the dataset (pts) like this:
x <- seq(-124.25,length=115,by=0.5)
y <- seq(26.25,length=46,by=0.5)
z = 1:5290
longlat <- expand.grid(x = x, y = y) # Create an X,Y grid
pts=data.frame(longlat,z)
names(pts) <- c( "x","y","data")
I knew that I can map the dataframe (pts) into a map by doing:
library(sp)
library(rgdal)
library(raster)
library(maps)
coordinates(pts)=~x+y
proj4string(pts)=CRS("+init=epsg:4326") # set it to long, lat
pts = spTransform(pts,CRS(" +init=epsg:4326 +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0"))
pts <- as(pts, "SpatialPixelsDataFrame")
r = raster(pts)
projection(r) = CRS(" +init=epsg:4326 +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0")
plot(r)
map("usa",add=T)
Now I would like to create a separate map which shows the means of pts across different regions. The shapefile I want to use is from ftp://ftp.epa.gov/wed/ecoregions/cec_na/NA_CEC_Eco_Level2.zip , however, this is a north america map. How can I create the map showing only US based on this north america map? Or is there another better way to do this? thanks so much.
I think that cutting out the non-US data based on the data in the shapefile alone would be hard, since the regions do not correspond to political boundaries - that could be done with rgeos though.
Assuming that "eco" is a SpatialPolygonsDataFrame read in by rgdal::readOGR or maptools::readShapeSpatial, see the available key data for indexing:
sapply(as.data.frame(eco), function(x) if(!is.numeric(x)) unique(x) else NULL)
If you just want to plot it, set up a map with only the US region to start with and then overplot.
library(maps)
map("usa", col = "transparent")
We see that the data is in Lambert Azimuthal Equal Area:
proj4string(eco)
[1] " +proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +a=6370997 +b=6370997 +units=m +no_defs"
So
require(rgdal)
eco.laea <- spTransform(eco, CRS("+proj=longlat +ellpse=WGS84"))
plot(eco.laea, add = TRUE)
If you want to plot in the original Lambert Azimuthal Equal Area you'll need to get the bounding box in that projection and start the plot based on that, I just used existing data to make an easy example. I'm pretty sure the data could also be cropped with rgeos against another boundary too, but depends what you actually want.

Resources