Related
I'm trying to better understand how various R packages calculate areas when in lat-long geographic coordinate reference systems, and which functions provide the most accurate estimates.
The estimate of area with the s2 geometry library turned on in {sf} provides a different estimate than when s2 is turned off and GEOS is used. The {sf} GEOS area calculation is the the same as {terra}'s estimate.
Which is the most accurate way to calculate area?
library(terra)
library(sf)
# create a polygon using terra
p <- vect('POLYGON ((2 2, 7 6, 4 9, 2 2))', crs = 'EPSG:4326')
# calculate area with terra
expanse(p, unit = 'km')
# 165515 km2
# calculate area with sf
st_area(st_as_sf(p))/1e6
# 166235 km2
# turn off s2, then terra and sf area estimates are the same
sf_use_s2(FALSE)
round((expanse(p, unit = 'km')) - (as.numeric(st_area(st_as_sf(p))/1e6)))
It appears that the S2 library represents the earth as a sphere. The webpage states that
the S2 library represents all data on a three-dimensional sphere
The high precision GeographicLib uses a spheroid to represent the earth's shape. GeographicLib was first used by "geosphere" and it is now also used by "terra", and by "sf" when S2 is turned off.
The earth bulges at the equator and that is why you can better approximate the earth's shape with a spheroid than with a sphere. This suggests that the computation is less precise when S2 is used.
GEOS is not used in this context, as it can only be used to compute areas for planar polygons, not for lon/lat polygons.
I'm working with Swiss meteo data stored in netCDF files. Example data are available here.
After downloading and reading the data
download.file(url = "https://www.meteoswiss.admin.ch/content/dam/meteoswiss/de/Ungebundene-Seiten/Produkte/doc/tnorm9120.zip")
unzip("tnorm9120.zip")
filename <- "TnormM9120_ch01r.swiss.lv95_000001010000_000012010000.nc"
tnorm9120 <- nc_open(filename)
tnorm9120
I have file of this form:
File data-raw/tnorm9120/TnormM9120_ch01r.swiss.lv95_000001010000_000012010000.nc (NC_FORMAT_CLASSIC):
5 variables (excluding dimension variables):
float swiss_lv95_coordinates[]
_FillValue: -1
grid_mapping_name: Oblique Mercator (LV95 - CH1903+)
longitude_of_projection_center: 7.43958333
latitude_of_projection_center: 46.9524056
false_easting: 2600000
false_northing: 1200000
inverse_flattening: 299.1528128
semi_major_axis: 6377397.155
double climatology_bounds[ncb,time]
units: months since 1991-01-01 00:00:00
_FillValue: -1
float TnormM9120[E,N,time]
units: degree
_FillValue: -999.989990234375
grid_mapping: swiss_lv95_coordinates
coordinates: lon lat
long_name: mean monthly temperature 1991-2020
grid_name: ch01r.swiss.lv95
version: v1.4
prod_date: 2021-09-30 17:40:40
cell_methods: time: mean within years time: mean over years
float lon[E,N]
units: degrees_east
_FillValue: NaN
long_name: longitude coordinate
standard_name: longitude
float lat[E,N]
units: degrees_north
_FillValue: NaN
long_name: latitude coordinate
standard_name: latitude
4 dimensions:
ncb Size:2
long_name: ncb
time Size:12 *** is unlimited ***
units: months since 1991-01-01 00:00:00
long_name: time
axis: T
calendar: standard
climatology: climatology_bounds
E Size:370
units: meters_east
long_name: swiss easting (lv95)
standard_name: projection x coordinate
N Size:240
units: meters_north
long_name: swiss northing (lv95)
standard_name: projection y coordinate
3 global attributes:
Conventions: CF-1.6
institution: Federal Office of Meteorology and Climatology MeteoSwiss
References: Frei C., 2014: Interpolation of temperatures in a mountainous region using nonlinear profiles and non-Euclidean distances. Int. J. Climatol., 34, 1585-1605. DOI: 10.1002/joc.3786.
I was trying to set up the coordinate system (which should be EPSG:2056) on the raster brick using:
b <- brick(filename, crs = st_crs(2056)$proj4string)
However that gives me the following warnings:
Warning messages:
1: In .getCRSfromGridMap4(atts) : cannot process these parts of the crs:
_FillValue=-1
longitude_of_projection_center=7.43958333
latitude_of_projection_center=46.9524056
2: In .getCRSfromGridMap4(atts) : cannot create a valid crs
grid_mapping_name; false_easting; false_northing; scale_factor_at_projection_origin; scale_factor_at_central_meridian; standard_parallel; standard_parallel1; standard_parallel2; longitude_of_central_meridian; longitude_of_projection_origin; latitude_of_projection_origin; straight_vertical_longitude_from_pole; longitude_of_prime_meridian; semi_major_axis; semi_minor_axis; inverse_flattening; earth_radius; +proj; +x_0; +y_0; +k_0; +k_0; +lat_1; +lat_1; +lat_2; +lon_0; +lon_0; +lat_0; +lon_0; +pm; +a; +b; +rf; +a
I tried other ways of setting up CRS but that either resulted in error or the same warnings as above. Is there any way around it?
Update: alternative attempts are documented in this gist.
Odly enough when I tried plotting this data in QGIS I ended up with the country upside down o_O
We are the authors of these NetCDF files at MeteoSwiss. If you can direct us how to format the crs info stored in the NetCDF grid mapping variable, please get back to us. We tried to follow the CF-convention and it works for the old swiss (LV03/CH1903) projection and regular lon/lat. But it seems to fail for the new swiss (LV95/CH1903+) in your application.
We're happy to post test files in case you find out what goes wrong in the CRS.
Can you confirm that you can read the CRS stored in our previous (LV03/ch1903) files, since we never got any complaint on these:
ftp://ftp.cscs.ch/out/stockli/swisscors/TminM_ch01r.swisscors_201903010000_201903010000.nc
Cheers
Reto
I want to calculate the distance between to points. I know there are several ways to do it in R (see here for one example), I thought it would be best to use the st_distance function from the sf package, but when I use a projection different to WGS84 (crs = 4326), I get the distances in decimal degrees and not in meters.
However, when I set the projection to crs = 32718, I get the distance in decimal degrees. Is there a way to convert this to meters (or to get meters in the first place). What I don't understand is why when I set the projection to crs = 4326, I do get the distance in meters.
I included a reproducible example:
library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
library(tidyverse)
library(maptools)
#> Loading required package: sp
#> Checking rgeos availability: TRUE
crs <- CRS("+init=epsg:32718")
df <- tibble::tribble(
~documento, ~cod_mod, ~nlat_ie, ~nlong_ie,
"00004612", 238840, -8.37661, -74.53749,
"00027439", 238758, -8.47195, -74.80497,
"00074909", 502518, -8.83271, -75.21418,
"00074909", 612663, -8.82781, -75.05055,
"00074909", 612812, -8.64173, -74.96442,
"00102408", 237255, -13.4924, -72.9337,
"00102408", 283341, -13.5317, -73.6769,
"00109023", 238717, -9.03639, -75.50947,
"00109023", 238840, -8.37661, -74.53749,
"00109023", 1122464, -8.37855, -74.57039,
"00124708", 238717, -9.03639, -75.50947,
"00124708", 238840, -8.37661, -74.53749,
"00124708", 1122464, -8.37855, -74.57039,
"00186987", 612663, -8.82781, -75.05055,
"00186987", 1121383, -8.36195, -74.57805,
"00237970", 327379, -3.55858, -80.45579,
"00238125", 1137678, -3.6532, -80.4266,
"00238125", 1143577, -3.50163, -80.27616,
"00239334", 1143577, -3.50163, -80.27616,
"00239334", 1372333, -3.6914, -80.2521
)
df_spatial <- df
coordinates(df_spatial) <- c("nlong_ie", "nlat_ie")
proj4string(df_spatial) <- crs
# Now we create a spatial dataframe with coordinates in the average location of each documento
df_mean_location <- df %>%
group_by(documento) %>%
summarize(
mean_long = mean(nlong_ie),
mean_lat = mean(nlat_ie)
)
df_mean_location_spatial <- df_mean_location
coordinates(df_mean_location_spatial) <- c("mean_long", "mean_lat")
proj4string(df_mean_location_spatial) <- crs
df_spatial_st <- st_as_sf(df_spatial)
df_mean_location_spatial_st <- st_as_sf(df_mean_location_spatial)
distancias1 <- st_distance(df_spatial_st, df_mean_location_spatial_st, by_element = TRUE)
distancias1
#> Units: [m]
#> [1] 0.00000000 0.00000000 0.15248325 4.99880005 0.10219044 5.26515886
#> [7] 5.06614947 7.38054767 7.53880558 7.43549151 1.17475732 0.28396349
#> [13] 0.63815871 4.99880005 0.37683694 7.52071866 7.47784143 0.18844161
#> [19] 0.10677741 0.09564457
When I change the crs <- CRS("+init=epsg:4326"), I do get the correct results (in meters):
[1] 0.00 0.00 16792.18 552085.93 11258.44 581428.01 560043.61 816269.42 834131.40 822686.13 129481.67 31286.98 70373.13 552085.93
[15] 41565.46 832000.85 827230.50 20928.56 11835.41 10577.04
EPSG 32718 is a cartesian coordinate reference system in metres. By assigning that CRS to a data set, you are saying "these numbers are metres, and the origin is not at (0,0) degrees (where equator meets Greenwich meridian) but at the origin of zone 18 of the UTM system". So you get a distance in metres.
EPSG 4326 is a lat-long reference system with a particular shape of ellipsoid earth. The coordinates are lat-long degrees. st_distance spots this and works out the great circle distance between points based on the ellipsoid. If you want the distance in decimal degrees then assign an NA CRS and you'll get unitless distances, which are the pythagorean distances in lat-long (and so very wrong in real terms near the poles, for example).
I'm using the spatstat package to compute the nearest distance to it's cooresponding point bases on xyz data. The code works, but i'm getting incorrect answers. See below.
ex<- data.frame(long= c(-103.5664,-103.5664,-103.5586),lat= c(32.09539,32.10129,32.10799),elevation= c(5000,5500,5700))
####bounding box 3D
bb <- box3(range(ex$long), range(ex$lat), range(ex$elevation))
# Create a spatial points data frame:
comp_dist.pp3<- spatstat::pp3(ex$long,ex$lat,ex$elevation,bb)
nndist.pp3(comp_dist.pp3,k=1)
[1] 500 200 200
The points are more than a mile away so it should be closer to 6800.
Unfortunately spatstat doesn’t automatically recognize latitude and longitude
coordinates. Your points are interpreted as (x,y,z) coordinates in Euclidean
space, and the three pairwise distances measured by
sqrt((x2-x1)^2 + (y2-y1)^2 + (z2-z1)^2) are (very suspiciously) the nice
round numbers 200, 500, and 700. Here is the small change to the original
code to calculate all pairwise distances:
library(spatstat)
ex<- data.frame(long= c(-103.5664,-103.5664,-103.5586),
lat= c(32.09539,32.10129,32.10799),
elevation= c(5000,5500,5700))
bb <- box3(range(ex$long), range(ex$lat), range(ex$elevation))
comp_dist.pp3<- spatstat::pp3(ex$long,ex$lat,ex$elevation,bb)
pairdist(comp_dist.pp3)
#> [,1] [,2] [,3]
#> [1,] 0 500 700
#> [2,] 500 0 200
#> [3,] 700 200 0
You can use sp::spTransform or sf::transform to convert from spherical
(lon,lat) to planar (x,y) and then you can attach your elevation as z-coordinate
when you define the pp3 object and things should work.
Created on 2019-02-12 by the reprex package (v0.2.1)
Check your units. If you look at your longitude values: all around -103, latitude values: all around 32, and elevation values: 5000, 5500, 5700. The dimension that causes the most distance is the elevation. Since these only differ by 500 and 200, I would not expect distances to be "closer to 6800."
Edit: That is to say, I believe your package is treating your latitudes and longitudes as numeric dimensions on xyz plane, and not as actual latitudes and longitudes!
I have what may be a very simplistic question on the KEST function in Spatstat.KEST graph output I'm using the KEST function in Spatstat to assess spatial randomness in a dataset. I have uploaded lat and long values spread over London and converted them to a PPP object, using the ripras function to specify the spatial domain. When I run my KEST analysis on my ppp, and plot the graph, I end up with an r value on the x, but although I know this is a distance measurement, I don't know what units it's using. I get this summary output:
Planar point pattern: 113 points
Average intensity 407.9378 points per square unit
Coordinates are given to 9 decimal places
Window: polygonal boundary
single connected closed polygon with 14 vertices
enclosing rectangle: [-0.5532963, 0.3519148] x [51.2901, 51.7022] units
Window area = 0.277003 square units
with the max r on the x axis being 0.1 units, and the K(r) on the y axis being 0.04. How do I figure out what unit of distance these equate to?
Your lat,lon coordinates correspond to points on a sphere (or ellipsoid or whatever) used as a model for planet Earth. Essentially, spatstat assumes you are using coordinates projected on a flat map. This conversion could be done with e.g. the sp package (using Buckingham Palace as an example):
library(sp)
lat = c(51.501476)
lon = c(-0.140634)
xy = data.frame(lon, lat)
coordinates(xy) <- c("lon", "lat")
proj4string(xy) <- CRS("+proj=longlat +datum=WGS84")
NE <- spTransform(xy, CRS("+proj=utm +zone=30 ellps=WGS84"))
NE <- as.data.frame(NE)
The result is a data.frame with projected coordinates in Easting, Northing in metres. Then you can continue your analysis from there. To assign a unit label like "m" for prettier labels in figures use the function unitname on your ppp object (assuming the object is called X): unitname(X) <- "m"
If the function is able to accept geographic coordinates, then it is using a great circle equation to calculate distance. This normally results in units that are in Kilometers.
It is not very good practice to perform PPA on non-projected data. If possible, you should project your data into a coordinate system that is in distance units. I believe that most of the functions in spatstat use Euclidean distance, which is quite inappropriate for projection units in decimal degrees. Since there is not a latlong argument in the Kest function, I do not believe that your results are valid.
The K function itself (i.e. the theoretical K-function, not just the computer code) assumes that the space is flat rather than curved.
This would probably be a reasonable approximation in your case (points scattered over a few dozen kilometres) but not for a point pattern scattered over a continent. That is, in general the planar K-function should not be used for point patterns on a sphere.
The other posts are correct. The Kest function expects the coordinates to be given in an isometric coordinate system. You just need to express the spatial locations in a coordinate system in which the x and y coordinates are measured in the same distance units. Longitude and latitude are not measured in the same distance units because one degree (say) of longitude does not represent the same distance as one degree of latitude. Ege Rubak's example using spTransform is probably the best way to go.