Compare variogram and variog function - r

I assumed (probably wrongly) that in the easiest cases the output of variog in the geoR package and variogram in the sp package would have been the same.
I have this dataset:
head(final)
lat lon elev seadist tradist samples rssi
1 60.1577 24.9111 2.392 125 15.21606 200 -58
2 60.1557 24.9214 3.195 116 15.81549 200 -55
3 60.1653 24.9221 4.604 387 15.72119 200 -70
4 60.1667 24.9165 7.355 205 15.39796 200 -62
5 60.1637 24.9166 3.648 252 15.43457 200 -73
6 60.1530 24.9258 2.733 65 16.10631 200 -57
that is made of (I guess) unprojected data, so I project them
#data projection
#convert to sp object:
coordinates(final) <- ~ lon + lat #longitude first
library(rgdal)
proj4string(final) = "+proj=longlat +datum=WGS84"
UTM <- spTransform(final, CRS=CRS("+proj=utm +zone=35V+north+ellps=WGS84+datum=WGS84"))
and produce the variogram without trend according to the gstat library
var.notrend.sp<-variogram(rssi~1, UTM)
plot(var.notrend.sp)
trying to get the same output in geoR I go with
UTM1<-as.data.frame(UTM)
UTM1<-cbind(UTM1[,6:7], UTM1[,1:5])
UTM1
coords<-UTM1[,1:2]
coords
var.notrend.geoR <- variog(coords=coords, data=rssi,estimator.type='classical')
plot(var.notrend.geoR)

A couple of points.
gstat can work with unprojected data, and will compute the great-circle distance
setting the "projection" to be "+proj=longlat +datum=WGS84" does not transform the data to a cartesian grid-based system (such as UTM)
What you are seeing in the output of variogram is the fact that is (sensibly) using great circle distances. If you look at the scale of the distance axis, you will see that the ranges are quite different, because geoR doesn't know (and can't account for) the fact you are not using a grid-based projection.
If you want to compare apples with apples use rgdal and spTransform to transform the coordinate system to an appropriate projection and then create variograms with similar specifications. (Note that gstat defines a cutoff ( the length of the diagonal of the box spanning the data is divided by three.)).
The empirical variogram is highly dependent on the definition of distance and the choice of binning. (see the brilliant model-based geostatistics by Diggle and Ribeiro, especially chapter 5 which deals with this issue in detail.

Related

Compute distance to nearest point xyz coordinates

I'm using the spatstat package to compute the nearest distance to it's cooresponding point bases on xyz data. The code works, but i'm getting incorrect answers. See below.
ex<- data.frame(long= c(-103.5664,-103.5664,-103.5586),lat= c(32.09539,32.10129,32.10799),elevation= c(5000,5500,5700))
####bounding box 3D
bb <- box3(range(ex$long), range(ex$lat), range(ex$elevation))
# Create a spatial points data frame:
comp_dist.pp3<- spatstat::pp3(ex$long,ex$lat,ex$elevation,bb)
nndist.pp3(comp_dist.pp3,k=1)
[1] 500 200 200
The points are more than a mile away so it should be closer to 6800.
Unfortunately spatstat doesn’t automatically recognize latitude and longitude
coordinates. Your points are interpreted as (x,y,z) coordinates in Euclidean
space, and the three pairwise distances measured by
sqrt((x2-x1)^2 + (y2-y1)^2 + (z2-z1)^2) are (very suspiciously) the nice
round numbers 200, 500, and 700. Here is the small change to the original
code to calculate all pairwise distances:
library(spatstat)
ex<- data.frame(long= c(-103.5664,-103.5664,-103.5586),
lat= c(32.09539,32.10129,32.10799),
elevation= c(5000,5500,5700))
bb <- box3(range(ex$long), range(ex$lat), range(ex$elevation))
comp_dist.pp3<- spatstat::pp3(ex$long,ex$lat,ex$elevation,bb)
pairdist(comp_dist.pp3)
#> [,1] [,2] [,3]
#> [1,] 0 500 700
#> [2,] 500 0 200
#> [3,] 700 200 0
You can use sp::spTransform or sf::transform to convert from spherical
(lon,lat) to planar (x,y) and then you can attach your elevation as z-coordinate
when you define the pp3 object and things should work.
Created on 2019-02-12 by the reprex package (v0.2.1)
Check your units. If you look at your longitude values: all around -103, latitude values: all around 32, and elevation values: 5000, 5500, 5700. The dimension that causes the most distance is the elevation. Since these only differ by 500 and 200, I would not expect distances to be "closer to 6800."
Edit: That is to say, I believe your package is treating your latitudes and longitudes as numeric dimensions on xyz plane, and not as actual latitudes and longitudes!

Create Grid in R for kriging in gstat

lat long
7.16 124.21
8.6 123.35
8.43 124.28
8.15 125.08
Consider these coordinates, these coordinates correspond to weather stations that measure rainfall data.
The intro to the gstat package in R uses the meuse dataset. At some point in this tutorial: https://rpubs.com/nabilabd/118172, the guys makes use of a "meuse.grid" in this line of code:
data("meuse.grid")
I do not have such a file and I do not know how to create it, can I create one using these coordinates? Or at least point me to material that discusses how to create a custom grid for a custom area (i.e not using administrative boundaries from GADM).
Probably wording this wrong, don't even know if this question makes sense to R savvy people. Still, would love to hear some direction, or at least tips. Thanks a lot!
Total noob at R and statistics.
EDIT: See the sample grid that the tutorial I posted looks like, that's the thing I want to make.
EDIT 2: Would this method be viable? https://rstudio-pubs-static.s3.amazonaws.com/46259_d328295794034414944deea60552a942.html
I am going to share my approach to create a grid for kriging. There are probably more efficient or elegant ways to achieve the same task, but I hope this will be a start to facilitate some discussions.
The original poster was thinking about 1 km for every 10 pixels, but that is probably too much. I am going to create a grid with cell size equals to 1 km * 1 km. In addition, the original poster did not specify an origin of the grid, so I will spend some time determining a good starting point. I also assume that the Spherical Mercator projection coordinate system is the appropriate choice for the projection. This is a common projection for Google Map or Open Street Maps.
1. Load Packages
I am going to use the following packages. sp, rgdal, and raster are packages provide many useful functions for spatial analysis. leaflet and mapview are packages for quick exploratory visualization of spatial data.
# Load packages
library(sp)
library(rgdal)
library(raster)
library(leaflet)
library(mapview)
2. Exploratory Visualization of the station locations
I created an interactive map to inspect the location of the four stations. Because the original poster provided the latitude and longitude of these four stations, I can create a SpatialPointsDataFrame with Latitude/Longitude projection. Notice the EPSG code for Latitude/Longitude projection is 4326. To learn more about EPSG code, please see this tutorial (https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/OverviewCoordinateReferenceSystems.pdf).
# Create a data frame showing the **Latitude/Longitude**
station <- data.frame(lat = c(7.16, 8.6, 8.43, 8.15),
long = c(124.21, 123.35, 124.28, 125.08),
station = 1:4)
# Convert to SpatialPointsDataFrame
coordinates(station) <- ~long + lat
# Set the projection. They were latitude and longitude, so use WGS84 long-lat projection
proj4string(station) <- CRS("+init=epsg:4326")
# View the station location using the mapview function
mapview(station)
The mapview function will create an interactive map. We can use this map to determine what could be a suitable for the origin of the grid.
3. Determine the origin
After inspecting the map, I decided that the origin could be around longitude 123 and latitude 7. This origin will be on the lower left of the grid. Now I need to find the coordinate representing the same point under Spherical Mercator projection.
# Set the origin
ori <- SpatialPoints(cbind(123, 7), proj4string = CRS("+init=epsg:4326"))
# Convert the projection of ori
# Use EPSG: 3857 (Spherical Mercator)
ori_t <- spTransform(ori, CRSobj = CRS("+init=epsg:3857"))
I first created a SpatialPoints object based on the latitude and longitude of the origin. After that I used the spTransform to perform project transformation. The object ori_t now is the origin with Spherical Mercator projection. Notice that the EPSG code for Spherical Mercator is 3857.
To see the value of coordinates, we can use the coordinates function as follows.
coordinates(ori_t)
coords.x1 coords.x2
[1,] 13692297 781182.2
4. Determine the extent of the grid
Now I need to decide the extent of the grid that can cover all the four points and the desired area for kriging, which depends on the cell size and the number of cells. The following code sets up the extent based on the information. I have decided that the cell size is 1 km * 1 km, but I need to experiment on what would be a good cell number for both x- and y-direction.
# The origin has been rounded to the nearest 100
x_ori <- round(coordinates(ori_t)[1, 1]/100) * 100
y_ori <- round(coordinates(ori_t)[1, 2]/100) * 100
# Define how many cells for x and y axis
x_cell <- 250
y_cell <- 200
# Define the resolution to be 1000 meters
cell_size <- 1000
# Create the extent
ext <- extent(x_ori, x_ori + (x_cell * cell_size), y_ori, y_ori + (y_cell * cell_size))
Based on the extent I created, I can create a raster layer with number all equal to 0. Then I can use the mapview function again to see if the raster and the four stations matches well.
# Initialize a raster layer
ras <- raster(ext)
# Set the resolution to be
res(ras) <- c(cell_size, cell_size)
ras[] <- 0
# Project the raster
projection(ras) <- CRS("+init=epsg:3857")
# Create interactive map
mapview(station) + mapview(ras)
I repeated this process several times. Finally I decided that the number of cells is 250 and 200 for x- and y-direction, respectively.
5. Create spatial grid
Now I have created a raster layer with proper extent. I can first save this raster as a GeoTiff for future use.
# Save the raster layer
writeRaster(ras, filename = "ras.tif", format="GTiff")
Finally, to use the kriging functions from the package gstat, I need to convert the raster to SpatialPixels.
# Convert to spatial pixel
st_grid <- rasterToPoints(ras, spatial = TRUE)
gridded(st_grid) <- TRUE
st_grid <- as(st_grid, "SpatialPixels")
The st_grid is a SpatialPixels that can be used in kriging.
This is an iterative process to determine a suitable grid. Throughout the process, users can change the projection, origin, cell size, or cell number depends on the needs of their analysis.
#yzw and #Edzer bring up good points for creating a regular rectangular grid, but sometimes, there is the need to create an irregular grid over a defined polygon, usually for kriging.
This is a sparsely documented topic. One good answer can be found here. I expand on it with code below:
Consider the the built in meuse dataset. meuse.grid is an irregularly shaped grid. How do we make an grid like meuse.grid for our unique study area?
library(sp)
data(meuse.grid)
ggplot(data = meuse.grid) + geom_point(aes(x, y))
Imagine an irregularly shaped SpatialPolygon or SpatialPolygonsDataFrame, called spdf. You first build a regular rectangular grid over it, then subset the points in that regular grid by the irregularly-shaped polygon.
# First, make a rectangular grid over your `SpatialPolygonsDataFrame`
grd <- makegrid(spdf, n = 100)
colnames(grd) <- c("x", "y")
# Next, convert the grid to `SpatialPoints` and subset these points by the polygon.
grd_pts <- SpatialPoints(
coords = grd,
proj4string = CRS(proj4string(spdf))
)
# subset all points in `grd_pts` that fall within `spdf`
grd_pts_in <- grd_pts[spdf, ]
# Then, visualize your clipped grid which can be used for kriging
ggplot(as.data.frame(coordinates(grd_pts_in))) +
geom_point(aes(x, y))
If you have your study area as a polygon, imported as a SpatialPolygons, you could either use package raster to rasterize it, or use sp::spsample to sample it using sampling type regular.
If you don't have such a polygon, you can create points regularly spread over a rectangular long/lat area using expand.grid, using seq to generate a sequence of long and lat values.

Using gaussian smoothing for lat/long and value (xyz) data

I am trying perform Gaussian Smoothing of the z values that are associated with lat/long values. Each Lat/Long that I have is the centroid of a 1-mile by 1-mile grid. Here's an example of data:
latitude <- c(40.607674, 40.607048, 40.606419, 40.605785, 40.605152, 40.604515, 40.603874)
longitude <- c(-91.865349, -91.846184, -91.827026, -91.807861, -91.788704, -91.769547, -91.750381)
value <- 1:7
These lat/longs are in the WGS 1984 coordinate systems. One important thing to note is that I do not have access to rgdal library, however I do have access to ARCGIS (I don't have Spatial Analyst License, though) to project these values to another coordinate system.
So far I have been able to create a SpatialPointsDataFrame in R using these values. My intent was to convert the SpatialPointsDataFrame into a matrix representation for the Gaussian smoothing, but I haven't been able to do that successfully.

R: Calculating the shortest distance between two point layers

I need to calculate the shortest distance between two point matrices. I am new to R and have no clue how to do this. This is the code that I used to call in the data and convert them to points
library(dismo)
laurus <- gbif("Laurus", "nobilis")
locs <- subset(laurus, select = c("country", "lat", "lon"))
#uk observations
locs.uk <-subset(locs, locs$country=="United Kingdom")
#ireland observations
locs.ire <- subset(locs, locs$country=="Ireland")
uk_coord <-SpatialPoints(locs.uk[,c("lon","lat")])
ire_coord <-SpatialPoints(locs.ire[,c("lon","lat")])
crs.geo<-CRS("+proj=longlat +ellps=WGS84 +datum=WGS84") # geographical, datum WGS84
proj4string(uk_coord) <-crs.geo #define projection
proj4string(ire_coord) <-crs.geo #define projection
I need to calculate the shortest distance (Euclidean) from points in Ireland to points in UK. In other words I need to calculate the distance from each point in Ireland to its closet point in the UK points layer.
Can some one tell me what function or package I need to use in order to do this. I looked at gdistance and could not find a function that calculate the shortest distance.
You can use the FNN package which uses spatial trees to make the search efficient. It works with euclidean geometry, so you should transform your points to a planar coordinate system. I'll use rgdal package to convert to UK grid reference (stretching it a bit to use it over ireland here, but your original data was New York and you should use a New York planar coord system for that):
> require(rgdal)
> uk_coord = spTransform(uk_coord, CRS("+init=epsg:27700"))
> ire_coord = spTransform(ire_coord, CRS("+init=epsg:27700"))
Now we can use FNN:
> require(FNN)
> g = get.knnx(coordinates(uk_coord), coordinates(ire_coord),k=1)
> str(g)
List of 2
$ nn.index: int [1:69, 1] 202 488 202 488 253 253 488 253 253 253 ...
$ nn.dist : num [1:69, 1] 232352 325375 87325 251770 203863 ...
g is a list of indexes and distances of the uk points that are nearest to the 69 irish points. The distances are in metres because the coordinate system is in metres.
You can illustrate this by plotting the points then joining irish point 1 to uk point 202, irish 2 to uk 488, irish 3 to uk 202 etc. In code:
> plot(uk_coord, col=2, xlim=c(-1e5,6e5))
> plot(ire_coord, add=TRUE)
> segments(coordinates(ire_coord)[,1], coordinates(ire_coord)[,2], coordinates(uk_coord[g$nn.index[,1]])[,1], coordinates(uk_coord[g$nn.index[,1]])[,2])
gDistance() from the rgeos package will give you the distance matrix
library(rgeos)
gDistance(uk_coord, ire_coord, byid = TRUE)
Another option is nncross() from the spatstat package. Pro: it gives the distance to the nearest neighbour. Contra: you 'll need to convert the SpatialPoints to a SpatialPointPattern (see ?as.ppp in statstat)
library(spatstat)
nncros(uk.ppp, ire.ppp)
The package geosphere offers a lot of dist* functions to evaluate distances from two lat/lon points. In your example, you could try:
require(geosphere)
#get the coordinates of UK and Ireland
pointuk<-uk_coord#coords
pointire<-ire_coord#coords
#prepare a vector which will contain the minimum distance for each Ireland point
res<-numeric(nrow(pointire))
#get the min distance
for (i in 1:length(res)) res[i]<-min(distHaversine(pointire[i,,drop=FALSE],pointuk))
The distances you'll obtain are in meters (you can change by setting the radius of the earth in the call to distHaversine).
The problem with gDistance and other rgeos functions is that they evaluate the distance as the coordinates were planar. Basically, the number you obtain is not much useful.

Visualize correlations on a map in R

I caculated correlations between temperatures und the date of grape harvest. I stored the results as matrix:
32.5 37.5 42.5 47.5 52.5 57.5 62.5
-12.5 -0.05783118 -0.001655467 -0.07857098 -0.1526494 -0.0007327898 -0.02078552 0.06121682
-7.5 -0.23219824 -0.059952117 -0.06895444 -0.1674386 -0.1311612338 -0.08476390 0.09831010
-2.5 -0.11040995 -0.147325160 -0.15016740 -0.1796807 -0.1819844495 -0.14472899 -0.03550576
2.5 -0.20577359 -0.180857373 -0.15077067 -0.2293366 -0.2577666092 -0.21645676 -0.13044584
7.5 -0.44526971 -0.176224708 -0.15114994 -0.2459971 -0.2741514139 -0.19281484 -0.15683870
12.5 -0.12481683 -0.121675085 -0.16011098 -0.2288839 -0.2503969467 -0.26616721 -0.23089796
17.5 -0.15352693 -0.220012419 -0.11456690 -0.2314059 -0.2194705426 -0.20557053 -0.22529422
Now I want to visualize the results on a map. It should look like this:
example for visualizing correlations on a map
Only my longitude und latitude are different. My latitude ranges from 32,5°N to 62,5°N and my longitude goes from -12,5°E to 17,5°E.
I have absolutely no idea how it's been done! It would be nice, if someone can help me.
Regards.
This is one way. Your grid is rather coarse (increments of 5°, or ~350 miles at the equator), and of course you did not provide an actual map, but this will plot a "heat map" of correlation at the coordinates you provided.
df <- cbind(lon=rownames(df),df)
library(reshape2)
library(ggplot2)
library(RColorBrewer)
gg <- melt(df,id="lon",variable.name="lat",value.name="corr")
gg$lat <- as.numeric(substring(gg$lat,2)) # remove pre-pended "X"
gg$lon <- as.numeric(as.character(gg$lon)) # convert factor to numeric
ggplot(gg)+
geom_tile(aes(x=lon,y=lat, fill=corr))+
scale_fill_gradientn(colours=rev(brewer.pal(9,"Spectral")))+
coord_fixed()
The corrplot() function from corrplot R package can be used to plot a correlogram.
library(corrplot)
M<-cor(mtcars) # compute correlation matrix
corrplot(M, method="circle")
this is described here:
http://www.sthda.com/english/wiki/visualize-correlation-matrix-using-correlogram
Online R software is also available to compute and visualize a correlation matrix, by a simple click without any installation :
http://www.sthda.com/english/rsthda/correlation-matrix.php

Resources