How to apply universal kriging with custom prediction spatial grid using autoKrige in R - r

I want to apply universal kriging on a dataset using the autokrige function in R. I would like to create my own custom, spatial grid for the predicted points (for the new_data argument of autokrige). I am using R version 3.2.2 (64-bit) and RStudio Version 0.99.486. The following is what I've done so far:
library(automap)
library(sp)
library(gstat)
library(raster)
library(rgdal)
data(meuse)
coordinates(meuse) <- ~x + y
proj4string(meuse) <- CRS("+init=epsg:28992")
The following code was received from stackexchange here (credit goes to Jeffrey Evans) and is used to create a custom spatial grid for the prediction values:
ext_meuse <- as(extent(meuse), "SpatialPolygons")
r_meuse <- rasterToPoints(raster(ext_meuse, resolution = 59), spatial = TRUE)
proj4string(r_meuse) <- proj4string(meuse)
I then try to apply universal kriging (regression on the 'dist' column) using autoKrige:
kriging_result = autoKrige(zinc~dist, meuse, r_meuse)
The following error is then received:
Error in model.frame.default(terms.f, newdata, na.action = na.action, :
object is not a matrix In addition: Warning message:
'newdata' had 3102 rows but variable found had 1 row
Did I made a mistake with the grid creation (r_meuse)? Is there a 'better' way to create a grid for the predicted data? All the examples I have found so far uses the meuse.grid data, but I would like to apply universal kriging to other data that does not have its own grid data yet.

I believe the problem here is that you are performing UK without having the predictor, dist, present in r_meuse. This is a problem as that information is needed for the linear to make a prediction. So, r_meuse needs to be a SpatialPointsDataFrame with dist defined.

Related

R Studio gplots library 'coplot' function: "Error in eval(y, data, parent.frame()) : object 'y' not found"

I am trying to do the "Fixed/Random Effects Models using R" exercise found at (https://rstudio-pubs-static.s3.amazonaws.com/372492_3e05f38dd3f248e89cdedd317d603b9a.html)
However, for homework I have been asked to use a different dataset containing cosmetic surgery information for patients at various clinics.
I have tried to plot my dataset using coplot like so:
coplot(y ~ Surgery|Clinic, type="b", data=cosmsur)
But I get the following error:
Error in eval(y, data, parent.frame()) : object 'y' not found
Having checked the coplot information using ?coplot I do not see how object 'y' cannot be found since it is merely defining the y axis of the plot - what is going wrong please?
This is the code I have been working with to get to this error:
# First, set wd to: [setwd("~/OneDrive - University College London/2. UCL DEGREE/COURSES/1. PSYC0223- Introduction to Statistics for Psychology/R Studio/Metacognition and Optimal Reminders/Session 3/My Practice")]
# Then, load the specialist libraries you need for this exercise:
library(tidyverse) # Modern data science library
library(plm) # Panel data analysis library
library(car) # Companion to applied regression
library(gplots) # Various programing tools for plotting data
library(tseries) # For timeseries analysis
library(lmtest) # For hetoroskedasticity analysis
# Finally, define your dataset for this R sc ript: cosmsur for "Cosmetic Surgery" data from the "Mixed and Random Effects Modelling" topic in your PSYC0223 Course:
cosmsur <- read_csv("Cosmetic_Surgery.csv")
### Data Import and Tidying
head(cosmsur)
# We now need to set cosmsur as panel data:
# Remember, you can achieve this using the plm funcion [cosmsur <- plm.data(cosmsur, index=c("surgery","clinic"))]
# But use of 'plm.data' is discouraged, better use 'pdata.frame' instead:
cosmsur <- pdata.frame(cosmsur, index=c("Post_QoL","Surgery","Clinic"))
# Warning message:In pdata.frame(cosmsur, index = c("Post_QoL", "Surgery", "Clinic")) :duplicate couples (id-time) in resulting pdata.frame; to find out which, use e.g. table(index(your_pdataframe), useNA = "ifany"):
table(index(cosmsur), useNA = "ifany")
head(cosmsur)
## 2.1 View tabular data
cosmsur
### 3 Exploratory Data Analysis
# Let's first visualise our dataset using a special plot function "coplot":
coplot(y ~ Surgery|Clinic, type="b", data=cosmsur)
Error in eval(y, data, parent.frame()) : object 'y' not found
?coplot

Cross-validation for kriging in R: how to include the trend while reestimating the variogram using xvalid?

I have a question very specific for the function xvalid (package geoR) in R which is used in spatial statistics only, so I hope it's not too specific for someone to be able to answer. In any case, suggestions for alternative functions/packages are welcome too.
I would like to compute a variogram, fit it, and then perform cross-validation. Function xvalid seems to work pretty nice to do the cross-validation. It works when I set reestimate=TRUE (so it reestimates the variogram for every point removed from the dataset in cross-validation) and it also works when using a trend. However, it does not seem to work when combining these two...
Here is an example using the classical Meuse dataset:
library(geoR)
library(sp)
data(meuse) # import data
coordinates(meuse) = ~x+y # make spatialpointsdataframe
meuse#proj4string <- CRS("+init=epsg:28992") # add projection
meuse_geo <- as.geodata(meuse) # create object of class geodata for geoR compatibility
meuse_geo$data <- meuse#data # attach all data (incl. covariates) to meuse_geo
meuse_vario <- variog(geodata=meuse_geo, data=meuse_geo$data$lead, trend= ~meuse_geo$data$elev) # variogram
meuse_vfit <- variofit(meuse_vario, nugget=0.1, fix.nugget=T) # fit
# cross-validation works fine:
xvalid(geodata=meuse_geo, data=meuse_geo$data$lead, model=meuse_vfit, variog.obj = meuse_vario, reestimate=F)
# cross-validation does not work when reestimate = T:
xvalid(geodata=meuse_geo, data=meuse_geo$data$lead, model=meuse_vfit, variog.obj = meuse_vario, reestimate=T)
The error I get is:
Error in variog(coords = cv.coords, data = cv.data, uvec = variog.obj$uvec, : coords and trend have incompatible sizes
It seems to remove the point from the dataset during cross-validation, but it doesn't seem to remove the point from the covariates/trend data. Any ideas on solving this or using a different package?
Thanks a lot in advance!

R: Autokrige.cv function in automap package generates NaNs

I’m fairly new to R and I am trying to make interpolations of temperature measurements that where gathered from different station across the Netherlands. I have data for about 35 stations that make measurements every 10 minutes covering a timespan of about two weeks. Accordingly, I figured it would be best to make a loop that takes care of this. To see how well the interpolation technique works I want to do a cross validation for every timestamp.
In order to do this I used the Autokrige function from the automap package, and next I used the compare.cv function from the automap package in order to get an overview of the most important statistics for all time stamps. Besides that, I made sure the cross validation is only done if at least 25 stations registred meassurements.
The problem however is, that my code as described below works most of the time but gives the following warnings in 4 cases:
1. In sqrt(ret[[var.name]]) : NaNs produced
2. In sqrt(ret[[var.name]]) : NaNs produced
3. In sqrt(ret[[var.name]]) : NaNs produced
4. In sqrt(ret[[var.name]]) : NaNs produced
When I try to use the compare.cv command for the total list including all the cross validations it gives me the following error:
"Error in quantile.default(as.numeric(x), c(0.25, 0.75), na.rm = na.rm, :
missing values and NaN's not allowed if 'na.rm' is FALSE"
Im wondering what causes the Autokrige function to generate NaNs in the cross validation, and more importantly how I can remove them from the results.cv so that I can use the compare.cv function?
rm(list=ls())
# load packages
require(sp)
require(gstat)
require(ggmap)
require(automap)
require(ggplot2)
#load data (download link provided below)
load("download path") https://www.dropbox.com/s/qmi3loub29e55io/meassurements_aug.RDS?dl=0
# make data spatial and assign spatial coordinate system
coordinates(meassurements) = ~x+y
proj4string(meassurements) <- CRS("+init=epsg:4326")
meassurements_df <- as.data.frame(meassurements)
# loop for cross validation
timestamp <- meassurements$import_log_id
results.cv=list()
for (i in unique(timestamp)) {
x = meassurements_df[which(meassurements$import_log_id == i), ]
if(sum(!is.na(x$temperature)) > 25){
results.cv[[paste0(i)]] = autoKrige.cv (temperature ~ 1, meassurements[which(meassurements$import_log_id == i & !is.na(meassurements$temperature)), ])
}
}
# calculate key statistics (RMSE MAE etc)
compare.cv(results.cv)
Thanks!
I came across the same problem and solved it with the help of remove.duplicates() of package sp on the SpatialPointDataFrame used for kriging. Prior to that I calculated the mean of the relevant variables in the DataFrame.
SPDF#data <- SPDF#data %>%
group_by(varx,vary,varz) %>%
mutate_at(vars(one_of(relevant_var)),mean,na.rm=TRUE) %>%
ungroup()
SPDF <- SPDF %>% remove.duplicates()
At the time I was encountering the same problem the Dropbox link above was not working anymore, so I could not check this specific example.

Duplicate data when using gstat or automap package in R

I am trying to using ordinary kriging to spatially predict data where an animal will occur based on predictor variables using the gstat or automap package in R. I have many (over 100) duplicate coordinate points, which I cannot throw out since those stations were sampled multiple times over many years. Every time that I run the code below for ordinary kriging, I get an LDL error, which is due to the duplicate points. Does anyone know how to fix this problem without throwing out data? I have tried the code from the automap package that is supposed to correct for duplicates but I can't get that to work. Thank you for the help!
coordinates(fish) <- ~ LONGITUDE+LATITUDE
x.range <- range(fish#coords[,1])
y.range <- range(fish#coords[,2])
grd <- expand.grid(x=seq(from=x.range[1], to=x.range[2], by=3), y=seq(from=y.range[1], to=y.range[2], by=3))
coordinates(grd) <- ~ x+y
plot(grd, pch=16, cex=.5)
gridded(grd) <- TRUE
library(gstat)
zerodist(fish) ###146 duplicate points
v <- variogram(log(WATER_TEMP) ~1, fish, na.rm=TRUE)
plot(v)
vgm()
f <- vgm(1, "Sph", 300, 0.5)
print(f)
v.fit <- fit.variogram(v,f)
plot(v, model=v.fit) ####In fit.variogram(v, d) : Warning: singular model in variogram fit
krg <- krige(log(WATER_TEMP) ~ 1, fish, grd, v.fit)
## [using ordinary kriging]
##"chfactor.c", line 131: singular matrix in function LDLfactor()Error in predict.gstat(g, newdata = newdata, block = block, nsim = nsim,: LDLfactor
##automap code for correcting for duplicates
fish.dup = rbind(fish, fish[1,]) # Create duplicate
coordinates(fish.dup) = ~LONGITUDE + LATITUDE
kr = autoKrige(WATER_TEMP, fish.dup, grd)
###Error in inherits(formula, "SpatialPointsDataFrame"):object 'WATER_TEMP' not found
###somehow my predictor variables are no longer available when in a Spatial Points Data Frame??
automap::autoKrige expects a formula as first argument, try
kr = autoKrige(WATER_TEMP~1, fish.dup, grd)
automaphas a very simple fix for duplicate observations, and that is to discard them. So, automapdoes not really solves the issue you have. I see some options:
Discard the duplicates.
Slightly perturb the coordinates of the duplicates so that they are not on exactly the same location anymore.
Perform space-time kriging using gstat.
In regard to your specific issue, please make your example reproducible. What I can guess is that rbind of your fish object is not doing what you expect...
Alternatively you can use the function jitterDupCoords of geoR package.
https://cran.r-project.org/web/packages/geoR/geoR.pdf

Local kriging reports error in `krigeST` with full data (`STFDF`)

I have been playing around with the krigeST function in the gstat package, and I tried to do local kriging with a full set of SpatialPolygons based data (STFDF).
When I tried the following
spatio_time_krige_sumProd = krigeST(conso~1,
some_STFDF_with_SpatialPolygons_in_sp,
newdata = some_STF_with_SpatialPolygons_in_sp,
fitted_prodSum_variogram,
nmax = 50,
stAni = 4)
Error in from#sp[from#index[, 1], ] :
SpatialPolygons selection: can't find plot order if polygons are replicated
What seems to be the problem is that in krigeST.local, the some_STFDF_with_SpatialPolygons_in_sp is coerced into a SFIDF. In that, there is a subsetting step which is not allowed by the SpatialPolygons class, since there are repeated terms in from#index[, 1].
I'd like to try non-separable spatio-temporal variogram, so not using local kriging isn't really an option. Is there a workaround for this problem?
Thanks.

Resources