siber.ellipses -> Error in rmultireg(Y, X, Bbar, A, nu, V) : not a matrix - r

I am very new to R and am currently trying to create siber ellipses.
I watched the potcast Using ellipses to compare community members:(http://www.tcd.ie/Zoology/research/research/theoretical/Rpodcasts.php#siber) and got along just fine in the beginning. Whenever i get to the function of the siber.ellipses i get an Error:
(Error in rmultireg(Y, X, Bbar, A, nu, V) : not a matrix)
I can not figure out why. I get it with my own data as well as with the example data in the zip file provided along with the script.
I have researched the Error message online but could not come up with an answer.
It almost can be an error in the script or the data since I used those exactly as provided. My R version is 3.3.2
Does it have something to do with some kind of settings?
What could be the reason?
Can someone please helpe me :)
thanks
# this demo generates some random data for M consumers based on N samples and
# constructs a standard ellipse for each based on SEAc and SEA_B
rm(list = ls())
library(siar)
-------------------------------------------------------------------------
# ANDREW - REMOVE THESE LINES WHICH SHOULD BE REDUNDANT
# change this line
setwd("C:/Users/elisabeth/Desktop/R/demo")
# -----------------------------------------------------------------------------
# now close all currently open windows
graphics.off()
# read in some data
# NB the column names have to be exactly, "group", "x", "y"
mydata <- read.table("example_ellipse_data.txt",sep="\t",header=T)
# make the column names availble for direct calling
attach(mydata)
# now loop through the data and calculate the ellipses
ngroups <- length(unique(group))
# split the isotope data based on group
spx <- split(x,group)
spy <- split(y,group)
# create some empty vectors for recording our metrics
SEA <- numeric(ngroups)
SEAc <- numeric(ngroups)
TA <- numeric(ngroups)
dev.new()
plot(x,y,col=group,type="p")
legend("topright",legend=as.character(paste("Group ",unique(group))),
pch=19,col=1:length(unique(group)))
for (j in unique(group)){
# Fit a standard ellipse to the data
SE <- standard.ellipse(spx[[j]],spy[[j]],steps=1)
# Extract the estimated SEA and SEAc from this object
SEA[j] <- SE$SEA
SEAc[j] <- SE$SEAc
# plot the standard ellipse with d.f. = 2 (i.e. SEAc)
# These are plotted here as thick solid lines
lines(SE$xSEAc,SE$ySEAc,col=j,lty=1,lwd=3)
# Also, for comparison we can fit and plot the convex hull
# the convex hull is plotted as dotted thin lines
#
# Calculate the convex hull for the jth group's isotope values
# held in the objects created using split() called spx and spy
CH <- convexhull(spx[[j]],spy[[j]])
# Extract the area of the convex hull from this object
TA[j] <- CH$TA
# Plot the convex hull
lines(CH$xcoords,CH$ycoords,lwd=1,lty=3)
}
# print the area metrics to screen for comparison
# NB if you are working with real data rather than simulated then you wont be
# able to calculate the population SEA (pop.SEA)
# If you do this enough times or for enough groups you will easily see the
# bias in SEA as an estimate of pop.SEA as compared to SEAc which is unbiased.
# Both measures are equally variable.
print(cbind(SEA,SEAc,TA))
# So far we have fitted the standard ellipses based on frequentist methods
# and calculated the relevant metrics (SEA and SEAc). Now we turn our attention
# to producing a Bayesian estimate of the standard ellipse and its area SEA_B
reps <- 10^4 # the number of posterior draws to make
# Generate the Bayesian estimates for the SEA for each group using the
# utility function siber.ellipses
SEA.B <- siber.ellipses(x,y,group,R=reps)
Error in rmultireg(Y, X, Bbar, A, nu, V) : not a matrix

Reinstall the SIAR package using Jackson's Github site:
library(devtools)
install_github("andrewljackson/siar#v4.2.2", build_vingettes == TRUE)
library(siar)

Related

Maximum pseudo-likelihood estimator for soft-core point process

I am trying to fit a soft-core point process model on a set of point pattern using maximum pseudo-likelihood. I followed the instructions given in this paper by Baddeley and Turner
And here is the R-code I came up with
`library(deldir)
library(tidyverse)
library(fields)
#MPLE
# irregular parameter k
k <- 0.4
## Generate dummy points 50X50. "RA" and "DE" are x and y coordinates
dum.x <- seq(ramin, ramax, length = 50)
dum.y <- seq(demin, demax, length = 50)
dum <- expand.grid(dum.x, dum.y)
colnames(dum) <- c("RA", "DE")
## Combine with data and specify which is data point and which is dummy, X is the point pattern to be fitted
bind.x <- bind_rows(X, dum) %>%
mutate(Ind = c(rep(1, nrow(X)), rep(0, nrow(dum))))
## Calculate Quadrature weights using Voronoi cell area
w <- deldir(bind.x$RA, bind.x$DE)$summary$dir.area
## Response
y <- bind.x$Ind/w
# the sum of distances between all pairs of points (the sufficient statistics)
tmp <- cbind(bind.x$RA, bind.x$DE)
t1 <- rdist(tmp)^(-2/k)
t1[t1 == Inf] <- 0
t1 <- rowSums(t1)
t <- -t1
# fit the model using quasipoisson regression
fit <- glm(y ~ t, family = quasipoisson, weights = w)
`
However, the fitted parameter for t is negative which is obviously not a correct value for a softcore point process. Also, my point pattern is actually simulated from a softcore process so it does not make sense that the fitted parameter is negative. I tried my best to find any bugs in the code but I can't seem to find it. The only potential issue I see is that my sufficient statistics is extremely large (on the order of 10^14) which I fear may cause numerical issues. But the statistics are large because my observation window spans a very small unit and the average distance between a pair of points is around 0.006. So sufficient statistics based on this will certainly be very large and my intuition tells me that it should not cause a numerical problem and make the fitted parameter to be negative.
Can anybody help and check if my code is correct? Thanks very much!

Why does gstat.predict() function often return NaN values (GSTAT Package)? (R version 3.3.2, Windows 10)

I am trying to simulate a combination of two different random fields (yy1 and yy2 with different mean and correlation length) with an irregular boundary using Gstat package in R. I have attached the picture of my expected outcome. The code is not giving such output consistently and I am frequently getting atleast one of the yy1 and yy2 as NaNs, which results in the Undesired output as shown in image.
The key steps I used are:
1) Created two gstat objects with different means and psill (rf1 and rf2)
2) Created two computational grids (one for each random field) in the form of data frame with two variables “x” and “y” coordinates.
3) Predicted two random fields using unconditional simulation.
Any help in this regard would be highly appreciated.
Attachments: 2 images (link provided) and 1 R code
1) Expected Outcome
2) Undesired Outcome
library(gstat)
xy <- expand.grid(1:150, 1:200) # 150 x 200 grid is created in the form of a dataframe with x and y vectors
names(xy)<-c('x','y') # giving names to the variables
# creating gsat objects
rf1<-gstat(formula=z~1,locations=~x+y,dummy = T,beta=c(1,0,0), model=vgm(psill=0.025, range=5, model='Exp'), nmax=20) # dummy=T treats this as a unditional simulation
rf2<-gstat(formula=z~1,locations=~x+y,dummy = T,beta=c(4,0,0), model=vgm(psill=0.025, range=10, model='Exp'), nmax=20) # dummy=T treats this as a unditional simulation
# creating two computational grid
rows<-nrow(xy)
xy_shift <- expand.grid(60:90, 75:100)
names(xy_shift)<-c('x','y')
library(dplyr) # for antijoin
xy1<-xy[1:(rows/2),]
xy1<-anti_join(xy1, xy_shift, by = c("x","y")) # creating the irregular boundary
xy2<-rbind(xy[(rows/2+1):rows,],xy_shift)
library(sp)
yy1<- predict(rf1, newdata=xy1, nsim=1) # random field 1
yy2<- predict(rf2, newdata=xy2, nsim=1) # random field 2
rf1_label<-gl(1,length(yy1[,1]),labels="field1")
rf2_label<-gl(1,length(yy2[,1]),labels="field2")
yy1<-cbind(yy1,field=rf1_label)
yy2<-cbind(yy2,field=rf2_label)
yy<-rbind(yy1,yy2)
yyplot<-yy[,c(1,2,3)]
# plotting the field
gridded(yyplot) = ~x+y
spplot(obj=yyplot[1],scales=list(draw = TRUE))

spBayes spLM function with duplicate coordinates

I am using the spRecover function in package spBayes to produce a spatial univariate model.
Here is a reproducible example where there I made a duplicate coordinate point. The modeling procedure itself executes just fine, but it won't let me recover the spatial effects for each site:
require(spBayes)
set.seed(444)
N = 200
y = rnorm(N,0,100)
x = rnorm(N,2,7)
df <- as.data.frame(cbind((rnorm(N,5,2.5)),rep('location1',N)))
coord <- cbind(runif(N,-30,30),runif(N,-180,180))
coord[2,] <- coord [1,]
n.samples <- 1000
bef.sp <- spLM(y ~ x, ## the equation
data = df, coords=coord, ## data and coordinates
starting=list("phi"=3/200,"sigma.sq"=0.08,"tau.sq"=0.02),## start values
tuning=list("phi"=0.1, "sigma.sq"=0.05, "tau.sq"=0.05), ## tuning values
priors=list("phi.Unif"=c(3/1500, 3/50), "sigma.sq.IG"=c(2, 0.08),"tau.sq.IG"=c(2, 0.02)), ## priors
cov.model="exponential",n.samples=n.samples)
burn.in <- floor(0.75*n.samples)
bef.sp <- spRecover(bef.sp, start=burn.in, thin=2)
The error received is:
Error in spRecover(bef.sp, start = burn.in, thin = 2) :
c++ error: dpotrf failed
I found a post by the package author indicating this error might come up if one has replicated coordinates. I definitely have duplicated coordinates, since many sites were sampled many times (on the same day; this is not a time-series issue). How do I get the model to accept that there is lots of replication within each coordinate pair, and to recover individual spatial effects values for each site?
Thanks!

R: Calculate sill, range and nugget from a raster object

I need to calculate the sill, range and nugget from a raster layer. I have explored gstat, usdm packages where one can create variogram however I couln't find a function which given a raster layer will estimate these parameters.In most of the functions these parameters have to be defined eg. krigging.
I have raster data layers for different heights which looks similar to
I would like get the sill, nugget and range from the parameters of semivariogram fitted to these data layers to create a plot similar to this:
The original data layers are available here as a multiband tiff. Here is a figure from this paper which further illustrates the concept.
Using gstat, here is an example:
library(raster)
library(gstat)
demo(meuse, ask = FALSE, echo = FALSE)
set.seed(131) # make random numbers reproducible
# add some noise with .1 variance
meuse.grid$dist = meuse.grid$dist + rnorm(nrow(meuse.grid), sd=sqrt(.1))
r = raster(meuse.grid["dist"])
v = variogram(dist~1, as(r, "SpatialPixelsDataFrame"))
(f = fit.variogram(v, vgm("Sph")))
# model psill range
# 1 Nug 0.09035948 0.000
# 2 Sph 0.06709838 1216.737
f$psill[2] # sill
# [1] 0.06709838
f$range[2] # range
# [1] 1216.737
f$psill[1] # nugget
# [1] 0.09035948
Plug in your own raster for r, and it should work. Change the Sph to fit another variogram model, try plot(v,f) to verify the plot.
This is just a guess. This is how I estimate semi variance
where n is the number of layers which their mean is less than the total mean. m is the total mean across all the layers. r is the mean of each layer that fell below the total mean.
s <- stack("old_gap_.tif")
m <- cellStats(mean(s), stat="mean", na.rm=T) # 0.5620522
r <- m[m < 0.5620522]
sem <- 1/53 * (0.5620522 - r)^2
plot(sem, r)

applying the pvclust R function to a precomputed dist object

I'm using R to perform an hierarchical clustering. As a first approach I used hclust and performed the following steps:
I imported the distance matrix
I used the as.dist function to transform it in a dist object
I run hclust on the dist object
Here's the R code:
distm <- read.csv("distMatrix.csv")
d <- as.dist(distm)
hclust(d, "ward")
At this point I would like to do something similar with the function pvclust; however, I cannot because it's not possible to pass a precomputed dist object. How can I proceed considering that I'm using a distance not available among those provided by the dist function of R?
I've tested the suggestion of Vincent, you can do the following (my data set is a dissimilarity matrix):
# Import you data
distm <- read.csv("distMatrix.csv")
d <- as.dist(distm)
# Compute the eigenvalues
x <- cmdscale(d,1,eig=T)
# Plot the eigenvalues and choose the correct number of dimensions (eigenvalues close to 0)
plot(x$eig,
type="h", lwd=5, las=1,
xlab="Number of dimensions",
ylab="Eigenvalues")
# Recover the coordinates that give the same distance matrix with the correct number of dimensions
x <- cmdscale(d,nb_dimensions)
# As mentioned by Stéphane, pvclust() clusters columns
pvclust(t(x))
If the dataset is not too large, you can embed your n points in a space of dimension n-1, with the same distance matrix.
# Sample distance matrix
n <- 100
k <- 1000
d <- dist( matrix( rnorm(k*n), nc=k ), method="manhattan" )
# Recover some coordinates that give the same distance matrix
x <- cmdscale(d, n-1)
stopifnot( sum(abs(dist(x) - d)) < 1e-6 )
# You can then indifferently use x or d
r1 <- hclust(d)
r2 <- hclust(dist(x)) # identical to r1
library(pvclust)
r3 <- pvclust(x)
If the dataset is large, you may have to check how pvclust is implemented.
It's not clear to me whether you only have a distance matrix, or you computed it beforehand. In the former case, as already suggested by #Vincent, it would not be too difficult to tweak the R code of pvclust itself (using fix() or whatever; I provided some hints on another question on CrossValidated). In the latter case, the authors of pvclust provide an example on how to use a custom distance function, although that means you will have to install their "unofficial version".

Resources