How to solve "impacts()" neighbors length error after running spdep::lagsarlm (Spatial Autoregressive Regression model)? - r

I have 9,150 polygons in my dataset. I was trying to run a spatial autoregressive model (SAR) in spdep to test spatial dependence of my outcome variable. After running the model, I wanted to examine the direct/indirect impacts, but encountered an error that seems to have something to do with the length of neighbors in the weights matrix not being equal to n.
I tried running the very same equation as SLX model (Spatial Lag X), and impacts() worked fine, even though there were some polygons in my set that had no neighbors. I Googled and looked at spdep documentation, but couldn't find a clue on how to solve this error.
# Defining queen contiguity neighbors for polyset and storing the matrix as list
q.nbrs <- poly2nb(polyset)
listweights <- nb2listw(q.nbrs, zero.policy = TRUE)
# Defining the model
model.equation <- TIME ~ A + B + C
# Run SAR model
reg <- lagsarlm(model.equation, data = polyset, listw = listweights, zero.policy = TRUE)
# Run impacts() to show direct/indirect impacts
impacts(reg, listw = listweights, zero.policy = TRUE)
Error in intImpacts(rho = rho, beta = beta, P = P, n = n, mu = mu, Sigma = Sigma, :
length(listweights$neighbours) == n is not TRUE

I know that this is a question from 2019, but maybe it can help people dealing with the same problem. I found out that in my case the problem was the type of dataset, your data=polyset should be of type "SpatialPolygonsDataFrame". Which can be achieved by converting your data:
polyset_spatial_sf <- sf::as_Spatial(polyset, IDs = polyset$ID)
Then rerun your code.

Related

Kriging with gstat : "Covariance matrix singular at location" with predict

I am trying to do an estimation by kriging with gstat, but can never achieve it because of an issue with the covariance matrix. I never have estimates on the locations I want, because they are all skipped. I have the following warning message, for each location :
1: In predict.gstat(g, newdata = newdata, block = block, nsim = nsim, :
Covariance matrix singular at location [-8.07794,48.0158,0]: skipping...
And all estimates are NA.
So far I've browsed many related StackOverflow threads, but none resolved my problems (https://gis.stackexchange.com/questions/222192/r-gstat-krige-covariance-matrix-singular-at-location-5-88-47-4-0-skipping ; https://gis.stackexchange.com/questions/200722/gstat-krige-error-covariance-matrix-singular-at-location-917300-3-6109e06-0 ; https://gis.stackexchange.com/questions/262993/r-gstat-predict-error?rq=1)
I checked that :
there is actually a spatial structure in my dataset (see bubble plot with code below)
there are no duplicate locations
the variogram model is not singular and has a good fit to the experimental variogram (see plot with code below)
I also tried several values of range, sill, nugget and all the models in the gstat library
The covariance matrix is positive definite and has positive eigen values. It is singular according to gstat, but not to is.singular.matrix function
There were enough pair of points to do the experimental variogram
How to overcome this problem? What tips to avoid a singular covariance matrix? I also welcome any "best practice" for kriging.
Code (requires forSO.Rdata : https://www.dropbox.com/s/5vfj2gw9rkt365r/forSO.Rdata?dl=0 ) :
library(ggplot2)
library(gstat)
#Attached Rdata
load("forSO.Rdata")
#The observations
str(abun)
#Spatial structure
abun %>% as.data.frame %>%
ggplot(aes(lon, lat)) +
geom_point(aes(colour=prop_species_cells), alpha=3/4) +
coord_equal() + theme_bw()
#Number of pair of points
cvgm <- variogram(prop_species_cells ~1, data=abun, width=3, cutoff=300)
plot(cvgm$dist,cvgm$np)
#Fit a model covariogram
efitted = fit.variogram(cvgm, vgm(model="Mat", range=100, nugget=1), fit.method=7, fit.sills=TRUE, fit.ranges=TRUE)
plot(cvgm,efitted)
#No warning, and the model is non singular
attr(efitted, "singular")
#Covariance matrix (only on a small set of points, I have more than 25000 points) : positive-definite, postiive eigen values and not singular
hex_pointsDegTiny=hex_pointsDeg
hex_pointsDegTiny#coords=hex_pointsDegTiny#coords[1:10,]
dists <- spDists(hex_pointsDegTiny)
covarianceMatrix=variogramLine(efitted, maxdist = max(cvgm$dist), n = 10*max(cvgm$dist), dir = c(1,0,0), dist_vector = dists, covariance = TRUE)
eigen(covarianceMatrix)$values
is.positive.definite(covarianceMatrix)
is.singular.matrix(covarianceMatrix)
# No duplicate locations
zerodist(hex_pointsDegTiny)
# Impossible to krig
OK_fit <- gstat(id = "OK_fit", formula = prop_species_cells ~ 1, data = abun, model = efitted)
dist <- predict(OK_fit, newdata = hex_pointsDegTiny)
dist#data
Actually, there were duplicate locations in abun dataset (zerodist(abun)), they were not to be seeked into the grid on which I wanted to krig estimates. After getting rid of the duplicates, kriging worked fine.

Parameter estimates using FME ODE model fitting in R

I have a system of ODE equations that I am trying to fit to generated data, synthetic or lab. The final product I am interested in is the parameter and it's estimated error. We use the R package FME with modCost and modFit. As an example, a system of ODEs may be defined as such:
eqs <- function (time, y, parms, ...) {
with(as.list(c(parms, y)), {
dP <- k2*PA - k1*A*P # concentration of nucleic acid
dA <- dP # concentration of free protein
dPA <- -dP
list(c(dA,dP,dPA))
}
}
with parameters k1 and k2 and variables A,P and PA. I import the data (not shown) and define the cost function used in modFit
cost <- function(p, data, ...) {
yy <- p[c("A","P","PA")]
pp <- p[c("k1", "k2")]
out <- ode(yy, time, eqs, pp)
modCost(out, data, ...)
}
I set some initial conditions with a parms vector and then do the fitting with
fit <- modFit(f = cost, p = parms, data = dat, weight = "std",
lower = rep(0, 8), upper = c(600,100,600,0.01,0.01), method = "Marq")
I then do a final ode to get the generated fits with best parameters, Bob's your uncle, and boom, estimated parameters. The input numbers don't matter, I hope my process outline is legible for those who use this package.
My issue and question centers around two things: I'm a scientist, a physicist, and the error of the estimated parameters is important to report. Can I generate the estimated error from MFE somehow or is there a separate package for that kind of return?
I don't get your point. You can just use:
summary(fit)
to see the Std. Error.

How to extract the baseline hazard function h0(t) from glmnet object in R?

Extract the baseline hazard function h0(t) from glmnet object
I want to know the hazard function at time t >> h(t,X) = h0(t) exp[Σ βi*Xi]. How can I extract the baseline hazard function h0(t) from glmnet object in R?
What I know is that function "basehaz()" in Survival Packages can extract the baseline hazard function from coxph object only.
I also found a function, glmnet.basesurv(time, event, lp, times.eval = NULL, centered = FALSE). But when I try to use this function, there is an error.
Error: could not find function "glmnet.basesurv"
Below is my code, using glmnet to fit the cox model and obtained the coefficients of selected variables. Is it possible to get the baseline hazard function h0(t) from this glmnet object?
Code
# Split data into training data and testing data
set.seed(101)
train_ratio = 2/3
sample <- sample.int(nrow(x), floor(train_ratio*nrow(x)), replace = F)
x.train <- x[sample, ]
x.test <- x[-sample, ]
y.train <- y[sample, ]
y.test <- y[-sample, ]
surv_obj <- Surv(y.train[,1],y.train[,2])
#
my_alpha = 0.5
fit = glmnet(x = x.train, y = surv_obj, family = "cox",alpha = my_alpha) # fit the model with elastic net method
plot(fit,xvar="lambda", main="cox model coefficient paths(glmnet.fit)\n\n") # Plot the paths for the fit
fit
# cross validation to find out best lambda
cv_fit = cv.glmnet(x = x.train,y = surv_obj , family = "cox",nfolds = 10,alpha = my_alpha)
tencrossfit <- cv_fit$glmnet.fit
plot(cv_fit, main="Cross-validated Deviance(10 folds cv.glmnet.fit)\n\n")
plot(tencrossfit, main="cox model coefficient paths(10 folds cv.glmnet.fit)\n\n")
max(cv_fit$cvm)
summary(cv_fit$cvm)
cv_fit$lambda.min
cv_fit$lambda.1se
coef.min = coef(cv_fit, s = "lambda.1se")
pred_min_value2 <- predict(cv_fit, s=cv_fit$lambda.min, newx=x.test,type="link")
I really appreciate any help you can provide.
The glmnet.basesurv function is part of the hdnom package (which is available on CRAN), not glmnet itself. So install that, and then call it.
I had similar question and after installing hdnom install.packages("hdnom"), if you check inside the function list library(help = "hdnom")
you can see that the function is actually glmnet_survcurve(). I made it working as hdnom:::glmnet_survcurve(), example is here:
S <- Surv(data$survtimed, data$outcome)
X_glm<-model.matrix(S~.,data[, c("factor1", "factor2")])
cox_model <- glmnet(X_glm, S, family="cox", alpha=1, lambda=0.2)
times = c (1,2) #for predict of survival and
linearpredictors at times = 1 and 2
predictions = hdnom:::glmnet_survcurve(cox_model, S[,1], S[,2], X_glm, survtime = times)
predictions$p[,1] #survival probability at time 1

Fit state space model using dlm

I am about to fit a state space model to at univariate time series (y_t). The model i try to fit is:
y_t=F x_t+\epsilon_t, \epsilon_t \sim N(0,V)
x_{t+1}=G x_t+w_t, w_t \sim N(0,W)
x_0 \sim N(m_0,C_0)
I use the following R-code:
# Create function of unknown parameters, which returns dlm object
Build <- function(theta) {dlm(FF=theta[1],
GG=theta[2],V=theta[3],W=theta[4],m0=theta[5],
C0=theta[6])}
# Fit model to data using MLE
f1 <- dlmMLE(y,parm=c(1,1,0.1,0.1,0,0.1),Build)
But I get the following error message (after running f1):
Error in dlm(FF = theta[1], GG = theta[2], V = theta[3], W = theta [4], :
V is not a valid variance matrix
My problem is that I don't understand why V is not a valid variance matrix..
Does anyone know what is wrong?
Thank you in advance
Regards fuente
EDIT:
I tried doing the same, but instead of my real data I used:
y <- rnorm(72,6.44,1.97)
This produced, however, an error involving W (and not V...):
Error in dlm(FF = theta[1], GG = theta[2], V = theta[3], W = theta[4], :
W is not a valid variance matrix
I'm confused. Does it have something to do with the starting values passed to parm=...?

spdep "Not yet able to subset general weights lists" listw

I have a problems with spdep(). Starting with a matrix of non-missing distances produced by a function
dist_m <- geoDistMatrix(data1, group = 'fips_dist')
dist_m[upper.tri(dist_m)] <- t(dist_m)[upper.tri(dist_m)]
we then turn into weights with linear inverse
max_dist <- max(dist_m)
w1 <- (max_dist + 1 - dist_m)/(max_dist + 1)
and now
lw <- mat2listw(w1, row.names = rownames(w1), style = 'M')
I check to make sure no missing weights:
any(is.na(lw$weights))
and since there aren't, go ahead with:
errorsarlm(cvote ~ inc, data = data1, lw, method = 'eigen', quiet = F, zero.policy = TRUE)
leads to the following error:
Error in subset.listw(listw, subset, zero.policy = zero.policy) :
Not yet able to subset general weights lists
This is because at least one observation in data1 is not complete, i.e. has missing values. Hence, errorsarlm wants to subset the data, i.e. restrict to complete cases. But it can't do it now - that's what the error message says.
Best is to subset the data manually or correct the incomplete cases.
This is because the spdep function created a listw object only for non-general weights by default. Set zero.polcy=TRUE beform you perform mat2listw or nb2listw function so that it consider non-neighbors that have zero value.

Resources