gwr fitting using package mgcv and R2Bayesx in R - r

I want to compare GWR fittings produced between spgwr and mgcv, but I got a error with gam function of mgcv . Here is a example :
require(spgwr)
require(mgcv)
require(R2BayesX)
data(columbus)
col.bw <- gwr.sel(crime ~ income + housing, data=columbus,verbose=F,
coords=cbind(columbus$x, columbus$y))
col.gauss <- gwr(crime ~ income + housing, data=columbus,
coords=cbind(columbus$x, columbus$y),
bandwidth=col.bw, hatmatrix=TRUE)
#gwr fitting with Intercept
col.gam<-gam(crime ~s(x,y)+s(x,y)*income+s(x,y)*housing, data=columbus)#mgcv ERROR
b1<-bayesx(crime ~sx(x,y)+sx(x,y)*income+sx(x,y)*housing, data=columbus)#R2Bayesx ERROR
Question:
How to fit the same gwr using gam and bayesx function(the smooth functions of location )
How to control the parameters to be similiar as possible including optimal bandwidth

The mgcv error comes from the factor that you are specifying the "interactions" between the spatial smooth and variables income and housing. Read ?gam.models for details on using by terms. I think for this you need
col.gam <- gam(crime ~s(x,y, k = 5) + s(x,y, by = income, k = 5) +
s(x,y, by = housing, k = 5), data=columbus)
In this example, as there are only 49 observations, you need to restrict the dimensions of the basis functions, which I do here with k = 5, but you should investigate whether you need to vary these a little, within the constraints of the data.
By the looks of the error from bayesx, you have the same issue of specifying the model incorrectly. I'm not familiar with bayesx(), but it looks like it uses the same s() function as supplied with mgcv, so the model specification should be the same as I show above.
As for 2. can you expand on what you mean here Comparable getween gam() and bayesx() or getting both or one of these comparable with the spgwr() model?

Related

ggcoef_model error when two random intercepts

When trying to graph the conditional fixed effects of a glmmTMB model with two random intercepts in GGally I get the error:
There was an error calling "tidy_fun()". Most likely, this is because the
function supplied in "tidy_fun=" was misspelled, does not exist, is not
compatible with your object, or was missing necessary arguments (e.g. "conf.level=" or "conf.int="). See error message below.
Error: Error in "stop_vctrs()":
! Can't recycle "..1" (size 3) to match "..2" (size 2).`
I have tinkered with figuring out the issue and it seems to be related to the two random intercepts included in the model. I have also tried extracting the coefficient and standard error information separately through broom.mixed::tidy and then feeding the data frame into GGally:ggcoef() with no avail. Any suggestions?
# Example with built-in randu data set
data(randu)
randu$A <- factor(rep(c(1,2), 200))
randu$B <- factor(rep(c(1,2,3,4), 100))
# Model
test <- glmmTMB(y ~ x + z + (0 +x|A) + (1|B), family="gaussian", data=randu)
# A few of my attempts at graphing--works fine when only one random effects term is in model
ggcoef_model(test)
ggcoef_model(test, tidy_fun = broom.mixed::tidy)
ggcoef_model(test, tidy_fun = broom.mixed::tidy, conf.int = T, intercept=F)
ggcoef_model(test, tidy_fun = broom.mixed::tidy(test, effects="fixed", component = "cond", conf.int = TRUE))
There are some (old!) bugs that have recently been fixed (here, here) that would make confidence interval reporting on RE parameters break for any model with multiple random terms (I think). I believe that if you are able to install updated versions of both glmmTMB and broom.mixed:
remotes::install_github("glmmTMB/glmmTMB/glmmTMB#ci_tweaks")
remotes::install_github("bbolker/broom.mixed")
then ggcoef_model(test) will work.

Using the caret::train package for calculating prediction error (MdAE) of glmms with beta-binomial errors

The question is more or less as the title indicates. I would like to use the caret::train function with beta-binomial models made with glmmTMB package (although I am not opposed to other functions capable of fitting beta-binomial models) to calculate median absolute error (MdAE) estimates through jack-knife (leave-one-out) cross-validation. The glmmTMBControl function is already capable of estimating the optimal dispersion parameter but I was hoping to retain this information somehow as well... or having caret do the calculation possibly?
The dataset I am working with looks like this:
df <- data.frame(Effect = rep(seq(from = 0.05, to = 1, by = 0.05), each = 5), Time = rep(seq(1:20), each = 5))
Ideally I would be able to pass the glmmTMB function to trainControl like so:
BB.glmm1 <- train(Time ~ Effect,
data = df, method = "glmmTMB",
method = "", metric = "MAD")
The output would be as per the examples contained in train, although possibly with estimates for the dispersion parameter.
Although I am in no way opposed to work arounds - Thank you in advance!
I am unsure how to perform the required operation with caret without creating a custom method but I trust it is fairly easy to implement it with a for (lapply) loop.
In the example I will use the sleepstudy data set since your example data throws a bunch of warnings.
library(glmmTMB)
to perform LOOCV - for every row, create a model without that row and predict on that row:
data(sleepstudy,package="lme4")
LOOCV <- lapply(1:nrow(sleepstudy), function(x){
m1 <- glmmTMB(Reaction ~ Days + (Days|Subject),
data = sleepstudy[-x,])
return(predict(m1, sleepstudy[x,], type = "response"))
})
get the median of the residuals (I think this is MdAE? if not post a comment on how its calculated):
median(abs(unlist(LOOCV) - sleepstudy$Reaction))

Extracting path coefficients of piecewise SEM (structural equation model)

I'm constructing a piecewise structural equation model using the piecewiseSEM package in R (Lefcheck - https://cran.r-project.org/web/packages/piecewiseSEM/vignettes/piecewiseSEM.html)
I already created the model set and I could evaluate the model fit, so the model itself works. Also, the data fits the model (p = 0.528).
But I do not succeed in extracting the path coefficients.
This is the error i get: Error in cbind(Xlarge, Xsmall) : number of rows of matrices must match (see arg 2)
I already tried (but this did not work):
standardising my data because of the warning: Some predictor variables are on very different scales: consider rescaling
adapted my data (threw some NA values away)
This is my modellist:
predatielijst = list(
lmer(plantgrootte ~ gapfraction + olsen_P + (1|plot_ID), data = d),
glmer(piek1 ~ gapfraction + olsen_P + plantgrootte + (1|plot_ID),
family = poisson, data = d),
glmer(predatie ~ piek1 + (1|plot_ID), family = binomial, data = d)
)
with "predatie" being a binary variable (yes or no) and all the rest continuous variables (gapfraction, plantgrootte, olsen_P & piek1)
Thanks in advance!
Try installing the development version:
library(devtools)
install_github("jslefche/piecewiseSEM#2.0")
Replace list with psem and run the coefs or summary function. It will likely get rid of your error. If not, open a bug on Github!
WARNING: this will overwrite your current version from CRAN. You will need to reinstall from CRAN to get version 1.4 back.
try to use lme (out of the nlme library) ilstead of glmer. As far as I understand, the fact that lmer does not provide p-values (while lme does) seems to be the problem here.
Hope this works.

IV Estimation with Cluster Robust Standard Errors using the plm package in R

I'm using the plm package for panel data to do instrumental variable estimation. However, it seems that calculating cluster robust standard errors by using the vcovHC() function is not supported.
More specifically, when I use the vcovHC() function, the following error message is displayed:
Error in vcovG.plm(x, type = type, cluster = cluster, l = 0, inner = >inner, :
Method not available for IV
Example:
data("Wages", package = "plm")
IV <- plm(lwage ~ south + exp | wks + south,
data = Wages, model = "pooling", index = 595)
vcvIV <- vcovHC(IV)
According to this thread, someone worked on a fix two years ago. Is there any progress on the issue? I know that the packages "lfe" and "ivpack" allow to compute cluster robust standard errors for IV estimation but none of them allows for random effects/intercepts.
In fact it's not implemented. However, you can use Schrimpf's clustered errors function which is applied directly to a object of the plm class.
Using your example:
library (plm)
data("Wages", package = "plm")
IV <- plm(lwage ~ south + exp | wks + south, data = Wages, model = "pooling", index = 595)
Wages$id <- rep(1:595, each = 7)
cl.plm(Wages, IV, Wages$id)
Where I'm using Wages$idas the panel first dimension around which clusters will be formed. You may want to compare these results with the obtained in other software. Anyway, the code is simple allowing some tricks. The cl.plm function is based on Arai's clustering notes which can help you further.
You can obtain the same result from cl.plm doing this in Stata:
ivregress 2sls lwage south (exp = wks), vce(cluster id) small
Or for the within model:
xtset id time, generic
xtivreg2 lwage south (exp = wks), fe small cluster(id)
Note however I used the small sample formulation in Stata, which is not big deal. More about this here. Anyway, cl.plm properly deals with the plm class object.
For sake of completeness: as suggested by #Helix123, you can use the development version (1.6-1) of plm package and proceed as you did in tour question.

Error with ZIP and ZINB Models upon subseting and factoring data

I am trying to run ZIP and ZINB models to try look at some of the factors which might help explain disease (orf) distribution within 8 geographical regions. The models works fine for some regions and not others. However upon factoring and running the model in R I get error message.
How can I solve this problem or would there be a model that might work better with the subset as the analysis will only make sense when it’s all uniform across all regions.
zinb3 = zeroinfl(Cases2012 ~ Precip+ Altitude +factor(Breed)+ factor(Farming.Practise)+factor(Lambing.Management)+ factor(Thistles) ,data=orf3, dist="negbin",link="logit")
Error in solve.default(as.matrix(fit$hessian)) :
system is computationally singular: reciprocal condition number = 2.99934e-24
Results after fitting zerotrunc & glm as suggested by #Achim Zeileis. How do i interprete zerotruc output given that no p values. Also how can I correct the error with glm?
zerotrunc(Cases2012 ~ Flock2012+Stocking.Density2012+ Precip+ Altitude +factor(Breed)+ factor(Farming.Practise)+factor(Lambing.Management)+ factor(Thistles),data=orf1, subset = Cases2012> 0)
Call:
zerotrunc(formula = Cases2012 ~ Flock2012 + Stocking.Density2012 + Precip + Altitude +
factor(Breed) + factor(Farming.Practise) + factor(Lambing.Management) + factor(Thistles),
data = orf1, subset = Cases2012 > 0)
Coefficients (truncated poisson with log link):
(Intercept) Flock2012 Stocking.Density2012
14.1427130 -0.0001318 -0.0871504
Precip Altitude factor(Breed)2
-0.1467075 -0.0115919 -3.2138767
factor(Farming.Practise)2 factor(Lambing.Management)2 factor(Thistles)3
1.3699477 -2.9790725 2.0403543
factor(Thistles)4
0.8685876
glm(factor(Cases2012 ~ 0) ~ Precip+ Altitude +factor(Breed)+ factor(Farming.Practise)+factor(Lambing.Management)+ factor(Thistles) +Flock2012+Stocking.Density2012 ,data=orf1, family = binomial)
Error in unique.default(x, nmax = nmax) :
unique() applies only to vectors
It's hard to say exactly what is going on based on the information provided. However, I would suspect that the data in some regions does not allow to fit the model specified. For example, there might be some regions where certain factor levels (of Breed or Farming.Practise or Lambin.Management or Thristles) only have zero values (or only non-zero but that is less frequent in practice). Then the coefficient estimates often degenerate so that the associated zero-inflation probability goes to 1 and the count coefficient cannot be estimated.
It's typically easier to separate these effects by using the hurdle rather the zero-inflation model. Then the two parts of the model can also be fitted separately by glm(factor(y > 0) ~ ..., ..., family = binomial) and zerotrunc(y ~ ..., ..., subset = y > 0). The latter function is essentially the same code as pscl uses but has been factored into a standalone function in the package countreg on R-Forge (not yet on CRAN).

Resources