I'm trying to build some models using Zero-Inflated Poisson regression using pscl package and after having manipulated the output object which turns to be zeroinfl, I find that doing residuals(fm_zip) is not equal to fm_zip$residuals.
The following is an example of what I'm talking about:
library("pscl")
data("bioChemists", package = "pscl")
fm_zip <- zeroinfl(art ~ . | 1, data = bioChemists)
names(fm_zip)
fm_zip$residuals
residuals(fm_zip)
all.equal(fm_zip$residuals,residuals(fm_zip))
qplot(fm_zip$residuals,residuals(fm_zip))
As you might realize, the results are not equal. I would say that both ways are equivalent but it seems like they're not. Could you explain me what is wrong with this? According to residuals R help, those two alternatives are supposed to return the difference (observed - fitted). By contrast, I did the same with a plain vanilla linear regression and they are equal.
My R version is:
sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)...
and the package verion is pscl_1.04.4
Any help is appreciated.
To get equal result you should set type to response ( pearson by default)
all.equal(fm_zip$residuals,residuals(fm_zip,'response'))
[1] TRUE
From the ?residuals.zeroinfl:
The residuals method can compute raw residuals (observed - fitted) and
Pearson residuals (raw residuals scaled by square root of variance
function).
The perason variance is defined as:
mu <- predict(fm_zip, type = "count")
phi <- predict(fm_zip, type = "zero")
theta1 <- switch(fm_zip$dist, poisson = 0,
geometric = 1,
negbin = 1/object$theta)
variance <- fm_zip$fitted.values * (1 + (phi + theta1) * mu)
EDIT
Don't hesitate to read the code behind, it is generally a source of learning and you can also avoid many confusions. To get the code behind the S3 method residuals.zeroinfl, you can use something like this :
getS3method('residuals','zeroinfl')
Related
I estimated a mixed effect model with a nested random effect structure (participants were in different groups) with the lmer command of the lme4 package.
mixed.model <- lmer(ln.v ~ treatment*level+age+income+(1 | group/participant),data=data)
Then I bootstrapped the bootstrap command from the lmeresampler package because of the nested structure. I used the semi-parametric bootstrap.
boot.mixed.model <- bootstrap(model = mixed.model, type = "cgr", fn = extractor, B = 10000, resample=c(data$group,data$participant))
I can obtain bootsrapped confidence intervals via boot.ci (package boot) but in addition I want to report the coefficients' p-values. The output of the bootstrapped model boot.mixed.model provides only the bias and the standard error:
Bootstrap Statistics :
original bias std. error
t1* 0.658442415 -7.060056e-02 2.34685668
t2* -0.452128438 -2.755208e-03 0.17041300
…
What is the best way to calculate the p-values based on these values?
I am unaware of the package called lmeresampler, and it seems to have been removed from cran due to compatibility issues (failed cran checks).
Also, the question does not include data and extractor is not defined, so the example is not reproducible. However the output is the same as you would get by using the bootMer function from lme4 so produce and example using the inbuilt function.
Basically this follows the example from the help(bootMer) page, but expanded for the specific problem. If the object returend by the lmeresampler package is similar, it will contain the objects used.
Reproducible example
library(lme4)
data(Dyestuff, package = "lme4")
fm01ML <- lmer(Yield ~ 1|Batch, Dyestuff, REML = FALSE)
Now the bootMer function simply requires a function, which outputs a vector of interesting parameters.
StatFun <- function(merMod){
pars <- getME(merMod, c("fixef", "theta", "sigma"))
c(beta = pars$fixef, theta = unname(pars$theta * pars$sigma), sigma = pars$sigma) ### <<== Error corrected
}
We can perform our bootstrapping by using the bootMer, which also contains parametric options in type (i suggest reading the details in the help(bootMer) page for more information)
boo01 <- bootMer(fm01ML, StatFun, nsim = 100, seed = 101)
Now for more precise p-values, I'd advice p-values greater closer to 1000 but for time reasons it might not be feasible in every circumstance.
Regardless the output is stored in a matrix t, which we can use to perform a simple Kolmogorov-supremum test:
H0 <- c(0, 0, 0)
Test <- sweep(abs(boo01$t), 2, H0, "-") <= H0 ###<<=== Error corrected
pVals <- colSums(Test)/nrow(Test)
print(pVals)
#output#
beta.(Intercept) theta sigma
0.00 0.12 0.00
I'm constructing a piecewise structural equation model using the piecewiseSEM package in R (Lefcheck - https://cran.r-project.org/web/packages/piecewiseSEM/vignettes/piecewiseSEM.html)
I already created the model set and I could evaluate the model fit, so the model itself works. Also, the data fits the model (p = 0.528).
But I do not succeed in extracting the path coefficients.
This is the error i get: Error in cbind(Xlarge, Xsmall) : number of rows of matrices must match (see arg 2)
I already tried (but this did not work):
standardising my data because of the warning: Some predictor variables are on very different scales: consider rescaling
adapted my data (threw some NA values away)
This is my modellist:
predatielijst = list(
lmer(plantgrootte ~ gapfraction + olsen_P + (1|plot_ID), data = d),
glmer(piek1 ~ gapfraction + olsen_P + plantgrootte + (1|plot_ID),
family = poisson, data = d),
glmer(predatie ~ piek1 + (1|plot_ID), family = binomial, data = d)
)
with "predatie" being a binary variable (yes or no) and all the rest continuous variables (gapfraction, plantgrootte, olsen_P & piek1)
Thanks in advance!
Try installing the development version:
library(devtools)
install_github("jslefche/piecewiseSEM#2.0")
Replace list with psem and run the coefs or summary function. It will likely get rid of your error. If not, open a bug on Github!
WARNING: this will overwrite your current version from CRAN. You will need to reinstall from CRAN to get version 1.4 back.
try to use lme (out of the nlme library) ilstead of glmer. As far as I understand, the fact that lmer does not provide p-values (while lme does) seems to be the problem here.
Hope this works.
I'm using the plm package for panel data to do instrumental variable estimation. However, it seems that calculating cluster robust standard errors by using the vcovHC() function is not supported.
More specifically, when I use the vcovHC() function, the following error message is displayed:
Error in vcovG.plm(x, type = type, cluster = cluster, l = 0, inner = >inner, :
Method not available for IV
Example:
data("Wages", package = "plm")
IV <- plm(lwage ~ south + exp | wks + south,
data = Wages, model = "pooling", index = 595)
vcvIV <- vcovHC(IV)
According to this thread, someone worked on a fix two years ago. Is there any progress on the issue? I know that the packages "lfe" and "ivpack" allow to compute cluster robust standard errors for IV estimation but none of them allows for random effects/intercepts.
In fact it's not implemented. However, you can use Schrimpf's clustered errors function which is applied directly to a object of the plm class.
Using your example:
library (plm)
data("Wages", package = "plm")
IV <- plm(lwage ~ south + exp | wks + south, data = Wages, model = "pooling", index = 595)
Wages$id <- rep(1:595, each = 7)
cl.plm(Wages, IV, Wages$id)
Where I'm using Wages$idas the panel first dimension around which clusters will be formed. You may want to compare these results with the obtained in other software. Anyway, the code is simple allowing some tricks. The cl.plm function is based on Arai's clustering notes which can help you further.
You can obtain the same result from cl.plm doing this in Stata:
ivregress 2sls lwage south (exp = wks), vce(cluster id) small
Or for the within model:
xtset id time, generic
xtivreg2 lwage south (exp = wks), fe small cluster(id)
Note however I used the small sample formulation in Stata, which is not big deal. More about this here. Anyway, cl.plm properly deals with the plm class object.
For sake of completeness: as suggested by #Helix123, you can use the development version (1.6-1) of plm package and proceed as you did in tour question.
I have fitted a GARCH process to a time series and analyzed the ACF for squared and absolute residuals to check the model goodness of fit. But I also want to do a formal test and after searching the internet, The Weighted Portmanteau Test (originally by Li and Mak) seems to be the one.
It's from the WeightedPortTest package and is one of the few (perhaps the only one?) that properly tests the GARCH residuals.
While going through the instructions in various documents I can't wrap my head around what the "h.t" argument wants. It says in the info in R that I need to assign "a numeric vector of the conditional variances". This may be simple to an experienced user, though I'm struggling to understand. What is it that I need to do and preferably how would I code it in R?
Thankful for any kind of help
Taken directly from the documentation:
h.t: a numeric vector of the conditional variances
A little toy example using the fGarch package follows:
library(fGarch)
library(WeightedPortTest)
spec <- garchSpec(model = list(alpha = 0.6, beta = 0))
simGarch11 <- garchSim(spec, n = 300)
fit <- garchFit(formula = ~ garch(1, 0), data = simGarch11)
Weighted.LM.test(fit#residuals, fit#h.t, lag = 10)
And using garch() from the tseries package:
library(tseries)
fit2 <- garch(as.numeric(simGarch11), order = c(0, 1))
summary(fit2)
# comparison of fitted values:
tail(fit2$fitted.values[,1]^2)
tail(fit#h.t)
# comparison of residuals after unstandardizing:
unstd <- fit2$residuals*fit2$fitted.values[,1]
tail(unstd)
tail(fit#residuals)
Weighted.LM.test(unstd, fit2$fitted.values[,1]^2, lag = 10)
I want to compare GWR fittings produced between spgwr and mgcv, but I got a error with gam function of mgcv . Here is a example :
require(spgwr)
require(mgcv)
require(R2BayesX)
data(columbus)
col.bw <- gwr.sel(crime ~ income + housing, data=columbus,verbose=F,
coords=cbind(columbus$x, columbus$y))
col.gauss <- gwr(crime ~ income + housing, data=columbus,
coords=cbind(columbus$x, columbus$y),
bandwidth=col.bw, hatmatrix=TRUE)
#gwr fitting with Intercept
col.gam<-gam(crime ~s(x,y)+s(x,y)*income+s(x,y)*housing, data=columbus)#mgcv ERROR
b1<-bayesx(crime ~sx(x,y)+sx(x,y)*income+sx(x,y)*housing, data=columbus)#R2Bayesx ERROR
Question:
How to fit the same gwr using gam and bayesx function(the smooth functions of location )
How to control the parameters to be similiar as possible including optimal bandwidth
The mgcv error comes from the factor that you are specifying the "interactions" between the spatial smooth and variables income and housing. Read ?gam.models for details on using by terms. I think for this you need
col.gam <- gam(crime ~s(x,y, k = 5) + s(x,y, by = income, k = 5) +
s(x,y, by = housing, k = 5), data=columbus)
In this example, as there are only 49 observations, you need to restrict the dimensions of the basis functions, which I do here with k = 5, but you should investigate whether you need to vary these a little, within the constraints of the data.
By the looks of the error from bayesx, you have the same issue of specifying the model incorrectly. I'm not familiar with bayesx(), but it looks like it uses the same s() function as supplied with mgcv, so the model specification should be the same as I show above.
As for 2. can you expand on what you mean here Comparable getween gam() and bayesx() or getting both or one of these comparable with the spgwr() model?