Error in cv.glmnet for poisson with an offset - r

I am facing an error when trying to run a cross validation on glmnet for family = poisson using an offset.
I managed to replicate the error with the very simple example below:
library(glmnet)
#poisson
N=500; p=20
nzc=5
x=matrix(rnorm(N*p),N,p)
beta=rnorm(nzc)
f = x[,seq(nzc)]%*%beta
mu=exp(f)
y=rpois(N,mu)
exposure=rep(0.5,length(y))
#cross validation
cv=cv.glmnet(x,y,family="poisson",offset=log(exposure),nlambda=50,nfolds=3)
which returns the following error:
Error: No newoffset provided for prediction, yet offset used in fit of
glmnet
I can't figure out what I'm doing wrong here. And wasn't able to find any help on the internet. Would anyone have ideas?
Thanks a lot!
EDIT : this issue is obsolete, and was linked to the version 2.0-12 of the glmnet package - fixed when updating to version 2.0-13

This works:
predict(cv,x,newoffset=log(exposure))
From the documentation for glmnet for the offset parameter:
If supplied, then values must also be supplied to the predict
function.

Related

GEE log binomial to estimate relative risk in clustered data

I need to perform a log binomial regression using GEE. I tried to solve this using gee package with the following code:
gee(y~x,id,data = data, family="binomial",link="log",corstr="exchangeable")
Where y is a binary variable 0/1.
When I run that, i got this error:
Error in gee(y~x,id,data = data, family="binomial",link="log",corstr="exchangeable") :
unused argument (link = "log")
I know that the error is straightforward, however, I don't know how to specify the link function inside the gee function. The problem is that the binary outcome is more than 50%. Because of that, OR is not a good proxy for risk interpretation.
Please help.

Robust standard error (HC3) using vcovHC(), coeftest for plm object

I want to estimate a fixed effect model and use a robust variance-covariance matrix with the HC3 small-sample adjustment.
For the model itself I use following lines of code:
require(plm)
require(sandwich)
require(lmtest)
require(car)
QSFE <- plm(log(SPREAD)~PERIOD, data = na.omit(QSREG), index = c("STOCKS", "TIME"), model = "within")
This works very fine, now to calculate the HC3 robust standard error, I use used the function coeftest with vcovHC in it.
coeftest(x = QSFE, vcov = vcovHC(QSFE, type = "HC3", method = "arellano"))
And this does not work. The returned error goes as follows:
Error in 1 - diaghat : non-numeric argument to binary operator
The issue is in vcovHC: when one sets the type to "HC3". It uses the function hatvalues() to calculate "diaghat", which does not support plm objects and returns the error:
Error in UseMethod("hatvalues") :
no applicable method for 'hatvalues' applied to an object of class "c('plm', 'panelmodel')"
Does anyone know, how to use the HC3 (HC2) estimator for plm. I think it should depend on the function hatvalues used in vcov, since HC0/HC1 works fine, because this do not need it.
plm developer here. While the efficiency issue is interesting computationally, from a statistical viewpoint these small-sample corrections are not needed when you have a 300 x 300 panel. You can happily go with HC0 (or if you definitely want a panel small-sample correction "sss" (panel DF) would be best anyway, and the latter is computationally much lighter).
The fact that small-sample corrections become useless when the data size increases is the main reason we did not allocate scarce developer time into making them more efficient.
Also, from a statistical viewpoint please be aware that the properties of "clustering" vcovs like White-Arellano are less than ideal for T ~ N, they are meant for N >> T.
Lastly, one clarification re: your original post: while originally vcovHC is a generic function in the 'sandwich' package, in a panel context the specialized method vcovHC.plm from the 'plm' package is applied.
Better explanation here: https://www.jstatsoft.org/article/view/v082i03
In the method supplied by plm for plm objects, there is no function hatvalues in package plm, the word "hatvalues" is not even in plm's source code. Be sure to have package plm loaded when you execute coeftest. Also, be sure to have the latest version of plm installed from CRAN (currently, version 2.2-3).
If you have package plm loaded, the code should work. It does with a toy example on my machine. To be sure, you may want to force the use of vcovHC as supplied by plm:
Fist, try vcovHC(QSFE, type = "HC3", method = "arellano"). If that gives the same error, try plm::vcovHC(QSFE, type = "HC3", method = "arellano").
Next, please try:
coeftest(QSFE, vcov.=function(x) vcovHC(QSFE, method="arellano", type="HC3"))
Edit:
Using the data set supplied, it is clear that dispatching to vcovHC.plm works correctly. Package sandwich is not involved here. The root cause is the memory demand of the function vcovHC.plm with the argument type set to "HC3" (and others). This also explains your comment about the function working for a subset of the data.
Edit2:
Memory demand of vcovHC.plm's small sample adjustments is significantly lower from plm version 2.4-0 onwards (internal function dhat optimized) and the error does not happen anymore.
vcovHC(QSFE, type = "HC3", method = "arellano")
Error in 1 - diaghat : non-numeric argument to binary operator
Called from: omega(uhat, diaghat, df, G)
Browse[1]> diaghat
[1] "Error : cannot allocate vector of size 59.7 Gb\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError: cannot allocate vector of size 59.7 Gb>

Error when estimating CI for GLMM using confint()

I have a set of GLMMs fitted with a binary response variable and a set of continuous variables, and I would like to get confidence intervals for each model. I've been using confint() function, at 95% and with the profile method, and it works without any problems if it is applied to a model with no interactions.
However, when I apply confint() to a model with interactions (continuous*continuous), I've been getting this error:
m1CI <- confint(m1, level=0.95, method="profile")
Error in zeta(shiftpar, start = opt[seqpar1][-w]) :
profiling detected new, lower deviance
The model runs without any problem (although I applied an optimizer because some of the models were having problems with convergence), and here is the final form of one of them:
m1 <- glmer(Use~RSr2*W+RSr3*W+RShw*W+RScon*W+
RSmix*W+(1|Pack/Year),
control=glmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=100000)),
data = data0516RS, family=binomial(link="logit"))
Does anyone know why this is happening, and how can I solve it?
I am using R version 3.4.3 and lme4 1.1-17
The problem was solved by following these instructions:
The error message indicates that during profiling, the optimizer found
a fitted value that was significantly better (as characterized by the
'devtol' parameter) than the supposed minimum-deviance solution returned
in the first place. You can boost the 'devtol' parameter (which is
currently set at a conservative 1e-9 ...) if you want to ignore this --
however, the non-monotonic profiles are also warning you that something
may be wonky with the profile.
From https://stat.ethz.ch/pipermail/r-sig-mixed-models/2014q3/022394.html
I used the confint.merModfrom the lme4 package, and boosted the 'devtol' parameter, first to 1e-8, which didn't work for my models, and then to 1e-7. With this value, it worked

model averaged coefficients of linear mixed models in glmulti? Fix no longer works

I'm using the glmulti package to do variable selection on the fixed effects of a mixed model in lme4. I had the same problem retrieving coefficients and confidence intervals that was solved by the author of the package in this thread. Namely using the coef or coef.multi gives an check.names error and the coefficients are listed as NULL when calling the predict method. So I tried the solution listed on the thread linked above, using:
setMethod('getfit', 'merMod', function(object, ...) {
summ=summary(object)$coef
summ1=summ[,1:2]
if (length(dimnames(summ)[[1]])==1) {
summ1=matrix(summ1, nr=1, dimnames=list(c("(Intercept)"),c("Estimate","Std. Error")))
}
cbind(summ1, df=rep(10000,length(fixef(object))))
})
I fixed the missed " in the original post and the code ran. But, now instead of getting
Error in data.frame(..., check.names = FALSE) :arguments imply
differing number of rows: 1, 0
I get this error for every single model...
Error in calculation of the Satterthwaite's approximation. The output
of lme4 package is returned summary from lme4 is returned some
computational error has occurred in lmerTest
I'm using lmerTest and it doesn't surprise me that it would fail if glmulti can't pull the correct info from the model. So really it's the first two lines of the error that are probably what should be focussed on.
A description of the original fix is on the developers website here. Clearly the package hasn't been updated in awhile, and yes I should probably learn a new package...but until then I'm hoping for a fix. I'll contact the developer directly through his website. But, in the mean time, has anyone tried this and found a fix?
lme4 glmulti rJava and other related packages have all been updated to the latest version.

Error when using AUC package in R

We use the AUC package in R to evaluate the model prediction.
Sometimes, we faced the error like below:
> plot(roc(pred, yTEST))
> auc(roc(pred, yTEST))
Error in rank(prob) : argument "prob" is missing, with no default
Could anyone let us know where the error comes from ? Note that: the problem did not occur frequently. For example: we ran 10 models and it happened to 3-4 models.
It sounds like you might have loaded glmnet after AUC (glmnet also has a auc function). Just used AUC:: before the function.

Resources