Optim inside function: promise already under evaluation - r

The are a few topics about the error message "promise already under evaluation" and scoping. However it doesn't seem to be the case here. I trying to use optim inside other function. To reproduce the same error I put a minimal example bellow. Is there a way to avoid this?
set.seed(123)
df_ss = data.frame(var1 = rnorm(100),
var2 = rnorm(100),
var3 = rnorm(100),
var4 = rnorm(100))
test <- function(df_ss = df_ss, degree = 3, raw = TRUE, ...){
# objective function
objective <- function(beta, df_ss = df_ss) {
op2 <- lm(formula = I(var1 - beta*var2) ~ poly(I(var3 - beta*var2), degree = degree), data = df_ss)
return (sum(residuals(op2)^2))
}
ss_reg <- optim(1, fn = objective , method ="Brent",lower =-1, upper =1)
}
test()
Error in model.frame.default(formula = I(var1 - beta * var2) ~ poly(I(var3 - : promise already under evaluation: recursive default argument reference or earlier problems?

The following works. Avoid doing df_ss = df_ss.
test <- function(dat = df_ss, degree = 3, raw = TRUE, ...){
# objective function
objective <- function(beta) {
op2 <- lm(formula = I(var1 - beta*var2) ~ poly(I(var3 - beta*var2), degree = degree, raw = raw), data = dat)
return (sum(residuals(op2)^2))
}
ss_reg <- optim(1, fn = objective , method ="Brent",lower =-1, upper =1)
}
result <- test()
result
# par
# [1] -0.03866607
#
# value
# [1] 80.22191

Related

R: passing a formula into an function as the first input

library(RSSL)
set.seed(1)
df <- generateSlicedCookie(1000,expected=FALSE) %>%
add_missinglabels_mar(Class~.,0.98)
class_erlr <- EntropyRegularizedLogisticRegression(Class ~., df, lambda=0.01,lambda_entropy = 100)
In the EntropyRegularizedLogisticRegression function from the RSSL package, the example in the documentation passed in the formula Class ~. as the input. I was looking at the source code, and these are the parameters for the function
function (X, y, X_u = NULL, lambda = 0, lambda_entropy = 1, intercept = TRUE,
init = NA, scale = FALSE, x_center = FALSE)
I tried manually defining what X, y, X_u are based on the df I generated. But running the following gives me an error with the optimization:
y <- df$Class
X <- df[, -1]
ids <- which(is.na(y))
X_u <- X[ids, ]
class_erlr_manual <- EntropyRegularizedLogisticRegression(X = X, y = y, X_u = X_u, lambda=0.01,lambda_entropy = 100)
The error reads:
Error in optim(w, fn = loss_erlr, gr = grad_erlr, X, y, X_u, lambda = lambda, :
initial value in 'vmmin' is not finite
Why does changing the formula input Class ~. into X=X, y =y, X_u = X_u result in an error? Can anyone point me to where in the source code the formula input is being used?

Error with bootmer and confint for glmer

I'm running into an error that I can't find any documentation on when I try to bootstrap a glmer object:
glm2 <- glmer(RT~valence+location+first_location+Trial_num +
(1+Trial_num|id)+(1|Trial_num),
family=inverse.gaussian(log),
control = glmerControl(optimizer = "nloptwrap",
calc.derivs = FALSE), data=df_long)
The error is:
Error in lme4::.simulateFun(object = , :
could not find function "sfun
This is regardless of whether I try bootMer or confint:
bootMer_out <- bootMer(glm2,FUN=fixef, nsim=300)
confint_out <- confint(glm2, method="boot")
When I run as an lmer object I don't have the issue with bootstrapping. i.e.
lm2 <- glmer(RT~valence+location+first_location+Trial_num + (1+Trial_num|id)+(1|Trial_num), family=inverse.gaussian(log), control = glmerControl(optimizer = "nloptwrap", calc.derivs = FALSE), data=df_long))
Does it have to do with the link function? Is there a workaround? I couldn't find function 'sfun' in the simulateFun documentation either. I could always just do the transformation on the data separately and use lmer instead of glmer, but if anyone has some insight that would be great (since I'm curious now).
As pointed out by #user20650, you'll need to add a simulation method for the inverse gaussian family.
For example, I added these to a branch on my lme4 fork under predict.R:
rinvgauss <- function(n, mu, lambda) {
# transcribed from https://en.wikipedia.org/wiki/Inverse_Gaussian_distribution
nu <- rnorm(n)
y <- nu^2
x <- mu + (mu^2 * y)/(2*lambda) - (mu/(2*lambda)) * sqrt(4*mu*lambda*y + mu^2*y^2)
z <- runif(n)
ifelse(z <= mu/(mu + x), x, mu^2/x)
}
inverse.gaussian_simfun <- function(object, nsim, ftd = fitted(object),
wts = weights(object)) {
if (any(wts != 1)) message("using weights as inverse variances")
dispersion <- sum((weights(object, 'working') *
resid(object, 'working')^2)[weights(object, 'working')>0])/df.residual(object)
rinvgauss(nsim * length(ftd), mu = ftd,
lambda = wts/dispersion)
}
# ... skip a few
simfunList <- list(gaussian = gaussian_simfun,
binomial = binomial_simfun,
poisson = poisson_simfun,
Gamma = Gamma_simfun,
negative.binomial = negative.binomial_simfun,
inverse.gaussian = inverse.gaussian_simfun)
Here's an example:
# devtools::install_github('aforren1/lme4', ref = 'add_invgauss_simulate')
library(lme4)
set.seed(1)
dat <- data.frame(y = lme4:::rinvgauss(1000, 3, 4),
x = runif(1000),
subj = factor(rep(1:10, 100)))
mod <- glmer(y ~ x + (1|subj),
data = dat,
family = inverse.gaussian(link='log'))
# ~60 secs on my laptop
(boots <- confint(mod, method = 'boot', nsim = 100, parm = 'beta_'))
2.5 % 97.5 %
(Intercept) 1.0044813 1.248774
x -0.2158155 0.161213
(walds <- confint(mod, method = 'Wald', parm = 'beta_'))
2.5 % 97.5 %
(Intercept) 1.000688 1.2289971
x -0.205546 0.1644621
You can see that the bootstrap method gives (roughly) the same results as the Wald method.

Running existing function with non-default option

The code pasted below from ResourceSelection::hoslem.test performs a Hosmer and Lemeshow goodness of fit test. While investigating why the output that does not agree exactly with that performed by another software (Stata), I found that the difference relates to use of default R argument for the quantile function (type=7). I would like to use this function with a different default for calculation of quantiles (type=6).
FWIW, the reference to the 9 possible methods used by R can be found at:
https://www.amherst.edu/media/view/129116/original/Sample+Quantiles.pdf
The Stata manual for pctile refers to a default method and an 'altdef' method. I found it difficult to map these two methods to corresponding R types.
However,
hoslem.test(yhat, y, type=6)
Produces:
> hl <- hoslem.test(y, yhat, type=6)
Error in hoslem.test(y, yhat, type = 6) : unused argument (type = 6)
Is there a way to run the function below with a non-default argument for the quantile function?
Ie. allows the following line adding ', type=6':
qq <- unique(quantile(yhat, probs = seq(0, 1, 1/g), type=6))
The function in question is:
> ResourceSelection::hoslem.test
function (x, y, g = 10)
{
DNAME <- paste(deparse(substitute(x)), deparse(substitute(y)),
sep = ", ")
METHOD <- "Hosmer and Lemeshow goodness of fit (GOF) test"
yhat <- y
y <- x
qq <- unique(quantile(yhat, probs = seq(0, 1, 1/g)))
cutyhat <- cut(yhat, breaks = qq, include.lowest = TRUE)
observed <- xtabs(cbind(y0 = 1 - y, y1 = y) ~ cutyhat)
expected <- xtabs(cbind(yhat0 = 1 - yhat, yhat1 = yhat) ~
cutyhat)
chisq <- sum((observed - expected)^2/expected)
PVAL = 1 - pchisq(chisq, g - 2)
PARAMETER <- g - 2
names(chisq) <- "X-squared"
names(PARAMETER) <- "df"
structure(list(statistic = chisq, parameter = PARAMETER,
p.value = PVAL, method = METHOD, data.name = DNAME, observed = observed,
expected = expected), class = "htest")
}
We can modify pieces of functions. Look at the body of the function
as.list(body(hoslem.test))
See that the element we want to modify is the 6th element in the body
[[1]]
`{`
[[2]]
DNAME <- paste(deparse(substitute(x)), deparse(substitute(y)),
sep = ", ")
[[3]]
METHOD <- "Hosmer and Lemeshow goodness of fit (GOF) test"
[[4]]
yhat <- y
[[5]]
y <- x
[[6]]
qq <- unique(quantile(yhat, probs = seq(0, 1, 1/g)))
Modify the 6th element to what you want
body(hoslem.test)[[6]] = substitute(qq <- unique(quantile(yhat,
probs = seq(0, 1, 1/g), type = 6)))
The easiest way would be to reenter the function as your own:
myhoslem.test<-function(x, y, g = 10, mytype = 6){
DNAME <- paste(deparse(substitute(x)), deparse(substitute(y)),
sep = ", ")
METHOD <- "Hosmer and Lemeshow goodness of fit (GOF) test"
yhat <- y
y <- x
qq <- unique(quantile(yhat, probs = seq(0, 1, 1/g), type = mytype))
cutyhat <- cut(yhat, breaks = qq, include.lowest = TRUE)
observed <- xtabs(cbind(y0 = 1 - y, y1 = y) ~ cutyhat)
expected <- xtabs(cbind(yhat0 = 1 - yhat, yhat1 = yhat) ~
cutyhat)
chisq <- sum((observed - expected)^2/expected)
PVAL = 1 - pchisq(chisq, g - 2)
PARAMETER <- g - 2
names(chisq) <- "X-squared"
names(PARAMETER) <- "df"
structure(list(statistic = chisq, parameter = PARAMETER,
p.value = PVAL, method = METHOD, data.name = DNAME, observed = observed,
expected = expected), class = "htest")
}
The key change here is :
qq <- unique(quantile(yhat, probs = seq(0, 1, 1/g), type = mytype))
and allowing mytype as a argument to the function with default as 6
The two answers suggest a wrapper function to flexibly modify hoslem.test
myhoslem.test<-function(x, y, g = 10, mytype = 6){
body(hoslem.test)[[6]] = substitute(qq <- unique(quantile(yhat,
probs = seq(0, 1, 1/g), type = mytype)))
hoslem.test(x,y, g=10)
}

Passing the weights argument to a regression function inside an R function

I am trying to write an R function to run a weighted (optional) regressions, and I am having difficulties getting the weight variable to work.
Here is a simplified version of the function.
HC <- function(data, FUN, formula, tau = 0.5, weights = NULL){
if(is.null(weights)){
est <- FUN(data = data, formula = formula, tau = tau)
intercept = est$coef[["(Intercept)"]]
zeroWorker <- exp(intercept)
}
else {
est <- FUN(data = data, formula = formula, tau = tau, weights = weights)
intercept = est$coef[["(Intercept)"]]
zeroWorker <- exp(intercept)
}
return(zeroWorker)
}
The function works perfectly if I do not use the weights argument.
mod1 <- HC(data = mydata, formula = lin.model, tau = 0.2,
FUN = rq)
But, throws an error message when I use the weights argument.
mod2 <- HC(data = mydata, formula = lin.model, tau = 0.2,
FUN = rq, weights = weig)
I google the problem, and this post seems to be the closest to my problem, but I could still not get it to work. R : Pass argument to glm inside an R function.
Any help will be appreciated.
My problem can be replicated with:
library("quantreg")
data(engel)
mydata <- engel
mydata$weig <- with(mydata, log(sqrt(income))) # Create a fictive weigth variable
lin.model <- foodexp~income
mod1 <- HC(data = mydata, formula = lin.model, tau = 0.2,
FUN = rq) # This works perfectly
mod2 <- HC(data = mydata, formula = lin.model, tau = 0.2,
FUN = rq, weights = weig) # throws an error.
Error in HC(data = mydata, formula = lin.model, tau = 0.2, FUN = rq, weights = weig) :
object 'weig' not found
You have two problems. The error you're encountering is because you're trying to use the weigh variable without referencing it as coming from the mydata dataset. Try using mydata$weig. This will solve your first error, but you then get the actual one related to using the weights argument, which is:
Error in model.frame.default(formula = formula, data = data, weights = substitute(weights), :
invalid type (symbol) for variable '(weights)'
The solution is to add the variable specified in HC's weights argument to the dataframe before passing it to FUN:
HC <- function(data, FUN, formula, tau = 0.5, weights = NULL){
data$.weights <- weights
if(is.null(weights)){
est <- FUN(data = data, formula = formula, tau = tau)
} else {
est <- FUN(data = data, formula = formula, tau = tau, weights = .weights)
}
intercept = est$coef[["(Intercept)"]]
zeroWorker <- exp(intercept)
return(zeroWorker)
}
Then everything works:
mod2 <- HC(data = mydata, formula = lin.model, tau = 0.2, FUN = rq, weights = mydata$weig)
mod2
# [1] 4.697659e+47

Reproducing results from previous answer is not working due to using new version of lme4

I have tried to reproduce the results from the answers for this question “Estimating random effects and applying user defined correlation/covariance structure with R lme4 or nlme package “ https://stats.stackexchange.com/questions/18563/estimating-random-effects-and-applying-user-defined-correlation-covariance-struc
Aaron Rendahl's codes
library(pedigreemm)
relmatmm <- function (formula, data, family = NULL, REML = TRUE, relmat = list(),
control = list(), start = NULL, verbose = FALSE, subset,
weights, na.action, offset, contrasts = NULL, model = TRUE,
x = TRUE, ...)
{
mc <- match.call()
lmerc <- mc
lmerc[[1]] <- as.name("lmer")
lmerc$relmat <- NULL
if (!length(relmat))
return(eval.parent(lmerc))
stopifnot(is.list(relmat), length(names(relmat)) == length(relmat))
lmerc$doFit <- FALSE
lmf <- eval(lmerc, parent.frame())
relfac <- relmat
relnms <- names(relmat)
stopifnot(all(relnms %in% names(lmf$FL$fl)))
asgn <- attr(lmf$FL$fl, "assign")
for (i in seq_along(relmat)) {
tn <- which(match(relnms[i], names(lmf$FL$fl)) == asgn)
if (length(tn) > 1)
stop("a relationship matrix must be associated with only one random effects term")
Zt <- lmf$FL$trms[[tn]]$Zt
relmat[[i]] <- Matrix(relmat[[i]][rownames(Zt), rownames(Zt)],
sparse = TRUE)
relfac[[i]] <- chol(relmat[[i]])
lmf$FL$trms[[tn]]$Zt <- lmf$FL$trms[[tn]]$A <- relfac[[i]] %*% Zt
}
ans <- do.call(if (!is.null(lmf$glmFit))
lme4:::glmer_finalize
else lme4:::lmer_finalize, lmf)
ans <- new("pedigreemm", relfac = relfac, ans)
ans#call <- match.call()
ans
}
the original example
set.seed(1234)
mydata <- data.frame (gen = factor(rep(1:10, each = 10)),
repl = factor(rep(1:10, 10)),
yld = rnorm(10, 5, 0.5))
library(lme4)
covmat <- round(nearPD(matrix(runif(100, 0, 0.2), nrow = 10))$mat, 2)
diag(covmat) <- diag(covmat)/10+1
rownames(covmat) <- colnames(covmat) <- levels(mydata$gen)
m <- relmatmm(yld ~ (1|gen) + (1|repl), relmat=list(gen=covmat), data=mydata)
here is the error message
Error in lmf$FL : $ operator not defined for this S4 class
In addition: Warning message:
In checkArgs("lmer", doFit = FALSE) : extra argument(s) ‘doFit’ disregarded
I will appreciate any help ?
Thanks
This is a re-implementation of the previous code -- I have done some slight modifications, and I have not tested it in any way -- test yourself and/or use at your own risk.
First create a slightly more modularized function that constructs the deviance function and fits the model:
doFit <- function(lmod,lmm=TRUE) {
## see ?modular
if (lmm) {
devfun <- do.call(mkLmerDevfun, lmod)
opt <- optimizeLmer(devfun)
mkMerMod(environment(devfun), opt, lmod$reTrms, fr = lmod$fr)
} else {
devfun <- do.call(mkGlmerDevfun, lmod)
opt <- optimizeGlmer(devfun)
devfun <- updateGlmerDevfun(devfun, lmod$reTrms)
opt <- optimizeGlmer(devfun, stage=2)
mkMerMod(environment(devfun), opt, lmod$reTrms, fr = lmod$fr)
}
}
Now create a function to construct the object that doFit needs and modify it:
relmatmm <- function (formula, ..., lmm=TRUE, relmat = list()) {
ff <- if (lmm) lFormula(formula, ...) else glFormula(formula, ...)
stopifnot(is.list(relmat), length(names(relmat)) == length(relmat))
relnms <- names(relmat)
relfac <- relmat
flist <- ff$reTrms[["flist"]] ## list of factors
## random-effects design matrix components
Ztlist <- ff$reTrms[["Ztlist"]]
stopifnot(all(relnms %in% names(flist)))
asgn <- attr(flist, "assign")
for (i in seq_along(relmat)) {
tn <- which(match(relnms[i], names(flist)) == asgn)
if (length(tn) > 1)
stop("a relationship matrix must be",
" associated with only one random effects term")
zn <- rownames(Ztlist[[i]])
relmat[[i]] <- Matrix(relmat[[i]][zn,zn],sparse = TRUE)
relfac[[i]] <- chol(relmat[[i]])
Ztlist[[i]] <- relfac[[i]] %*% Ztlist[[i]]
}
ff$reTrms[["Ztlist"]] <- Ztlist
ff$reTrms[["Zt"]] <- do.call(rBind,Ztlist)
fit <- doFit(ff,lmm)
}
Example
set.seed(1234)
mydata <- data.frame (gen = factor(rep(1:10, each = 10)),
repl = factor(rep(1:10, 10)),
yld = rnorm(10, 5, 0.5))
library(lme4)
covmat <- round(nearPD(matrix(runif(100, 0, 0.2), nrow = 10))$mat, 2)
diag(covmat) <- diag(covmat)/10+1
rownames(covmat) <- colnames(covmat) <- levels(mydata$gen)
m <- relmatmm(yld ~ (1|gen) + (1|repl), relmat=list(gen=covmat),
data=mydata)
This runs -- I don't know if the output is correct. It also doesn't make the resulting object into a pedigreemm object ...

Resources