How to fix 'unused arguments' error on nls model - r

A <- c(100, 200, 300, 400, 500, 600)
B <- c(60, 50, 40, 30, 20, 10)
data <- data.frame(A,B)
sample_plot <- function(A,B)
{
(w * ((x * A)^2) / (y/z)
}
control <- nls.control(maxiter = 1000)
sample_model <- nls(B, w, x, y, z), control = control, start = list(w = 62.2060, x = 0.0438, y = 0.9692, z = 0.8693))
plot(A, predict(sample_model), type = "1", col = "blue")
points(A, B, col = "red")
summary(sample_model)
Everytime I run the codes of the model, it provides an error about the unused arguments which states that "Error in sample_model(B, w, x, y, z) : unused arguments (x, y, z)", even though I provided values for x, y, and z. I also try to change the sample_plot <- function(A,B) into sample_plot <- function(A,B,...) but it doesn't work. Any help? Thank You.

As #user20650 has pointed out the model is overparametized and cannot give any different predictions than the model B ~ w * A^2. This is because all the other parameters can be absorbed into w. Further, as #Dave2e has indicated such a model can't fit the data in any reasonable way.
Plotting the data, B seems linear in A so using lm should do and, in fact, gives a perfect fit as the residual sum of squares is effectively zero.
fm <- lm(B ~ A, data)
fm
## Call:
## lm(formula = B ~ A, data = data)
##
## Coefficients:
## (Intercept) A
## 70.0 -0.1
deviance(fm) # residual sum of squares
## [1] 1.390367e-28
plot(B ~ A, data)
abline(fm)

Related

Automatically solve an equation of `pt` for `ncp`

I wonder if it is possible to efficiently change ncp in the below code such that x becomes .025 and .975 (within rounding error).
x <- pt(q = 5, df = 19, ncp = ?)
----------
Clarification
q = 5 and df = 19 (above) are just two hypothetical numbers, so q and df could be any other two numbers. What I expect is a function / routine, that takes q and df as input.
What is wrong with uniroot?
f <- function (ncp, alpha) pt(q = 5, df = 19, ncp = ncp) - alpha
par(mfrow = c(1,2))
curve(f(ncp, 0.025), from = 5, to = 10, xname = "ncp", main = "0.025")
abline(h = 0)
curve(f(ncp, 0.975), from = 0, to = 5, xname = "ncp", main = "0.975")
abline(h = 0)
So for 0.025 case, the root lies in (7, 8); for 0.975 case, the root lies in (2, 3).
uniroot(f, c(7, 8), alpha = 0.025)$root
#[1] 7.476482
uniroot(f, c(2, 3), alpha = 0.975)$root
#[1] 2.443316
---------
(After some discussion...)
OK, now I see your ultimate goal. You want to implement this equation solver as a function, with input q and df. So they are unknown, but fixed. They might come out of an experiment.
Ideally if there is an analytical solution, i.e., ncp can be written as a formula in terms of q, df and alpha, that would be so great. However, this is not possible for t-distribution.
Numerical solution is the way, but uniroot is not a great option for this purpose, as it relies on "plot - view - guess - specification". The answer by loki is also crude but with some improvement. It is a grid search, with fixed step size. Start from a value near 0, say 0.001, and increase this value and check for approximation error. We stop when this error fails to decrease.
This really initiates the idea of numerical optimization with Newton-method or quasi-Newton method. In 1D case, we can use function optimize. It does variable step size in searching, so it converges faster than a fixed step-size searching.
Let's define our function as:
ncp_solver <- function (alpha, q, df) {
## objective function: we minimize squared approximation error
obj_fun <- function (ncp, alpha = alpha, q = q, df = df) {
(pt(q = q, df = df, ncp = ncp) - alpha) ^ 2
}
## now we call `optimize`
oo <- optimize(obj_fun, interval = c(-37.62, 37.62), alpha = alpha, q = q, df = df)
## post processing
oo <- unlist(oo, use.names = FALSE) ## list to numerical vector
oo[2] <- sqrt(oo[2]) ## squared error to absolute error
## return
setNames(oo, c("ncp", "abs.error"))
}
Note, -37.62 / 37.62 is chosen as lower / upper bound for ncp, as it is the maximum supported by t-distribution in R (read ?dt).
For example, let's try this function. If you, as given in your question, has q = 5 and df = 19:
ncp_solver(alpha = 0.025, q = 5, df = 19)
# ncp abs.error
#7.476472e+00 1.251142e-07
The result is a named vector, with ncp and absolute approximation error.
Similarly we can do:
ncp_solver(alpha = 0.975, q = 5, df = 19)
# ncp abs.error
#2.443347e+00 7.221928e-07
----------
Follow up
Is it possible that in the function ncp_solver(), alpha takes a c(.025, .975) together?
Why not wrapping it up for a "vectorization":
sapply(c(0.025, 0.975), ncp_solver, q = 5, df = 19)
# [,1] [,2]
#ncp 7.476472e+00 2.443347e+00
#abs.error 1.251142e-07 7.221928e-07
How come 0.025 gives upper bound of confidence interval, while 0.975 gives lower bound of confidence interval? Should this relationship reversed?
No surprise. By default pt computes lower tail probability. If you want the "right" relationship, set lower.tail = FALSE in pt:
ncp_solver <- function (alpha, q, df) {
## objective function: we minimize squared approximation error
obj_fun <- function (ncp, alpha = alpha, q = q, df = df) {
(pt(q = q, df = df, ncp = ncp, lower.tail = FALSE) - alpha) ^ 2
}
## now we call `optimize`
oo <- optimize(obj_fun, interval = c(-37.62, 37.62), alpha = alpha, q = q, df = df)
## post processing
oo <- unlist(oo, use.names = FALSE) ## list to numerical vector
oo[2] <- sqrt(oo[2]) ## squared error to absolute error
## return
setNames(oo, c("ncp", "abs.error"))
}
Now you see:
ncp_solver(0.025, 5, 19)[[1]] ## use "[[" not "[" to drop name
#[1] 2.443316
ncp_solver(0.975, 5, 19)[[1]]
#[1] 7.476492
--------
Bug report and fix
I was reported that the above ncp_solver is unstable. For example:
ncp_solver(alpha = 0.025, q = 0, df = 98)
# ncp abs.error
#-8.880922 0.025000
But on the other hand, if we double check with uniroot here:
f <- function (ncp, alpha) pt(q = 0, df = 98, ncp = ncp, lower.tail = FALSE) - alpha
curve(f(ncp, 0.025), from = -3, to = 0, xname = "ncp"); abline(h = 0)
uniroot(f, c(-2, -1.5), 0.025)$root
#[1] -1.959961
So there is clearly something wrong with ncp_solver.
Well it turns out that we can not use too big bound, c(-37.62, 37.62). If we narrow it to c(-35, 35), it will be alright.
Also, to avoid tolerance problem, we can change objective function from squared error to absolute error:
ncp_solver <- function (alpha, q, df) {
## objective function: we minimize absolute approximation error
obj_fun <- function (ncp, alpha = alpha, q = q, df = df) {
abs(pt(q = q, df = df, ncp = ncp, lower.tail = FALSE) - alpha)
}
## now we call `optimize`
oo <- optimize(obj_fun, interval = c(-35, 35), alpha = alpha, q = q, df = df)
## post processing and return
oo <- unlist(oo, use.names = FALSE) ## list to numerical vector
setNames(oo, c("ncp", "abs.error"))
}
ncp_solver(alpha = 0.025, q = 0, df = 98)
# ncp abs.error
#-1.959980e+00 9.190327e-07
Damn, this is a pretty annoying bug. But relax now.
Report on getting warning messages from pt
I also receive some report on annoying warning messages from pt:
ncp_solver(0.025, -5, 19)
# ncp abs.error
#-7.476488e+00 5.760562e-07
#Warning message:
#In pt(q = q, df = df, ncp = ncp, lower.tail = FALSE) :
# full precision may not have been achieved in 'pnt{final}'
I am not too sure what is going on here, but meanwhile I did not observe misleading result. Therefore, I decide to suppress those warnings from pt, using suppressWarnings:
ncp_solver <- function (alpha, q, df) {
## objective function: we minimize absolute approximation error
obj_fun <- function (ncp, alpha = alpha, q = q, df = df) {
abs(suppressWarnings(pt(q = q, df = df, ncp = ncp, lower.tail = FALSE)) - alpha)
}
## now we call `optimize`
oo <- optimize(obj_fun, interval = c(-35, 35), alpha = alpha, q = q, df = df)
## post processing and return
oo <- unlist(oo, use.names = FALSE) ## list to numerical vector
setNames(oo, c("ncp", "abs.error"))
}
ncp_solver(0.025, -5, 19)
# ncp abs.error
#-7.476488e+00 5.760562e-07
OK, quiet now.
You could use two while loops like this:
i <- 0.001
lowerFound <- FALSE
while(!lowerFound){
x <- pt(q = 5, df = 19, ncp = i)
if (round(x, 3) == 0.025){
lowerFound <- TRUE
print(paste("Lower is", i))
lower <- i
} else {
i <- i + 0.0005
}
}
i <- 0.001
upperFound <- FALSE
while(!upperFound){
x <- pt(q = 5, df = 19, ncp = i)
if (round(x, 3) == 0.975){
upperFound <- TRUE
print(paste("Upper is ", i))
upper <- i
} else {
i <- i + 0.0005
}
}
c(Lower = lower, Upper = upper)
# Lower Upper
# 7.4655 2.4330
Of course, you can adapt the increment in i <- i + .... or change the check if (round(x,...) == ....) to fit this solution to your specific needs of accuracy.
I know this is an old question, but there is now a one-line solution to this problem using the conf.limits.nct() function in the MBESS package.
install.packages("MBESS")
library(MBESS)
result <- conf.limits.nct(t.value = 5, df = 19)
result
$Lower.Limit
[1] 2.443332
$Prob.Less.Lower
[1] 0.025
$Upper.Limit
[1] 7.476475
$Prob.Greater.Upper
[1] 0.025
$Lower.Limit is the result where pt = 0.975
$Upper.Limit is the result where pt = 0.025
pt(q=5,df=19,ncp=result$Lower.Limit)
[1] 0.975
> pt(q=5,df=19,ncp=result$Upper.Limit)
[1] 0.025

How to export all coefficients of a penlized regression model from package `penalized`? Need them for reporting rolling regression estimate

I have been able to run regression with some coefficients constrained to positive territory, but I'm doing alot of rolling regressions where I face the problem. Here is my sample code:
library(penalized)
set.seed(1)
x1=rnorm(100)*10
x2=rnorm(100)*10
x3=rnorm(100)*10
y=sin(x1)+cos(x2)-x3+rnorm(100)
data <- data.frame(y, x1, x2, x3)
win <- 10
coefs <- matrix(NA, ncol=4, nrow=length(y))
for(i in 1:(length(y)-win)) {
d <- data[(1+i):(win+i),]
p <- win+i
# Linear Regression
coefs[p,] <- as.vector(coef(penalized(y, ~ x1 + x2 + x3, ~1,
lambda1=0, lambda2=0, positive = c(F, F, T), data=data)))}
This is how I usually populate matrix with coefs from rolling regression and now I receive error:
Error in coefs[p, ] <- as.vector(coef(penalized(y, ~x1 + x2 + x3, ~1, :
number of items to replace is not a multiple of replacement length
I assume that this error is produced because there is not always Intercept + 3 coefficients coming out of that penalized regression function. Is there away to get penalized function to show 0 coefs as well? or other way to populated matrix / data.frame?
Perhaps you are unaware of the which argument for coef for "penfit" object. Have a look at:
getMethod(coef, "penfit")
#function (object, ...)
#{
# .local <- function (object, which = c("nonzero", "all", "penalized",
# "unpenalized"), standardize = FALSE)
# {
# coefficients(object, which, standardize)
# }
# .local(object, ...)
#}
#<environment: namespace:penalized>
We can set which = "all" to report all coefficients. The default is which = "nonzero", which is causing the "replacement length differs" issue.
The following works:
library(penalized)
set.seed(1)
x1 = rnorm(100)*10
x2 = rnorm(100)*10
x3 = rnorm(100)*10
y = sin(x1) + cos(x2) - x3 + rnorm(100)
data <- data.frame(y, x1, x2, x3)
win <- 10
coefs <- matrix(NA, ncol=4, nrow=length(y))
for(i in 1:(length(y)-win)) {
d <- data[(1+i):(win+i),]
p <- win + i
pen <- penalized(y, ~ x1 + x2 + x3, ~1, lambda1 = 0, lambda2 = 0,
positive = c(F, F, T), data = data)
beta <- coef(pen, which = "all")
coefs[p,] <- unname(beta)
}

R curve fitting (multiple exponential) with NLS2 and NLS

I have some difficulties getting a specific curve to fit in R, while it works perfectly fine in a commercial curve-fitting program.
The formula that the data should fit to is:
y(t) = A * exp(-a*(t)) + B * exp(-b*(t)) - (A+B) * exp(-c*(t))
So for this I want to use the nonlinear regression built into R. I've been at this for a day on-and-off now and just can't get it to function. The issue lies entirely with the initial values, so I'm using NLS2 to brute-force find the initial values.
y <- c(0,0.01377,0.01400875,0.0119175,0.00759375,0.00512125,0.004175,0.00355375,
0.00308875,0.0028925,0.00266375)
t <- c(0,3,6,12,24,48,72,96,120,144,168)
df <- data.frame(t,y)
plot(t,y);
#Our model:
fo <- y ~ f1*exp(-k1*t)+f2*exp(-k2*t)-(f1+f2)*exp(-k3*t);
#Define the outer boundaries to search for initial values
grd <- data.frame(f1=c(0,1),
f2=c(0,1),
k1=c(0,2),
k2=c(0,2),
k3=c(0,0.7));
#Do the brute-force
fit <- nls2(fo,
data=df,
start = grd,
algorithm = "brute-force",
control=list(maxiter=20000))
fit
coef(fit)
final <- nls(fo, data=df, start=as.list(coef(fit)))
The values it should give are:
f1 0.013866
f2 0.005364
k1 0.063641
k2 0.004297
k3 0.615125
Though even with quite high iteration values, I'm just getting nonsense returns. I'm clearly doing something wrong, but I cannot see it
Edit based on #Roland 's comment:
The method you propose with the approximation of k1-3 with a linear approach seems to work on some datasets, but not on all of them. Below is the code I'm using now based on your input.
#Oral example:
y <- c(0,0.0045375,0.0066325,0.00511375,0.00395875,0.003265,0.00276,
0.002495,0.00231875);
t <- c(0,12,24,48,72,96,120,144,168);
#IV example:
#y <- c(0,0.01377,0.01400875,0.0119175,0.00759375,0.00512125,0.004175,
#0.00355375,0.00308875,0.0028925,0.00266375)
#t <- c(0,3,6,12,24,48,72,96,120,144,168)
DF <- data.frame(y, t)
fit1 <- nls(y ~ cbind(exp(-k1*t), exp(-k2*t), exp(-k3*t)), algorithm = "plinear", data = DF,
start = list(k1 = 0.002, k2 = 0.02, k3= 0.2))
k1_predict <-summary(fit1)$coefficients[1,1]
k2_predict <-summary(fit1)$coefficients[2,1]
k3_predict <-summary(fit1)$coefficients[3,1]
fo <- y ~ f1*exp(-k1*t)+f2*exp(-k2*t)-(f1+f2)*exp(-k3*t);
fit2 <- nls(fo, data = DF,
start = list(k1 = k1_predict, k2 = k2_predict, k3 = k3_predict, f1 = 0.01, f2 = 0.01))
summary(fit2);
plot(t,y);
curve(predict(fit2, newdata = data.frame(t = x)), 0, 200, add = TRUE, col = "red")
oral_example fit
#G. Grothendieck
Similar to Roland's suggestion, your suggestion is also excellent in that it is capable of fitting some datasets but struggles with others. The code below is based on your input, and exits with a singular gradient matrix.
#Oral example:
y <- c(0,0.0045375,0.0066325,0.00511375,0.00395875,0.003265,0.00276,
0.002495,0.00231875);
t <- c(0,12,24,48,72,96,120,144,168);
#IV example:
#y <- c(0,0.01377,0.01400875,0.0119175,0.00759375,0.00512125,0.004175,
#0.00355375,0.00308875,0.0028925,0.00266375)
#t <- c(0,3,6,12,24,48,72,96,120,144,168)
df <- data.frame(y, t)
grd <- data.frame(f1=c(0,1),
f2=c(0,1),
k1=c(0,2),
k2=c(0,2),
k3=c(0,0.7));
set.seed(123)
fit <- nls2(fo,
data=df,
start = grd,
algorithm = "random",
control = nls.control(maxiter = 100000))
nls(fo, df, start = coef(fit), alg = "port", lower = 0)
plot(t,y);
curve(predict(nls, newdata = data.frame(t = x)), 0, 200, add = TRUE, col = "red")
I would first do a partially linear fit with no constraints on the linear parameters to get good starting values for the exponential parameters and some idea regarding the magnitude of the linear parameters:
DF <- data.frame(y, t)
fit1 <- nls(y ~ cbind(exp(-k1*t), exp(-k2*t), exp(-k3*t)), algorithm = "plinear", data = DF,
start = list(k1 = 0.002, k2 = 0.02, k3= 0.2))
summary(fit1)
#Formula: y ~ cbind(exp(-k1 * t), exp(-k2 * t), exp(-k3 * t))
#
#Parameters:
# Estimate Std. Error t value Pr(>|t|)
#k1 0.0043458 0.0010397 4.180 0.008657 **
#k2 0.0639379 0.0087141 7.337 0.000738 ***
#k3 0.6077646 0.0632586 9.608 0.000207 ***
#.lin1 0.0053968 0.0006637 8.132 0.000457 ***
#.lin2 0.0139231 0.0008694 16.014 1.73e-05 ***
#.lin3 -0.0193145 0.0010631 -18.168 9.29e-06 ***
Then you can fit your actual model:
fit2 <- nls(fo, data = DF,
start = list(k1 = 0.06, k2 = 0.004, k3 = 0.6, f1 = 0.01, f2 = 0.01))
summary(fit2)
#Formula: y ~ f1 * exp(-k1 * t) + f2 * exp(-k2 * t) - (f1 + f2) * exp(-k3 * t)
#
#Parameters:
# Estimate Std. Error t value Pr(>|t|)
#k1 0.0639344 0.0079538 8.038 0.000198 ***
#k2 0.0043456 0.0009492 4.578 0.003778 **
#k3 0.6078929 0.0575616 10.561 4.24e-05 ***
#f1 0.0139226 0.0007934 17.548 2.20e-06 ***
#f2 0.0053967 0.0006059 8.907 0.000112 ***
curve(predict(fit2, newdata = data.frame(t = x)), 0, 200, add = TRUE, col = "red")
Note that this model can easily be re-parameterized by switching the exponential terms (i.e., the order of the kn starting values), which could result in different estimates for f1 and f2, but basically the same fit.
With this many parameters I would use algorithm = "random" rather than "brute". If we do that then the following gives a result close to the one in the question (up to permutation of the arguments due to the symmetry of the model parameters):
set.seed(123)
fit <- nls2(fo,
data=df,
start = grd,
algorithm = "random",
control = nls.control(maxiter = 20000))
nls(fo, df, start = coef(fit), alg = "port", lower = 0)
giving:
Nonlinear regression model
model: y ~ f1 * exp(-k1 * t) + f2 * exp(-k2 * t) - (f1 + f2) * exp(-k3 * t)
data: df
f1 f2 k1 k2 k3
0.005397 0.013923 0.004346 0.063934 0.607893
residual sum-of-squares: 2.862e-07
Algorithm "port", convergence message: relative convergence (4)
ADDED
A variation of the above is to use nlsLM in the minpack.lm package instead of nls and to use splines to get more points in the data set. In place of the nls line try the following. It still gives convergence:
library(minpack.lm)
t_s <- with(df, min(t):max(t))
df_s <- setNames(data.frame(spline(df$t, df$y, xout = t_s)), c("t", "y"))
nlsLM(fo, df_s, start = coef(fit), lower = rep(0,5), control = nls.control(maxiter = 1024))
and it also does in the Oral example:
set.seed(123)
y <- c(0,0.0045375,0.0066325,0.00511375,0.00395875,0.003265,0.00276,
0.002495,0.00231875);
t <- c(0,12,24,48,72,96,120,144,168)
DF <- data.frame(y, t)
grd <- data.frame(f1=c(0,1), f2=c(0,1), k1=c(0,2), k2=c(0,2), k3=c(0,0.7))
fit <- nls2(fo,
data=DF,
start = grd,
algorithm = "random",
control = nls.control(maxiter = 20000))
library(minpack.lm)
t_s <- with(DF, min(t):max(t))
df_s <- setNames(data.frame(spline(DF$t, DF$y, xout = t_s)), c("t", "y"))
nlsLM(fo, df_s, start = coef(fit), lower = rep(0,5), control = nls.control(maxiter = 1024))

Why is gradient of first iteration step singular in nls with biv.norm

I am trying to fit a non-linear regression model where the mean-function is the bivariate normal distribution. The parameter to specify is the correlation rho.
The problem: "gradient of first iteration step is singular". Why?
I have here a little example with simulated data.
# given values for independent variables
x1 <- c(rep(0.1,5), rep(0.2,5), rep(0.3,5), rep(0.4,5), rep(0.5,5))
x2 <- c(rep(c(0.1,0.2,0.3,0.4,0.5),5))
## 1 generate values for dependent variable (incl. error term)
# from bivariate normal distribution with assumed correlation rho=0.5
fun <- function(b) pmnorm(x = c(qnorm(x1[b]), qnorm(x2[b])),
mean = c(0, 0),
varcov = matrix(c(1, 0.5, 0.5, 1), nrow = 2))
set.seed(123)
y <- sapply(1:25, function(b) fun(b)) + runif(25)/1000
# put it in data frame
dat <- data.frame(y=y, x1=x1, x2=x2 )
# 2 : calculate non-linear regression from the generated data
# use rho=0.51 as starting value
fun <- function(x1, x2,rho) pmnorm(x = c(qnorm(x1), qnorm(x2)),
mean = c(0, 0),
varcov = matrix(c(1, rho, rho, 1), nrow = 2))
nls(formula= y ~ fun(x1, x2, rho), data= dat, start=list(rho=0.51),
lower=0, upper=1, trace=TRUE)
This yields an error message:
Error in nls(formula = y ~ fun(x1, x2, rho), data = dat, start = list(rho = 0.51), :
singulärer Gradient
In addition: Warning message:
In nls(formula = y ~ fun(x1, x2, rho), data = dat, start = list(rho = 0.51), :
Obere oder untere Grenzen ignoriert, wenn nicht algorithm= "port"
What I don't understand is
I have only one variable (rho), so there is only one gradient which must be =0 if the matrix of gradients is supposed to be singular. So why should the gradient be =0?
The start value cannot be the problem as I know the true rho=0.5. So the start value =0.51 should be fine, shouldn't it?
The data cannot be completely linear dependent as I added an error term to y.
I would appreciate help very much. Thanks already.
Perhaps "optim" does a better job than "nls":
library(mnormt)
# given values for independent variables
x1 <- c(rep(0.1,5), rep(0.2,5), rep(0.3,5), rep(0.4,5), rep(0.5,5))
x2 <- c(rep(c(0.1,0.2,0.3,0.4,0.5),5))
## 1 generate values for dependent variable (incl. error term)
# from bivariate normal distribution with assumed correlation rho=0.5
fun <- function(b) pmnorm(x = c(qnorm(x1[b]), qnorm(x2[b])),
mean = c(0, 0),
varcov = matrix(c(1, 0.5, 0.5, 1), nrow = 2))
set.seed(123)
y <- sapply(1:25, function(b) fun(b)) + runif(25)/1000
# put it in data frame
dat <- data.frame(y=y, x1=x1, x2=x2 )
# 2 : calculate non-linear regression from the generated data
# use rho=0.51 as starting value
fun <- function(x1, x2,rho) pmnorm(x = c(qnorm(x1), qnorm(x2)),
mean = c(0, 0),
varcov = matrix(c(1, rho, rho, 1), nrow = 2))
f <- function(rho) { sum( sapply( 1:nrow(dat),
function(i){
(fun(dat[i,2],dat[i,3],rho) - dat[i,1])^2
} ) ) }
optim(0.51, f, method="BFGS")
The result is not that bad:
> optim(0.51, f, method="BFGS")
$par
[1] 0.5043406
$value
[1] 3.479377e-06
$counts
function gradient
14 4
$convergence
[1] 0
$message
NULL
Maybe even a little bit better than 0.5:
> f(0.5043406)
[1] 3.479377e-06
> f(0.5)
[1] 1.103484e-05
>
Let's check another start value:
> optim(0.8, f, method="BFGS")
$par
[1] 0.5043407
$value
[1] 3.479377e-06
$counts
function gradient
28 6
$convergence
[1] 0
$message
NULL

Creating function arguments from a named list (with an application to stats4::mle)

I should start by saying what I'm trying to do: I want to use the mle function without having to re-write my log likelihood function each time I want to try a different model specification. Because mle is expecting a named list of starting values, you apparently cannot just write the log-likelihood function as taking a vector of parameters. A simple example:
Suppose I want to fit a linear regression model via maximum likelihood and at first, I'm ignoring one of my predictors:
n <- 100
df <- data.frame(x1 = runif(n), x2 = runif(n), y = runif(n))
Y <- df$y
X <- model.matrix(lm(y ~ x1, data = df))
# define log-likelihood function
ll <- function(beta0, beta1, sigma){
beta = matrix(NA, nrow=2, ncol=1)
beta[,1] = c(beta0, beta1)
-sum(log(dnorm(Y - X %*% beta, 0, sigma)))
}
library(stats4)
mle(ll, start = list(beta0=.1, beta1=.2, sigma=1)
Now, if I want to fit a different model, say:
m <- lm(y ~ x1 + x2, data = df)
I cannot re-use my log-likelihood function--I'd have to re-write it to have the beta3 parameter. What I'd like to do is something like:
ll.flex <- function(theta){
# theta is a vector that I can use directly
...
}
if I could then somehow adjust the start argument in mle to account for my now vector-input log-likelihood function, or barring that, have a function that constructs the log-likelihood function at run-time, say by constructing the named list of arguments and then using it to define the function e.g., something like this:
X <- model.matrix(lm(y ~ x1 + x2, data = df))
arguments <- rep(NA, dim(X)[2])
names(arguments) <- colnames(X)
ll.magic <- function(bring.this.to.life.as.function.arguments(arguments)){...}
Update:
I ended up writing a helper function that can add an arbitrary number of named arguments x1, x2, x3... to a passed function f.
add.arguments <- function(f,n){
# adds n arguments to a function f; returns that new function
t = paste("arg <- alist(",
paste(sapply(1:n, function(i) paste("x",i, "=",sep="")), collapse=","),
")", sep="")
formals(f) <- eval(parse(text=t))
f
}
It's ugly, but it got the job done, letting me re-factor my log-likelihood function on the fly.
You can use the mle2 function from the package bbmle which allows you to pass vectors as parameters. Here is some sample code.
# REDEFINE LOG LIKELIHOOD
ll2 = function(params){
beta = matrix(NA, nrow = length(params) - 1, ncol = 1)
beta[,1] = params[-length(params)]
sigma = params[[length(params)]]
minusll = -sum(log(dnorm(Y - X %*% beta, 0, sigma)))
return(minusll)
}
# REGRESS Y ON X1
X <- model.matrix(lm(y ~ x1, data = df))
mle2(ll2, start = c(beta0 = 0.1, beta1 = 0.2, sigma = 1),
vecpar = TRUE, parnames = c('beta0', 'beta1', 'sigma'))
# REGRESS Y ON X1 + X2
X <- model.matrix(lm(y ~ x1 + x2, data = df))
mle2(ll2, start = c(beta0 = 0.1, beta1 = 0.2, beta2 = 0.1, sigma = 1),
vecpar = TRUE, parnames = c('beta0', 'beta1', 'beta2', 'sigma'))
This gives you
Call:
mle2(minuslogl = ll2, start = c(beta0 = 0.1, beta1 = 0.2, beta2 = 0.1,
sigma = 1), vecpar = TRUE, parnames = c("beta0", "beta1",
"beta2", "sigma"))
Coefficients:
beta0 beta1 beta2 sigma
0.5526946 -0.2374106 0.1277266 0.2861055
It might be easier to use optim directly; that's what mle is using anyway.
ll2 <- function(par, X, Y){
beta <- matrix(c(par[-1]), ncol=1)
-sum(log(dnorm(Y - X %*% beta, 0, par[1])))
}
getp <- function(X, sigma=1, beta=0.1) {
p <- c(sigma, rep(beta, ncol(X)))
names(p) <- c("sigma", paste("beta", 0:(ncol(X)-1), sep=""))
p
}
set.seed(5)
n <- 100
df <- data.frame(x1 = runif(n), x2 = runif(n), y = runif(n))
Y <- df$y
X1 <- model.matrix(y ~ x1, data = df)
X2 <- model.matrix(y ~ x1 + x2, data = df)
optim(getp(X1), ll2, X=X1, Y=Y)$par
optim(getp(X2), ll2, X=X2, Y=Y)$par
With the output of
> optim(getp(X1), ll2, X=X1, Y=Y)$par
sigma beta0 beta1
0.30506139 0.47607747 -0.04478441
> optim(getp(X2), ll2, X=X2, Y=Y)$par
sigma beta0 beta1 beta2
0.30114079 0.39452726 -0.06418481 0.17950760
It might not be what you're looking for, but I would do this as follows:
mle2(y ~ dnorm(mu, sigma),parameters=list(mu~x1 + x2), data = df,
start = list(mu = 1,sigma = 1))
mle2(y ~ dnorm(mu,sigma), parameters = list(mu ~ x1), data = df,
start = list(mu=1,sigma=1))
You might be able to adapt this formulation for a multinomial, although dmultinom might not work -- you might need to write a Dmultinom() that took a matrix of multinomial samples and returned a (log)probability.
The R code that Ramnath provided can also be applied to the optim function because
it takes vectors as parameters also.

Resources