I am trying to solve a minimization problem in R with nlminb as part of a statistical problem. However, there is something wrong when comparing the solution provided by nlminb with the plot of the function I am trying to minimize. This is the R-code of the objective function
library(cubature)
Objective_Function <- function(p0){
F2 <- function(x){
u.s2 <- x[1]
u.c0 <- x[2]
u.k0 <- x[3]
s2 <- u.s2^(-1) - 1
c0 <- u.c0^(-1) - 1
k0 <- u.k0/p0
L <- 1/2 * c0 * s2 - 1/c0 * log(1 - k0 * p0)
A <- 1 - pnorm(L, mean = 1, sd = 1)
A <- A * dgamma(k0, shape = 1, rate = 1)
A <- A * dgamma(c0, shape = 1, rate = 1)
A <- A * dgamma(s2, shape = 1, rate = 1)
A * u.s2^(-2) * u.c0^(-2) * 1/p0
}
Pr <- cubature::adaptIntegrate(f = F2,
lowerLimit = rep(0, 3),
upperLimit = rep(1, 3))$integral
A <- 30 * Pr * (p0 - 0.1)
B <- 30 * Pr * (1 - Pr) * (p0 - 0.1)^2
0.4 * B + (1 - 0.4) * (-A)
}
Following the R-command
curve(Objective_Function, 0.1, 4)
one observes a critical point close to 2. However, when one executes
nlminb(start = runif(1, min = 0.1, max = 4),
objective = Objective_Function,
lower = 0.1, upper = 4)$par
the minimum of the function takes place at the point 0.6755844.
I was wondering if you could tell me where my mistake is, please.
Is there any reliable R-command to solve optimization problems?
If this is a very basic question, I apologize.
Thank you for your help.
The problem is not nlminb() but the fact that you have not provided a vectorized function in curve(). You can get the correct figure using the following code, from which you see that nlminb() indeed finds the minimum:
min_par <- nlminb(start = runif(1, min = 0.1, max = 4),
objective = Objective_Function,
lower = 0.1, upper = 4)$par
vec_Objective_Function <- function (x) sapply(x, Objective_Function)
curve(vec_Objective_Function, 0.1, 4)
abline(v = min_par, lty = 2, col = 2)
In addition, for univariate optimization you can also use function optimize(), i.e.,
optimize(Objective_Function, c(0.1, 4))
Related
I am new to R. I want to do some parameters estimation by using Maximum Likelihood Estimation.
Here is my attempt:
The data are
my_data = c(0.1,0.2,1,1,1,1,1,2,3,6,7,11,12,18,18,18,18,18,21,32,36,40,
45,45,47,50,55,60,63,63,67,67,67,67,72,75,79,82,82,83,
84,84,84,85,85,85,85,85,86,86)
and
lx <- function(p,x){
l <- p[1]
b <- p[2]
a <- p[3]
n <- length(x)
lnL <- n*log(l)+n*log(b)+n*log(a)+(b-1)*sum(log(x))+(a-1)*sum(log(1+l*x^b))+n-sum(1+l*x^b)
return(-lnL)
}
Note: l is λ, b is β, and a is α.
And here is the optim function
optim(p=c(1,1,1),fn = lx, method = "L-BFGS-B",
lower = c(0.0001, 0.0001, 0.0001),
control = list(), hessian = FALSE, x = my_data)
After I run this code, I get an error message:
Error in optim(p = c(1, 1, 1), fn = lx, method = "L-BFGS-B", lower = c(1e-04, :
objective function in optim evaluates to length 50 not 1
What's wrong with my code? Can you help me to fix it? Thanks in advance!
Instead of a log-likelihood, use MASS::fitdistr.
#
# Power Generalized Weibull distribution
#
# x > 0, alpha, beta, lambda > 0
#
dpowergweibull <- function(x, alpha, beta, lambda){
f1 <- lambda * beta * alpha
f2 <- x^(beta - 1)
f3 <- (1 + lambda * x^beta)^(alpha - 1)
f4 <- exp(1 - (1 + lambda * x^beta)^alpha)
f1 * f2 * f3 * f4
}
ppowergweibull <- function(q, alpha, beta, lambda){
1 - exp(1 - (1 + lambda * q^beta)^alpha)
}
my_data <- c(0.1,0.2,1,1,1,1,1,2,3,6,7,11,12,18,18,18,18,18,21,32,36,40,
45,45,47,50,55,60,63,63,67,67,67,67,72,75,79,82,82,83,
84,84,84,85,85,85,85,85,86,86)
start_par <- list(alpha = 0.1, beta = 0.1, lambda = 0.1)
y1 <- MASS::fitdistr(my_data, dpowergweibull, start = start_par),
start_par2 <- list(shape = 1, rate = 1)
y2 <- MASS::fitdistr(my_data, "gamma", start = start_par2)
hist(my_data, freq = FALSE)
curve(dpowergweibull(x, y1$estimate[1], y1$estimate[2], y1$estimate[3]),
from = 0.1, to = 90, col = "red", add = TRUE)
curve(dgamma(x, y2$estimate[1], y2$estimate[2]),
from = 0.1, to = 90, col = "blue", add = TRUE)
Have a problem with data generating and I have no idea how to solve this. All information provided in photo: Problem.
I think that X_i(t) in both cases should be 200 x 100 if we say that t is from 0 to 1 (length = 100). Furthermore, coefficients for polynomial should contain 200 x 4 and coefficients for fourier should contain 200 x 5. Bu I have no idea how to start to solve this problem.
Here is some code. So, I have already defined my beta's, but I can't defeat generating of X_i(t).
t <- seq(0, 1, length = 100)
beta_1t <- rep(0, 100)
plot(t, beta_1t, type = "l")
beta_2t <- (t >= 0 & t < 0.342) * ((t - 0.5)^2 - 0.025) +
(t >= 0.342 & t <= 0.658) * 0 +
(t > 0.658 & t <= 1) * (-(t - 0.5)^2 + 0.025)
plot(t, beta_2t, type = "l")
beta_3t <- t^3 - 1.6 * t^2 + 0.76 * t + 1
plot(t, beta_3t, type = "l")
poly_c <- matrix(rnorm(n = 800, mean = 0, sd = 1), ncol = 4)
four_c <- matrix(rnorm(n = 1000, mean = 0, sd = 1), ncol = 5)
As I mentioned before, there should be (X_i(t), Y_i(t)) samples. Here i = 1, 2, ..., 200; t from [0, 1] (length = 100).
I'm trying to use nls(), but the error in the question was made.
Following is the sample data set resembles the original one:
rh1 = rnorm(301, 0.75, 0.1)
rh1[rh1 > 1] = 1
ta1 = rnorm(301, 302, 3)
y1 = rnorm(301, 0.2, 0.05)
df_test = data.frame(rh1 = rh1,
rh2 = c(NA, rh1[-c(1)]),
ta1 = ta1,
ta2 = c(NA, ta1[-c(1)]),
y1 = y1,
y2 = c(NA, y1[-c(1)]))
df_test = df_test[-c(1), ] # this function cannot estimate for the first value
where rh is relative humidity of the air,
ta is air temperature in K,
and y is moisture content of an object. 1 means today's value; 2 means yesterday's value.
I'm trying to estimate y using y2, rh1&2 and ta1&2 by a model below:
nls(y1 ~
coef1 ^ 2 * y2 +
coef1 * (1 - coef1) *
(coef2 + coef3 * log(-8.3 * ta2 * log(rh2) / 18)) +
(1 - coef1) *
(coef2 + coef3 * log(-8.3 * ta1 * log(rh1) / 18)),
data = df_test,
algorithm = "port",
start = list(coef1 = 0.7,
coef2 = 0.15,
coef3 = 0),
upper = c(exp(-0.00005), Inf, Inf),
lower = c(exp(-0.5), Inf, Inf))
Coef1, 2, and 3 are the parameters to be estimated.
The initial values were determined by manual calculation for the first row of the data.
But this script made the error in the title.
Missing value or an infinity produced when evaluating the model
I also tried using minpack.lm::nlsLM() function according to the link below:
nls troubles: Missing value or an infinity produced when evaluating the model
library(minpack.lm)
nlsLM(y1 ~
coef1 ^ 2 * y2 +
coef1 * (1 - coef1) *
(coef2 + coef3 * log(-8.3 * ta2 * log(rh2) / 18)) +
(1 - coef1) *
(coef2 + coef3 * log(-8.3 * ta1 * log(rh1) / 18)),
data = df_test,
start = list(coef1 = 0.7,
coef2 = 0.15,
coef3 = 0),
upper = c(exp(-0.00005), Inf, Inf),
lower = c(exp(-0.5), Inf, Inf))
but still got the same error.
There are several issues here.
First off: your lagged values aren't really lagged. Take a look at df_test and you will se that the 1's and 2's are identical.
This will give you lagged values:
set.seed(1)
rh1 <- rnorm(301, 0.75, 0.1)
rh1[rh1 > 1] <- 1
ta1 <- rnorm(301, 302, 3)
y1 <- rnorm(301, 0.2, 0.05)
df_test <- data.frame(
rh1 = rh1,
rh2 = c(NA, head(rh1, -1)),
ta1 = ta1,
ta2 = c(NA, head(ta1, -1)),
y1 = y1,
y2 = c(NA, head(y1, -1))
)
df_test <- df_test[complete.cases(df_test), ]
Next:
Missing value or an infinity produced when evaluating the model
Means just that, and my eyes immediately fix on the logs in your expression. We all know that taking the log of a negative number is undefined, as is the log of 0, although it is often returned as infinity.
Let's take a look at those expressions
ex1 <- with(df_test, log(-8.2 * ta2 * log(rh2) / 18))
ex2 <- with(df_test, log(-8.3 * ta1 * log(rh1) / 18))
If you look at ex1 and ex2 you will see that both contain a -Inf. Now there's your culprit. But how can we fix this? Let's see which rows in your data gives rise to this.
df_test[which(is.infinite(ex1 + ex2)),]
# rh1 rh2 ta1 ta2 y1 y2
# 274 1.0000 0.66481 304.5453 300.5972 0.20930 0.17474
# 275 0.7656 1.00000 304.9603 304.5453 0.20882 0.20930
Interesting, they are right next to each other, and they both contain a 1. What's log(1)? What happens if you multiply it by something and take the log of the product?
Let's make sure rh1 and rh2 is always less than 1
set.seed(1)
rh1 <- rnorm(301, 0.75, 0.1)
rh1[rh1 > 0.99] <- 0.99
ta1 <- rnorm(301, 302, 3)
y1 <- rnorm(301, 0.2, 0.05)
df_test <- data.frame(
rh1 = rh1,
rh2 = c(NA, head(rh1, -1)),
ta1 = ta1,
ta2 = c(NA, head(ta1, -1)),
y1 = y1,
y2 = c(NA, head(y1, -1))
)
df_test <- df_test[complete.cases(df_test), ]
But we're still not done. If you run your nls() call now you'll get the error
Convergence failure: initial par violates constraints
And the cause is obvious if you look at the values you specify for your coefficients constraints. coef2 and coef3 has lower constraints set to infinity! That doesn't make sense. "initial par violates constraints" usually means that the start values aren't within the constraints, which is definitely the case here. If we change them to negative infinity everything works fine.
nls(y1 ~
coef1 ^ 2 * y2 +
coef1 * (1 - coef1) *
(coef2 + coef3 * log(-8.3 * ta2 * log(rh2) / 18)) +
(1 - coef1) *
(coef2 + coef3 * log(-8.3 * ta1 * log(rh1) / 18)),
data = df_test,
algorithm = "port",
start = list(coef1 = 0.7,
coef2 = 0.15,
coef3 = 0),
upper = c(exp(-0.00005), Inf, Inf),
lower = c(exp(-0.5), -Inf, -Inf)
)
# Nonlinear regression model
# model: y1 ~ coef1^2 * y2 + coef1 * (1 - coef1) * (coef2 + coef3 * log(…
# data: df_test
# coef1 coef2 coef3
# 0.6065 0.2569 -0.0170
# residual sum-of-squares: 1.058
# Algorithm "port", convergence message:
# both X-convergence and relative convergence (5)
I am working with a function that depends on quadratic B-spline interpolation estimated up front by the the cobs function in the same R package. The estimated knots and corresponding coefficients are given in code.
Further on, I require the integral of this function from 0 to some value, for example 0.6 or 0.7. Since my function is strictly positive, the integral value should increase if the upper bound of the integral increases. However this is not the case for some values, as shown when using 0.6 and 0.7
library(cobs)
b <- 0.6724027
xi1 <- 0.002541667
xi2 <- 2.509625
knots <- c(5.000010e-06, 8.700000e-05, 3.420000e-04, 1.344000e-03, 5.292000e-03, 2.082900e-02, 8.198800e-02, 3.227180e-01, 1.270272e+00, 5.000005e+00)
coef <- c(2.509493, 2.508141, 2.466733, 2.378368, 2.239769, 2.063977, 1.874705, 1.601780, 1.288163, 1.262683, 1.432729)
fn <- function(x) {
z <- (2 - b) * (cobs:::.splValue(2, knots, coef, x, 0) - 2 * x * xi1) / xi2 - b
return (z)
}
x <- seq(0, 0.7, 0.0001)
plot(x, fn(x), type = 'l')
integrate(f = fn, 0, 0.6)
# 0.1049019 with absolute error < 1.2e-15
integrate(f = fn, 0, 0.7)
# 0.09714124 with absolute error < 1.1e-15
I know I could integrate directly on the cobs:::.splValue function, and transform the results correspondingly. However, I am interested to know why this strange behaviour occurs.
I think that the algorithm used by the function "integrate" is not behaving well for those conditions. For example, if you modify the lower limits, it works as expected:
> integrate(f = fn, 0.1, 0.6)
0.06794357 with absolute error < 7.5e-16
> integrate(f = fn, 0.1, 0.7)
0.07432096 with absolute error < 8.3e-16
This is common with numerical integration methods, you have to choose on a case by case basis.
I'm using the trapezoidal rule to integrate over the same region and works well original code
composite.trapezoid <- function(f, a, b, n) {
if (is.function(f) == FALSE) {
stop('f must be a function with one parameter (variable)')
}
h <- (b - a) / n
j <- 1(:n - 1)
xj <- a + j * h
approx <- (h / 2) * (f(a) + 2 * sum(f(xj)) + f(b))
return(approx)
}
> composite.trapezoid(f = fn, 0, 0.6, 10000)
[1] 0.1079356
> composite.trapezoid(f = fn, 0, 0.7, 10000)
[1] 0.1143195
If we analyze the behavior of the integral close to the 0.65 region, we can see that there is a problem with the first approach (it is not smooth):
tst = sapply(seq(0.5, 0.8, length.out = 100), function(upper) {
integrate(f = fn, 0, upper)[[1]]
})
plot(seq(0.5, 0.8, length.out = 100), tst)
and that the trapezoid rule behaves better:
tst2 = sapply(seq(0.5, 0.8, length.out = 100), function(upper) {
composite.trapezoid(f = fn, 0, upper, 10000)[[1]]
})
plot(seq(0.5, 0.8, length.out = 100), tst2)
I have faced a problem with passing arguments to optim.
Suppose I want to do box constraint minimization on a multivariate function, for example
fr <- function(x) { ## Rosenbrock function
x1 <- x[1]
x2 <- x[2]
x3 <- x[3]
x4 <- x[4]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2 +
100 * (x3 - x2 * x2)^2 + (1 - x2)^2 +
100 * (x4 - x3 * x3)^2 + (1 - x3)^2
}
As usual optim can be used as following:
optim(par = c(0, 1, 1, 2), fr, method = "L-BFGS-B", lower = c(0, 0, 0, 0), upper = c(3, 3, 3, 3))
Now, suppose this procedure repeated in an algorithm which changes lower and upper (box constraints), followed by par, such that in some iterations one, two or three value of parameters become known, for example x1 = 1. in this case I expect optim to handle this by setting the initial value, lower and upper bounds of x1 to 1:
optim(par = c(1, 1, 1, 2), fr, method = "L-BFGS-B", lower = c(1, 0, 0, 0), upper = c(1, 3, 3, 3))
But by runnig this line I got an error:
Error in optim(par = c(1, 1, 1, 2), fr, method = "L-BFGS-B", lower = c(1, : non-finite finite-difference value [1]
Now, the question is how can I deal with this feature of optim without defining many new functions when one or some of the parameters become known?
Thank you in advance
It sounds like optim is not able to handle the upper and lower matching. I suppose you could parameterize your function with the known values and use some simple ifelse statements to check if you should be using the passed value from optim or the known value:
# Slightly redefined function to optimize
fr2 <- function(opt.x, known.x) {
x <- ifelse(is.na(known.x), opt.x, known.x)
100 * (x[2] - x[1] * x[1])^2 + (1 - x[1])^2 +
100 * (x[3] - x[2] * x[2])^2 + (1 - x[2])^2 +
100 * (x[4] - x[3] * x[3])^2 + (1 - x[3])^2
}
# Optimize, and then replace the appropriate indices of the result with known vals
known.x <- c(NA, 1, NA, 1)
opt.result <- optim(par = c(0, 1, 1, 2), fr2, method = "L-BFGS-B",
lower = c(0, 0, 0, 0), upper = c(3, 3, 3, 3), known.x=known.x)
opt.result$par <- ifelse(is.na(known.x), opt.result$par, known.x)
opt.result
# $par
# [1] 0.9999995 1.0000000 0.9999996 1.0000000
#
# $value
# [1] 1.795791e-10
#
# $counts
# function gradient
# 13 13
#
# $convergence
# [1] 0
#
# $message
# [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
This code basically ignores the indices passed from optim if they are already known, and just uses the known values in those cases.