I defined a function called 'fun5' as follows:
function(y,mu=mu0,lsig=lsig0) {
res = exp(y)/(1+exp(y)) * 1/sqrt(2*pi)/exp(lsig) * exp(-(y-mu)^2/2/exp(lsig)^2)
return(res)
, then integrated the function from negative infinity to positive infinity with two parameters.
integrate(fun5,-Inf,Inf,mu=2.198216,lsig=-3)$value
This integral gives the expectation of a random variable which has logit-normal distribution with mu = 2.198216 and sigma = exp(-3).
This error occurred.
Error in integrate(fun5, -Inf, Inf, mu = 2.198216, lsig = -3) :
non-finite function value
Since the function 'fun5' is a random variable between 0 and 1 multiplied by probability density, it should be positive everywhere, though it might be very close to zero. I don't understand why it has non-finite value somewhere.
Could anybody give an advice?
The problem is that the function
exp(y)/(1+exp(y))
is rounded to NaN when y is too big. You can avoid this replacing it with 1 when y is too big. This function will play the trick:
fun5<-function(y,mu=mu0,lsig=-lsig0) {
res = ifelse(y<100, exp(y)/(1+exp(y)) * 1/sqrt(2*pi)/exp(lsig) * exp(-(y-mu)^2/2/exp(lsig)^2),
1/sqrt(2*pi)/exp(lsig) * exp(-(y-mu)^2/2/exp(lsig)^2))
return(res)}
and now this will work
integrate(fun5,-Inf,Inf,mu=2.198216,lsig=-3)$value
[1] 0.9
We can use, that
exp(y)/(1+exp(y)) is the same as (1 - 1/(1+exp(y))) or also1/(1+exp(-y))
fun5 <- function(y,mu=mu0,lsig=lsig0) 1/(1+exp(-y)) / sqrt(2*pi)/exp(lsig) * exp(-(y-mu)^2/2/exp(lsig)^2)
integrate(fun5,-Inf,Inf,mu=2.198216,lsig=-3)$value
.
> integrate(fun5,-Inf,Inf,mu=2.198216,lsig=-3)$value
[1] 0.9
Related
I'm trying to estimate parameters that will maximize the likelihood of a certain event. My objective function looks like that:
event_prob = function(p1, p2) {
x = ((1-p1-p2)^4)^67 *
((1-p1-p2)^3*p2)^5 *
((1-p1-p2)^3*p1)^2 *
((1-p1-p2)^2*p1*p2)^3 *
((1-p1-p2)^2*p1^2) *
((1-p1-p2)*p1^2*p2)^2 *
(p1^3*p2) *
(p1^4)
return(x)
}
In this case, I'm looking for p1 and p2 [0,1] that will maximize this function. I tried using optim() in the following manner:
aaa = optim(c(0,0),event_prob)
but I'm getting an error "Error in fn(par, ...) : argument "p2" is missing, with no default".
Am I using optim() wrong? Or is there a different function (package?) I should be using for multi-parameter optimization?
This problem can in fact be solved analytically.
The objective function simplifies to
F(p1,p2) = (1-p1-p2)^299 * p1^19 * p2^11
which is to be maximised over the region
C = { (p1,p2) | 0<=p1, 0<=p2, p1+p2<=1 }
Note that F is 0 if p1=0 or p2 =0 or p1+p2 = 1, while if none of those are true then F is positive. Thus the maximum of F occurs in the interior of C
Taking the log
f(p1,p2) = 299*log(1-p1-p2) + 19*log(p1) + 11*log(p2)
In fact it is as easy to solve the more general problem: maximise f over C where
f( p1,..pN) = b*log( 1-p1-..-pn) + Sum{ a[j]*log(p[j])}
where b and each a[j] is positive and
C = { (p1,..pN) | 0<pj, j=1..N and p1+p2+..pN<1 }
The critical point occurs where all the partial derivatives of f are zero, which is at
-b/(1-p1-..-pn) + a[j]/p[j] = 0 j=1..N
which can be written as
b*p[j] + a[j]*(p1+..p[N]) = a[j] j=1..N
or
M*p = a
where M = b*I + a*Ones', and Ones is a vector with each component 1
The inverse of M is
inv(M) = (1/b)*(I - a*Ones'/(b + Ones'*a))
Thus the unique critical point is
p^ = inv(M)*a
= a/(b + Sum{i|a[i]})
Since there is a maximum, and only one critical point, the critical point must be the maximum.
Based on Erwin Kalvelagen's comment: Redefine your function event_prob:
event_prob = function(p) {
p1 = p[1]
p2 = p[2]
x = ((1-p1-p2)^4)^67 *
((1-p1-p2)^3*p2)^5 *
((1-p1-p2)^3*p1)^2 *
((1-p1-p2)^2*p1*p2)^3 *
((1-p1-p2)^2*p1^2) *
((1-p1-p2)*p1^2*p2)^2 *
(p1^3*p2) *
(p1^4)
return(x)
}
You may want to set limits to ensure that p1 and p2 fulfill your constraints:
optim(c(0.5,0.5),event_prob,method="L-BFGS-B",lower=0,upper=1)
I would like to minimize a distance function ||dz - z|| under the constraint that g(z) = 0.
I wanted to use Lagrange Multipliers to solve this problem. Then I used NLsolve.jl to solve the non-linear equation that I end up with.
using NLsolve
using ForwardDiff
function ProjLagrange(dz, g::Function)
λ_init = ones(size(g(dz...),1))
initial_x = vcat(dz, λ_init)
function gradL!(F, x)
len_dz = length(dz)
z = x[1:len_dz]
λ = x[len_dz+1:end]
F = Array{Float64}(undef, length(x))
my_distance(z) = norm(dz - z)
∇f = z -> ForwardDiff.gradient(my_distance, z)
F[1:len_dz] = ∇f(z) .- dot(λ, g(z...))
if length(λ) == 1
F[end] = g(z...)
else
F[len_dz+1:end] = g(z)
end
end
nlsolve(gradL!, initial_x)
end
g_test(x1, x2, x3) = x1^2 + x2 - x2 + 5
z = [1000,1,1]
ProjLagrange(z, g_test)
But I always end up with Zero: [NaN, NaN, NaN, NaN] and Convergence: false.
Just so you know I have already solved the equation by using Optim.jl and minimizing the following function: Proj(z) = b * sum(abs.(g(z))) + a * norm(dz - z).
But I would really like to know if this is possible with NLsolve. Any help is greatly appreciated!
Starting almost from scratch and wikipedia's Lagrange multiplier page because it was good for me, the code below seemed to work. I added an λ₀s argument to the ProjLagrange function so that it can accept a vector of initial multiplier λ values (I saw you initialized them at 1.0 but I thought this was more generic). (Note this has not been optimized for performance!)
using NLsolve, ForwardDiff, LinearAlgebra
function ProjLagrange(x₀, λ₀s, gs, n_it)
# distance function from x₀ and its gradients
f(x) = norm(x - x₀)
∇f(x) = ForwardDiff.gradient(f, x)
# gradients of the constraints
∇gs = [x -> ForwardDiff.gradient(g, x) for g in gs]
# Form the auxiliary function and its gradients
ℒ(x,λs) = f(x) - sum(λ * g(x) for (λ,g) in zip(λs,gs))
∂ℒ∂x(x,λs) = ∇f(x) - sum(λ * ∇g(x) for (λ,∇g) in zip(λs,∇gs))
∂ℒ∂λ(x,λs) = [g(x) for g in gs]
# as a function of a single argument
nx = length(x₀)
ℒ(v) = ℒ(v[1:nx], v[nx+1:end])
∇ℒ(v) = vcat(∂ℒ∂x(v[1:nx], v[nx+1:end]), ∂ℒ∂λ(v[1:nx], v[nx+1:end]))
# and solve
v₀ = vcat(x₀, λ₀s)
nlsolve(∇ℒ, v₀, iterations=n_it)
end
# test
gs_test = [x -> x[1]^2 + x[2] - x[3] + 5]
λ₀s_test = [1.0]
x₀_test = [1000.0, 1.0, 1.0]
n_it = 100
res = ProjLagrange(x₀_test, λ₀s_test, gs_test, n_it)
gives me
julia> res = ProjLagrange(x₀_test, λ₀s_test, gs_test, n_it)
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [1000.0, 1.0, 1.0, 1.0]
* Zero: [9.800027199717013, -49.52026655749088, 51.520266557490885, -0.050887973682118504]
* Inf-norm of residuals: 0.000000
* Iterations: 10
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 11
* Jacobian Calls (df/dx): 11
I altered your code as below (see my comments in there) and got the following output. It doesn't throw NaNs anymore, reduces the objective and converges. Does this differ from your Optim.jl results?
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [1000.0, 1.0, 1.0, 1.0]
* Zero: [9.80003, -49.5203, 51.5203, -0.050888]
* Inf-norm of residuals: 0.000000
* Iterations: 10
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 11
* Jacobian Calls (df/dx): 11
using NLsolve
using ForwardDiff
using LinearAlgebra: norm, dot
using Plots
function ProjLagrange(dz, g::Function, n_it)
λ_init = ones(size(g(dz),1))
initial_x = vcat(dz, λ_init)
# These definitions can go outside as well
len_dz = length(dz)
my_distance = z -> norm(dz - z)
∇f = z -> ForwardDiff.gradient(my_distance, z)
# In fact, this is probably the most vital difference w.r.t. your proposal.
# We need the gradient of the constraints.
∇g = z -> ForwardDiff.gradient(g, z)
function gradL!(F, x)
z = x[1:len_dz]
λ = x[len_dz+1:end]
# `F` is memory allocated by NLsolve to store the residual of the
# respective call of `gradL!` and hence doesn't need to be allocated
# anew every time (or at all).
F[1:len_dz] = ∇f(z) .- λ .* ∇g(z)
F[len_dz+1:end] .= g(z)
end
return nlsolve(gradL!, initial_x, iterations=n_it, store_trace=true)
end
# Presumable here is something wrong: x2 - x2 is not very likely, also made it
# callable directly with an array argument
g_test = x -> x[1]^2 + x[2] - x[3] + 5
z = [1000,1,1]
n_it = 10000
res = ProjLagrange(z, g_test, n_it)
# Ugly reformatting here
trace = hcat([[state.iteration; state.fnorm; state.stepnorm] for state in res.trace.states]...)
plot(trace[1,:], trace[2,:], label="f(x) inf-norm", xlabel="steps")
Evolution of inf-norm of f(x) over iteration steps
[Edit: Adapted solution to incorporate correct gradient computation for g()]
I have two equations. They are as follows:
( 1 - 0.25 ^ {1/alpha} ) * lambda = 85
( 1 - 0.75 ^ {1/alpha} ) * lambda = 11
I would like to compute the values of alpha and lambda by solving the above two equations. How do I do this using R?
One approach is to translate it into an optimization problem by introducing an loss function:
loss <- function(X) {
L = X[1]
a = X[2]
return(sum(c(
(1 - 0.25^(1/a))*L - 85,
(1 - 0.75^(1/a))*L - 11
)^2))
}
nlm(loss, c(-1,-1))
If the result returned from nlm() has a minimum near zero, then estimate will be a vector containing lambda and alpha. When I tried this, I got an answer that passed the sniff test:
> a = -1.28799
> L = -43.95321
> (1 - 0.25^(1/a))*L
[1] 84.99999
> (1 - 0.75^(1/a))*L
[1] 11.00005
#olooney's answer is best.
Another way to solve these equations is to use uniroot function. We can cancel the lambda values and can use the uniroot to find the value of alpha. Then substitute back to find lambda.
f <- function(x) {
(11/85) - ((1 - (0.75) ^ (1/x)) / (1 - (0.25) ^ (1/x)) )
}
f_alpha <- uniroot(f, lower = -10, upper = -1, extendInt = "yes")
f_lambda <- function(x) {
11 - ((1 - (0.75) ^ (1/f_alpha$root)) * x)
}
lambda = uniroot(f_lambda, lower = -10, upper = -2, extendInt = "yes")$root
sprintf("Alpha equals %f", f_alpha$root)
sprintf("Lambda equals %f", lambda)
results in
[1] "Alpha equals -1.287978"
[1] "Lambda equals -43.952544"
The integrate() function returns the integrated value, but what if the user wants to take the integrated equation for an interval?
For example, the normal case of integrate() is like below:
integrate(f = function(x){2 * x}, lower = 1, upper = 2)
>3 with absolute error < 3.3e-14
But I want to write something like this:
integrate(f = function(x){2 * x}, lower = t, upper = t + 1)
to get
2 * t + 1
Thanks
The Ryacas package does symbolic computation:
install.packages("Ryacas")
library(Ryacas)
help(pac=Ryacas)
yacas("Integrate(x,t,t+1)2*x")
# expression((t + 1)^2 - t^2)
Simplify("%") # apply simplification to last result
# expression(2 * t + 1)
Since it's a short question I'll leave out regular background information (if you need, I'll add it).
Finally there is a data frame called Coefficients
Serial_number Fixed_effects_beta_0 Fixed_effects_beta_1 Fixed_effects_beta_2 Fixed_effects_beta_3 Random_effects_beta_0 Random_effects_beta_1 Random_effects_beta_2 Random_effects_beta_3 p0_fixed p1_fixed p2_fixed p3_fixed p0_random p1_random p2_random p3_random Fitted_Voltage
1 912009913 1.238401 13.19572 -0.08379988 1.366747 -0.039642999 -0.40767221 -0.25476169 -0.11315457 -11.92334 0.1177605 -0.0003777831 4.328852e-07 0.56414753 -0.006946270 2.736287e-05 -3.583906e-08 352.9476
(...)
and for each row I want to apply the function
inverse = function (f, lower = lower_limit, upper = 450) {
function (y) uniroot((function (x) f(x) - y), lower = lower_limit, upper = upper)[1]
# function (y) polyroot((function (x) f(x) - y), lower = lower_limit, upper = upper)[1]
}
function_to_observe = inverse((function(x=150)
exp(
exp(
sum(
Coefficients[running_row,"p0_fixed"] * x^0,
Coefficients[running_row,"p1_fixed"] * x^1,
Coefficients[running_row,"p2_fixed"] * x^2,
Coefficients[running_row,"p3_fixed"] * x^3
))
)
)
, 50, 450)
by making use of values stored in each row and in certain columns of the data frame as follows:
for(i in 1:nrow(Coefficients)){
Coefficients[i,"Fitted_Voltage"]<- function_to_observe(150)
}
Unfortunately this does not work since Coefficients[i,"Fitted_Voltage"]<- function_to_observe(150) does not take care of the different rows of Coefficients.
What's a remedy? Whyever I cannot do the following:
for(i in 1:nrow(Coefficients)){
Coefficients[i,"Fitted_Voltage"]<- inverse((function(x=150)
exp(
exp(
sum(
Coefficients[i,"p0_fixed"] * x^0,
Coefficients[i,"p1_fixed"] * x^1,
Coefficients[i,"p2_fixed"] * x^2,
Coefficients[i,"p3_fixed"] * x^3
))
)
)
, 50, 450)
}
This yields:
Error in x[[jj]][iseq] <- vjj :
incompatible types (from closure to double) in subassignment type fix
Thanks a lot in advance for any help!
# Update:
With the help of mathdotrandom I tried a bit and get the following:
lower_limit<- 0
function_to_observe<- inverse((function(x=150)
exp(
exp(
sum(
Coefficients[i,"p0_fixed"] * x^0,
Coefficients[i,"p1_fixed"] * x^1,
Coefficients[i,"p2_fixed"] * x^2,
Coefficients[i,"p3_fixed"] * x^3
))))
, 50, 550
)
inverse = function (f, lower = lower_limit, upper = 450) {
function (y) uniroot((function (x) f(x) - y), lower = lower_limit, upper = upper)[1]
}
for(i in 1:nrow(Coefficients)){
Coefficients[i, "Fitted_Voltage"]<- function_to_observe(150)
}
Coefficients["Fitted_Voltage"]
which yields reasonable values:
Fitted_Voltage
1 352.9476
2 352.9476
3 352.9476
4 352.9476
5 352.9476
6 352.9476
7 352.9476
8 352.9476
9 352.9476
10 352.9476
11 352.9476
12 352.9476
13 352.9476
14 352.9476
15 352.9476
Though I do not understand the syntax I guess this is correct since it does what it should.
function(x=150) does not run the function but sets x as a default parameter of 150. So you try to put a function definition into your data.frame. Thats why it complains about the type closure(function). Easiest is to give the function a name and define it outside of the for loop and then call it.
If you really want to use it as lambda function checkout this question and lebatsnok answer: lambda-like functions in R?
The inverse function should not return a function but a number. The uniroot function expects a function, so f should be a function as you did. R will actually lookup the value of i or running_row from above if you don't put it as a parameter.
Coefficients <- data.frame("Fitted_Voltage"=c(0,0), "p0_fixed"=c(10^-1, 10^-2),
"p1_fixed"=c(10^-2, 10^-3), "p2_fixed"=c(10^-3, 10^-4),
"p3_fixed"=c(10^-4, 10^-5))
f <- function(x=150)exp(exp(sum(Coefficients[running_row,"p0_fixed"] * x^0,
Coefficients[running_row,"p1_fixed"] * x^1,
Coefficients[running_row,"p2_fixed"] * x^2,
Coefficients[running_row,"p3_fixed"] * x^3)))
inverse = function (f, lower_limit, upper = 450) {
y = (f(lower_limit) + f(upper))/2
uniroot(function(x)(f(x)-y), lower = lower_limit, upper = upper)[1]
}
for(running_row in 1:nrow(Coefficients)){
Coefficients[i, "Fitted_Voltage"] <- inverse(f,-1,1)
}
But your function is always positive because you used exp and exp(x) >0 forall x, so uniroot can not find a zero of that function. Also polyroot can only find zeros of polynomials but you are using an exponentail function. Are you sure that your function should look like: e^(e^(c_0 + c_1*x + c_2*x^2 + c_3*x^3))?
I subtract a value in inverse to make it have a root but i dont know if this makes any sense in your context. Also because of double exponential the function gets big really fast, so even for small Coefficients it returns Infinity for lower limit 50 and upper 450, so i needed to do -1 and 1 as limits to get some results. But this should be somehow similar to how you want it.
Following mathdotrandom's suggestion. You can define function outside. Try this:
inner.f <- function(x=150, i){
exp(
exp(
sum(
Coefficients[i,"p0_fixed"] * x^0,
Coefficients[i,"p1_fixed"] * x^1,
Coefficients[i,"p2_fixed"] * x^2,
Coefficients[i,"p3_fixed"] * x^3
))
)
}
then (if you want x to be set to 150)
Coefficients[i,"Fitted_Voltage"]<- inverse(inner.f(150, i), 50, 450)