Solve linear Equation System - julia

I have two (mathematical) functions:
y = x
y = -2x + 3
This is solved by y = 1 and x = 1. See picture:
How can I make Julia do this for me?

This is a set of linear equations so first rearrange them in the following way:
-x + y = 0
2x + y = 3
and you see that they are in the form of a linear equation system A*v=b where. A is a matrix:
julia> A = [-1 1; 2 1]
2×2 Array{Int64,2}:
-1 1
2 1
and b is a vector:
julia> b = [0, 3]
2-element Array{Int64,1}:
0
3
Now v contains your unknown variables x and y. You can now solve the system using the left division operator \:
julia> A\b
2-element Array{Float64,1}:
1.0
1.0
If you had a more general system of non-linear equations you should use NLsolve.jl package:
julia> using NLsolve
julia> function f!(F, v)
x = v[1]
y = v[2]
F[1] = -x + y
F[2] = 2*x + y - 3
end
f! (generic function with 1 method)
julia> res = nlsolve(f!, [0.0; 0.0])
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [0.0, 0.0]
* Zero: [1.0000000000003109, 0.9999999999999647]
* Inf-norm of residuals: 0.000000
* Iterations: 2
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 3
* Jacobian Calls (df/dx): 3
julia> res.zero
2-element Array{Float64,1}:
1.0000000000003109
0.9999999999999647
(note that in f! we define two outputs F[1] and F[2] to be equal to zero - you have to rearrange your equations in this way).
For more details how to use NLsolve.jl see https://github.com/JuliaNLSolvers/NLsolve.jl.

Mr #bogumił-kamiński gave an excellent answer. However, just a friendly reminder, the solution MAY NOT EXIST for some other system of linear equations. In that case, you'll get a SingularException. Consider checking if the solution exists or not. For example,
using LinearAlgebra;
"""
y = x => x - y = 0 => |1 -1| X = |0| => AX = B => X = A⁻¹B
y = -2x + 3 2x + y = 3 |2 1| |3|
"""
function solution()
A::Matrix{Int64} = Matrix{Int64}([1 -1; 2 1]);
br::Matrix{Int64} = Matrix{Int64}([0 3]);
bc = transpose(br);
# bc::Matrix{Int64} = Matrix{Int64}([0 ;3;;]); # I missed a semicolon, that's why I got an error
# println(bc);
if(det(A) == 0) # existence check
println("soln doesnot exist. returning...");
return
end
A⁻¹ = inv(A);
println("solution exists:")
X = A⁻¹ * bc;
println(X);
end
solution();

Related

User-defined (nonlinear) objective with vectorized variables in JuMP

Is it possible to use vectorized variables with user-defined objective functions in JuMP for Julia? Like so,
model = Model(GLPK.Optimizer)
A = [
1 1 9 5
3 5 0 8
2 0 6 13
]
b = [7; 3; 5]
c = [1; 3; 5; 2]
#variable(model, x[1:4] >= 0)
#constraint(model, A * x .== b)
# dummy functions, could be nonlinear hypothetically
identity(x) = x
C(x, c) = c' * x
register(model, :identity, 1, identity; autodiff = true)
register(model, :C, 2, C; autodiff = true)
#NLobjective(model, Min, C(identity(x), c))
This throws the error,
ERROR: Unexpected array VariableRef[x[1], x[2], x[3], x[4]] in nonlinear expression. Nonlinear expressions may contain only scalar expression.
Which sounds like no. Is there a workaround to this? I believe scipy.optimize.minimize is capable of optimizing user-defined objectives with vectorized variables?
No, you cannot pass vector arguments to user-defined functions.
Documentation: https://jump.dev/JuMP.jl/stable/manual/nlp/#User-defined-functions-with-vector-inputs
Issue you opened: https://github.com/jump-dev/JuMP.jl/issues/2854
The following is preferable to Prezemyslaw's answer. His suggestion to wrap things in an #expression won't work if the functions are more complicated.
using JuMP, Ipopt
model = Model(Ipopt.Optimizer)
A = [
1 1 9 5
3 5 0 8
2 0 6 13
]
b = [7; 3; 5]
c = [1; 3; 5; 2]
#variable(model, x[1:4] >= 0)
#constraint(model, A * x .== b)
# dummy functions, could be nonlinear hypothetically
identity(x) = x
C(x, c) = c' * x
my_objective(x...) = C(identitiy(collect(x)), c)
register(model, :my_objective, length(x), my_objective; autodiff = true)
#NLobjective(model, Min, my_objective(x...))
Firstly, use optimizer that supports nonlinear models. GLPK does not. Try Ipopt:
using Ipopt
model = Model(Ipopt.Optimizer)
Secondly, JuMP documentation reads (see https://jump.dev/JuMP.jl/stable/manual/nlp/#Syntax-notes):
The syntax accepted in nonlinear macros is more restricted than the syntax for linear and quadratic macros. (...) all expressions must be simple scalar operations. You cannot use dot, matrix-vector products, vector slices, etc.
you need wrap the goal function
#expression(model, expr, C(identity(x), c))
Now you can do:
#NLobjective(model, Min, expr)
To show that it works I solve the model:
julia> optimize!(model)
This is Ipopt version 3.14.4, running with linear solver MUMPS 5.4.1.
...
Total seconds in IPOPT = 0.165
EXIT: Optimal Solution Found.
julia> value.(x)
4-element Vector{Float64}:
0.42307697548737005
0.3461538282496562
0.6923076931757742
-8.46379887234798e-9

Lagrange Multiplier Method using NLsolve.jl

I would like to minimize a distance function ||dz - z|| under the constraint that g(z) = 0.
I wanted to use Lagrange Multipliers to solve this problem. Then I used NLsolve.jl to solve the non-linear equation that I end up with.
using NLsolve
using ForwardDiff
function ProjLagrange(dz, g::Function)
λ_init = ones(size(g(dz...),1))
initial_x = vcat(dz, λ_init)
function gradL!(F, x)
len_dz = length(dz)
z = x[1:len_dz]
λ = x[len_dz+1:end]
F = Array{Float64}(undef, length(x))
my_distance(z) = norm(dz - z)
∇f = z -> ForwardDiff.gradient(my_distance, z)
F[1:len_dz] = ∇f(z) .- dot(λ, g(z...))
if length(λ) == 1
F[end] = g(z...)
else
F[len_dz+1:end] = g(z)
end
end
nlsolve(gradL!, initial_x)
end
g_test(x1, x2, x3) = x1^2 + x2 - x2 + 5
z = [1000,1,1]
ProjLagrange(z, g_test)
But I always end up with Zero: [NaN, NaN, NaN, NaN] and Convergence: false.
Just so you know I have already solved the equation by using Optim.jl and minimizing the following function: Proj(z) = b * sum(abs.(g(z))) + a * norm(dz - z).
But I would really like to know if this is possible with NLsolve. Any help is greatly appreciated!
Starting almost from scratch and wikipedia's Lagrange multiplier page because it was good for me, the code below seemed to work. I added an λ₀s argument to the ProjLagrange function so that it can accept a vector of initial multiplier λ values (I saw you initialized them at 1.0 but I thought this was more generic). (Note this has not been optimized for performance!)
using NLsolve, ForwardDiff, LinearAlgebra
function ProjLagrange(x₀, λ₀s, gs, n_it)
# distance function from x₀ and its gradients
f(x) = norm(x - x₀)
∇f(x) = ForwardDiff.gradient(f, x)
# gradients of the constraints
∇gs = [x -> ForwardDiff.gradient(g, x) for g in gs]
# Form the auxiliary function and its gradients
ℒ(x,λs) = f(x) - sum(λ * g(x) for (λ,g) in zip(λs,gs))
∂ℒ∂x(x,λs) = ∇f(x) - sum(λ * ∇g(x) for (λ,∇g) in zip(λs,∇gs))
∂ℒ∂λ(x,λs) = [g(x) for g in gs]
# as a function of a single argument
nx = length(x₀)
ℒ(v) = ℒ(v[1:nx], v[nx+1:end])
∇ℒ(v) = vcat(∂ℒ∂x(v[1:nx], v[nx+1:end]), ∂ℒ∂λ(v[1:nx], v[nx+1:end]))
# and solve
v₀ = vcat(x₀, λ₀s)
nlsolve(∇ℒ, v₀, iterations=n_it)
end
# test
gs_test = [x -> x[1]^2 + x[2] - x[3] + 5]
λ₀s_test = [1.0]
x₀_test = [1000.0, 1.0, 1.0]
n_it = 100
res = ProjLagrange(x₀_test, λ₀s_test, gs_test, n_it)
gives me
julia> res = ProjLagrange(x₀_test, λ₀s_test, gs_test, n_it)
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [1000.0, 1.0, 1.0, 1.0]
* Zero: [9.800027199717013, -49.52026655749088, 51.520266557490885, -0.050887973682118504]
* Inf-norm of residuals: 0.000000
* Iterations: 10
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 11
* Jacobian Calls (df/dx): 11
I altered your code as below (see my comments in there) and got the following output. It doesn't throw NaNs anymore, reduces the objective and converges. Does this differ from your Optim.jl results?
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [1000.0, 1.0, 1.0, 1.0]
* Zero: [9.80003, -49.5203, 51.5203, -0.050888]
* Inf-norm of residuals: 0.000000
* Iterations: 10
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 11
* Jacobian Calls (df/dx): 11
using NLsolve
using ForwardDiff
using LinearAlgebra: norm, dot
using Plots
function ProjLagrange(dz, g::Function, n_it)
λ_init = ones(size(g(dz),1))
initial_x = vcat(dz, λ_init)
# These definitions can go outside as well
len_dz = length(dz)
my_distance = z -> norm(dz - z)
∇f = z -> ForwardDiff.gradient(my_distance, z)
# In fact, this is probably the most vital difference w.r.t. your proposal.
# We need the gradient of the constraints.
∇g = z -> ForwardDiff.gradient(g, z)
function gradL!(F, x)
z = x[1:len_dz]
λ = x[len_dz+1:end]
# `F` is memory allocated by NLsolve to store the residual of the
# respective call of `gradL!` and hence doesn't need to be allocated
# anew every time (or at all).
F[1:len_dz] = ∇f(z) .- λ .* ∇g(z)
F[len_dz+1:end] .= g(z)
end
return nlsolve(gradL!, initial_x, iterations=n_it, store_trace=true)
end
# Presumable here is something wrong: x2 - x2 is not very likely, also made it
# callable directly with an array argument
g_test = x -> x[1]^2 + x[2] - x[3] + 5
z = [1000,1,1]
n_it = 10000
res = ProjLagrange(z, g_test, n_it)
# Ugly reformatting here
trace = hcat([[state.iteration; state.fnorm; state.stepnorm] for state in res.trace.states]...)
plot(trace[1,:], trace[2,:], label="f(x) inf-norm", xlabel="steps")
Evolution of inf-norm of f(x) over iteration steps
[Edit: Adapted solution to incorporate correct gradient computation for g()]

Using Lsq-Fit in Julia

I am trying to practice fitting with the Lsq-Fit-function in Julia.
The derivative of a Cauchy-distribution with parameters \gamma and x_0.
Following this manual I tried
f(x, x_0, γ) = -2*(x - x_0)*(π * γ^3 * (1 + ((x - x_0)/γ)^2)^2)^(-1)
x_0 = 3350
γ = 50
xarr = range(3000, length = 5000, stop = 4000)
yarr = [f(x, x_0, γ) for x in xarr]
using LsqFit
# p ≡ [x_0, γ]
model(x, p) = -2*(x - p[1])*(π * (p[2])^3 * (1 + ((x - p[1])/p[2])^2)^2)^(-1)
p0 = [3349, 49]
curve_fit(model, xarr, yarr, p0)
param = fit.param
... and it does not work, giving a MethodError: no method matching -(::StepRangeLen[...], leaving me confused.
Can please somebody tell me what I am doing wrong?
There are a few issues with what you've written:
the model function is meant to be called with its first argument (x) being the full vector of independent variables, not just one value. This is where the error you mention comes from:
julia> model(x, p) = -2*(x - p[1])*(π * (p[2])^3 * (1 + ((x - p[1])/p[2])^2)^2)^(-1);
julia> p0 = [3349, 49];
julia> model(xarr, p0);
ERROR: MethodError: no method matching -(::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}, ::Float64)
One way to fix this is to use the dot notation to broadcast all operators so that they work elementwise:
julia> model(x, p) = -2*(x .- p[1]) ./ (π * (p[2])^3 * (1 .+ ((x .- p[1])/p[2]).^2).^2);
julia> model(xarr, p0); # => No error
but if this is too tedious you can let the #. macro do the work for you:
# just put #. in front of the expression to transform every
# occurrence of a-b into a.-b (and likewise for all operators)
# which means to compute the operation elementwise
julia> model(x, p) = #. -2*(x - p[1])*(π * (p[2])^3 * (1 + ((x - p[1])/p[2])^2)^2)^(-1);
julia> model(xarr, p0); # => No error
Another issue is that the parameters you're looking for are meant to be floating-point values. But your initial guess p0 is initialized with integers, which confuses curve_fit. There are two ways of fixing this. Either put floating-point values in p0:
julia> p0 = [3349.0, 49.0]
2-element Array{Float64,1}:
3349.0
49.0
or use a typed array initializer to specify explicitly the element type:
julia> p0 = Float64[3349, 49]
2-element Array{Float64,1}:
3349.0
49.0
This is not really an error, but I would find it more intuitive to compute a/b instead of a*b^(-1). Also, yarr can be computed with a simple broadcast using dot notation instead of a comprehension.
Wrapping all this together:
f(x, x_0, γ) = -2*(x - x_0)*(π * γ^3 * (1 + ((x - x_0)/γ)^2)^2)^(-1)
(x_0, γ) = (3350, 50)
xarr = range(3000, length = 5000, stop = 4000);
# use dot-notation to "broadcast" f and map it
# elementwise to elements of xarr
yarr = f.(xarr, x_0, γ);
using LsqFit
model(x, p) = #. -2*(x - p[1]) / (π * (p[2])^3 * (1 + ((x - p[1])/p[2])^2)^2)
p0 = Float64[3300, 10]
fit = curve_fit(model, xarr, yarr, p0)
yields:
julia> fit.param
2-element Array{Float64,1}:
3349.999986535933
49.99999203625603

non-linear solver always produces zero residual

I am learning how to solve non-linear equations and Julia (1.3.1) in general and want to ask how we should use NLsolve.
As a first step, I try the following:
using NLsolve
uni = 1
z = 3
x = [3,2,4,5,6]
y = z .+ x
print(y)
function g!(F, x)
F[1:uni+4] = y .- z - x
end
nlsolve(g!, [0.5,0.9,1,2,3])
And, I confirm it works as below:
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [0.5, 0.9, 1.0, 2.0, 3.0]
* Zero: [2.999999999996542, 2.000000000003876, 4.000000000008193, 4.999999999990685, 5.999999999990221]
* Inf-norm of residuals: 0.000000
* Iterations: 2
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 3
* Jacobian Calls (df/dx): 3
Then, I try a more complicated model as following
using SpecialFunctions, NLsolve, Random
Random.seed!(1234)
S = 2
# Setting parameters
ε = 1.3
β = 0.4
γ = gamma((ε-1)/ε)
T = rand(S)
E = rand(S)
B = rand(S)
w = rand(S)
Q = rand(S)
d = [1 2 ;2 1 ]
# Construct a model
rvector = T.*Q.^(1-β).*B.^ε
svector = E.* w.^ε
Φ_all = (sum(sum(rvector * svector' .* d )))
π = rvector * svector' .* d ./ Φ_all
# These two are outcome the model
πR = (sum(π,dims=1))'
πM = sum(π,dims=2)
# E is now set as unknown and we want to estimate it given the outcome of the model
function f!(Res, Unknown)
rvector = T.*Q.^(1-β).*B.^ε
svector = Unknown[1:S].* w.^ε
Φ_all = (sum(sum(rvector * svector' .* d )))
π = rvector * svector' .* d ./ Φ_all
Res = ones(S+S)
Res[1:S] = πR - (sum(π,dims=1))'
Res[S+1:S+S] = πM - sum(π,dims=2)
end
nlsolve(f!, [0.5,0.6])
and this code produces a weird result as follows.
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [0.5, 0.6]
* Zero: [0.5, 0.6]
* Inf-norm of residuals: 0.000000
* Iterations: 0
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 1
* Jacobian Calls (df/dx): 1
So, essentially, the function always yields 0 as the return, and thus the initial input always becomes the solution. I cannot understand why this does not work and also why this behaves differently from the first example. Could I have your suggestion to fix it?
You need to update Res in place. Now you do
Res = ones(S+S)
which shadows the input Res. Just update Res directly.

Can't get performant Julia Turing model

I've tried to reproduce the model from a PYMC3 and Stan comparison. But it seems to run slowly and when I look at #code_warntype there are some things -- K and N I think -- which the compiler seemingly calls Any.
I've tried adding types -- though I can't add types to turing_model's arguments and things are complicated within turing_model because it's using autodiff variables and not the usuals. I put all the code into the function do_it to avoid globals, because they say that globals can slow things down. (It actually seems slower, though.)
Any suggestions as to what's causing the problem? The turing_model code is what's iterating, so that should make the most difference.
using Turing, StatsPlots, Random
sigmoid(x) = 1.0 / (1.0 + exp(-x))
function scale(w0::Float64, w1::Array{Float64,1})
scale = √(w0^2 + sum(w1 .^ 2))
return w0 / scale, w1 ./ scale
end
function do_it(iterations::Int64)::Chains
K = 10 # predictor dimension
N = 1000 # number of data samples
X = rand(N, K) # predictors (1000, 10)
w1 = rand(K) # weights (10,)
w0 = -median(X * w1) # 50% of elements for each class (number)
w0, w1 = scale(w0, w1) # unit length (euclidean)
w_true = [w0, w1...]
y = (w0 .+ (X * w1)) .> 0.0 # labels
y = [Float64(x) for x in y]
σ = 5.0
σm = [x == y ? σ : 0.0 for x in 1:K, y in 1:K]
#model turing_model(X, y, σ, σm) = begin
w0_pred ~ Normal(0.0, σ)
w1_pred ~ MvNormal(σm)
p = sigmoid.(w0_pred .+ (X * w1_pred))
#inbounds for n in 1:length(y)
y[n] ~ Bernoulli(p[n])
end
end
#time chain = sample(turing_model(X, y, σ, σm), NUTS(iterations, 200, 0.65));
# ϵ = 0.5
# τ = 10
# #time chain = sample(turing_model(X, y, σ), HMC(iterations, ϵ, τ));
return (w_true=w_true, chains=chain::Chains)
end
chain = do_it(1000)

Resources