I am trying to learn how to use the Mamba package in Julia for doing Bayesian inference. Though the package is great, as a beginner I find the documentation a bit scarce in information. Hence I am trying to figure out how implement some very simple examples.
What I have tried
I implemented an example for doing Bayesian inference for the mean of a univariate normal distribution. The code follows:
using Mamba
## Model Specification
model = Model(
x = Stochastic(1,
mu -> Normal(mu, 2.0),
false
),
mu = Stochastic(
() -> Normal(0.0, 1000.0),
true
)
)
## Data
data = Dict{Symbol, Any}(
:x => randn(30)*2+13
)
## Initial Values
inits = [
Dict{Symbol, Any}(
:x => data[:x],
:mu => randn()*1
)
]
## Sampling Scheme Assignment
scheme1 = NUTS([:mu])
setsamplers!(model, [scheme1])
sim1 = mcmc(model, data, inits, 10000, burnin=250, thin=2, chains=1);
describe(sim1)
This seems to work absolutely fine (though there may be better ways perhaps of coding this?).
What I am trying to do and doesn't work.
In this example, I am trying to do bayesian inference for the mean of a bivariate normal distribution. The code follows:
using Mamba
## Model Specification
model = Model(
x = Stochastic(1,
mu -> MvNormal(mu, eye(2)),
false
),
mu = Stochastic(1,
() -> MvNormal(zeros(2), 1000.0),
true
)
)
## Data
data = Dict{Symbol, Any}(
:x => randn(2,30)+13
)
## Initial Values
inits = [
Dict{Symbol, Any}(
:x => data[:x],
:mu => randn(2)*1
)
]
## Sampling Scheme Assignment
scheme1 = NUTS([:mu])
setsamplers!(model, [scheme1])
sim1 = mcmc(model, data, inits, 10000, burnin=250, thin=2, chains=1);
describe(sim1)
As you may notice, the changes that I suppose are necessary are minimal. However, I am doing somewhere something wrong and when I attempt to run this I get an error (a conversion between types error) which does not help me further.
Any help appreciated. If this works out, I will consider contribute this simple example to the Mamba documentation for other new users. Thanks.
Addendum: the error message
ERROR: MethodError: Cannot `convert` an object of type Array{Float64,2} to an object of type Array{Float64,1}
This may have arisen from a call to the constructor Array{Float64,1}(...),
since type constructors fall back to convert methods.
in setinits!(::Mamba.ArrayStochastic{1}, ::Mamba.Model, ::Array{Float64,2}) at /lhome/lgiannins/.julia/v0.5/Mamba/src/model/dependent.jl:164
in setinits!(::Mamba.Model, ::Dict{Symbol,Any}) at /lhome/lgiannins/.julia/v0.5/Mamba/src/model/initialization.jl:11
in setinits!(::Mamba.Model, ::Array{Dict{Symbol,Any},1}) at /lhome/lgiannins/.julia/v0.5/Mamba/src/model/initialization.jl:24
in #mcmc#29(::Int64, ::Int64, ::Int64, ::Bool, ::Function, ::Mamba.Model, ::Dict{Symbol,Any}, ::Array{Dict{Symbol,Any},1}, ::Int64) at /lhome/lgiannins/.julia/v0.5/Mamba/src/model/mcmc.jl:29
in (::Mamba.#kw##mcmc)(::Array{Any,1}, ::Mamba.#mcmc, ::Mamba.Model, ::Dict{Symbol,Any}, ::Array{Dict{Symbol,Any},1}, ::Int64) at ./<missing>:0
As I posted on the Mamba issue you openned:
The issue is because
data[:x]
2x30 Array{Float64,2}:
is a matrix of dimension 2 x 30. The way you coded up the stochastic node for x is
x = Stochastic(1,
mu -> MvNormal(mu, eye(2)),
false
),
which specifies that x is a vector (multidimensional array of dimension 1). That's what the 1 right after Stochastic denotes. It helps to write out the model in math notation. Because the MvNormal defines a distribution on a vector, not a matrix. Perhaps your model is something like X_1, ..., X_n iid MvNormal(mu, I) in which case you can try something like
using Mamba
## Model Specification
model = Model(
x = Stochastic(2,
(mu, N, P) ->
UnivariateDistribution[
begin
Normal(mu[i], 1)
end
for i in 1:P, j in 1:N
],
false
),
mu = Stochastic(1,
() -> MvNormal(zeros(2), 1000.0),
true
)
)
## Data
data = Dict{Symbol, Any}(
:x => randn(2,30)+13,
:P => 2,
:N => 30
)
## Initial Values
inits = [
Dict{Symbol, Any}(
:x => data[:x],
:mu => randn(2)*1
)
]
## Sampling Scheme Assignment
scheme1 = NUTS([:mu])
setsamplers!(model, [scheme1])
sim1 = mcmc(model, data, inits, 10000, burnin=250, thin=2, chains=1);
describe(sim1)
Related
so I wrote a minimum example to show what I'm trying to do. Basically I want to solve a optimization problem with multiple variables. When I try to do this in JuMP I was having issues with my function obj not being able to take a forwardDiff object.
I looked here: and it seemed to do with the function signature :Restricting function signatures while using ForwardDiff in Julia . I did this in my obj function, and for insurance did it in my sub-function as well, but I still get the error
LoadError: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#110#112"{typeof(my_fun)},Float64},Float64,2})
Closest candidates are:
Float64(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
Float64(::T) where T<:Number at boot.jl:715
Float64(::Int8) at float.jl:60
This still does not work. I feel like I have the bulk of the code correct, just some weird of type thing going on that I have to clear up so autodifferentiate works...
Any suggestions?
using JuMP
using Ipopt
using LinearAlgebra
function obj(x::Array{<:Real,1})
println(x)
x1 = x[1]
x2 = x[2]
eye= Matrix{Float64}(I, 4, 4)
obj_val = tr(eye-kron(mat_fun(x1),mat_fun(x2)))
println(obj_val)
return obj_val
end
function mat_fun(var::T) where {T<:Real}
eye= Matrix{Float64}(I, 2, 2)
eye[2,2]=var
return eye
end
m = Model(Ipopt.Optimizer)
my_fun(x...) = obj(collect(x))
#variable(m, 0<=x[1:2]<=2.0*pi)
register(m, :my_fun, 2, my_fun; autodiff = true)
#NLobjective(m, Min, my_fun(x...))
optimize!(m)
# retrieve the objective value, corresponding x values and the status
println(JuMP.value.(x))
println(JuMP.objective_value(m))
println(JuMP.termination_status(m))
Use instead
function obj(x::Vector{T}) where {T}
println(x)
x1 = x[1]
x2 = x[2]
eye= Matrix{T}(I, 4, 4)
obj_val = tr(eye-kron(mat_fun(x1),mat_fun(x2)))
println(obj_val)
return obj_val
end
function mat_fun(var::T) where {T}
eye= Matrix{T}(I, 2, 2)
eye[2,2]=var
return eye
end
Essentially, anywhere you see Float64, replace it by the type in the incoming argument.
I found the problem:
in my mat_fun the type of the return had to be "Real" in order for it to propgate through. Before it was Float64, which was not consistent with the fact I guess all types have to be Real with the autodifferentiate. Even though a Float64 is clearly Real, it looks like the inheritence isn't perserved i.e you have to make sure everything that is returned and inputed are type Real.
using JuMP
using Ipopt
using LinearAlgebra
function obj(x::AbstractVector{T}) where {T<:Real}
println(x)
x1 = x[1]
x2 = x[2]
eye= Matrix{Float64}(I, 4, 4)
obj_val = tr(eye-kron(mat_fun(x1),mat_fun(x2)))
#println(obj_val)
return obj_val
end
function mat_fun(var::T) where {T<:Real}
eye= zeros(Real,(2,2))
eye[2,2]=var
return eye
end
m = Model(Ipopt.Optimizer)
my_fun(x...) = obj(collect(x))
#variable(m, 0<=x[1:2]<=2.0*pi)
register(m, :my_fun, 2, my_fun; autodiff = true)
#NLobjective(m, Min, my_fun(x...))
optimize!(m)
# retrieve the objective value, corresponding x values and the status
println(JuMP.value.(x))
println(JuMP.objective_value(m))
println(JuMP.termination_status(m))
I would like use ModelingToolkit.jl to solve large nonlinear systems of equations. Unfortunately, using symbolic arrays with NonlinearSystem gives a method error:
ERROR: MethodError: no method matching hasmetadata(::Vector{Num}, ::Type{Symbolics.VariableDefaultValue})
Is there a way of solving nonlinear equations with indexed variables with ModelingToolkit?
Code example:
using ModelingToolkit, NonlinearSolve
vars = #variables x
#named works = NonlinearSystem([], vars, [])
vars = #variables x[1:3]
#named fails = NonlinearSystem([], vars, [])
eqs = [x[j] ~ j for j ∈ 1:3]
#named also_fails = NonlinearSystem(eqs, vars, [])
As Chris comments, it is a bug and now an open issue here:
https://github.com/SciML/ModelingToolkit.jl/issues/1406
This question has already been asked on another platform, but I haven't got an answer yet.
https://discourse.julialang.org/t/generalizing-the-inputs-of-the-nlsolve-function-in-julia/
After an extensive process usyng the SymPy in Julia, I generated a system of nonlinear equations. My system is allocated in a matrix NxS. Something like this(NN = 2, S = 2).
I would like to adapt the system to use the NLsolve package. I do some boondoggle for the case NN=1 and S =1. The system_equations2 function give me the nonlinear system, like the figure
using SymPy
using Plots
using NLsolve
res = system_equations2()
In order to simulate the output, I do this:
NN = 1
S = 1
p= [Sym("p$i$j") for i in 1:NN,j in 1:S]
res = [ Eq( -331.330122303069*p[i,j]^(1.0) + p[i,j]^(2.81818181818182) - 1895.10478893046/(p[i,j]^(-1.0))^(2.0),0 ) for i in 1:NN,j in 1:S]
resf = convert( Function, lhs( res[1,1] ) )
plot(resf, 0 ,10731)
Now
resf = convert( Function, lhs( res[1,1] ) )
# This for the argument in the nlsolve function
function resf2(p)
p = Tuple(p)[1]
r = resf(p)
return r
end
Now, I find the zeros
function K(F,p)
F[1] = resf2(p[1])
end
nlsolve(K , [7500.8])
I would like to generalize this price to any NN and any S. I believe there is a simpler way to do this.
why the code below does not work?
xa = [0 0.200000000000000 0.400000000000000 1.00000000000000 1.60000000000000 1.80000000000000 2.00000000000000 2.60000000000000 2.80000000000000 3.00000000000000 3.80000000000000 4.80000000000000 5.00000000000000 5.20000000000000 6.00000000000000 6.20000000000000 7.40000000000000 7.60000000000000 7.80000000000000 8.60000000000000 8.80000000000000 9.00000000000000 9.20000000000000 9.40000000000000 10.0000000000000 10.6000000000000 10.8000000000000 11.2000000000000 11.6000000000000 11.8000000000000 12.2000000000000 12.4000000000000];
ya = [-0.183440428023042 -0.131101157495126 0.0268875670852843 0.300000000120000 0.579048247883555 0.852605831272159 0.935180993484717 1.13328608090532 1.26893326843583 1.10202945535186 1.09201137189664 1.14279083803453 0.811302535321072 0.909735376251797 0.417067545528244 0.460107770989798 -0.516307074859654 -0.333994077331822 -0.504124744955962 -0.945794320817293 -0.915934553082780 -0.975458595671737 -1.09943707404275 -1.11254211607374 -1.29739980589100 -1.23440439602665 -0.953807504156356 -1.12240274852172 -0.609284630192522 -0.592560286759450 -0.402521296049042 -0.510090363150962];
x0 = vec(xa)
y0 = vec(ya)
fun(x,a) = a[1].*sin(a[2].*x - a[3])
a0 = [1,2,3]
eps = 0.000001
maxiter=200
coefs, converged, iter = CurveFit.nonlinear_fit(x0 , fun , a0 , eps, maxiter )
y0b = fit(x0)
Winston.plot(x0, y0, "ob", x0, y0b, "r-", linewidth=3)
Error: LoadError: MethodError: convert has no method matching convert(::Type{Float64}, ::Array{Float64,1}) This may have arisen from
a call to the constructor Float64(...), since type constructors fall
back to convert methods. Closest candidates are: call{T}(::Type{T},
::Any) convert(::Type{Float64}, !Matched::Int8)
convert(::Type{Float64}, !Matched::Int16)
while loading In[269], in expression starting on line 8
in nonlinear_fit at /home/jmarcellopereira/.julia/v0.4/CurveFit/src/nonlinfit.jl:75
The fun function has to return a residual value r of type Float64, calculated at each iteration of the data, as follows:
r = y - fun(x, coefs)
so your function y=a1*sin(x*a2-a3) will be defined as:
fun(x,a) = x[2]-a[1]*sin(a[2]*x[1] - a[3])
Where:
x[2] is a value of 'y' vector
x[1] is a value of 'x' vector
a[...] is the set of parameters
The fun function has to return a single Float64, so the operators can't be 'dot version' (.*).
By calling the nonlinear_fit function, the first parameter must be an array Nx2, with the first column containing N values of x and the second, containing N values of y, so you must concatenate the two vectors x and y in a two columns array:
xy = [x y]
and finally, call the function:
coefs, converged, iter = CurveFit.nonlinear_fit(xy , fun , a0 , eps, maxiter )
Answering to your comment about the returned coefficients are not correct:
The y = 1 * sin (x * a2-a3) is a harmonic function, so the coefficients returning from the function call, depend heavily on the parameter a0 ("initial guess for each fitting parameter") you will send as the third parameter (with maxiter=200_000):
a0=[1.5, 1.5, 1.0]
coefficients: [0.2616335317043578, 1.1471991302529982,0.7048665905560775]
a0=[100.,100.,100.]
coefficients: [-0.4077952060368059, 90.52328921205392, 96.75331155303707]
a0=[1.2, 0.5, 0.5]
coefficients: [1.192007321713507, 0.49426296880933257, 0.19863645732313934]
I think the results you're getting are harmonics, as the graph:
Where:
blue line:
f1(xx)=0.2616335317043578*sin(xx*1.1471991302529982-0.7048665905560775)
yellow line:
f2(xx)=1.192007321713507*sin(xx*0.49426296880933257-0.19863645732313934)
pink line:
f3(xx)=-0.4077952060368059*sin(xx*90.52328921205392-96.75331155303707)
blue dots are your initial data.
The graph was generated with Gadfly:
plot(layer(x=x,y=y,Geom.point),layer([f1,f2,f3],0.0, 15.0,Geom.line))
tested with Julia Version 0.4.3
from Doc:
we are trying to fit the relationship fun(x, a) = 0
So, if you want to find elements of a in a way that: for each xi,yi in [x0 y0] => a[1].*sin(a[2].*xi - a[3])==yi, then the right way is:
fun(xy,a) = a[1].*sin(a[2].*xy[1] - a[3])-xy[2];
xy=hcat(x0,y0);
coefs,converged,iter = CurveFit.nonlinear_fit(xy,fun,a0,eps,maxiter);
I found the LsqFit package a bit simpler to use, just define first the model and "fit it" with your data:
using DataFrames, Plots, LsqFit
xa = [0 0.200000000000000 0.400000000000000 1.00000000000000 1.60000000000000 1.80000000000000 2.00000000000000 2.60000000000000 2.80000000000000 3.00000000000000 3.80000000000000 4.80000000000000 5.00000000000000 5.20000000000000 6.00000000000000 6.20000000000000 7.40000000000000 7.60000000000000 7.80000000000000 8.60000000000000 8.80000000000000 9.00000000000000 9.20000000000000 9.40000000000000 10.0000000000000 10.6000000000000 10.8000000000000 11.2000000000000 11.6000000000000 11.8000000000000 12.2000000000000 12.4000000000000];
ya = [-0.183440428023042 -0.131101157495126 0.0268875670852843 0.300000000120000 0.579048247883555 0.852605831272159 0.935180993484717 1.13328608090532 1.26893326843583 1.10202945535186 1.09201137189664 1.14279083803453 0.811302535321072 0.909735376251797 0.417067545528244 0.460107770989798 -0.516307074859654 -0.333994077331822 -0.504124744955962 -0.945794320817293 -0.915934553082780 -0.975458595671737 -1.09943707404275 -1.11254211607374 -1.29739980589100 -1.23440439602665 -0.953807504156356 -1.12240274852172 -0.609284630192522 -0.592560286759450 -0.402521296049042 -0.510090363150962];
x0 = vec(xa)
y0 = vec(ya)
xbase = collect(linspace(minimum(x0),maximum(x0),100))
p0 = [1.2, 0.5, 0.5] # initial value of parameters
fun(x0, p) = p[1] .* sin.(p[2] .* x0 .- p[3]) # definition of the model
fit = curve_fit(fun,x0,y0,p0) # actual fitting job
yFit = [fit.param[1] * sin(fit.param[2] * x - fit.param[3]) for x in xbase] # building the fitted values
# Plotting..
scatter(x0, y0, label="obs")
plot!(xbase, yFit, label="fitted")
Note that using LsqFit does not solve the problem of dependency from initial conditions highlighted by Gomiero..
I am trying to perform a ML-Estimation of a normally distributed variable in a linear regression setting in Julia using JuMP and the NLopt solver.
There exists a good working example here however if I try to estimate the regression parameters (slope) the code becomes quite tedious to write, in particular if the parameter space increases.
Maybe someone has an idea how to write it more concise. Here is my Code:
#type definition to store data
type data
n::Int
A::Matrix
β::Vector
y::Vector
ls::Vector
err::Vector
end
#generate regression data
function Data( n = 1000 )
A = [ones(n) rand(n, 2)]
β = [2.1, 12.9, 3.7]
y = A*β + rand(Normal(), n)
ls = inv(A'A)A'y
err = y - A * ls
data(n, A, β, y, ls, err)
end
#initialize data
d = Data()
println( var(d.y) )
function ml( )
m = Model( solver = NLoptSolver( algorithm = :LD_LBFGS ) )
#defVar( m, b[1:3] )
#defVar( m, σ >= 0, start = 1.0 )
#this is the working example.
#As you can see it's quite tedious to write
#and becomes rather infeasible if there are more then,
#let's say 10, slope parameters to estimate
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\cont. next line
-sum{(d.y[i]-d.A[i,1]*b[1] \\
-d.A[i,2]*b[2] \\
-d.A[i,3]*b[3])^2, i=1:d.n}/(2σ^2) )
#julia returns:
> slope: [2.14,12.85,3.65], variance: 1.04
#which is what is to be expected
#however:
#this is what I would like the code to look like:
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\
-sum{(d.y[i]-(d.A[i,j]*b[j]))^2, \\
i=1:d.n, j=1:3}/(2σ^2) )
#I also tried:
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\
-sum{sum{(d.y[i]-(d.A[i,j]*b[j]))^2, \\
i=1:d.n}, j=1:3}/(2σ^2) )
#but unfortunately it returns:
> slope: [10.21,18.89,15.88], variance: 54.78
solve(m)
println( getValue(b), " ", getValue(σ^2) )
end
ml()
Any ideas?
EDIT
As noted by Reza a working example is:
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\
-sum{(d.y[i]-sum{d.A[i,j]*b[j],j=1:3})^2,
i=1:d.n}/(2σ^2) )
The sum{} syntax is a special syntax that only works inside JuMP macros, and is the preferred syntax for sums.
So your example would be written as:
function ml( )
m = Model( solver = NLoptSolver( algorithm = :LD_LBFGS ) )
#variable( m, b[1:3] )
#variable( m, σ >= 0, start = 1.0 )
#NLobjective(m, Max,
-(d.n/2)*log(2π*σ^2)
- sum{
sum{(d.y[i]-d.A[i,j]*b[j], j=1:3}^2,
i=1:d.n}/(2σ^2) )
where I've expanded it across multiple lines to be as clear as possible.
Reza's answer isn't technically wrong, but isn't idiomatic JuMP and won't be as efficient for larger models.
I didn't trace your code but anywhere, I wish that the following works for you:
sum([(d.y[i]-sum([d.A[i,j]*b[j] for j=1:3]))^2 for i=1:d.n])
as #IainDunning mentioned, JuMP package has a special syntax for summation inside it's macros, so the more efficient and abstract way to do this is:
sum{sum{(d.y[i]-d.A[i,j]*b[j], j=1:3}^2,i=1:d.n}