I've tried to reproduce the model from a PYMC3 and Stan comparison. But it seems to run slowly and when I look at #code_warntype there are some things -- K and N I think -- which the compiler seemingly calls Any.
I've tried adding types -- though I can't add types to turing_model's arguments and things are complicated within turing_model because it's using autodiff variables and not the usuals. I put all the code into the function do_it to avoid globals, because they say that globals can slow things down. (It actually seems slower, though.)
Any suggestions as to what's causing the problem? The turing_model code is what's iterating, so that should make the most difference.
using Turing, StatsPlots, Random
sigmoid(x) = 1.0 / (1.0 + exp(-x))
function scale(w0::Float64, w1::Array{Float64,1})
scale = √(w0^2 + sum(w1 .^ 2))
return w0 / scale, w1 ./ scale
end
function do_it(iterations::Int64)::Chains
K = 10 # predictor dimension
N = 1000 # number of data samples
X = rand(N, K) # predictors (1000, 10)
w1 = rand(K) # weights (10,)
w0 = -median(X * w1) # 50% of elements for each class (number)
w0, w1 = scale(w0, w1) # unit length (euclidean)
w_true = [w0, w1...]
y = (w0 .+ (X * w1)) .> 0.0 # labels
y = [Float64(x) for x in y]
σ = 5.0
σm = [x == y ? σ : 0.0 for x in 1:K, y in 1:K]
#model turing_model(X, y, σ, σm) = begin
w0_pred ~ Normal(0.0, σ)
w1_pred ~ MvNormal(σm)
p = sigmoid.(w0_pred .+ (X * w1_pred))
#inbounds for n in 1:length(y)
y[n] ~ Bernoulli(p[n])
end
end
#time chain = sample(turing_model(X, y, σ, σm), NUTS(iterations, 200, 0.65));
# ϵ = 0.5
# τ = 10
# #time chain = sample(turing_model(X, y, σ), HMC(iterations, ϵ, τ));
return (w_true=w_true, chains=chain::Chains)
end
chain = do_it(1000)
Related
I have :
a set of N locations which can be workplace or residence
a vector of observed workers L_i, with i in N
a vector of observed residents R_n, with n in N
a matrix of distance observed between all pair residence n and workplace i
a shape parameter epsilon
Setting N=3, epsilon=5, and
d = [1 1.5 3 ; 1.5 1 1.5 ; 3 1.5 1] #distance matrix
L_i = [13 69 18] #vector of workers in each workplace
R_n = [27; 63; 10]
I want to find the vector of wages (size N) that solve this system of N equations,
with l all the workplaces.
Do I need to implement an iterative algorithm on the vectors of workers and wages? Or is it possible to directly solve this system ?
I tried this,
w_i = [1 ; 1 ; 1]
er=1
n =1
while er>1e-3
L_i = ( (w_i ./ d).^ϵ ) ./ sum( ( (w_i ./ d).^ϵ), dims=1) * R
er = maximum(abs.(L .- L_i))
w_i = 0.7.*w_i + 0.3.*w_i.*((L .- L_i) ./ L_i)
n = n+1
end
If L and R are given (i.e., do not depend on w_i), you should set up a non-linear search to get (a vector of) wages from that gravity equation (subject to normalising one w_i, of course).
Here's a minimal example. I hope it helps.
# Call Packages
using Random, NLsolve, LinearAlgebra
# Set seeds
Random.seed!(1704)
# Variables and parameters
N = 10
R = rand(N)
L = rand(N) * 0.5
d = ones(N, N) .+ Symmetric(rand(N, N)) / 10.0
d[diagind(d)] .= 1.0
ε = -3.0
# Define objective function
function obj_fun(x, d, R, L, ε)
# Find shares
S_mat = (x ./ d).^ε
den = sum(S_mat, dims = 1)
s = S_mat ./ den
# Normalize last wage
x[end] = 1.0
# Define loss function
loss = L .- s * R
# Return
return loss
end
# Run optimization
x₀ = ones(N)
res = nlsolve(x -> obj_fun(x, d, R, L, ε), x₀, show_trace = true)
# Equilibrium vector of wages
w = res.zero
I want to make simulation of wave packet in free space in Julia and for this purpose I need to store every plot, plotted using arrays, in a frame and then to show it in the form of a simulation of 15 seconds. I have written code for these plots but dont know how to make simulation in which previous line of plot doesn't show and only shows a wave like simulation.
enter code here
function gauss(x, xo)
exp(-k * (x - xo) ^ 2)
end
y = zeros(100,15)
k =1000
xo = 0.3
for j = 1:2
x = 0.0
for i = 2:99
y[i, j] = gauss(x, xo)
x = x + 0.01
end
xo = xo + 0.1
end
y
function z(i, n)
2 * (1 - (r ^ 2)) * y[i, n] - y[i, n - 1] + (r ^ 2) * (y[i + 1, n] + y[i - 1, n])
end
r = 1
mat= zeros(100,15)
for n = 2:14
for i = 2:99
y[i, n+1] = z(i, n)
end
end
y
using Plots
plot(i=1:100,y[:, 1:15])
I understand that you want to combine plotting with an animation (hopefully this is what you mean). In that case you can use an #animate macro:
using Plots
x = 0:π/100:8π
anim = #animate for v ∈ 0:π/20:2π
plot(x, sin.(x .+ v), legend=false)
end
gif(anim, "myanim.gif", fps = 15)
Ok so I figured out how to plot the credible intervals for a univariate linear model in Turing.jl using the following code (I'm replicating Statistical rethinking by McElreath) This particular exercise is in chapter 4. If anyone has already plotted these types of models with Turing and can give me a guide, it would be great!!!
Univariate model code:
using Turing
using StatsPlots
using Plots
height = df2.height
weight = df2.weight
#model heightmodel(y, x) = begin
#priors
α ~ Normal(178, 100)
σ ~ Uniform(0, 50)
β ~ LogNormal(0, 10)
x_bar = mean(x)
#model
μ = α .+ (x.-x_bar).*β
y ~ MvNormal(μ, σ)
end
chns = sample(heightmodel(height, weight), NUTS(), 100000)
## code 4.43
describe(chns) |> display
# covariance and correlation
alph = get(chns, :α)[1].data
bet = get(chns, :β)[1].data
sigm = get(chns, :σ)[1].data
vecs = (alph[1:352], bet[1:352])
arr = vcat(transpose.(vecs)...)'
ss = [vec(alph + bet.*(x)) for x in 25:1:70]
arrr = vcat(transpose.(ss)...)'
plot([mean(arrr[:,x]) for x in 1:46],25:1:70, ribbon = ([-1*(quantile(arrr[:,x],[0.1,0.9])[1] - mean(arrr[:,x])) for x in 1:46], [quantile(arrr[:,x],[0.1,0.9])[2] - mean(arrr[:,x]) for x in 1:46]))
Credible interval Univariate:
However, when I try to replicate it with a multivatiate function, very strange things are drawn:
Multivariate model code:
weight_s = (df.weight .-mean(df.weight))./std(df.weight)
weight_s² = weight_s.^2
#model heightmodel(height, weight, weight²) = begin
#priors
α ~ Normal(178, 20)
σ ~ Uniform(0, 50)
β1 ~ LogNormal(0, 1)
β2 ~ Normal(0, 1)
#model
μ = α .+ weight.*β1 + weight².*β2
height ~ MvNormal(μ, σ)
end
chns = sample(heightmodel(height, weight_s, weight_s²), NUTS(), 100000)
describe(chns) |> display
### painting the fit
alph = get(chns, :α)[1].data
bet1 = get(chns, :β1)[1].data
bet2 = get(chns, :β2)[1].data
vecs = (alph[1:99000], bet1[1:99000], bet2[1:99000])
arr = vcat(transpose.(vecs)...)'
polinomial = [vec(alph + bet1.*(x) + bet2.*(x.^2)) for x in -2:0.01:2]
arrr = vcat(transpose.(polinomial)...)'
plot([mean(arrr[:,x]) for x in 1:401],-2:0.01:2, ribbon = ([-1*(quantile(arrr[:,x],[0.1,0.9])[1] - mean(arrr[:,x])) for x in 1:46], [quantile(arrr[:,x],[0.1,0.9])[2] - mean(arrr[:,x]) for x in 1:46]))
Credible interval Univariate:
In the Julia Slack channel (https://slackinvite.julialang.org/) Jens was kind enough to give me the answer. The credit goes to him --He doesn't have a SO account.
The main problem was that I was lacking simplicity and was trying to plot the mean the wrong way --in a very inefficient and strange way might I add. Each parameter has a vector which corresponds to 99000 draws from the posterior distribution, I was trying to draw the mean through the matrix but it's much easier to define the median first and then plot it, then you don't make mistakes as I did calculating the mean.
old code
vecs = (alph[1:99000], bet1[1:99000], bet2[1:99000])
arr = vcat(transpose.(vecs)...)'
[mean(arrr[:,x]) for x in 1:401]
can be written as:
testweights = -2:0.01:2
arr = [fheight.(w, res.α, res.β1, res.β2) for w in testweights]
m = [mean(v) for v in arr]
Moreover, the way Jens defined the Credible intervals is much more elegant and Julianic:
Jens' code:
quantiles = [quantile(v, [0.1, 0.9]) for v in arr]
lower = [q[1] - m for (q, m) in zip(quantiles, m)]
upper = [q[2] - m for (q, m) in zip(quantiles, m)]
My code:
ribbon = ([-1*(quantile(arrr[:,x],[0.1,0.9])[1] - mean(arrr[:,x])) for x in 1:46], [quantile(arrr[:,x],[0.1,0.9])[2] - mean(arrr[:,x]) for x in 1:46]))
Complete Solution:
weight_s = (d.weight .-mean(d.weight))./std(d.weight)
height = d.height
#model heightmodel(height, weight) = begin
#priors
α ~ Normal(178, 20)´
σ ~ Uniform(0, 50)
β1 ~ LogNormal(0, 1)
β2 ~ Normal(0, 1)
#model
μ = α .+ weight .* β1 + weight.^2 .* β2
# or μ = fheight.(weight, α, β1, β2) if we are defining fheigth anyways
height ~ MvNormal(μ, σ)
end
chns = sample(heightmodel(height, weight_s), NUTS(), 10000)
describe(chns) |> display
res = DataFrame(chns)
fheight(weight, α, β1, β2) = α + weight * β1 + weight^2 * β2
testweights = -2:0.01:2
arr = [fheight.(w, res.α, res.β1, res.β2) for w in testweights]
m = [mean(v) for v in arr]
quantiles = [quantile(v, [0.1, 0.9]) for v in arr]
lower = [q[1] - m for (q, m) in zip(quantiles, m)]
upper = [q[2] - m for (q, m) in zip(quantiles, m)]
plot(testweights, m, ribbon = [lower, upper])
I have an array Z in Julia which represents an image of a 2D Gaussian function. I.e. Z[i,j] is the height of the Gaussian at pixel i,j. I would like to determine the parameters of the Gaussian (mean and covariance), presumably by some sort of curve fitting.
I've looked into various methods for fitting Z: I first tried the Distributions package, but it is designed for a somewhat different situation (randomly selected points). Then I tried the LsqFit package, but it seems to be tailored for 1D fitting, as it is throwing errors when I try to fit 2D data, and there is no documentation I can find to lead me to a solution.
How can I fit a Gaussian to a 2D array in Julia?
The simplest approach is to use Optim.jl. Here is an example code (it was not optimized for speed, but it should show you how you can handle the problem):
using Distributions, Optim
# generate some sample data
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const m = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
decode(x) = (mu=x[1:2], sig=[x[3] x[4]; x[4] x[5]], s=x[6])
function objective(x)
mu, sig, s = decode(x)
try # sig might be infeasible so we have to handle this case
est_d = MvNormal(mu, sig)
ref_m = [s * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_m, m))
catch
sum(m)
end
end
# test for an example starting point
result = optimize(objective, [1.0, 0.0, 1.0, 0.0, 1.0, 1.0])
decode(result.minimizer)
Alternatively you could use constrained optimization e.g. like this:
using Distributions, JuMP, NLopt
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const Z = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
m = Model(solver=NLoptSolver(algorithm=:LD_MMA))
#variable(m, m1)
#variable(m, m2)
#variable(m, sig11 >= 0.001)
#variable(m, sig12)
#variable(m, sig22 >= 0.001)
#variable(m, sc >= 0.001)
function obj(m1, m2, sig11, sig12, sig22, sc)
est_d = MvNormal([m1, m2], [sig11 sig12; sig12 sig22])
ref_Z = [sc * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_Z, Z))
end
JuMP.register(m, :obj, 6, obj, autodiff=true)
#NLobjective(m, Min, obj(m1, m2, sig11, sig12, sig22, sc))
#NLconstraint(m, sig12*sig12 + 0.001 <= sig11*sig22)
setvalue(m1, 0.0)
setvalue(m2, 0.0)
setvalue(sig11, 1.0)
setvalue(sig12, 0.0)
setvalue(sig22, 1.0)
setvalue(sc, 1.0)
status = solve(m)
getvalue.([m1, m2, sig11, sig12, sig22, sc])
In principle, you have a loss function
loss(μ, Σ) = sum(dist(Z[i,j], N([x(i), y(j)], μ, Σ)) for i in Ri, j in Rj)
where x and y convert your indices to points on the axes (for which you need to know the grid distance and offset positions), and Ri and Rj the ranges of the indices. dist is the distance measure you use, eg. squared difference.
You should be able to pass this into an optimizer by packing μ and Σ into a single vector:
pack(μ, Σ) = [μ; vec(Σ)]
unpack(v) = #views v[1:N], reshape(v[N+1:end], N, N)
loss_packed(v) = loss(unpack(v)...)
where in your case N = 2. (Maybe the unpacking deserves some optimization to get rid of unnecessary copying.)
Another thing is that we have to ensure that Σ is positive semidifinite (and hence also symmetric). One way to do that is to parametrize the packed loss function differently, and optimize over some lower triangular matrix L, such that Σ = L * L'. In the case N = 2, we can write this as
unpack(v) = v[1:2], LowerTriangular([v[3] zero(v[3]); v[4] v[5]])
loss_packed(v) = let (μ, L) = unpack(v)
loss(μ, L * L')
end
(This is of course prone to further optimization, such as expanding the multiplication directly in to loss). A different way is to specify the condition as constraints into the optimizer.
For the optimzer to work you probably have to get the derivative of loss_packed. Either have to find the manually calculate it (by a good choice of dist), or maybe more easily by using a log transformation (if you're lucky, you find a way to reduce it to a linear problem...). Alternatively you could try to find an optimizer that does automatic differentiation.
I am trying to Implement gradient Descent algorithm from scratch to find the slope and intercept value for my linear fit line.
Using the package and calculating slope and intercept, I get slope = 0.04 and intercept = 7.2 but when I use my gradient descent algorithm for the same problem, I get slope and intercept both values = (-infinity,-infinity)
Here is my code
x= [1,2,3,4,5,6,7,8,9,10,11,12,13,141,5,16,17,18,19,20]
y=[2,3,4,5,6,7,8,9,10,11,12,13,141,5,16,17,18,19,20,21]
function GradientDescent()
m=0
c=0
for i=1:10000
for k=1:length(x)
Yp = m*x[k] + c
E = y[k]-Yp #error in predicted value
dm = 2*E*(-x[k]) # partial derivation of cost function w.r.t slope(m)
dc = 2*E*(-1) # partial derivate of cost function w.r.t. Intercept(c)
m = m + (dm * 0.001)
c = c + (dc * 0.001)
end
end
return m,c
end
Values = GradientDescent() # after running values = (-inf,-inf)
I have not done the math, but instead wrote the tests. It seems you got a sign error when assigning m and c.
Also, writing the tests really helps, and Julia makes it simple :)
function GradientDescent(x, y)
m=0.0
c=0.0
for i=1:10000
for k=1:length(x)
Yp = m*x[k] + c
E = y[k]-Yp
dm = 2*E*(-x[k])
dc = 2*E*(-1)
m = m - (dm * 0.001)
c = c - (dc * 0.001)
end
end
return m,c
end
using Base.Test
#testset "gradient descent" begin
#testset "slope $slope" for slope in [0, 1, 2]
#testset "intercept for $intercept" for intercept in [0, 1, 2]
x = 1:20
y = broadcast(x -> slope * x + intercept, x)
computed_slope, computed_intercept = GradientDescent(x, y)
#test slope ≈ computed_slope atol=1e-8
#test intercept ≈ computed_intercept atol=1e-8
end
end
end
I can't get your exact numbers, but this is close. Perhaps it helps?
# 141 ?
datax = [1,2,3,4,5,6,7,8,9,10,11,12,13,141,5,16,17,18,19,20]
datay = [2,3,4,5,6,7,8,9,10,11,12,13,141,5,16,17,18,19,20,21]
function gradientdescent()
m = 0
b = 0
learning_rate = 0.00001
for n in 1:10000
for i in 1:length(datay)
x = datax[i]
y = datay[i]
guess = m * x + b
error = y - guess
dm = 2error * x
dc = 2error
m += dm * learning_rate
b += dc * learning_rate
end
end
return m, b
end
gradientdescent()
(-0.04, 17.35)
It seems that adjusting the learning rate is critical...