Is there an R spline that matches the Schoenberg algorithm? - r

Is there a R algorithm that fit smoothing splines while minimizing L,
L = ρ ∑ (i from 0 to n-1) wi(yi-Si(xi))² + (1 - ρ) ∫ (x from 0 to x_(n-1)) (S''(x))² dx
Maybe it's possible with smooth.spline but I didn't succeed to find the good parameters.
(The equation can be seen more clearly here : https://www.iro.umontreal.ca/~simardr/ssj/doc/html/umontreal/iro/lecuyer/functionfit/SmoothingCubicSpline.html)

pspline::smooth.Pspline( x = input_x,
y = input_y,
norder = 2,
method = 1,
spar = rho)
will do the job.

Related

Renewal Function for Weibull Distribution

The renewal function for Weibull distribution m(t) with t = 10 is given as below.
I want to find the value of m(t). I wrote the following r code to compute m(t)
last_term = NULL
gamma_k = NULL
n = 50
for(k in 1:n){
gamma_k[k] = gamma(2*k + 1)/factorial(k)
}
for(j in 1: (n-1)){
prev = gamma_k[n-j]
last_term[j] = gamma(2*j + 1)/factorial(j)*prev
}
final_term = NULL
find_value = function(n){
for(i in 2:n){
final_term[i] = gamma_k[i] - sum(last_term[1:(i-1)])
}
return(final_term)
}
all_k = find_value(n)
af_sum = NULL
m_t = function(t){
for(k in 1:n){
af_sum[k] = (-1)^(k-1) * all_k[k] * t^(2*k)/gamma(2*k + 1)
}
return(sum(na.omit(af_sum)))
}
m_t(20)
The output is m(t) = 2.670408e+93. Does my iteratvie procedure correct? Thanks.
I don't think it will work. First, lets move Γ(2k+1) from denominator of m(t) into Ak. Thus, Ak will behave roughly as 1/k!.
In the nominator of the m(t) terms there is t2k, so roughly speaking you're computing sum with terms
100k/k!
From Stirling formula
k! ~ kk, making terms
(100/k)k
so yes, they will start to decrease and converge to something but after 100th term
Anyway, here is the code, you could try to improve it, but it breaks at k~70
N <- 20
A <- rep(0, N)
# compute A_k/gamma(2k+1) terms
ps <- 0.0 # previous sum
A[1] = 1.0
for(k in 2:N) {
ps <- ps + A[k-1]*gamma(2*(k-1) + 1)/factorial(k-1)
A[k] <- 1.0/factorial(k) - ps/gamma(2*k+1)
}
print(A)
t <- 10.0
t2 <- t*t
r <- 0.0
for(k in 1:N){
r <- r + (-t2)^k*A[k]
}
print(-r)
UPDATE
Ok, I calculated Ak as in your question, got the same answer. I want to estimate terms Ak/Γ(2k+1) from m(t), I believe it will be pretty much dominated by 1/k! term. To do that I made another array k!*Ak/Γ(2k+1), and it should be close to one.
Code
N <- 20
A <- rep(0.0, N)
psum <- function( pA, k ) {
ps <- 0.0
if (k >= 2) {
jmax <- k - 1
for(j in 1:jmax) {
ps <- ps + (gamma(2*j+1)/factorial(j))*pA[k-j]
}
}
ps
}
# compute A_k/gamma(2k+1) terms
A[1] = gamma(3)
for(k in 2:N) {
A[k] <- gamma(2*k+1)/factorial(k) - psum(A, k)
}
print(A)
B <- rep(0.0, N)
for(k in 1:N) {
B[k] <- (A[k]/gamma(2*k+1))*factorial(k)
}
print(B)
shows that
I got the same Ak values as you did.
Bk is indeed very close to 1
It means that term Ak/Γ(2k+1) could be replaced by 1/k! to get quick estimate of what we might get (with replacement)
m(t) ~= - Sum(k=1, k=Infinity) (-1)k (t2)k / k! = 1 - Sum(k=0, k=Infinity) (-t2)k / k!
This is actually well-known sum and it is equal to exp() with negative argument (well, you have to add term for k=0)
m(t) ~= 1 - exp(-t2)
Conclusions
Approximate value is positive. Probably will stay positive after all, Ak/Γ(2k+1) is a bit different from 1/k!.
We're talking about 1 - exp(-100), which is 1-3.72*10-44! And we're trying to compute it precisely summing and subtracting values on the order of 10100 or even higher. Even with MPFR I don't think this is possible.
Another approach is needed
OK, so I ended up going down a pretty different road on this. I have implemented a simple discretization of the integral equation which defines the renewal function:
m(t) = F(t) + integrate (m(t - s)*f(s), s, 0, t)
The integral is approximated with the rectangle rule. Approximating the integral for different values of t gives a system of linear equations. I wrote a function to generate the equations and extract a matrix of coefficients from it. After looking at some examples, I guessed a rule to define the coefficients directly and used that to generate solutions for some examples. In particular I tried shape = 2, t = 10, as in OP's example, with step = 0.1 (so 101 equations).
I found that the result agrees pretty well with an approximate result which I found in a paper (Baxter et al., cited in the code). Since the renewal function is the expected number of events, for large t it is approximately equal to t/mu where mu is the mean time between events; this is a handy way to know if we're anywhere in the neighborhood.
I was working with Maxima (http://maxima.sourceforge.net), which is not efficient for numerical stuff, but which makes it very easy to experiment with different aspects. At this point it would be straightforward to port the final, numerical stuff to another language such as Python.
Thanks to OP for suggesting the problem, and S. Pappadeux for insightful discussions. Here is the plot I got comparing the discretized approximation (red) with the approximation for large t (blue). Trying some examples with different step sizes, I saw that the values tend to increase a little as step size gets smaller, so I think the red line is probably a little low, and the blue line might be more nearly correct.
Here is my Maxima code:
/* discretize weibull renewal function and formulate system of linear equations
* copyright 2020 by Robert Dodier
* I release this work under terms of the GNU General Public License
*
* This is a program for Maxima, a computer algebra system.
* http://maxima.sourceforge.net/
*/
"Definition of the renewal function m(t):" $
renewal_eq: m(t) = F(t) + 'integrate (m(t - s)*f(s), s, 0, t);
"Approximate integral equation with rectangle rule:" $
discretize_renewal (delta_t, k) :=
if equal(k, 0)
then m(0) = F(0)
else m(k*delta_t) = F(k*delta_t)
+ m(k*delta_t)*f(0)*(delta_t / 2)
+ sum (m((k - j)*delta_t)*f(j*delta_t)*delta_t, j, 1, k - 1)
+ m(0)*f(k*delta_t)*(delta_t / 2);
make_eqs (n, delta_t) :=
makelist (discretize_renewal (delta_t, k), k, 0, n);
make_vars (n, delta_t) :=
makelist (m(k*delta_t), k, 0, n);
"Discretized integral equation and variables for n = 4, delta_t = 1/2:" $
make_eqs (4, 1/2);
make_vars (4, 1/2);
make_eqs_vars (n, delta_t) :=
[make_eqs (n, delta_t), make_vars (n, delta_t)];
load (distrib);
subst_pdf_cdf (shape, scale, e) :=
subst ([f = lambda ([x], pdf_weibull (x, shape, scale)), F = lambda ([x], cdf_weibull (x, shape, scale))], e);
matrix_from (eqs, vars) :=
(augcoefmatrix (eqs, vars),
[submatrix (%%, length(%%) + 1), - col (%%, length(%%) + 1)]);
"Subsitute Weibull pdf and cdf for shape = 2 into discretized equation:" $
apply (matrix_from, make_eqs_vars (4, 1/2));
subst_pdf_cdf (2, 1, %);
"Just the right-hand side matrix:" $
rhs_matrix_from (eqs, vars) :=
(map (rhs, eqs),
augcoefmatrix (%%, vars),
[submatrix (%%, length(%%) + 1), col (%%, length(%%) + 1)]);
"Generate the right-hand side matrix, instead of extracting it from equations:" $
generate_rhs_matrix (n, delta_t) :=
[delta_t * genmatrix (lambda ([i, j], if i = 1 and j = 1 then 0
elseif j > i then 0
elseif j = i then f(0)/2
elseif j = 1 then f(delta_t*(i - 1))/2
else f(delta_t*(i - j))), n + 1, n + 1),
transpose (makelist (F(k*delta_t), k, 0, n))];
"Generate numerical right-hand side matrix, skipping over formulas:" $
generate_rhs_matrix_numerical (shape, scale, n, delta_t) :=
block ([f, F, numer: true], local (f, F),
f: lambda ([x], pdf_weibull (x, shape, scale)),
F: lambda ([x], cdf_weibull (x, shape, scale)),
[genmatrix (lambda ([i, j], delta_t * if i = 1 and j = 1 then 0
elseif j > i then 0
elseif j = i then f(0)/2
elseif j = 1 then f(delta_t*(i - 1))/2
else f(delta_t*(i - j))), n + 1, n + 1),
transpose (makelist (F(k*delta_t), k, 0, n))]);
"Solve approximate integral equation (shape = 3, t = 1) via LU decomposition:" $
fpprintprec: 4 $
n: 20 $
t: 1;
[AA, bb]: generate_rhs_matrix_numerical (3, 1, n, t/n);
xx_by_lu: linsolve_by_lu (ident(n + 1) - AA, bb, floatfield);
"Iterative solution of approximate integral equation (shape = 3, t = 1):" $
xx: bb;
for i thru 10 do xx: AA . xx + bb;
xx - (AA.xx + bb);
xx_iterative: xx;
"Should find iterative and LU give same result:" $
xx_diff: xx_iterative - xx_by_lu[1];
sqrt (transpose(xx_diff) . xx_diff);
"Try shape = 2, t = 10:" $
n: 100 $
t: 10 $
[AA, bb]: generate_rhs_matrix_numerical (2, 1, n, t/n);
xx_by_lu: linsolve_by_lu (ident(n + 1) - AA, bb, floatfield);
"Baxter, et al., Eq. 3 (for large values of t) compared to discretization:" $
/* L.A. Baxter, E.M. Scheuer, D.J. McConalogue, W.R. Blischke.
* "On the Tabulation of the Renewal Function,"
* Econometrics, vol. 24, no. 2 (May 1982).
* H(t) is their notation for the renewal function.
*/
H(t) := t/mu + sigma^2/(2*mu^2) - 1/2;
tx_points: makelist ([float (k/n*t), xx_by_lu[1][k, 1]], k, 1, n);
plot2d ([H(u), [discrete, tx_points]], [u, 0, t]), mu = mean_weibull(2, 1), sigma = std_weibull(2, 1);

Plotting credible intervals in Julia from Turing model

Ok so I figured out how to plot the credible intervals for a univariate linear model in Turing.jl using the following code (I'm replicating Statistical rethinking by McElreath) This particular exercise is in chapter 4. If anyone has already plotted these types of models with Turing and can give me a guide, it would be great!!!
Univariate model code:
using Turing
using StatsPlots
using Plots
height = df2.height
weight = df2.weight
#model heightmodel(y, x) = begin
#priors
α ~ Normal(178, 100)
σ ~ Uniform(0, 50)
β ~ LogNormal(0, 10)
x_bar = mean(x)
#model
μ = α .+ (x.-x_bar).*β
y ~ MvNormal(μ, σ)
end
chns = sample(heightmodel(height, weight), NUTS(), 100000)
## code 4.43
describe(chns) |> display
# covariance and correlation
alph = get(chns, :α)[1].data
bet = get(chns, :β)[1].data
sigm = get(chns, :σ)[1].data
vecs = (alph[1:352], bet[1:352])
arr = vcat(transpose.(vecs)...)'
ss = [vec(alph + bet.*(x)) for x in 25:1:70]
arrr = vcat(transpose.(ss)...)'
plot([mean(arrr[:,x]) for x in 1:46],25:1:70, ribbon = ([-1*(quantile(arrr[:,x],[0.1,0.9])[1] - mean(arrr[:,x])) for x in 1:46], [quantile(arrr[:,x],[0.1,0.9])[2] - mean(arrr[:,x]) for x in 1:46]))
Credible interval Univariate:
However, when I try to replicate it with a multivatiate function, very strange things are drawn:
Multivariate model code:
weight_s = (df.weight .-mean(df.weight))./std(df.weight)
weight_s² = weight_s.^2
#model heightmodel(height, weight, weight²) = begin
#priors
α ~ Normal(178, 20)
σ ~ Uniform(0, 50)
β1 ~ LogNormal(0, 1)
β2 ~ Normal(0, 1)
#model
μ = α .+ weight.*β1 + weight².*β2
height ~ MvNormal(μ, σ)
end
chns = sample(heightmodel(height, weight_s, weight_s²), NUTS(), 100000)
describe(chns) |> display
### painting the fit
alph = get(chns, :α)[1].data
bet1 = get(chns, :β1)[1].data
bet2 = get(chns, :β2)[1].data
vecs = (alph[1:99000], bet1[1:99000], bet2[1:99000])
arr = vcat(transpose.(vecs)...)'
polinomial = [vec(alph + bet1.*(x) + bet2.*(x.^2)) for x in -2:0.01:2]
arrr = vcat(transpose.(polinomial)...)'
plot([mean(arrr[:,x]) for x in 1:401],-2:0.01:2, ribbon = ([-1*(quantile(arrr[:,x],[0.1,0.9])[1] - mean(arrr[:,x])) for x in 1:46], [quantile(arrr[:,x],[0.1,0.9])[2] - mean(arrr[:,x]) for x in 1:46]))
Credible interval Univariate:
In the Julia Slack channel (https://slackinvite.julialang.org/) Jens was kind enough to give me the answer. The credit goes to him --He doesn't have a SO account.
The main problem was that I was lacking simplicity and was trying to plot the mean the wrong way --in a very inefficient and strange way might I add. Each parameter has a vector which corresponds to 99000 draws from the posterior distribution, I was trying to draw the mean through the matrix but it's much easier to define the median first and then plot it, then you don't make mistakes as I did calculating the mean.
old code
vecs = (alph[1:99000], bet1[1:99000], bet2[1:99000])
arr = vcat(transpose.(vecs)...)'
[mean(arrr[:,x]) for x in 1:401]
can be written as:
testweights = -2:0.01:2
arr = [fheight.(w, res.α, res.β1, res.β2) for w in testweights]
m = [mean(v) for v in arr]
Moreover, the way Jens defined the Credible intervals is much more elegant and Julianic:
Jens' code:
quantiles = [quantile(v, [0.1, 0.9]) for v in arr]
lower = [q[1] - m for (q, m) in zip(quantiles, m)]
upper = [q[2] - m for (q, m) in zip(quantiles, m)]
My code:
ribbon = ([-1*(quantile(arrr[:,x],[0.1,0.9])[1] - mean(arrr[:,x])) for x in 1:46], [quantile(arrr[:,x],[0.1,0.9])[2] - mean(arrr[:,x]) for x in 1:46]))
Complete Solution:
weight_s = (d.weight .-mean(d.weight))./std(d.weight)
height = d.height
#model heightmodel(height, weight) = begin
#priors
α ~ Normal(178, 20)´
σ ~ Uniform(0, 50)
β1 ~ LogNormal(0, 1)
β2 ~ Normal(0, 1)
#model
μ = α .+ weight .* β1 + weight.^2 .* β2
# or μ = fheight.(weight, α, β1, β2) if we are defining fheigth anyways
height ~ MvNormal(μ, σ)
end
chns = sample(heightmodel(height, weight_s), NUTS(), 10000)
describe(chns) |> display
res = DataFrame(chns)
fheight(weight, α, β1, β2) = α + weight * β1 + weight^2 * β2
testweights = -2:0.01:2
arr = [fheight.(w, res.α, res.β1, res.β2) for w in testweights]
m = [mean(v) for v in arr]
quantiles = [quantile(v, [0.1, 0.9]) for v in arr]
lower = [q[1] - m for (q, m) in zip(quantiles, m)]
upper = [q[2] - m for (q, m) in zip(quantiles, m)]
plot(testweights, m, ribbon = [lower, upper])

Can't get performant Julia Turing model

I've tried to reproduce the model from a PYMC3 and Stan comparison. But it seems to run slowly and when I look at #code_warntype there are some things -- K and N I think -- which the compiler seemingly calls Any.
I've tried adding types -- though I can't add types to turing_model's arguments and things are complicated within turing_model because it's using autodiff variables and not the usuals. I put all the code into the function do_it to avoid globals, because they say that globals can slow things down. (It actually seems slower, though.)
Any suggestions as to what's causing the problem? The turing_model code is what's iterating, so that should make the most difference.
using Turing, StatsPlots, Random
sigmoid(x) = 1.0 / (1.0 + exp(-x))
function scale(w0::Float64, w1::Array{Float64,1})
scale = √(w0^2 + sum(w1 .^ 2))
return w0 / scale, w1 ./ scale
end
function do_it(iterations::Int64)::Chains
K = 10 # predictor dimension
N = 1000 # number of data samples
X = rand(N, K) # predictors (1000, 10)
w1 = rand(K) # weights (10,)
w0 = -median(X * w1) # 50% of elements for each class (number)
w0, w1 = scale(w0, w1) # unit length (euclidean)
w_true = [w0, w1...]
y = (w0 .+ (X * w1)) .> 0.0 # labels
y = [Float64(x) for x in y]
σ = 5.0
σm = [x == y ? σ : 0.0 for x in 1:K, y in 1:K]
#model turing_model(X, y, σ, σm) = begin
w0_pred ~ Normal(0.0, σ)
w1_pred ~ MvNormal(σm)
p = sigmoid.(w0_pred .+ (X * w1_pred))
#inbounds for n in 1:length(y)
y[n] ~ Bernoulli(p[n])
end
end
#time chain = sample(turing_model(X, y, σ, σm), NUTS(iterations, 200, 0.65));
# ϵ = 0.5
# τ = 10
# #time chain = sample(turing_model(X, y, σ), HMC(iterations, ϵ, τ));
return (w_true=w_true, chains=chain::Chains)
end
chain = do_it(1000)

2D curve fitting in Julia

I have an array Z in Julia which represents an image of a 2D Gaussian function. I.e. Z[i,j] is the height of the Gaussian at pixel i,j. I would like to determine the parameters of the Gaussian (mean and covariance), presumably by some sort of curve fitting.
I've looked into various methods for fitting Z: I first tried the Distributions package, but it is designed for a somewhat different situation (randomly selected points). Then I tried the LsqFit package, but it seems to be tailored for 1D fitting, as it is throwing errors when I try to fit 2D data, and there is no documentation I can find to lead me to a solution.
How can I fit a Gaussian to a 2D array in Julia?
The simplest approach is to use Optim.jl. Here is an example code (it was not optimized for speed, but it should show you how you can handle the problem):
using Distributions, Optim
# generate some sample data
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const m = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
decode(x) = (mu=x[1:2], sig=[x[3] x[4]; x[4] x[5]], s=x[6])
function objective(x)
mu, sig, s = decode(x)
try # sig might be infeasible so we have to handle this case
est_d = MvNormal(mu, sig)
ref_m = [s * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_m, m))
catch
sum(m)
end
end
# test for an example starting point
result = optimize(objective, [1.0, 0.0, 1.0, 0.0, 1.0, 1.0])
decode(result.minimizer)
Alternatively you could use constrained optimization e.g. like this:
using Distributions, JuMP, NLopt
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const Z = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
m = Model(solver=NLoptSolver(algorithm=:LD_MMA))
#variable(m, m1)
#variable(m, m2)
#variable(m, sig11 >= 0.001)
#variable(m, sig12)
#variable(m, sig22 >= 0.001)
#variable(m, sc >= 0.001)
function obj(m1, m2, sig11, sig12, sig22, sc)
est_d = MvNormal([m1, m2], [sig11 sig12; sig12 sig22])
ref_Z = [sc * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_Z, Z))
end
JuMP.register(m, :obj, 6, obj, autodiff=true)
#NLobjective(m, Min, obj(m1, m2, sig11, sig12, sig22, sc))
#NLconstraint(m, sig12*sig12 + 0.001 <= sig11*sig22)
setvalue(m1, 0.0)
setvalue(m2, 0.0)
setvalue(sig11, 1.0)
setvalue(sig12, 0.0)
setvalue(sig22, 1.0)
setvalue(sc, 1.0)
status = solve(m)
getvalue.([m1, m2, sig11, sig12, sig22, sc])
In principle, you have a loss function
loss(μ, Σ) = sum(dist(Z[i,j], N([x(i), y(j)], μ, Σ)) for i in Ri, j in Rj)
where x and y convert your indices to points on the axes (for which you need to know the grid distance and offset positions), and Ri and Rj the ranges of the indices. dist is the distance measure you use, eg. squared difference.
You should be able to pass this into an optimizer by packing μ and Σ into a single vector:
pack(μ, Σ) = [μ; vec(Σ)]
unpack(v) = #views v[1:N], reshape(v[N+1:end], N, N)
loss_packed(v) = loss(unpack(v)...)
where in your case N = 2. (Maybe the unpacking deserves some optimization to get rid of unnecessary copying.)
Another thing is that we have to ensure that Σ is positive semidifinite (and hence also symmetric). One way to do that is to parametrize the packed loss function differently, and optimize over some lower triangular matrix L, such that Σ = L * L'. In the case N = 2, we can write this as
unpack(v) = v[1:2], LowerTriangular([v[3] zero(v[3]); v[4] v[5]])
loss_packed(v) = let (μ, L) = unpack(v)
loss(μ, L * L')
end
(This is of course prone to further optimization, such as expanding the multiplication directly in to loss). A different way is to specify the condition as constraints into the optimizer.
For the optimzer to work you probably have to get the derivative of loss_packed. Either have to find the manually calculate it (by a good choice of dist), or maybe more easily by using a log transformation (if you're lucky, you find a way to reduce it to a linear problem...). Alternatively you could try to find an optimizer that does automatic differentiation.

Automatic Differentiation in Julia: Hessian from ReverseDiffSparse

How can I evaluate the Hessian of a function in Julia using automatic differentiation (preferably using ReverseDiffSparse)? In the following example, I can compute and evaluate the gradient at a point values through JuMP:
m = Model()
#variable(m, x)
#variable(m, y)
#NLobjective(m, Min, sin(x) + sin(y))
values = zeros(2)
values[linearindex(x)] = 2.0
values[linearindex(y)] = 3.0
d = JuMP.NLPEvaluator(m)
MathProgBase.initialize(d, [:Grad])
objval = MathProgBase.eval_f(d, values) # == sin(2.0) + sin(3.0)
∇f = zeros(2)
MathProgBase.eval_grad_f(d, ∇f, values)
# ∇f[linearindex(x)] == cos(2.0)
# ∇f[linearindex(y)] == cos(3.0)
Now I want the Hessian at values. I've tried changing the below:
MathProgBase.initialize(d, [:Grad,:Hess])
H = zeros(2); # based on MathProgBase.hesslag_structure(d) = ([1,2],[1,2])
MathProgBase.eval_hesslag(d, H, values, 0, [0])
but am getting an error.
For reference, cross-post is on the helpful Julia Discourse forum here.

Resources