lbfgs: how to find gradient - r

I want to use lbfgs method for minimizing function. Problem is that the function is Svenson function (see: Svenson function) and I do not know how to find gradient of such function, where tau (time) goes from 1:15.
Any help?

This is the gradient. Take the partials wrt each of the variables.
We can numerically check the gradient is correct.
grad=function(t, b0, b1, b2, b3, l1, l2) {
a=1
b=exp(-t/l1)
c=t/l1*exp(-t/l1)
d=t/l2*exp(-t/l2)
e=t*exp(-t/l1)*(b1*l1+b2*(t-l1))/l1^3
f=-b3*t*exp(-t/l2)*(l2-t)/l2^3
return(c(a,b,c,d,e,f))
}
func=function(t, b0, b1, b2, b3, l1, l2) {
return(b0+b1*exp(-t/l1)+b2*t/l1*exp(-t/l1)+b3*t/l2*exp(-t/l2))
}
x0=c(runif(6))
x1=x0+rnorm(6, 0, .01)
f0=func(1, x0[1], x0[2], x0[3], x0[4], x0[5], x0[6])
f0grd=grad(1, x0[1], x0[2], x0[3], x0[4], x0[5], x0[6])
f1=func(1, x1[1], x1[2], x1[3], x1[4], x1[5], x1[6])
f1-f0
0.009506896
sum(f0grd * (x1-x0))
0.009467063

Related

arctangent implementation in TMS320C55X

I'm learning arctangent implementation in TMS320C55x
this is the source code:
;* AR0 assigned to _x
;* AR1 assigned to _r
;* T0 assigned to _nx
PSH T3
|| BSET FRCT ;fractional mode
SUB #1, T0 ;nx-1
MOV T0, BRC0 ;repeat nx times
MOV #2596 << #16, AC3 ; AC3.Hi = C5
MOV #-9464 << #16, AC1 ; AC1.Hi = C3
MOV #32617 << #16, AC2 ; AC2.Hi = C1
*
* Note: loading T3 on the instruction before a multiply that uses it will
* cause a 1-cycle delay.
*
MPYMR T3=*AR0+, AC3, AC0 ; (Prime the Pump)
|| RPTBLOCAL loop1-1
MACR AC0, T3, AC1, AC0
MPYR T3, AC0
||MOV *AR0+, T1 ; (for next iteration)
MACR AC0, T3, AC2, AC0
MPYR T3, AC0
||MOV T1, T3
MOV HI(AC0), *AR1+ ;save result
||MPYR T1, AC3, AC0 ; (for next iteration)
loop1:
POP T3
|| BCLR FRCT ;return to standard C
MOV #0, T0 ;return OK value (no possible error)
|| RET
where _x is vector of input and _r is output. nx is the number of elements.
The question is about constants that assigns to AC3, AC1, AC2. I guess it is coefficients for polynomial approximation but i don't understand how to calculate them
I do not follow the assembly code, but I could guess where those magic coefficients come from.
The code comments suggest C1, C3, C5 are coefficients of a polynomial approximation, and arctan is an odd function, so its Taylor expansion around 0 has indeed only odd powers of x. Comparing C1 = 32617 to 1 in the Taylor expansion y = x - 1/3 x^3 + 1/5 x^5 - 1/7 x^7 + ..., and given the computational context, this further suggests that the result of the calculation is scaled by 2^15 = 32768.
It turns out that y = (32617 x - 9464 x^3 + 2596 x^5) / 32768 is in fact a pretty good approximation of arctan(x) over the interval [-1, 1]. As shown below (verified in wolfram alpha) the largest absolute error of the approximation is less than 1/1000, and is negligible at the endpoints x = ±1 corresponding to y = ±π/4, which is probably desirable in graphics calculations.
As to how the coefficients were actually derived, a crude polynomial best-fit using just 9 control points gives a polynomial y = 32613 x - 9443 x^3 + 2573 x^5 with coefficients already close to the ones used in the posted code. More control points and/or additional conditions to minimize the error at the end points would result in slightly different coefficients, but it's hard to guess how to exactly match the ones in the code without any documentation or clues about the optimization criteria being actually used there.

Multiplication on Fuzzy Number (a> 0, b <0)

If (A * B) = (a, a1, a2) LR * (b, b1, b2) LR = ((a * b, (b * a1-a * b3), (b * a3-a * b1)) LR (for a <0, b> 0 Fuzzy Number multiplication) then what is the formula for (a> 0, b <0) LR Fuzzy Number multiplication?
Check the photo below
Fuzzy Number Multiplication
If A*B is to be the same as B*A then there is no need to give separate definitions for the two cases a<0, b>0 and a>0,b<0. If a<0,b>0, just compute B*A. On the other hand, if fuzzy multiplication is not commutative, then you are correct that there is a gap in the book's definition.

Finding the new relative position of a point against the line after moved together

I am working on a drawing application. I summerize the main question to a tiny scenario:
User draws the line A1-B1 and also the point M1. Now user moves the line A1-B1 to a new position A2-B2 while the point M1 also should move with the line (like a rigid body). How can I calculate the new position of point M2?
I know it is possbile with complex calucations on lines or circles conjuction and finally detemrining if the new point is on the right or left of line (this question) etc. But as I want to recalculate live on any steps of dragging, I guess there should be a shortcut and light solution which all modeling softwares use. Is there really a shortcut formula rahter than the line or circle conjuctions?
To summerize the problem again I am looking for a light function with given A1,B1,A2,B2,M1 and looking for M2 position (two separate formula for x and y)
M2 = f(A1,B1,A2,B2,M1)
My guess: I think finding the center and the angle of rotation is one of the best options but I don't know again if there is a shotcut function to find them.
Edit: I prefer to implement it in an online website, so if there is a ready javascript library, it would be helpful, However I am now just looking for the mathematical logic.
Start with A1 and B1.
D = B1 - A1
C1 = D/|D|
Rotate C1 90 degrees counter-clockwise to get C'1:
C1 = (cx, cy)
C'1 = (-cy, cx)
Now take M1 - A1 and measure it against C1 and C'1:
j = (M1 - A) · C1
k = (M1 - A) · C'1
Now it's easy to prove:
M1 = A1 + j C1 + k C'1
So j and k tell you where M1 is, given A1 and B1.
When you get A2 and B2, use them to construct C2, rotate it to get C'2, and then you'll have:
M2 = A2 + j C2 + k C'2
I'm assuming points are objects which have x and y values. Like this:
A1 = {x: 100, y:100}
Here's a lightweight solution:
function reposition(A1,A2,B1,B2,M1)
{
//Getting the Angle Between Lines
var angle = Math.atan2(B2.y-B1.y, B2.x-B1.x)-Math.atan2(A2.y-A1.y, A2.x-A1.x);
//Rotating the point
var xdif = M1.x-A1.x; //point.x-origin.x
var ydif = M1.y-A1.y; //point.y-origin.y
//Returning M2
return {x:B1.x + Math.cos(angle) * xdif - Math.sin(angle) * ydif, y: B1.y + Math.sin(angle) * xdif + Math.cos(angle) * ydif}
}

2D curve fitting in Julia

I have an array Z in Julia which represents an image of a 2D Gaussian function. I.e. Z[i,j] is the height of the Gaussian at pixel i,j. I would like to determine the parameters of the Gaussian (mean and covariance), presumably by some sort of curve fitting.
I've looked into various methods for fitting Z: I first tried the Distributions package, but it is designed for a somewhat different situation (randomly selected points). Then I tried the LsqFit package, but it seems to be tailored for 1D fitting, as it is throwing errors when I try to fit 2D data, and there is no documentation I can find to lead me to a solution.
How can I fit a Gaussian to a 2D array in Julia?
The simplest approach is to use Optim.jl. Here is an example code (it was not optimized for speed, but it should show you how you can handle the problem):
using Distributions, Optim
# generate some sample data
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const m = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
decode(x) = (mu=x[1:2], sig=[x[3] x[4]; x[4] x[5]], s=x[6])
function objective(x)
mu, sig, s = decode(x)
try # sig might be infeasible so we have to handle this case
est_d = MvNormal(mu, sig)
ref_m = [s * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_m, m))
catch
sum(m)
end
end
# test for an example starting point
result = optimize(objective, [1.0, 0.0, 1.0, 0.0, 1.0, 1.0])
decode(result.minimizer)
Alternatively you could use constrained optimization e.g. like this:
using Distributions, JuMP, NLopt
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const Z = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
m = Model(solver=NLoptSolver(algorithm=:LD_MMA))
#variable(m, m1)
#variable(m, m2)
#variable(m, sig11 >= 0.001)
#variable(m, sig12)
#variable(m, sig22 >= 0.001)
#variable(m, sc >= 0.001)
function obj(m1, m2, sig11, sig12, sig22, sc)
est_d = MvNormal([m1, m2], [sig11 sig12; sig12 sig22])
ref_Z = [sc * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_Z, Z))
end
JuMP.register(m, :obj, 6, obj, autodiff=true)
#NLobjective(m, Min, obj(m1, m2, sig11, sig12, sig22, sc))
#NLconstraint(m, sig12*sig12 + 0.001 <= sig11*sig22)
setvalue(m1, 0.0)
setvalue(m2, 0.0)
setvalue(sig11, 1.0)
setvalue(sig12, 0.0)
setvalue(sig22, 1.0)
setvalue(sc, 1.0)
status = solve(m)
getvalue.([m1, m2, sig11, sig12, sig22, sc])
In principle, you have a loss function
loss(μ, Σ) = sum(dist(Z[i,j], N([x(i), y(j)], μ, Σ)) for i in Ri, j in Rj)
where x and y convert your indices to points on the axes (for which you need to know the grid distance and offset positions), and Ri and Rj the ranges of the indices. dist is the distance measure you use, eg. squared difference.
You should be able to pass this into an optimizer by packing μ and Σ into a single vector:
pack(μ, Σ) = [μ; vec(Σ)]
unpack(v) = #views v[1:N], reshape(v[N+1:end], N, N)
loss_packed(v) = loss(unpack(v)...)
where in your case N = 2. (Maybe the unpacking deserves some optimization to get rid of unnecessary copying.)
Another thing is that we have to ensure that Σ is positive semidifinite (and hence also symmetric). One way to do that is to parametrize the packed loss function differently, and optimize over some lower triangular matrix L, such that Σ = L * L'. In the case N = 2, we can write this as
unpack(v) = v[1:2], LowerTriangular([v[3] zero(v[3]); v[4] v[5]])
loss_packed(v) = let (μ, L) = unpack(v)
loss(μ, L * L')
end
(This is of course prone to further optimization, such as expanding the multiplication directly in to loss). A different way is to specify the condition as constraints into the optimizer.
For the optimzer to work you probably have to get the derivative of loss_packed. Either have to find the manually calculate it (by a good choice of dist), or maybe more easily by using a log transformation (if you're lucky, you find a way to reduce it to a linear problem...). Alternatively you could try to find an optimizer that does automatic differentiation.

How to interface Prolog CLP(R) with real vectors?

I'm using Prolog to solve simple geometrical equations.
For example, I can define all points p3 on a line passing trough two points p1 and p2 as:
line((X1, Y1, Z1), (X2, Y2, Z2), T, (X3, Y3, Z3)) :-
{(X2 - X1) * T = X3},
{(Y2 - Y1) * T = Y3},
{(Z2 - Z1) * T = Z3}.
And then a predicate like line((0, 0, 0), (1, 1, 1), _, (2, 2, 2)) is true.
But what I'd really want is to write down something like this:
line(P1, P2, T, P3) :- {(P2 - P1) * T = P3}.
Where P1, P2, and P3 are real vectors.
What's the best way of arriving at something similar? The best I found so far is to rewrite my own add, subtract and multiply predicates, but that's not as conveniant.
Here is a solution where you still have to write a bit of code for each operator you want to handle, but which still provides nice syntax at the point of use.
Let's start with a notion of evaluating an arithmetic expression on vectors to a vector. This essentially applies arithmetic operations component-wise. (But you could add a dot product or whatever you like.)
:- use_module(library(clpr)).
vectorexpr_value((X,Y,Z), (X,Y,Z)).
vectorexpr_value(V * T, (X,Y,Z)) :-
vectorexpr_value(V, (XV,YV,ZV)),
{ X = XV * T },
{ Y = YV * T },
{ Z = ZV * T }.
vectorexpr_value(L + R, (X,Y,Z)) :-
vectorexpr_value(L, (XL,YL,ZL)),
vectorexpr_value(R, (XR,YR,ZR)),
{ X = XL + XR },
{ Y = YL + YR },
{ Z = ZL + ZR }.
vectorexpr_value(L - R, (X,Y,Z)) :-
vectorexpr_value(L, (XL,YL,ZL)),
vectorexpr_value(R, (XR,YR,ZR)),
{ X = XL - XR },
{ Y = YL - YR },
{ Z = ZL - ZR }.
So for example:
?- vectorexpr_value(A + B, Result).
A = (_1784, _1790, _1792),
B = (_1808, _1814, _1816),
Result = (_1832, _1838, _1840),
{_1808=_1832-_1784},
{_1814=_1838-_1790},
{_1816=_1840-_1792} .
Given this, we can now define "equality" of vector expressions by "evaluating" both of them and asserting pointwise equality on the results. To make this look nice, we can define an operator for it:
:- op(700, xfx, ===).
This defines === as an infix operator with the same priority as the other equality operators =, =:=, etc. Prolog doesn't allow you to overload operators, so we made up a new one. You can think of the three = signs in the operator as expressing equality in three dimensions.
Here is the corresponding predicate definition:
ExprL === ExprR :-
vectorexpr_value(ExprL, (XL,YL,ZL)),
vectorexpr_value(ExprR, (XR,YR,ZR)),
{ XL = XR },
{ YL = YR },
{ ZL = ZR }.
And we can now define line/4 almost as you wanted:
line(P1, P2, T, P3) :-
(P2 - P1) * T === P3.
Tests:
?- line((0,0,0), (1,1,1), Alpha, (2,2,2)).
Alpha = 2.0 ;
false.
?- line((0,0,0), (1,1,1), Alpha, (2,3,4)).
false.

Resources