Fitting two curves with linear/non-linear regression - julia

I need to fit two curves(which both should belong to cubic functions) into a set of points with JuMP.
I've done fitting one curve, but I'm struggling at fitting 2 curves into same dataset.
I thought that if I can distribute points to curves - so if each point can only be used once - I can do it like below, but it didn't work. (I know that I can use much more complicated things, I want to keep it simple.)
This is a part of my current code:
# cubicFunc is a two dimensional array which accepts cubicFunc[x,degree]
#variable(m, mult1[1:4]) // 0:3 because it's cubic
#variable(m, mult2[1:4]) // 0:3 because it's cubic
#variable(m, 0 <= includeIn1[1:numOfPoints] <= 1, Int)
#variable(m, 0 <= includeIn2[1:numOfPoints] <= 1, Int)
# some kind of hack to force one of them to 0 and other one to 1
#constraint(m, loop[i in 1:numOfPoints], includeIn1[i] + includeIn2[i] == 1)
#objective(m, Min, sum( (yPoints - cubicFunc*mult1).*includeIn1 .^2 ) + sum( (yPoints - cubicFunc*mult2).*includeIn2 .^2 ))
But it gives various errors depending on what I'm trying; *includeIn1 and, .*includeIn1 doesn't work, I've tried to do it via #NLobjective but it gave me whooping ~50 lines of errors etc.
Is my idea realistic? Can I make it into the code?
Any help will be highly appreciated. Thank you very much.

You can write down the problem e.g. like this:
using JuMP, Ipopt
m = Model(with_optimizer(Ipopt.Optimizer))
#variable(m, mult1[1:4])
#variable(m, mult2[1:4])
#variable(m, 0 <= includeIn1[1:numOfPoints] <= 1)
#variable(m, 0 <= includeIn2[1:numOfPoints] <= 1)
#NLconstraint(m, loop[i in 1:numOfPoints], includeIn1[i] + includeIn2[i] == 1)
#NLobjective(m, Min, sum(includeIn1[i] * (yPoints[i] - sum(cubicFunc[i,j]*mult1[j] for j in 1:4)) ^2 for i in 1:numOfPoints) +
sum(includeIn2[i] * (yPoints[i] - sum(cubicFunc[i,j]*mult2[j] for j in 1:4)) ^2 for i in 1:numOfPoints))
optimize!(m)
Given the constraints includeIn1 and includeIn2 will be 1 or 0 in optimum (if they are not this means that it does not matter to which group you assign the point), so we do not have to constrain them to be binary. Also I use non-linear solver as the problem does not not seem to be possible to reformulate as linear or quadratic optimization task.
However, I give the above code only as an example how you can write it down. The task you have formulated does not have a unique local minimum (that is a global one then), but several local minima. Therefore using standard non-linear convex solvers that JuMP supports will only find one local optimum (not necessarily a global one). In order to look for global optima you need to switch to global solvers like e.g. https://github.com/robertfeldt/BlackBoxOptim.jl.

Related

Define a correct constraint, like outside a 2-D rectangle in Julia with JuMP

I would like to define a constraint in an optimization problem as follows:
(x,y) not in {(x,y)|1.0 < x < 2.0, 3.0 < y < 4.0}.
what I tried is #constraint(model, (1.0 < x < 2.0 + 3.0 < y < 4.0)!=2), but failed.
It seems that boolen operation is not allowed. such that I have no idea about it. Any advice is appreciated!
You should avoid introducing quadratic constraints (as in the other answer) and rather introduce binary variables. This increase number of available solvers and generally linear models take shorter time to solve.
Hence you should note that !(1.0 < x < 2.0) is an equivalent of x <= 1 || x >= 2 which can be written in a linear form as:
#variable(model, bx, Bin)
const M = 1000 # number "big enough"
#constraint(model, x <= 1 + M*bx)
#constraint(model, x >=2 - M*(1-bx))
bx is here a "switcher" variable that makes either first or second constraint obligatory.
I am not sure what you want about y as you have 3.0 < y < 3.0 but basically the pattern to formulate the would be the same.
Just note you cannot have a constraint such as y != 3 as solvers obviously have some numerical accuracy and you would need rather to represent this is as for an example !(3-0.01 < y < 3+0.01) (still using the same pattern as above)
UPDATE: The previous solution in this answer turned out to be wrong (exclude parts of the admissable region), and so I felt obligated to provide another 'right' solution. This solution partitions the admissable region into parts and solves different optimization problems for each part. Keeping the best solution. This is not a nice solution, but if one does not have a good solver (those commercial ones) it is one way. The commercial solvers usually go through a more efficient similar process by the name of branch-and-bound.
using JuMP, Ipopt
function solveopt()
bestobj = Inf
bestx, besty = 0.0,0.0
for (ltside, xvar, val) in (
(true, true, 2.0),(false, true, 3.0),
(true, false, 3.0),(false, false, 4.0))
m = Model(Ipopt.Optimizer)
#variable(m, x)
#variable(m, y)
add_constraint(m, ScalarConstraint(xvar ? x : y,
ltside ? MOI.LessThan(val) : MOI.GreaterThan(val)))
# following is an objective optimal inside the box
#NLobjective(m, Min, (x-2.5)^2+(y-3.5)^2)
optimize!(m)
if objective_value(m) < bestobj
bestx, besty = value(x), value(y)
end
end
return bestx, besty
end
The solution for this example problem is:
julia> solveopt()
:
: lots of solver output...
:
(2.5, 3.9999999625176965)
Lastly, I benchmarked this crude method against a non-commercial solver (Pajarito) with the method from other answer and this one is 2X faster (because of simplicity, I suppose). Commercial solvers would beat both times.

Constrained Optimization in R using constOptim

I am trying to find the maximum values of this objective function:
f(x1,x2,x3) = 1300x1 + 600x2 + 500x3
subject to the following constraints
300x1 + 150x2 + 100x3 <= 4,000
90x1 + 30x2 + 40x3 <= 1,000
x1 <= 5
x1, x2, x3 >= 0
Below is the code I am using, which is not returning the values I'm looking for. The outputs for the variables are 9.453022e-12 3.272252e-12 5.548419e-14 and the total value is -1.428002e-08.
I'm new to R. What am I doing wrong? Thank you.
f=function(x) -(1300*x[1]+600*x[2]+500*x[3]) # minimize -f(x,y,z)
inequalities=function(x){ #define the ineqaulities function
h=0
h[1]=-(300*x[1]+150*x[2]+100*x[3]-4000)
h[2]=-(90*x[1]+30*x[2]+40*x[3]-1000)
h[3]=-(1*x[1]+0*x[2]+0*x[3]-5)
return(h)}
g=function(x){ #x,y,z > 0
h=0
h[1]=x[1]
h[2]=x[2]
h[3]=x[3]
return(h)}
p0=c(0,0,0) #give the starting point
y=constrOptim.nl(p0,f,hin=inequalities,heq=g);
print(y$par)
print(y$value)
The documentation says:
heq: a vector function specifying equality constraints such that heq[j] = 0 for all j
So the lower bounds x[1],x[2],x[3] >= 0 you are trying to specify, are actually interpreted as x[1],x[2],x[3] = 0. Hence the solution: 9.453022e-12, 3.272252e-12 5.548419e-14. Your lower bounds need to be incorporated in hin.
Note that there are better, linear solvers for linear problems. Passing on a linear problem to a non-linear solver is not optimal.

How is Hard Sigmoid defined

I am working on Deep Nets using keras. There is an activation "hard sigmoid". Whats its mathematical definition ?
I know what is Sigmoid. Someone asked similar question on Quora: https://www.quora.com/What-is-hard-sigmoid-in-artificial-neural-networks-Why-is-it-faster-than-standard-sigmoid-Are-there-any-disadvantages-over-the-standard-sigmoid
But I could not find the precise mathematical definition anywhere ?
Since Keras supports both Tensorflow and Theano, the exact implementation might be different for each backend - I'll cover Theano only. For Theano backend Keras uses T.nnet.hard_sigmoid, which is in turn linearly approximated standard sigmoid:
slope = tensor.constant(0.2, dtype=out_dtype)
shift = tensor.constant(0.5, dtype=out_dtype)
x = (x * slope) + shift
x = tensor.clip(x, 0, 1)
i.e. it is: max(0, min(1, x*0.2 + 0.5))
For reference, the hard sigmoid function may be defined differently in different places. In Courbariaux et al. 2016 [1] it's defined as:
σ is the “hard sigmoid” function: σ(x) = clip((x + 1)/2, 0, 1) =
max(0, min(1, (x + 1)/2))
The intent is to provide a probability value (hence constraining it to be between 0 and 1) for use in stochastic binarization of neural network parameters (e.g. weight, activation, gradient). You use the probability p = σ(x) returned from the hard sigmoid function to set the parameter x to +1 with p probability, or -1 with probability 1-p.
[1] https://arxiv.org/abs/1602.02830 - "Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1", Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio, (Submitted on 9 Feb 2016 (v1), last revised 17 Mar 2016 (this version, v3))
The hard sigmoid is normally a piecewise linear approximation of the logistic sigmoid function. Depending on what properties of the original sigmoid you want to keep, you can use a different approximation.
I personally like to keep the function correct at zero, i.e. σ(0) = 0.5 (shift) and σ'(0) = 0.25 (slope). This could be coded as follows
def hard_sigmoid(x):
return np.maximum(0, np.minimum(1, (x + 2) / 4))
it is
clip((x + 1)/2, 0, 1)
in coding parlance:
max(0, min(1, (x + 1)/2))

Errors when attempting constrained optimisation using optim()

I have been using the Excel solver to handle the following problem
solve for a b and c in the equation:
y = a*b*c*x/((1 - c*x)(1 - c*x + b*c*x))
subject to the constraints
0 < a < 100
0 < b < 100
0 < c < 100
f(x[1]) < 10
f(x[2]) > 20
f(x[3]) < 40
where I have about 10 (x,y) value pairs. I minimize the sum of abs(y - f(x)). And I can constrain both the coefficients and the range of values for the result of my function at each x.
I tried nls (without trying to impose the constraints) and while Excel provided estimates for almost any starting values I cared to provide, nls almost never returned an answer.
I switched to using optim, but I'm having trouble applying the constraints.
This is where I have gotten so far-
best = function(p,x,y){sum(abs(y - p[1]*p[2]*p[3]*x/((1 - p[3]*x)*(1 - p[3]*x + p[2]*p[3]*x))))}
p = c(1,1,1)
x = c(.1,.5,.9)
y = c(5,26,35)
optim(p,best,x=x,y=y)
I did this to add the first set of constraints-
optim(p,best,x=x,y=y,method="L-BFGS-B",lower=c(0,0,0),upper=c(100,100,100))
I get the error ""ERROR: ABNORMAL_TERMINATION_IN_LNSRCH"
and end up with a higher value of the error ($value). So it seems like I am doing something wrong. I couldn't figure out how to apply my other set of constraints at all.
Could someone provide me a basic idea how to solve this problem that a non-statistician can understand? I looked at a lot of posts and looked in a few R books. The R books stopped at the simplest use of optim.
The absolute value introduces a singularity:
you may want to use a square instead,
especially for gradient-based methods (such as L-BFGS).
The denominator of your function can be zero.
The fact that the parameters appear in products
and that you allow them to be (arbitrarily close to) zero
can also cause problems.
You can try with other optimizers
(complete list on the optimization task view),
until you find one for which the optimization converges.
x0 <- c(.1,.5,.9)
y0 <- c(5,26,35)
p <- c(1,1,1)
lower <- 0*p
upper <- 100 + lower
f <- function(p,x=x0,y=y0) sum(
(
y - p[1]*p[2]*p[3]*x / ( (1 - p[3]*x)*(1 - p[3]*x + p[2]*p[3]*x) )
)^2
)
library(dfoptim)
nmkb(p, f, lower=lower, upper=upper) # Converges
library(Rvmmin)
Rvmmin(p, f, lower=lower, upper=upper) # Does not converge
library(DEoptim)
DEoptim(f, lower, upper) # Does not converge
library(NMOF)
PSopt(f, list(min=lower, max=upper))[c("xbest", "OFvalue")] # Does not really converge
DEopt(f, list(min=lower, max=upper))[c("xbest", "OFvalue")] # Does not really converge
library(minqa)
bobyqa(p, f, lower, upper) # Does not really converge
As a last resort, you can always use a grid search.
library(NMOF)
r <- gridSearch( f,
lapply(seq_along(p), function(i) seq(lower[i],upper[i],length=200))
)

Fast, inaccurate sin function without lookup

For an ocean shader, I need a fast function that computes a very approximate value for sin(x). The only requirements are that it is periodic, and roughly resembles a sine wave.
The taylor series of sin is too slow, since I'd need to compute up to the 9th power of x just to get a full period.
Any suggestions?
EDIT: Sorry I didn't mention, I can't use a lookup table since this is on the vertex shader. A lookup table would involve a texture sample, which on the vertex shader is slower than the built in sin function.
It doesn't have to be in any way accurate, it just has to look nice.
Use a Chebyshev approximation for as many terms as you need. This is particularly easy if your input angles are constrained to be well behaved (-π .. +π or 0 .. 2π) so you do not have to reduce the argument to a sensible value first. You might use 2 or 3 terms instead of 9.
You can make a look-up table with sin values for some values and use linear interpolation between that values.
A rational algebraic function approximation to sin(x), valid from zero to π/2 is:
f = (C1 * x) / (C2 * x^2 + 1.)
with the constants:
c1 = 1.043406062
c2 = .2508691922
These constants were found by least-squares curve fitting. (Using subroutine DHFTI, by Lawson & Hanson).
If the input is outside [0, 2π], you'll need to take x mod 2 π.
To handle negative numbers, you'll need to write something like:
t = MOD(t, twopi)
IF (t < 0.) t = t + twopi
Then, to extend the range to 0 to 2π, reduce the input with something like:
IF (t < pi) THEN
IF (t < pi/2) THEN
x = t
ELSE
x = pi - t
END IF
ELSE
IF (t < 1.5 * pi) THEN
x = t - pi
ELSE
x = twopi - t
END IF
END IF
Then calculate:
f = (C1 * x) / (C2 * x*x + 1.0)
IF (t > pi) f = -f
The results should be within about 5% of the real sine.
Well, you don't say how accurate you need it to be. The sine can be approximated by straight lines of slopes 2/pi and -2/pi on intervals [0, pi/2], [pi/2, 3*pi/2], [3*pi/2, 2*pi]. This approximation can be had for the cost of a multiplication and an addition after reducing the angle mod 2*pi.
Using a lookup table is probably the best way to control the tradeoff between speed and accuracy.

Resources