maximizing function using optim in r where one of the parameters is an integer - r

I have a function that I need to maximize that contains 3 parameters, one of which is an integer.
How do I let the optim function know to maximize (instead of minimize which is the default).
And how do I let it know that one of the parameters in an integer?
Will it work if one of the parameters is a binary or categorical?

Max vs min is easy (set fnscale=-1 in the control parameter).
Integer parameters are not easy. I don't know of a simple out-of-the-box solution for this, hopefully someone else does.
Most of the methods implemented in optim assume continuous parameter spaces. (method="SANN" will work since you can give it explicit rules for updating - see the examples - but it's tricky to get it to work efficiently.) Most of the optimizers listed in the Optimization Task View are for continuous optimization - the section on global/stochastic gives the most options for mixed discrete/continuous problems.
If the range of plausible integers is reasonably small you can use brute force (i.e., optimize over the two continuous parameters for each of a range of fixed integer values); you could also use bisection search over the integers.

Related

compute the tableau's nonbasic term in SCIP separator

In traditional Simplex Algorithm notation, we have x at the current basis selection B as so:
xB = AB-1b - AB-1ANxN. How can I compute the AB-1AN term inside a separator in SCIP, or at least iterate over its columns?
I see three helpful methods: getLPColsData, getLPRowsData, getLPBasisInd. I'm just not sure exactly what data those methods represent, particularly the last one, with its negative row indexes. How do I use those to get the value I want?
Do those methods return the same data no matter what LP algorithm is used? Or do I need to account for dual vs primal? How does the use of the "revised" algorithm play into my calculation?
Update: I discovered the getLPBInvARow and getLPBInvRow. That seems to be much closer to what I'm after. I don't yet understand their results; they seem to include more/less dimensions than expected. I'm still looking for understanding at how to use them to get the rays away from the corner.
you are correct that getLPBInvRow or getLPBInvARow are the methods you want. getLPBInvARow directly returns you a of the simplex tableau, but it is not more efficient to use than getLPBInvRow and doing the multiplication yourself since the LP solver needs to also compute the actual tableau first.
I suggest you look into either sepa_gomory.c or sepa_gmi.c for examples of how to use these methods. How do they include less dimensions than expected? They both return sparse vectors.

openMDAO: Does the use of ExecComp maximum() interfere with constraints not being affected by design variables?

When running the optimization driver on a large model I recieve:
DerivativesWarning:Constraints or objectives [('max_current.current_constraint.current_constraint', inds=[0]), ('max_current.continuous_current_constraint.continuous_current_constraint', inds=[0])] cannot be impacted by the design variables of the problem.
I read the answer to a similar question posed here.
The values do change as the design variables change, and the two constraints are satisfied during the course of optimization.
I had assumed this was due to those components' ExecComp using a maximum(), as this is the only place in my model I use a maximum function, however when setting up a simple problem with a maximum() function in a similar manner I do not receive an error.
My model uses explicit components that are looped, there are connections in the bottom left of the N2 diagram and NLBGS is converging the whole model. I currently am thinking it is due to the use of only explicit components and the NLBGS instead of implicit components.
Thank you for any insight you can give in resolving this warning.
Below is a simple script using maximum() that does not report errors. (I was so sure that was it) As I create a minimum working example that gives the error in a similar way to my larger model I will upload it.
import openmdao.api as om
prob=om.Problem()
prob.driver = om.ScipyOptimizeDriver()
prob.driver.options['optimizer'] = 'SLSQP'
prob.driver.options['tol'] = 1e-6
prob.driver.options['maxiter'] = 80
prob.driver.options['disp'] = True
indeps = prob.model.add_subsystem('indeps', om.IndepVarComp())
indeps.add_output('x', val=2.0, units=None)
prob.model.promotes('indeps', outputs=['*'])
prob.model.add_subsystem('y_func_1',
om.ExecComp('y_func_1 = x'),
promotes_inputs=['x'],
promotes_outputs=['y_func_1'])
prob.model.add_subsystem('y_func_2',
om.ExecComp('y_func_2 = x**2'),
promotes_inputs=['x'],
promotes_outputs=['y_func_2'])
prob.model.add_subsystem('y_max',
om.ExecComp('y_max = maximum( y_func_1 , y_func_2 )'),
promotes_inputs=['y_func_1',
'y_func_2'],
promotes_outputs=['y_max'])
prob.model.add_subsystem('y_check',
om.ExecComp('y_check = y_max - 1.1'),
promotes_inputs=['*'],
promotes_outputs=['*'])
prob.model.add_constraint('y_check', lower=0.0)
prob.model.add_design_var('x', lower=0.0, upper=2.0)
prob.model.add_objective('x')
prob.setup()
prob.run_driver()
print(prob.get_val('x'))
There is a problem with the maximum function in this context. Technically a maximum function is not differentiable; at least not when the index of which value is max is subject to change. If the maximum value is not subject to change, then it is differentiable... but you didn't need the max function anyway.
One correct, differentiable way to handle a max when doing gradient based things is to use a KS function. OpenMDAO provides the KSComp which implements it. There are other kinds of functions (like p-norm that you could use as well).
However, even though maximum is not technically differentiable ... you can sort-of/kind-of get away with it. At least, numpy (which ExecComp uses) lets you apply complex-step differentiation to the maximum function and it seems to give a non-zero derivative. So while its not technically correct, you can maybe get rid of it. At least, its not likely to be the core of your problem.
You mention using NLBGS, and that you have components which are looped. Your test case is purely feed forward though (here is the N2 from your test case).
. That is an important difference.
The problem here is with your derivatives, not with the maximum function. Since you have a nonlinear solver, you need to do something to get the derivatives right. In the example Sellar optimization, the model uses this line: prob.model.approx_totals(), which tells OpenMDAO to finite-difference across the whole model (including the nonlinear solver). This is simple and keeps the example compact. It also works regardless of whether your components define derivatives or not. It is however, slow and suffers from numerical difficulties. So use on "real" problems at your own risk.
If you don't include that (and your above example does not, so I assume your real problem does not either) then you're basically telling OpenMDAO that you want to use analytic derivatives (yay! they are so much more awesome). That means that you need to have a Linear solver to match your nonlinear one. For most problems that you start out with, you can simply put a DirectSolver right at the top of the model and it will all work out. For more advanced models, you need a more complex linear solver structure... but thats a different question.
Give this a try:
prob.model.linear_solver = om.DirectSolver()
That should give you non-zero total derivatives regardless of whether you have coupling (loops) or not.

What do you call it when kernel of a matrix is sought with a set (nonzero) tolerance?

This will be a strange question: I know what to do, and I am actually doing it, and it works, but I don't know how to write about it. Looking for solutions to a homogeneous matrix equation, say AX=0, I use the kernel of the parameter matrix A. But, the world being imperfect as it is, the matrix does not have a "perfect" kernel; it does have an "imperfect" one if you set a nonzero "tolerance" parameter. FWIW I'm using Scilab, the function is kernel(A,tol).
Now what are the correct terms for "imperfect kernel", or "tolerance" (of what?), how should this whole process be described in correct English and maths terminology? Should I say something like a "least-squares kernel"? "Approximate kernel"? Is tol the "tolerance of kernel-determination algorithm"? Sounds lame to me...
Depending on the method used (QR or SVD, third flag allows to choose this in Scilab implementation) the tolerance is used to determine when pivots (QR case) or singular values (SVD case) are consider to be zero. The kernel is then considered to be the associated subspace.

Taking advantage of Julia's integration abilities

One of the main reasons I wanted to use Julia for my project is because of its speed, especially for calculating integrals.
I would like to integrate a 1-d function f(x) over some interval [a,b]. In general Julia's quadgk function would be a fast and accurate solution. However, I do not have the function f(x), but only its values f(xi) for a discrete set of points xi in [a,b], stored in an array. The xi's are regularly spaced, and I can get the spacing to be however small I like.
Naively, I could simply define a function f which interpolates using the values f(xi) and feed this to quadgk, (and make the spacing as small as possible), however then I won't know what my error is, which is a shame because QuadGK tells you the error in its estimation.
Another solution is to write a function myself to integrate the array (with trapezoid rule for example), but that would defeat the purpose of using Julia...
What is the easiest way to accurately integrate a function only given discrete values using Julia?
Since you only have values, not the function itself, trapezoid will be your best bet probably. The package Trapz provides this (https://github.com/francescoalemanno/Trapz.jl). However, I think it is worth seeing how easy writing a pretty good implementation yourself would be.
function trap(A)
return sum(A) - (A[begin] + A[end])/2
end
This takes 2.9ms for an array of 10 million floats. If they're Int, then 2.9ms. If they were complex numbers, it would still work (and take 8.9 ms)
A method like this is a good example to show how simple it can be to write pretty fast code in Julia that is still fully generic

A distribution that returns only a single floating point value in Julia?

I have some simulation code that draws from various distributions. To facilitate some sanity checks, is there a way to make a Distribution that returns only a single floating-point value? That way I can test without changing code that calls rand on the distribution. Right now I'm doing something like, supposing I want to always get the value 2.2
mydist = Normal(2.2, 0.000001)
But this seems kind of silly. Of course, if I change the variance to 0 I get an error.
The Distibutions.jl docs have an extends section, so you can see what needs to be defined. An incomplete implementation of a new Distribution starts
using Distributions
struct OneFloatDistribution <: Distribution{Univariate,Continuous}
v::Float64
end
Base.rand(x::OneFloatDistribution) = x.v
You can get down to two possible floating point numbers with Uniform(1.0,nextfloat(1.0))

Resources