How to optimize with integer parameters in OpenMDAO - openmdao

I am currently trying to do some optimization for locations on a map using OpenMDAO 1.7.2. The (preexisting) modules that do the calculations only support integer coordinates (resolution of one meter).
For now I am optimizing using an IndepVarComp for each direction each containing a float vector. These values are then rounded before using them, but this is quite inefficient because the solver mainly tries variations smaller below one.
When I attempt to initialize an IndepVarComp with an integer vector the first iteration works fine (uses inital values), but in the second iteration fails, because the data in IndepVarComp is set to an empty ndarray.
Looking through the OpenMDAO source code I found out that this is because
indep_var_comp._init_unknowns_dict['x']['size'] == 0
which happens in Component's _add_variable() method whenever the data type is not differentiable.
Here is an example problem which illustrates how defining an integer IndepVarComp fails:
from openmdao.api import Component, Group, IndepVarComp, Problem, ScipyOptimizer
INITIAL_X = 1
class ResultCalculator(Component):
def __init__(self):
super(ResultCalculator, self).__init__()
self.add_param('x', INITIAL_X)
self.add_output('y', 0.)
def solve_nonlinear(self, params, unknowns, resids):
unknowns['y'] = (params['x'] - 3) ** 2 - 4
problem = Problem()
problem.root = Group()
problem.root.add('indep_var_comp', IndepVarComp('x', INITIAL_X))
problem.root.add('calculator', ResultCalculator())
problem.root.connect('indep_var_comp.x', 'calculator.x')
problem.driver = ScipyOptimizer()
problem.driver.options['optimizer'] = 'COBYLA'
problem.driver.add_desvar('indep_var_comp.x')
problem.driver.add_objective('calculator.y')
problem.setup()
problem.run()
Which fails with
ValueError: setting an array element with a sequence.
Note that everythings works out fine if I set INITIAL_X = 0..
How am I supposed to optimize for integers?

if you want to use integer variables, you need to pick a different kind of optimizer. You won't be able to force COBYLA to respect the integrality. Additionally, if you do have some kind of integer rounding causing discontinuities in your analyses then you really can't be using COBYLA (or any other continuous optimizer) at all. They all make a fundamental assumption about smoothness of the function which you would be violating.
It sounds like you should possibly consider using a particle swarm or genetic algorithm for your problem. Alternatively, you could focus on making the analyses smooth and differentiable and scale some of your inputs to get more reasonable resolution. You can also loosen the convergence tolerance of the optimizer to have it stop iterating once it gets below physical significance in your design variables.

Related

openMDAO: Does the use of ExecComp maximum() interfere with constraints not being affected by design variables?

When running the optimization driver on a large model I recieve:
DerivativesWarning:Constraints or objectives [('max_current.current_constraint.current_constraint', inds=[0]), ('max_current.continuous_current_constraint.continuous_current_constraint', inds=[0])] cannot be impacted by the design variables of the problem.
I read the answer to a similar question posed here.
The values do change as the design variables change, and the two constraints are satisfied during the course of optimization.
I had assumed this was due to those components' ExecComp using a maximum(), as this is the only place in my model I use a maximum function, however when setting up a simple problem with a maximum() function in a similar manner I do not receive an error.
My model uses explicit components that are looped, there are connections in the bottom left of the N2 diagram and NLBGS is converging the whole model. I currently am thinking it is due to the use of only explicit components and the NLBGS instead of implicit components.
Thank you for any insight you can give in resolving this warning.
Below is a simple script using maximum() that does not report errors. (I was so sure that was it) As I create a minimum working example that gives the error in a similar way to my larger model I will upload it.
import openmdao.api as om
prob=om.Problem()
prob.driver = om.ScipyOptimizeDriver()
prob.driver.options['optimizer'] = 'SLSQP'
prob.driver.options['tol'] = 1e-6
prob.driver.options['maxiter'] = 80
prob.driver.options['disp'] = True
indeps = prob.model.add_subsystem('indeps', om.IndepVarComp())
indeps.add_output('x', val=2.0, units=None)
prob.model.promotes('indeps', outputs=['*'])
prob.model.add_subsystem('y_func_1',
om.ExecComp('y_func_1 = x'),
promotes_inputs=['x'],
promotes_outputs=['y_func_1'])
prob.model.add_subsystem('y_func_2',
om.ExecComp('y_func_2 = x**2'),
promotes_inputs=['x'],
promotes_outputs=['y_func_2'])
prob.model.add_subsystem('y_max',
om.ExecComp('y_max = maximum( y_func_1 , y_func_2 )'),
promotes_inputs=['y_func_1',
'y_func_2'],
promotes_outputs=['y_max'])
prob.model.add_subsystem('y_check',
om.ExecComp('y_check = y_max - 1.1'),
promotes_inputs=['*'],
promotes_outputs=['*'])
prob.model.add_constraint('y_check', lower=0.0)
prob.model.add_design_var('x', lower=0.0, upper=2.0)
prob.model.add_objective('x')
prob.setup()
prob.run_driver()
print(prob.get_val('x'))
There is a problem with the maximum function in this context. Technically a maximum function is not differentiable; at least not when the index of which value is max is subject to change. If the maximum value is not subject to change, then it is differentiable... but you didn't need the max function anyway.
One correct, differentiable way to handle a max when doing gradient based things is to use a KS function. OpenMDAO provides the KSComp which implements it. There are other kinds of functions (like p-norm that you could use as well).
However, even though maximum is not technically differentiable ... you can sort-of/kind-of get away with it. At least, numpy (which ExecComp uses) lets you apply complex-step differentiation to the maximum function and it seems to give a non-zero derivative. So while its not technically correct, you can maybe get rid of it. At least, its not likely to be the core of your problem.
You mention using NLBGS, and that you have components which are looped. Your test case is purely feed forward though (here is the N2 from your test case).
. That is an important difference.
The problem here is with your derivatives, not with the maximum function. Since you have a nonlinear solver, you need to do something to get the derivatives right. In the example Sellar optimization, the model uses this line: prob.model.approx_totals(), which tells OpenMDAO to finite-difference across the whole model (including the nonlinear solver). This is simple and keeps the example compact. It also works regardless of whether your components define derivatives or not. It is however, slow and suffers from numerical difficulties. So use on "real" problems at your own risk.
If you don't include that (and your above example does not, so I assume your real problem does not either) then you're basically telling OpenMDAO that you want to use analytic derivatives (yay! they are so much more awesome). That means that you need to have a Linear solver to match your nonlinear one. For most problems that you start out with, you can simply put a DirectSolver right at the top of the model and it will all work out. For more advanced models, you need a more complex linear solver structure... but thats a different question.
Give this a try:
prob.model.linear_solver = om.DirectSolver()
That should give you non-zero total derivatives regardless of whether you have coupling (loops) or not.

Performing gradient-based optimization with intermediate discrete variables in OpenMDAO

We have a user who wants to solve an optimization problem that has intermediate discrete variables using a gradient-based method. They're running into this error. I know we could restructure the problem to not use discrete variables, but could I also treat this error as a warning given that the discrete variable doesn't change? Or is there a fundamental reason that the derivs wouldn't propagate correctly. To be clear, we're using approx_totals at the model level.
Here is a small test case exhibiting this:
import openmdao.api as om
import numpy as np
class DiscreteComp(om.ExplicitComponent):
def setup(self):
self.add_input('a', 2.)
self.add_discrete_output('b', val=0)
def compute(self, inputs, outputs, discrete_inputs, discrete_outputs):
discrete_outputs['b'] = 2 * inputs['a']
class DummyComp(om.ExplicitComponent):
def setup(self):
self.add_input('a', 2.)
self.add_discrete_input('b', 1)
self.add_output('c')
def compute(self, inputs, outputs, discrete_inputs, discrete_outputs):
b = discrete_inputs['b']
outputs['c'] = inputs['a']**2 * b
prob = om.Problem()
prob.model.add_subsystem('discrete_comp', DiscreteComp(), promotes=['*'])
prob.model.add_subsystem('dummy_comp', DummyComp(), promotes=['*'])
prob.driver = om.pyOptSparseDriver()
prob.driver.options['optimizer'] = 'SLSQP'
prob.model.add_design_var('a', lower=-10, upper=10)
prob.model.add_objective('c')
prob.model.approx_totals()
prob.setup()
# run the optimization
prob.run_driver()
which gives this output:
Traceback (most recent call last):
File "tmp.py", line 39, in <module>
prob.run_driver()
File "/home/john/anaconda3/envs/weis/lib/python3.8/site-packages/openmdao/core/problem.py", line 685, in run_driver
return self.driver.run()
File "/home/john/anaconda3/envs/weis/lib/python3.8/site-packages/openmdao/drivers/pyoptsparse_driver.py", line 480, in run
raise self._exc_info
File "/home/john/anaconda3/envs/weis/lib/python3.8/site-packages/openmdao/drivers/pyoptsparse_driver.py", line 643, in _gradfunc
sens_dict = self._compute_totals(of=self._quantities,
File "/home/john/anaconda3/envs/weis/lib/python3.8/site-packages/openmdao/core/driver.py", line 892, in _compute_totals
total_jac = _TotalJacInfo(problem, of, wrt, use_abs_names,
File "/home/john/anaconda3/envs/weis/lib/python3.8/site-packages/openmdao/core/total_jac.py", line 209, in __init__
raise RuntimeError("Total derivative %s '%s' depends upon "
RuntimeError: Total derivative with respect to '_auto_ivc.v0' depends upon discrete output variables ['discrete_comp.b'].
Given your example, there is a reason that OpenMDAO raises this error. Discrete outputs are not passed through the same data system as continuous one. Only the continuous outputs have access to the derivative system. While it is possible to have this intermediate discrete variable here, and take a finite-difference over the whole thing, you only get away with it because you really have a continuous calculation there.
If the output had been really discrete, then its not mathematically valid to take a derivative or make finite-difference approximation across the compute. This is why OpenMDAO raises the error.
There are admittedly some corner cases where you could argue that OpenMDAO should let you do this. For example if your output c was computed as:
outputs['c'] = floor(inputs['a'])
and you knew that you would limit the value of a to between 1 and 2. You know that c would always be exactly 1. Hence you could say that its valid to differentiate across this since the discrete variable never changes in value and hence the function is differentiable within these bounds.
If you wanted to use OpenMDAO to finite difference this you have two options:
Even though it's a discrete value, list it as a continuous one anyway. This works around OpenMDAO's validity check, but its totally up to you to make sure the values are really never changing with the ranges you are taking derivatives around. If you decide to implement analytic derivatives, simply don't declare one for the discrete output with respect to anything.
Move the discrete calculations into the setup phase and pass the information around as options. This forces you to absolutely respect the "never changes" paradigm since setup only happens once. It does require some redesign.
If you don't like OpenMDAO's nanny-ing you can always comment out the error :) You obviously do this at your own risk ... but thats one of the values of open source software!
My personal recommendation is option #2. This is really the best overall design, and forces you to be sure the discrete calculations never change during an opt.

Approximating the whole problem with Mixed Analytical strategy

I have problem where I have implemented analytical derivatives for some components and I'm using complex step for the rest. There is a cyclic dependency between them so I also use a solver to converge them. It converges when I use NonlinearBlockGS. But when I use NewtonSolver in combination with a linear solver the optimization fails (Iteration limit exceeded), even with high iteration count. But I found that it converges easily and works perfectly when I use prob.model.approx_totals(). I read that approx_totals uses fd or cs to find the model gradients. So I have two questions.
In general, Will I lose the benefits from the mixed-analytical approach when I use approx_totals()? Is there a way to find the derivatives of whole model (or group) with mixed analytical strategy ? (Anyway In my case the explicitcomponents which are coupled use 'complex step`. But I'm just curious about this.)
In general (not in this scenario), will Openmdao automatically detect the mixed strategy or should I specify it some how ?
I will also be grateful, if you could point me to some examples where mixed derivatives are used. I didnt have any luck finding them myself.
Edit:Adding Example. I am not able to reproduce the issue in a sample code. Also I dont want to waste your time with my code(there more than 30 ExplicitComponents and 7 Groups). So I made a simple structure below to explain it better. In this there are 7 components A to G and only F and G doesn't have analytical derivatives and uses FD.
import openmdao.api as om
import numpy as np
class ComponentA_withDerivatives(om.ExplicitComponent):
def setup(self):
#setup inputs and outputs
def setup_partials(self):
#partial declaration
def compute(self, inputs, outputs):
def compute_partials(self, inputs, J):
#Partial definition
class ComponentB_withDerivatives(om.ExplicitComponent):
.....
class ComponentC_withDerivatives(om.ExplicitComponent):
......
class ComponentD_withDerivatives(om.ExplicitComponent):
......
class ComponentE_withDerivatives(om.ExplicitComponent):
......
class ComponentF(om.ExplicitComponent):
def setup(self):
#setup inputs and outputs
self.declare_partials(of='*', wrt='*', method='fd')
def compute(self,inputs,outputs):
# Computation
class ComponentG(om.ExplicitComponent):
def setup(self):
#setup inputs and outputs
self.declare_partials(of='*', wrt='*', method='fd')
def compute(self,inputs,outputs):
# Computation
class GroupAB(om.Group):
def setup(self):
self.add_subsystem('A', ComponentA_withDerivatives(), promotes_inputs=['x','y'], promotes_outputs=['z'])
self.add_subsystem('B', ComponentB_withDerivatives(), promotes_inputs=['x','y','w','u'], promotes_outputs=['k'])
class GroupCD(om.Group):
def setup(self):
self.add_subsystem('C', ComponentC_withDerivatives(), .....)
self.add_subsystem('D', ComponentD_withDerivatives(), ...)
class Final(om.Group):
def setup(self):
cycle1 = self.add_subsystem('cycle1', om.Group(), promotes=['*'])
cycle1.add_subsystem('GroupAB', GroupAB())
cycle1.add_subsystem('ComponentF', ComponentF())
cycle1.linear_solver = om.DirectSolver()
cycle1.nonlinear_solver = om.NewtonSolver(solve_subsystems=True)
cycle2 = self.add_subsystem('cycle2', om.Group(), promotes=['*'])
cycle2.add_subsystem('GroupCD', GroupCD())
cycle2.add_subsystem('ComponentE_withDerivatives', ComponentE_withDerivatives())
cycle2.linear_solver = om.DirectSolver()
cycle2.nonlinear_solver = om.NewtonSolver(solve_subsystems=True)
self.add_subsystem('ComponentG', ComponentG(), promotes_inputs=['a1','a2','a3'], promotes_outputs=['b1'])
prob = om.Problem()
prob.model = Final()
prob.driver = om.pyOptSparseDriver()
prob.driver.options['optimizer'] = 'SNOPT'
prob.driver.options['print_results']= True
## Design Variables
## Costraints
## Objectives
# Setup
prob.setup()
##prob.model.approx_totals(method='fd')
prob.run_model()
prob.run_driver()
Here this doesn't work. The cycle1 doesn't converge. The code works when I completely remove cycle1 or use NonlinearBlockGS instead of Newton or if I uncomment prob.model.approx_total(method='FD'). (no problem with cycle2. Work with Newton)
So if I don't use approx_totals(), I am assuming Openmdao uses a mixed strategy. Or should I manually mention it somehow ? And when I do use approx_totals() , will I lose the benefits from the analytical derivatives that I do have?
The code example you provided isn't runnable, so I'll have to make a few guesses. You call both run_model() and run_driver(). You bothered to include an optimizer in your sample code though, and you've show approx_totals to be called at the top of the model hierarchy.
So when you say it does not work, I will assume you mean that the optimizer doesn't converge.
You have understood the behavior of approx_totals correctly. When you set that at the top of your model, then OpenMDAO will FD the relevant variables from the group level. In this case, that means you will also be FD-ing across the solver itself. You say that this seems to work, but the mixed analytic approach does not.
In general, Will I lose the benefits from the mixed-analytical approach when I use approx_totals()?
Yes. You are no long using a mixed approach. You are just FD-ing across the model monolithically.
Is there a way to find the derivatives of whole model (or group) with mixed analytical strategy ?
OpenMDAO is computing total derivatives with a mixed strategy when you don't use approx_totals. The issue is that for your model, it seems not to be working.
In general (not in this scenario), will Openmdao automatically detect the mixed strategy?
It will "detect" it (it doesn't actually detect anything, but the underlying algorithms will use a mixed strategy UNLESS you tell it not to with approx_totals. Again, the issue is not that a mixed strategy is not being used, but that it is not working.
So why isn't the mixed strategy working?
I can only guess, since I can't run the code... so YMMV.
You mention that you are using complex-step for partials of your explicit components. Complex-step is a much more accurate approximation scheme than FD, but it is not without its own flaws. Not every computation is complex-safe. Some can be re-written to be complex-safe, others can not.
By "complex-safe" I mean that the computation correctly handles the complex-part to give a derivatives.
Two commonly used-complex-safe methods are np.linalg.norm and np.abs. Both will happily accept complex-numbers and give you an answer, but it is not the correct answer for when you need derivatives.
Because of this, OpenMDAO ships with a set of custom functions that are cs-safe --- custom norm and abs are provided.
What typically happens with non cs-safe methods is that the complex-part somehow gets dropped off and you get 0 partial derivatives. Wrong partials, wrong totals.
To check this, make sure you call check_partials on your components that are being complex-stepped, using a finite-difference check. You'll probably find some discrepancies.
The fixes available to you are:
Switch those components to use FD partials. Less accurate, but will probably work
Correct whatever problems in your compute are making your code non-cs-safe. Use OpenMDAO's custom functions if thats the problem, or possibly you need to be more careful about how you allocate and use numpy arrays in your compute (if you're allocating your own arrays, then you need to be careful to make sure they are complex too!).

How to use an external code component within the optimization framework

Hi I am trying to use the paraboloid external code component to get the same results as in the paraboloid optimization problem (openmdao v 2.2.0).
So in my mind the independent variables x,y should be updated and thus changing the input file of the external component to minimize the output f.
Not that I got this working but I basically add the external component's output to be the objective and the independent variables to be the design variables etc (see code below).
But more importantly I have a problem to conceptually understand how the optimizer would know the derivatives in such external codes.
I tried 'COBYLA' thinking that could be a way to go gradient-free approach but there seems to be a bug in the iprint statement, since I can not run the example paraboloid optimization either.
I think I have a similar problem with surrogates. For example I use Metamodelunstructured component to find my surrogate which performs well if I ask for a known value. But I do not see how to couple this component's output to be the objective of the optimizer. I think I am doing the right thing by giving the model objective. but not sure...
The answer might be that I am completely off from the optimization logic if so please refer me to the related papers for the algorithms behind.
Thanks in advance
from openmdao.api import Problem, Group, IndepVarComp
from openmdao.api import ScipyOptimizeDriver
from openmdao.components.tests.test_external_code import ParaboloidExternalCode
top = Problem()
top.model = model = Group()
# create and connect inputs
model.add_subsystem('p1', IndepVarComp('x', 3.0))
model.add_subsystem('p2', IndepVarComp('y', -4.0))
model.add_subsystem('p', ParaboloidExternalCode())
model.connect('p1.x', 'p.x')
model.connect('p2.y', 'p.y')
top.driver = ScipyOptimizeDriver()
top.driver.options['optimizer'] = 'SLSQP'
top.model.add_design_var('p1.x', lower=-50, upper=50)
top.model.add_design_var('p2.y', lower=-50, upper=50)
top.model.add_objective('p.f_xy')
top.driver.options['tol'] = 1e-9
top.driver.options['disp'] = True
top.setup()
top.run_driver()
# minimum value
# location of the minimum
print(top['p1.x'])
print(top['p2.y'])
So, I think the main thing you are asking is how to provide derivatives for external codes. I think there are really two options for this.
Finite difference across the external component.
The test example doesn't show how to do this, which is unfortunate, but you do this the same way that you would declare derivatives fd for a pure python component, namely by adding this line to the external component's setup method:
self.declare_partials(of='*', wrt='*', method='fd')
Provide another external method to calculate the derivatives, and wrap it in the "compute_partials" method.
We do this with CFD codes that provide an adjoint solution. You could possibly also use automatic differentiation on the external source code to produce a callable function in this way. However, I think method 1 is what you are asking for here.

numerical differentiation with Scipy

I was trying to learn Scipy, using it for mixed integrations and differentiations, but at the very initial step I encountered the following problems.
For numerical differentiation, it seems that the only Scipy function that works for callable functions is scipy.derivative() if I'm right!? However, I couldn't work with it:
1st) when I am not going to specify the point at which the differentiation is to be taken, e.g. when the differentiation is under an integral so that it is the integral that should assign the numerical values to its integrand's variable, not me. As a simple example I tried this code in Sage's notebook:
import scipy as sp
from scipy import integrate, derivative
var('y')
f=lambda x: 10^10*sin(x)
g=lambda x,y: f(x+y^2)
I=integrate.quad( sp.derivative(f(y),y, dx=0.00001, n=1, order=7) , 0, pi)[0]; show(I)
show( integral(diff(f(y),y),y,0,1).n() )
also it gives the warning that "Warning: The occurrence of roundoff error is detected, which prevents the requested tolerance from being achieved. The error may be underestimated." and I don't know what does this warning stand for as it persists even with increasing "dx" and decreasing the "order".
2nd) when I want to find the derivative of a multivariable function like g(x,y) in the above example and something like sp.derivative(g(x,y),(x,0.5), dx=0.01, n=1, order=3) gives error, as is easily expected.
Looking forward to hearing from you about how to resolve the above cited problems with numerical differentiation.
Best Regards
There are some strange problems with your code that suggest you need to brush up on some python! I don't know how you even made these definitions in python since they are not legal syntax.
First, I think you are using an older version of scipy. In recent versions (at least from 0.12+) you need from scipy.misc import derivative. derivative is not in the scipy global namespace.
Second, var is not defined, although it is not necessary anyway (I think you meant to import sympy first and use sympy.var('y')). sin has also not been imported from math (or numpy, if you prefer). show is not a valid function in sympy or scipy.
^ is not the power operator in python. You meant **
You seem to be mixing up the idea of symbolic and numeric calculus operations here. scipy won't numerically differentiate an expression involving a symbolic object -- the second argument to derivative is supposed to be the point at which you wish to take the derivative (i.e. a number). As you say you are trying to do numeric differentiation, I'll resolve the issue for that purpose.
from scipy import integrate
from scipy.misc import derivative
from math import *
f = lambda x: 10**10*sin(x)
df = lambda x: derivative(f, x, dx=0.00001, n=1, order=7)
I = integrate.quad( df, 0, pi)[0]
Now, this last expression generates the warning you mentioned, and the value returned is not very close to zero at -0.0731642869874073 in absolute terms, although that's not bad relative to the scale of f. You have to appreciate the issues of roundoff error in finite differencing. Your function f varies on your interval between 0 and 10^10! It probably seems paradoxical, but making the dx value for differentiation too small can actually magnify roundoff error and cause numerical instability. See the second graph here ("Example showing the difficulty of choosing h due to both rounding error and formula error") for an explanation: http://en.wikipedia.org/wiki/Numerical_differentiation
In fact, in this case, you need to increase it, say to 0.001: df = lambda x: derivative(f, x, dx=0.001, n=1, order=7)
Then, you can integrate safely, with no terrible roundoff.
I=integrate.quad( df, 0, pi)[0]
I don't recommend throwing away the second return value from quad. It's an important verification of what happened, as it is "an estimate of the absolute error in the result". In this case, I == 0.0012846582250212652 and the abs error is ~ 0.00022, which is not bad (the interval that implies still does not include zero). Maybe some more fiddling with the dx and absolute tolerances for quad will get you an even better solution, but hopefully you get the idea.
For your second problem, you simply need to create a proper scalar function (call it gx) that represents g(x,y) along y=0.5 (this is called Currying in computer science).
g = lambda x, y: f(x+y**2)
gx = lambda x: g(x, 0.5)
derivative(gx, 0.2, dx=0.01, n=1, order=3)
gives you a value of the derivative at x=0.2. Naturally, the value is huge given the scale of f. You can integrate using quad like I showed you above.
If you want to be able to differentiate g itself, you need a different numerical differentiation functio. I don't think scipy or numpy support this, although you could hack together a central difference calculation by making a 2D fine mesh (size dx) and using numpy.gradient. There are probably other library solutions that I'm not aware of, but I know my PyDSTool software contains a function diff that will do that (if you rewrite g to take one array argument instead). It uses Ridder's method and is inspired from the Numerical Recipes pseudocode.

Resources