I already have my component with the function I want to optimise. However, OpenMDAO Alpha 1.0 does not contain (to my knowledge) a wrapper for a genetic optimiser. I have written my own, and would now like to make it a Driver. I am a bit lost here, can I ask for any guidance?
Thank you!
You're correct that OpenMDAO doesn't have a genetic optimizer yet. You could use NSGAII from the pyopt library, but since you have one you want to use, writing your own driver should be fairly straightforward. The easiest example to follow would be our scipy wrapper for its optimizers. Your wrapper would have to look something like this:
from openmdao.core.driver import Driver
class GeneticOptimizer(Driver):
def __init__(self):
super(GeneticOptimizer, self).__init__()
#some stuff to setup your genetic optimizer here
def run(self, problem):
"""function called to kick off the optimization
Args
----
problem : `Problem`
Our parent `Problem`.
"""
#NOTE: you'll use these functions to build your optimizer
#to execute the model
problem.root.solve_nonlinear()
#function to set values to the design variables
self.set_param(var_name, value)
Related
I have problem where I have implemented analytical derivatives for some components and I'm using complex step for the rest. There is a cyclic dependency between them so I also use a solver to converge them. It converges when I use NonlinearBlockGS. But when I use NewtonSolver in combination with a linear solver the optimization fails (Iteration limit exceeded), even with high iteration count. But I found that it converges easily and works perfectly when I use prob.model.approx_totals(). I read that approx_totals uses fd or cs to find the model gradients. So I have two questions.
In general, Will I lose the benefits from the mixed-analytical approach when I use approx_totals()? Is there a way to find the derivatives of whole model (or group) with mixed analytical strategy ? (Anyway In my case the explicitcomponents which are coupled use 'complex step`. But I'm just curious about this.)
In general (not in this scenario), will Openmdao automatically detect the mixed strategy or should I specify it some how ?
I will also be grateful, if you could point me to some examples where mixed derivatives are used. I didnt have any luck finding them myself.
Edit:Adding Example. I am not able to reproduce the issue in a sample code. Also I dont want to waste your time with my code(there more than 30 ExplicitComponents and 7 Groups). So I made a simple structure below to explain it better. In this there are 7 components A to G and only F and G doesn't have analytical derivatives and uses FD.
import openmdao.api as om
import numpy as np
class ComponentA_withDerivatives(om.ExplicitComponent):
def setup(self):
#setup inputs and outputs
def setup_partials(self):
#partial declaration
def compute(self, inputs, outputs):
def compute_partials(self, inputs, J):
#Partial definition
class ComponentB_withDerivatives(om.ExplicitComponent):
.....
class ComponentC_withDerivatives(om.ExplicitComponent):
......
class ComponentD_withDerivatives(om.ExplicitComponent):
......
class ComponentE_withDerivatives(om.ExplicitComponent):
......
class ComponentF(om.ExplicitComponent):
def setup(self):
#setup inputs and outputs
self.declare_partials(of='*', wrt='*', method='fd')
def compute(self,inputs,outputs):
# Computation
class ComponentG(om.ExplicitComponent):
def setup(self):
#setup inputs and outputs
self.declare_partials(of='*', wrt='*', method='fd')
def compute(self,inputs,outputs):
# Computation
class GroupAB(om.Group):
def setup(self):
self.add_subsystem('A', ComponentA_withDerivatives(), promotes_inputs=['x','y'], promotes_outputs=['z'])
self.add_subsystem('B', ComponentB_withDerivatives(), promotes_inputs=['x','y','w','u'], promotes_outputs=['k'])
class GroupCD(om.Group):
def setup(self):
self.add_subsystem('C', ComponentC_withDerivatives(), .....)
self.add_subsystem('D', ComponentD_withDerivatives(), ...)
class Final(om.Group):
def setup(self):
cycle1 = self.add_subsystem('cycle1', om.Group(), promotes=['*'])
cycle1.add_subsystem('GroupAB', GroupAB())
cycle1.add_subsystem('ComponentF', ComponentF())
cycle1.linear_solver = om.DirectSolver()
cycle1.nonlinear_solver = om.NewtonSolver(solve_subsystems=True)
cycle2 = self.add_subsystem('cycle2', om.Group(), promotes=['*'])
cycle2.add_subsystem('GroupCD', GroupCD())
cycle2.add_subsystem('ComponentE_withDerivatives', ComponentE_withDerivatives())
cycle2.linear_solver = om.DirectSolver()
cycle2.nonlinear_solver = om.NewtonSolver(solve_subsystems=True)
self.add_subsystem('ComponentG', ComponentG(), promotes_inputs=['a1','a2','a3'], promotes_outputs=['b1'])
prob = om.Problem()
prob.model = Final()
prob.driver = om.pyOptSparseDriver()
prob.driver.options['optimizer'] = 'SNOPT'
prob.driver.options['print_results']= True
## Design Variables
## Costraints
## Objectives
# Setup
prob.setup()
##prob.model.approx_totals(method='fd')
prob.run_model()
prob.run_driver()
Here this doesn't work. The cycle1 doesn't converge. The code works when I completely remove cycle1 or use NonlinearBlockGS instead of Newton or if I uncomment prob.model.approx_total(method='FD'). (no problem with cycle2. Work with Newton)
So if I don't use approx_totals(), I am assuming Openmdao uses a mixed strategy. Or should I manually mention it somehow ? And when I do use approx_totals() , will I lose the benefits from the analytical derivatives that I do have?
The code example you provided isn't runnable, so I'll have to make a few guesses. You call both run_model() and run_driver(). You bothered to include an optimizer in your sample code though, and you've show approx_totals to be called at the top of the model hierarchy.
So when you say it does not work, I will assume you mean that the optimizer doesn't converge.
You have understood the behavior of approx_totals correctly. When you set that at the top of your model, then OpenMDAO will FD the relevant variables from the group level. In this case, that means you will also be FD-ing across the solver itself. You say that this seems to work, but the mixed analytic approach does not.
In general, Will I lose the benefits from the mixed-analytical approach when I use approx_totals()?
Yes. You are no long using a mixed approach. You are just FD-ing across the model monolithically.
Is there a way to find the derivatives of whole model (or group) with mixed analytical strategy ?
OpenMDAO is computing total derivatives with a mixed strategy when you don't use approx_totals. The issue is that for your model, it seems not to be working.
In general (not in this scenario), will Openmdao automatically detect the mixed strategy?
It will "detect" it (it doesn't actually detect anything, but the underlying algorithms will use a mixed strategy UNLESS you tell it not to with approx_totals. Again, the issue is not that a mixed strategy is not being used, but that it is not working.
So why isn't the mixed strategy working?
I can only guess, since I can't run the code... so YMMV.
You mention that you are using complex-step for partials of your explicit components. Complex-step is a much more accurate approximation scheme than FD, but it is not without its own flaws. Not every computation is complex-safe. Some can be re-written to be complex-safe, others can not.
By "complex-safe" I mean that the computation correctly handles the complex-part to give a derivatives.
Two commonly used-complex-safe methods are np.linalg.norm and np.abs. Both will happily accept complex-numbers and give you an answer, but it is not the correct answer for when you need derivatives.
Because of this, OpenMDAO ships with a set of custom functions that are cs-safe --- custom norm and abs are provided.
What typically happens with non cs-safe methods is that the complex-part somehow gets dropped off and you get 0 partial derivatives. Wrong partials, wrong totals.
To check this, make sure you call check_partials on your components that are being complex-stepped, using a finite-difference check. You'll probably find some discrepancies.
The fixes available to you are:
Switch those components to use FD partials. Less accurate, but will probably work
Correct whatever problems in your compute are making your code non-cs-safe. Use OpenMDAO's custom functions if thats the problem, or possibly you need to be more careful about how you allocate and use numpy arrays in your compute (if you're allocating your own arrays, then you need to be careful to make sure they are complex too!).
I am creating a program that optimizes a set of coupled subcomponents to minimize for their total mass. Currently each component is a group that has a promoted output for it's mass and then another group group exists at the top level that takes each of these masses as inputs, computes the sum, and then this sum is used as the objective for the optimizer.
This program is designed to be operated by a user where the type and number of subcomponents is set at runtime. This proves problematic for my statically declared mass summing group that would need to change it's inputs depending on what components are added at runtime.
I was therefor wondering if is there a way to declare a 'partial objective' where each of these partial pieces would be summed together for the final objective processed by the ScipyOptimize Driver? The 'partial objectives', design variable and constraints could simply be added in each subsystem, and the subsystem is added to the model, they would ready to go to fit into the larger optimization.
Another way could be some sort of summer behavior in a group where the inputs to be summed were exclusively created via glob pattern. Something along the lines of
self.add_subsystem('sum', Summer(inputs='mass:*'))
Is there any way to achieve either of these types of functionality in OpenMDAO 3.1.1?
In OpenMDAO V3.1, there is a configure method that will let you accomplish what you want --- subject to a few caveats. The first caveat is that in V3.1 you can inspect the I/O of components from within a group configure but you can not inspect the I/O of child groups. This is something we are working to remedy, but as of V3.1 this restriction is present.
None the less, here is some code that accomplishes what I think you were seeking. Its not super clean, but it does achieve the kind of reactive setup that you were going for.
import openmdao.api as om
class Summer(om.ExplicitComponent):
def setup(self):
# note: will add inputs via the configure method of parent group
self.add_output('total_mass')
self.declare_partials('total_mass', wrt='*', val=1)
def compute(self, inputs, outputs):
outputs['total_mass'] = 0
for inp_name in inputs:
outputs['total_mass'] += inputs[inp_name]
class TotalMass(om.Group):
def setup(self):
# Only add the summing comp, others will be added by users
self.add_subsystem('sum', Summer())
def configure(self):
sum_comp = self.sum
# NOTE: need to access some private attributes of the group here,
# so this is a little fragile, but works as of OM V3.1
for subsys in self._subsystems_myproc:
s_name = subsys.name
if s_name == 'sum':
continue
i_name = f'{s_name}_mass'
sum_comp.add_input(i_name)
self.connect(f'{s_name}.mass', f'sum.{i_name}')
if __name__ == "__main__":
p = om.Problem()
tm = p.model.add_subsystem('tm', TotalMass())
tm.add_subsystem('part_1', om.ExecComp('mass=3+x'))
tm.add_subsystem('part_2', om.ExecComp('mass=5+x'))
p.setup()
p.run_model()
p.model.list_outputs()
We're planning changes that will make more model introspection at the time of setup/configure possible. Until those changes are implemented, then the typical way of achieving this is similar to what you've implemented. Without introspection, you need to give Summer the names of the inputs it should expect (not wildcard-based).
You can give your systems which compute mass some attribute, for instance 'mass_output_name'.
Then, you could iterate through all such systems:
mass_output_systems = [sys_a, sys_b, sys_c]
mass_names = [sys.mass_output_name for sys in mass_output_systems]
And then feed these to your summing subsystem:
self.add_subsystem('sum', Summer(inputs=mass_names))
I have a group with coupled disciplines which is nested in a model where all other components are uncoupled. I have assigned a nonlinear Newton and linear direct solvers to the coupled group.
When I try to run the model with default "RunOnce" solver everything is OK, but as soon as I try to run optimization I get following error raised from linear_block_gs.py:
File "...\openmdao\core\group.py", line 1790, in _apply_linear scope_out, scope_in)
File "...\openmdao\core\explicitcomponent.py", line 339, in _apply_linear
self.compute_jacvec_product(*args)
File "...\Thermal_Cycle.py", line 51, in compute_jacvec_product
d_inputs['T'] = slope * deff_dT / alp_sc
File "...\openmdao\vectors\vector.py", line 363, in setitem
raise KeyError(msg.format(name)) KeyError: 'Variable name "T" not found.'
Below is the N2 diagram of the model. Variable "T" which is mentioned in the error comes from implicit "temp" component and is fed back to "sc" component (file Thermal_Cycle.py in the error msg) as input.
N2 diagram
The error disappears when I assign DirectSolver on top of the whole model. My impression was that "RunOnce" would work as long as groups with implicit components have appropriate solvers applied to them as suggested here and is done in my case. Why does it not work when trying to compute total derivatives of the model, i.e. why compute_jacvec_product cannot find coupled variable "T"?
The reason I want to use "RunOnce" solver is that optimization with DirecSolver on top becomes very long as my variable vector "T" increases. I suspect it should be much faster with linear "RunOnce"?
I think this example of the compute_jacvec_product method might be helpful.
The problem is that, depending on the solver configuration or the structure of the model, OpenMDAO may only need some of the partials that you provide in this method. For example, your matrix-free component might have two inputs, but only one is connected, so OpenMDAO does not need the derivative with respect to the unconnected input, and in fact, does not allocate space for it in the d_inputs or d_outputs vectors.
So, to fix the problem, you just need to put an if statement before assigning the value, just like in the example.
Based on the N2, I think that I agree with your strategy of putting the direct solver down around the coupling only. That should work fine, however it looks like you're implementing a linear operator in your component, based on:
File "...\Thermal_Cycle.py", line 51, in compute_jacvec_product d_inputs['T'] = slope * deff_dT / alp_sc
You shouldn't use direct solver with matrix-free partials. The direct solver computes an inverse, which requires the full assembly of the matrix. The only reason it works at all is that OM has some fall-back functionality to manually assemble the jacobian by passing columns of the identity matrix through the compute_jacvec_product method.
This fallback mechanism is there to make things work, but its very slow (you end up calling compute_jacvec_product A LOT).
The error you're getting, and why it works when you put the direct solver higher up in the model, is probably due to a lack of necessary if conditions in your compute_jacvec_product implementation.
See the docs on explicit component for some examples, but the key insight is to realize that not every single variable will be present when doing a jacvec product (it depends on what kind of solve is being done --- i.e. one for Newton vs one for total derivatives of the whole model).
So those if-checks are needed to check if variables are relevant. This is done, because for expensive codes (i.e. CFD) some of these operations are quite expensive and you don't want to do them unless you need to.
Are your components so big that you can't use the compute_partials function? Have you tried specifying the sparsity in your jacobian? Usually the matrix-free partial derivative methods are not needed until you start working with really big PDE solvers with 1e6 or more implicit outputs variables.
Without seeing some code, its hard to comment with more detail, but in summary:
You shouldn't use compute_jacvec_product in combination with direct solver. If you really need matrix-free partials, then you need to switch to iterative linear solvers liket PetscKrylov.
If you can post the code for the the component in Thermal_Cycle.py that has the compute_jacvec_product I could give a more detailed recommendation on how to handle the partial derivatives in that case.
Hi I am trying to use the paraboloid external code component to get the same results as in the paraboloid optimization problem (openmdao v 2.2.0).
So in my mind the independent variables x,y should be updated and thus changing the input file of the external component to minimize the output f.
Not that I got this working but I basically add the external component's output to be the objective and the independent variables to be the design variables etc (see code below).
But more importantly I have a problem to conceptually understand how the optimizer would know the derivatives in such external codes.
I tried 'COBYLA' thinking that could be a way to go gradient-free approach but there seems to be a bug in the iprint statement, since I can not run the example paraboloid optimization either.
I think I have a similar problem with surrogates. For example I use Metamodelunstructured component to find my surrogate which performs well if I ask for a known value. But I do not see how to couple this component's output to be the objective of the optimizer. I think I am doing the right thing by giving the model objective. but not sure...
The answer might be that I am completely off from the optimization logic if so please refer me to the related papers for the algorithms behind.
Thanks in advance
from openmdao.api import Problem, Group, IndepVarComp
from openmdao.api import ScipyOptimizeDriver
from openmdao.components.tests.test_external_code import ParaboloidExternalCode
top = Problem()
top.model = model = Group()
# create and connect inputs
model.add_subsystem('p1', IndepVarComp('x', 3.0))
model.add_subsystem('p2', IndepVarComp('y', -4.0))
model.add_subsystem('p', ParaboloidExternalCode())
model.connect('p1.x', 'p.x')
model.connect('p2.y', 'p.y')
top.driver = ScipyOptimizeDriver()
top.driver.options['optimizer'] = 'SLSQP'
top.model.add_design_var('p1.x', lower=-50, upper=50)
top.model.add_design_var('p2.y', lower=-50, upper=50)
top.model.add_objective('p.f_xy')
top.driver.options['tol'] = 1e-9
top.driver.options['disp'] = True
top.setup()
top.run_driver()
# minimum value
# location of the minimum
print(top['p1.x'])
print(top['p2.y'])
So, I think the main thing you are asking is how to provide derivatives for external codes. I think there are really two options for this.
Finite difference across the external component.
The test example doesn't show how to do this, which is unfortunate, but you do this the same way that you would declare derivatives fd for a pure python component, namely by adding this line to the external component's setup method:
self.declare_partials(of='*', wrt='*', method='fd')
Provide another external method to calculate the derivatives, and wrap it in the "compute_partials" method.
We do this with CFD codes that provide an adjoint solution. You could possibly also use automatic differentiation on the external source code to produce a callable function in this way. However, I think method 1 is what you are asking for here.
I want to write a simple autoencoder in PyTorch and use BCELoss, however, I get NaN out, since it expects the targets to be between 0 and 1. Could someone post a simple use case of BCELoss?
Update
The BCELoss function did not use to be numerically stable. See this issue https://github.com/pytorch/pytorch/issues/751. However, this issue has been resolved with Pull #1792, so that BCELoss is numerically stable now!
Old answer
If you build PyTorch from source, you can use the numerically stable function BCEWithLogitsLoss(contributed in https://github.com/pytorch/pytorch/pull/1792), which takes logits as input.
Otherwise, you can use the following function (contributed by yzgao in the above issue):
class StableBCELoss(nn.modules.Module):
def __init__(self):
super(StableBCELoss, self).__init__()
def forward(self, input, target):
neg_abs = - input.abs()
loss = input.clamp(min=0) - input * target + (1 + neg_abs.exp()).log()
return loss.mean()
You might want to use a sigmoid layer at the end of the network. In that way the number would represent probabilities. Also make sure that the targets are binary numbers. If you post your complete code we might help more.