I implemented a system that is composed of few groups and multiple components. It is relatively intricate and has component inputs/outputs, which some partials are dependent/non dependent etc.
Gradient based optimizers seem to get stuck at the initial values and never go further than iteration 0 (not stuck at local optimum). I have encountered this error before as I was missing declare_partials for some variables. Is there a way to automatically check which component input/output is missing partials similar to missing connection in N^2 diagram.
There are two tools that you need to use to check for bad derivatives. The first is check_partials. That will go component by component and use either finite-difference or complex-step to verify the partial derivatives for every component (regardless of whether or not your declared them in the setup of that component). That will catch the problem if you are missing any partials, because the check-fd will see them as non-zero and will show you that there is an error.
Check_partials should be your first stop, always. If you can, use complex-step to verify your derivatives. That way you know they are totally accurate. Also, check_partials will do the check around whatever point is currently initialized. So sometimes you might have a degenerate case (e.g. you have some input that is 0) and so your check_passes, but your derivatives are still wrong. For example, if your component represented y=2*x, and you forgot to define derivatives, but you ran check_partials at x=0, then the check would pass. But if you ran it at x=1, then the check would show an error.
If all of your partial derivatives are correct, but you're still getting bad results then you can try check_totals. Depending on the structure of your model, and if you have any coupling in it (i.e. you need to use some kind of nonlinear solver) then its possible that you don't have a correctly configured linear solver setup to solve for total derivatives correctly. In a lot of cases, if you have coupling you can just put a DirectSolver right at the same level as the nonlinear solver you put in the model.
Related
As the title states, the get_val() function allows the user to retrieve the value of a input, output or residual. Is the anything like a get_partial(of=..., wrt=...) that allows a user to retrieve a derivative? Or what would be the best way to go about retrieving that from the problem or model?
For getting a general derivative in a system, the recommended practice is to use the compute_totals method.
Even if you just want to look at a partial derivative, you can use the of and wrt arguments to point to just the specific partial. You'll get a total, but it should be equal to the partial.
The general debugging practice for looking at partials is to use check_partials. This will give you full values of all the partials to look at. But if you need an algorithmic approach as part of a run script, then use compute_totals.
OpenMDAO stores outputs, so obtaining those is a matter of getting a value that's already there (hence get_val).
For derivatives, depending on the way in which OpenMDAO is used, there's no guarantee that the totals are present in memory, so they must be computed when needed.
Turbulent boundary layer calculations break down at the point of flow separation when solved with a prescribed boundary layer edge velocity, ue, in what is called the direct method.
This can be alleviated by solving the system in a fully-simultaneous or quasi-simultaneous manner. Details about both methods are available here (https://www.rug.nl/research/portal/files/14407586/root.pdf), pages 38 onwards. Essentially, the fully-simultaneous method combines the inviscid and viscous equations into a single large system of equations, and solves them with Newton iteration.
I have currently implemented an inviscid panel solver entirely in ExplicitComponents. I intend to implement the boundary layer solver also entirely with ExplicitComponents. I am unsure whether coupling these two groups would then result in an execution procedure like the direct method, or whether it would work like the fully-simultaneous method. I note that in the OpenMDAO paper, it is stated that the components are solved "as a single nonlinear system of equations", and that the reformulation from explicit components to the implicit system is handled automatically by OpenMDAO.
Does this mean that if I couple my two analyses (again, consisting purely of ExplicitComponents) and set the group to solve with the Newton solver, I'll get a fully-simultaneous solution 'for free'? This seems too good to be true, as ultimately the component that integrates the boundary layer equations will have to take some prescribed ue as an input, and then will run into the singularity in the execution of its compute() method.
If doing the above would instead make it execute like the direct method and lead to the singularity, (briefly) what changes would I need to make to avoid it? Would it require defining the boundary layer components implicitly?
despite seeming too good to be true, you can in fact change the structure of your system by changing out the top level solver.
If you used a NonlinearBlockGS solver at the tope, it would solve in the weak form. If you used a NewtonSolver at the top, it would solve as one large monolithic system. This property does indeed derive from the unique structure of how OpenMDAO stores things.
There are some caveats. I would guess that your panel code is implemented as a set of intermediate calculations broken up across several components. If thats the case, then the NewtonSolver will be treating each intermediate variable as it it was its own state variable. In other words, you would have more than just delta and u_e as states, but also all the intermediate calculations too.
This is might be somewhat unstable (though it might work just fine, so try it!). You might need a hybrid between the weak and strong forms, that can be achieved via the solve_subsystems option on the NewtonSolver. This approach, is called the Hierarchical Newton Method in section 5.1.2 of the OpenMDAO paper. It will do a sub-iteration of NLBGS for every top level Newton iteration. This acts as a form of nonlinear preconditioner which can help stabilize the strong form. You can limit ho many sub-iterations are done, and in your case you may want to use just 2 or 3 because of the risk of singularity.
As per the title - say you have a fixed parameter like air density. Is it worth defining the partial w.r.t this fixed parameter?
If you know the value will be fixed forever (i.e. you'll never want to connect it to something else), the you don't need to declare derivatives for that combination of variables.
However I consider this to be a bad practice. In my experience, at some point in the future you will end up connecting something to that input, and then the total derivatives will be wrong. You could, of course, fix the derivatives at that point, but you might not remember and it will take you some time to debug the optimization and figure out the source of the bad derivatives. So as a best practice, I always differentiate all outputs with respect to all inputs.
Alternatively, you could declare density as an option instead of an input (see the docs on options) If you really want it to be a constant, this is the route I suggest.
The case I am solving is two discipline aerospace problem. The architecture is IDF. I am using recorders to record the data at each iteration. I am using finite difference. I am using SLSQP optimizer from SciPy.
If after few major iteration, the optimization crashes during line search. How to start the line search from the same point?
Apart from that, I want to check whether the call to solver_nonlinear() of Component is called for purpose of derivative calculation or for line search, from inside the component. Is there a way to do it?
SLSQP doesn't offer any built in restart capability, so there isn't a whole lot you can do there. Pyopt-sparse does have some restart capability that OpenMDAO can use. Its called "hot-start" in their code.
As for knowing if a solve_nonlinear is for derivative calculations or not, I assume you mean that you want to know if the call is for an FD step or not. We don't currently have that feature.
Say you're designing a math library (in JS) from scratch: the usual Vector2/3/4, Matrix2/3/4, Quaternion and so on (standard stuff for WebGL apps). What would be the best way to handle bad input? (division by zero, inverting a singular matrix, computing the intersection point between 2 parallel lines and so on).
The two ways to deal with this would be to:
throw exceptions
I know there are plenty who like to know precisely when their code fails - sort of the same people who hate dynamic typing, but I can't help but think of the dreaded "Error 200: Division by zero" exceptions that I got so much of in my early days of programming years ago. The only solution was to sprinkle the code with checks to prevent any of these errors. That only made code UGLY. I also can't help but wonder why programming languages nowadays have adopted +/-Infinity and NaN.
or to fail silently
In this case, the possible scenarios when trying to execute the line:
singularMatrix.invert().add(otherMatrix)
would be:
singularMatrix.invert() would return BAD_MATRIX, and BAD_MATRIX.add() would do nothing (and "stop the computation" (JQuery-like))
singularMatrix.invert() would fail but return itself unchanged and .add() would work
singularMatrix.invert() would fill the matrix with +/-Infinity and the computation would continue
I would personally prefer one of the latter options, but I'm totally open to arguments and alternatives (that's why I'm asking here on SO).
I don't know if the "best way" for this sort of thing has been invented yet.
Whatever you do, don't fail silently. There's no point in continuing the computation if the result is going to be wrong, and you don't want to show an incorrect result to the user and claim that it's correct. Nothing good can come of that, especially in a reusable library where you don't necessarily know what the caller will do with the result.
Throw an exception or return a special value that the caller can check for, such as undefined.
NaN and status codes (option 3)
The IEEE754 standard was created to resolve a lot of these issue in a completely consistent way. For example, 1/0 == +inf, which is a kind of NaN. This standard is baked into processors themselves. It is neither a thrown exception (which would make some simple code very complex) nor a silent failure. You can trace the NaNs all the way back to where they appeared, giving you the debug information you need to fix the bug.
As far as large routines like matrix inversion goes, numerical libraries generally follow the unix convention of returning a status code. In Javascript you can do this by returning an object with a status property.
Taking your example:
singularMatrix.invert().add(otherMatrix)
If invert were to return a matrix object full of NaNs with the status property 'invalid matrix', then add can be called and return another matrix full of NaNs.
This permits you to call invert and later check whether it was valid; if you use exceptions you have to handle them immediately, and when you want to defer a decision until later, you'll have to set up the same set of properties.
There is still useful information in a matrix that is partially or entirely filled with NaNs - the shape information can be used to create a new matrix to replace a bad initial vector, or known good values can still be used in calculation.
TLDR: Do the NaN thing, and recreate it in matrices.