OpenMDAO: interrupt a design of experiment computations - openmdao

I am creating this new topic because I am using the OpenMDAO platform, and more specifically its design of experiment option. I would like to know if there is a proper way to interrupt and stop the computations if a condition is met in my program.
I have already used OpenMDAO optimizers to study and solve some problems and to stop the computations I used to raise an Exception to stop the program. This strategy seems to work for optimizers but not so much when I am using the LatinHypercubeGenerator driver: it is like the OpenMDAO program is still trying to compute the points even if Exception or RuntimeError are raise within the OpenMDAO explicit component function "compute".
In that respect I am wondering if there is a way to kill OpenMDAO during calculations. I tried to check if an OpenMDAO built-in attribute or method could do the job, but I have not found anything.
Does anyone know how to stop OpenMDAO DOE computations?
Many thanks in advance for any advice/help

As of OpenMDAO V3.18, there is no way to add some kind off a stopping condition to the DOE driver. You mention using AnalysisError to achieve this for other optimizers. This won't work in general either, since some drivers will intentionally catch those errors, react, and attempt to keep running the optimization.
You can see the run code of the driver, where a for loop is made and some try/catch blocks are used to record the success/failure of specific cases.
My suggestion for achieving what you want would be to copy the driver code into your model directory and make your own custom drivers. You can add whatever kind of termination condition you like, either based on results of a single case or some statistical analysis of the currently run cases.
If you come up with a clean way of doing it, you can always submit a POEM and/or a pull request to propose adding your new functionality to the mainline of OpenMDAO.

Related

Why we need to compile the program of progress 4GL?

I would like to know Why we need to compile the program of progress 4GL? Really what is happening behind there? Why we are getting .r file after compiled the program? When we check the syntax if its correct then we will get one message box 'Syntax is correct' how its finding the errors and showing the messages.Any explanations welcome and appreciated.
Benefits of compiled r-code include:
Syntax checking
Faster execution (r-code executes faster)
Security (r-code is not "human readable" and tampering with it will likely be noticed)
Licensing (r-code runtime licenses are much less expensive)
For "how its finding the errors and showing the messages" -- at a high level it is like any compiler. It evaluates the provided source against a syntax tree and lets you know when you violate the rules. Compiler design and construction is a fairly advanced topic that probably isn't going to fit into a simple SO question -- but if you had something more specific that could stand on its own as a question someone might be able to help.
The short answer is that when you compile, you're translating your program to a language the machine understands. You're asking two different questions here, so let me give you a simple answer to the first: you don't NEED to compile if you're the only one using the program, for example. But in order to have your program optimized (since it's already at the machine language level) and guarantee no one is messing with your logic, we compile the code and usually don't allow regular users to access the source code.
The second question, how does the syntax checker work, I believe it would be better for you to Google and choose some articles to read about compilers. They're complex, but in a nutshell what they do is take what Progress expects as full, operational commands, and compare to what you do. For example, if you do a
Find first customer where customer.active = yes no-error.
Progress will check if customer is a table, if customer.active is a field in that table, if it's the logical type, since you are filtering if it is yes, and if your whole conditions can be translated to one single true or false Boolean value. It goes on to check if you specified a lock (and default to shared if you haven't, like in my example, which is a no-no, by the way), what happens if there are multiple records (since I said first, then get just the first one) and finally what happens if it fails. If you check the find statement, there are more options to customize it, and the compiler will simply compare your use of the statement to what Progress can have for it. And collect all errors if it can't. That's why sometimes compilers will give you generic messages. Since they don't know what you're trying to do, all they can do is tell you what's basically wrong with what you wrote.
Hope this helps you understand.

No Global Contract available for procedure / function

I've got a procedure within a SPARK module that calls the standard Ada-Text_IO.Put_Line.
During proving I get the following warning warning: no Global contract available for "Put_Line".
I do already know how to add the respective data dependency contract to procedures and functions written by myself but how do I add them to a procedures / functions written by others where I can't edit the source files?
I looked through sections 5.2 and 7.4 of the Adacore SPARK 2014 user's guide but didn't found an example with a solution to my problem.
This means that the analyzer cannot "see" whether global variables might be affected when this function is called. It therefore assumes this call is not modifying anything (otherwise all other proofs could be refuted immediately). This is likely a valid assumption for your specific example, but it might not be valid on an embedded system, where a custom implementation of Put_Line might do anything.
There are two ways to convey the missing information:
verifier can examine the source code of the function. Then it can try to generate global contracts itself.
global contracts are specified explicitly, see RM 6.1.4 (http://docs.adacore.com/spark2014-docs/html/lrm/subprograms.html#global-aspects)
In this case, the procedure you are calling is part of the run-time system (RTS), and therefore the source is not visible, and you probably cannot/should not change it.
What to do in practice?
Suppressing warnings is almost never a good idea, especially not when you are working on something safety-critical. Usually the code has to be changed until the warning goes away, or some justification process has to start.
If you are serious about the analysis results, I recommend to not use such subprograms. If you really need output there, either write your own procedure that replaces the RTS subprogram, or ensure that the subprogram really has no side effects. This is further backed up by what Frédéric has linked: Even if the callee has no side effects, you don't know whether it raises an exception for specific inputs (e.g., very long strings).
If you are not so serious about the results, then you can consider this specific one as a warning that you could live with.
Wrapper packages for use in development of SPARK applications may be found here:
https://github.com/joakim-strandberg/aida_2012
I think you just can't add Spark contracts on code you don't own, especially code from the Ada standard.
About Text_Io, I found something that may be valuable to you in the reference manual.
EDIT
Another solution compared to what Martin said, according to "Building high integrity applications with Spark" book, is to create a wrapper package.
As Spark requires you to deal with Spark packages but allows you to depend on a Spark spec with an Ada body, the solution is to build a Spark package wrapping your Ada.Text_io calls.
It might be tedious as you will have to wrap possible exceptions, possibly define specific types and so on but this way, you'll be able to discharge VCs on your full Spark package.

Restarting SLSQP from sub iteration

The case I am solving is two discipline aerospace problem. The architecture is IDF. I am using recorders to record the data at each iteration. I am using finite difference. I am using SLSQP optimizer from SciPy.
If after few major iteration, the optimization crashes during line search. How to start the line search from the same point?
Apart from that, I want to check whether the call to solver_nonlinear() of Component is called for purpose of derivative calculation or for line search, from inside the component. Is there a way to do it?
SLSQP doesn't offer any built in restart capability, so there isn't a whole lot you can do there. Pyopt-sparse does have some restart capability that OpenMDAO can use. Its called "hot-start" in their code.
As for knowing if a solve_nonlinear is for derivative calculations or not, I assume you mean that you want to know if the call is for an FD step or not. We don't currently have that feature.

fault tolerance in MPICH/OpenMPI

I have two questions-
Q1. Is there a more efficient way to handle the error situation in MPI, other than check-point/rollback? I see that if a node "dies", the program halts abruptly.. Is there any way to go ahead with the execution after a node dies ?? (no issues if it is at the cost of accuracy)
Q2. I read in "http://stackoverflow.com/questions/144309/what-is-the-best-mpi-implementation", that OpenMPI has better fault tolerance and recently MPICH-2 has also come up with similar features.. does anybody know what they are and how to use them? is it a "mode"? can they help in the situation stated in Q1 ?
kindly reply. Thank you.
MPI - all implementations - have had the ability to continue after an error for a while. The default is to die - that is, the default error handler is MPI_ERRORS_ARE_FATAL - but that can be set (eg, see the discussion here). But the standard doesn't currently much beyond that; that is, it's hard to recover and continue after such an error. If your program is sufficiently simple - some sort of master-worker type of setup - it may be possible to continue this way.
The MPI forum is currently working on what will become MPI-3, and error handling and fault tolerance will be an important component of the new standard (there's a working group dedicated to the topic). Until that work is complete, however, the only way to get stronger fault tolerance out of MPI is to use earlier, nonstandard, extensions. FT-MPI was a project that developed a very robust MPI, but unfortuantely it's based on MPI1.2; a very early version of the standard. The claim here is that they're now working with OpenMPI, but I don't know what's become of that. There's MPICH-V, based on MPI2, but that's more checkpoint-restart based than what I think you're looking for.
Updated to add: The fault tolerance didn't make it into MPI-3, but the working group continues its work and the expectation is that something will result from that before too long.

How do you handle unit/regression tests which are expected to fail during development?

During software development, there may be bugs in the codebase which are known issues. These bugs will cause the regression/unit tests to fail, if the tests have been written well.
There is constant debate in our teams about how failing tests should be managed:
Comment out failing test cases with a REVISIT or TODO comment.
Advantage: We will always know when a new defect has been introduced, and not one we are already aware of.
Disadvantage: May forget to REVISIT the commented-out test case, meaning that the defect could slip through the cracks.
Leave the test cases failing.
Advantage: Will not forget to fix the defects, as the script failures will constantly reminding you that a defect is present.
Disadvantage: Difficult to detect when a new defect is introduced, due to failure noise.
I'd like to explore what the best practices are in this regard. Personally, I think a tri-state solution is the best for determining whether a script is passing. For example when you run a script, you could see the following:
Percentage passed: 75%
Percentage failed (expected): 20%
Percentage failed (unexpected): 5%
You would basically mark any test cases which you expect to fail (due to some defect) with some metadata. This ensures you still see the failure result at the end of the test, but immediately know if there is a new failure which you weren't expecting. This appears to take the best parts of the 2 proposals above.
Does anyone have any best practices for managing this?
I would leave your test cases in. In my experience, commenting out code with something like
// TODO: fix test case
is akin to doing:
// HAHA: you'll never revisit me
In all seriousness, as you get closer to shipping, the desire to revisit TODO's in code tends to fade, especially with things like unit tests because you are concentrating on fixing other parts of the code.
Leave the tests in perhaps with your "tri-state" solution. Howeveer, I would strongly encourage fixing those cases ASAP. My problem with constant reminders is that after people see them, they tend to gloss over them and say "oh yeah, we get those errors all the time..."
Case in point -- in some of our code, we have introduced the idea of "skippable asserts" -- asserts which are there to let you know there is a problem, but allow our testers to move past them on into the rest of the code. We've come to find out that QA started saying things like "oh yeah, we get that assert all the time and we were told it was skippable" and bugs didn't get reported.
I guess what I'm suggesting is that there is another alternative, which is to fix the bugs that your test cases find immediately. There may be practical reasons not to do so, but getting in that habit now could be more beneficial in the long run.
Fix the bug right away.
If it's too complex to do right away, it's probably too large a unit for unit testing.
Lose the unit test, and put the defect in your bug database. That way it has visibility, can be prioritized, etc.
I generally work in Perl and Perl's Test::* modules allow you to insert TODO blocks:
TODO: {
local $TODO = "This has not been implemented yet."
# Tests expected to fail go here
}
In the detailed output of the test run, the message in $TODO is appended to the pass/fail report for each test in the TODO block, so as to explain why it was expected to fail. For the summary of test results, all TODO tests are treated as having succeeded, but, if any actually return a successful result, the summary will also count those up and report the number of tests which unexpectedly succeeded.
My recommendation, then, would be to find a testing tool which has similar capabilities. (Or just use Perl for your testing, even if the code being tested is in another language...)
We did the following: Put a hierarchy on the tests.
Example: You have to test 3 things.
Test the login (login, retrieve the user name, get the "last login date" or something familiar etc.)
Test the database retrieval (search for a given "schnitzelmitkartoffelsalat" - tag, search the latest tags)
Test web services (connect, get the version number, retrieve simple data, retrieve detailed data, change data)
Every testing point has subpoints, as stated in brackets. We split these hierarchical. Take the last example:
3. Connect to a web service
...
3.1. Get the version number
...
3.2. Data:
3.2.1. Get the version number
3.2.2. Retrieve simple data
3.2.3. Retrieve detailed data
3.2.4. Change data
If a point fails (while developing) give one exact error message. I.e. 3.2.2. failed. Then the testing unit will not execute the tests for 3.2.3. and 3.2.4. . This way you get one (exact) error message: "3.2.2 failed". Thus leaving the programmer to solve that problem (first) and not handle 3.2.3. and 3.2.4. because this would not work out.
That helped a lot to clarify the problem and to make clear what has to be done at first.
I tend to leave these in, with an Ignore attribute (this is using NUnit) - the test is mentioned in the test run output, so it's visible, hopefully meaning we won't forget it. Consider adding the issue/ticket ID in the "ignore" message. That way it will be resolved when the underlying problem is considered to be ripe - it'd be nice to fix failing tests right away, but sometimes small bugs have to wait until the time is right.
I've considered the Explicit attribute, which has the advantage of being able to be run without a recompile, but it doesn't take a "reason" argument, and in the version of NUnit we run, the test doesn't show up in the output as unrun.
I think you need a TODO watcher that produces the "TODO" comments from the code base. The TODO is your test metadata. It's one line in front of the known failure message and very easy to correlate.
TODO's are good. Use them. Actively management them by actually putting them into the backlog on a regular basis.
#5 on Joel's "12 Steps to Better Code" is fixing bugs before you write new code:
When you have a bug in your code that you see the first time you try to run it, you will be able to fix it in no time at all, because all the code is still fresh in your mind.
If you find a bug in some code that you wrote a few days ago, it will take you a while to hunt it down, but when you reread the code you wrote, you'll remember everything and you'll be able to fix the bug in a reasonable amount of time.
But if you find a bug in code that you wrote a few months ago, you'll probably have forgotten a lot of things about that code, and it's much harder to fix. By that time you may be fixing somebody else's code, and they may be in Aruba on vacation, in which case, fixing the bug is like science: you have to be slow, methodical, and meticulous, and you can't be sure how long it will take to discover the cure.
And if you find a bug in code that has already shipped, you're going to incur incredible expense getting it fixed.
But if you really want to ignore failing tests, use the [Ignore] attribute or its equivalent in whatever test framework you use. In MbUnit's HTML output, ignored tests are displayed in yellow, compared to the red of failing tests. This lets you easily notice a newly-failing test, but you won't lose track of the known-failing tests.

Resources