pyOptSparse Error: Received an unknown option (AMIEGO) - openmdao

I recently came across AMIEGO. When I try to run the example problems (provided in the example directory) I get the following error.
-------------------------------------------------------------------------------
Exit Flag: True
Elapsed Time: 0.04829263687133789
======================ContinuousOptimization-End=======================================
+------------------------------------------------------------------------------+
| pyOptSparse Error: Received an unknown option: 'Major optimality tolerance' |
+------------------------------------------------------------------------------+
Traceback (most recent call last):
File "/home/sky/anaconda3/lib/python3.8/site-packages/openmdao/utils/concurrent.py", line 65, in concurrent_eval_lb
retval = func(*args)
File "/home/sky/anaconda3/lib/python3.8/site-packages/amiego/kriging.py", line 239, in _calculate_thetas
opt_x, opt_f, success, msg = snopt_opt(_calcll, x0, low, high, title='kriging',
File "/home/sky/anaconda3/lib/python3.8/site-packages/amiego/optimize_function.py", line 76, in snopt_opt
opt.setOption(name, value)
File "/home/sky/anaconda3/lib/python3.8/site-packages/pyoptsparse/pyOpt_optimizer.py", line 829, in setOption
raise Error("Received an unknown option: %s" % repr(name))
pyoptsparse.pyOpt_error.Error
I tested pyoptsparse optimization driver with the sellar problem and it worked as expected. So I think I'm missing something in AMIEGO. And fyi I didn't modify anything in the example, so I am running it with SLSQP(from pyoptsparse driver) for the continuous part(I dont have SNOPT). Any pointers on how to fix this or where to start looking will be helpful.

I've pushed up a couple of fixes to the repository so that you can run it without SNOPT. The basic Branin problem in the examples works and gets to the expected answer now. I can't promise that SLSQP is the best choice for more complicated problems as we usually favor SNOPT over SLSQP in our work. This is still very experimental code, so the documentation is weak and there are still a lot of control knobs and flags that are buried as subcomponent attributes (including ideas that we tried that didn't pan out). But we appreciate users who are willing to try AMIEGO and help us improve it.

Related

How to silence linux kernel checkpatch.pl ERROR: Missing Signed-off-by: line(s)

I use the Linux kernel style in my C code, even though it's not related to the kernel. Running checkpatch.pl produces ERROR: Missing Signed-off-by: line(s). How to ignore it?
To intentionally silence the ERROR: Missing Signed-off-by: line(s) one can pass the --no-signoff parameter, e.g.:
git show | checkpatch.pl --no-tree --no-signoff
This can also be added on a new line to the .checkpatch.conf file to avoid having to type it.

SysCTypes errors when using NetCDF.chpl?

I have a simple Chapel program to test the NetCDF module:
use NetCDF;
use NetCDF.C_NetCDF;
var f: int = ncopen("ppt2020_08_20.nc", NC_WRITE);
var status: int = nc_close(f);
and when I compile with:
chpl -I/usr/include -L/usr/lib/x86_64-linux-gnu -lnetcdf hello.chpl
it produces a list of errors about SysCTypes:
$CHPL_HOME/modules/packages/NetCDF.chpl:57: error: 'c_int' undeclared (first use this function)
$CHPL_HOME/modules/packages/NetCDF.chpl:77: error: 'c_char' undeclared (first use this function)
...
Would anyone see what my error is? I tried adding use SysCTypes; to my program, but that didn't seem to have an effect.
Sorry for the delayed response and for this bad behavior. This is a bug that's crept into the NetCDF module which seems not to have been caught by Chapel's nightly testing. To work around it, edit $CHPL_HOME/modules/packages/NetCDF.chpl, adding the line:
public use SysCTypes, SysBasic;
within the declaration of the C_NetCDF module (around line 50 in my copy of the sources). If you would consider filing this bug as an issue on the Chapel GitHub issue tracker, that would be great as well, though we'll try to get this fixed in the next release in any case.
With that change, your program almost compiles for me, except that nc_close() takes a c_int argument rather than a Chapel int. You could either lean on Chapel's type inference to cause this to happen:
var f = ncopen("ppt2020_08_20.nc", NC_WRITE);
or explicitly declare f to be of type c_int:
var f: c_int = ncopen("ppt2020_08_20.nc", NC_WRITE);
And then as one final note, I believe you should be able to drop the -lnetcdf from your chpl command-line as using the NetCDF module should cause this requirement to automatically be added.
Thanks for bringing this bug to our attention!

Can't get AllegroServe / Ironclad to work

(ql:quickload "aserve") fails
I'm trying to install AllegroServe. According to http://quickdocs.org/portableaserve/ and to this SO thread the simplest way to get aserve would be to get it with quicklisp: (ql:quickload "aserve")
But (ql:quickload "aserve") fails yielding the following error in the debugger buffer:
COMPILE-FILE-ERROR while compiling
#<IRONCLAD-SOURCE-FILE "ironclad" "src" "digests" "digest">
[Condition of type UIOP/LISP-BUILD:COMPILE-FILE-ERROR]
Whereas in the REPL it says:
; Loading "aserve"
; caught ERROR: READ error during COMPILE-FILE: Symbol "BIGNUM-TYPE"
; not found in the SB-BIGNUM package. Line: 53, Column: 52,
; File-Position: 2151 Stream: #<SB-INT:FORM-TRACKING-STREAM for
; "file
; C:\\Users\\user\\AppData\\Roaming\\quicklisp\\dists\\quicklisp\\software\\ironclad_0.33.0\\src\\digests\\digest.lisp"
; {25AFCD91}>
What I've tried so far
Apparently ironclad is another package, a "cryptographic toolkit written in pure Common Lisp". I downloaded ironclad-v0.34 from http://quickdocs.org/ironclad/ and even found digest.lisp and digests.lisp in the ironclad folder which made me think that I am on the right track.
My problem is I don't no where to go from here. How and where do I "install" ironclad?
Quickdocs says
[ironclad] comes with an ASDF system definition, so (asdf:oos 'asdf:load-op
:ironclad) should be all that you need to get started. The testsuite
can be run by substituting asdf:test-op for asdf:load-op in the form
above.
but since I'm not familiar with asdf I don't know what to make of it.
Am I even on the right track? Is installing the ironclad package the right way to make the error COMPILE-FILE-ERROR while compiling #<IRONCLAD-SOURCE-FILE "ironclad" "src" "digests" "digest">go away? If so what do I do with the ironclad-v0.34 folder?
(I'm using sbcl on a windows 10 machine.)
Thanks to #jkiiski leading me down the right path I was able to install aserve. The problem was indeed an old version of ironclad which, as #jkiiski pointed out, was using SB-BIGNUM:BIGNUM-TYPE which had been removed from SBCL.
However, the way I updated ironclad is probably NOT(!) a good way because I did it all manually (error prone).
Not knowing how quicklisp works exactly I searched for every occurence of ironclad-0.33.0 and replaced it with ironclad-v0.34, which meant replacing
the .../dists/quicklisp/software/ironclad-0.33.0 folder with .../dists/quicklisp/software/ironclad-v0.34
the irconcladd-0.33.0 tgz in .../dists/quicklisp/archives/ with ironclad-v0.34.tgz
the entry dists/quicklisp/software/ironclad-0.33.0/ in .../dists/quicklisp/installed/releases/ironclad.txt with dists/quicklisp/software/ironclad-v0.34/.
I also updated ironclad.txt and ironclad-text.txt in .../dists/quicklisp/installed/systems/
Well, it worked, but I only did it this way because I don't know any better (but am sure there has to be a better way).

MPI errors "MPI_ERR_IN_STATUS" and "BadPickleGet" in OpenMDAO when running external codes in parallel with many processors

The OpenMDAO problem that I'm running is quite complicated so I don't think it would be helpful to post the entire script. However, the basic setup is that my problem root is a ParallelFDGroup (not actually finite differencing for now--just running the problem once) that contains a few normal components as well as a parallel group. The parallel group is responsible for running 56 instances of an external code (one component per instance of the code). Strangely, when I run the problem with 4-8 processors, everything seems to work fine (sometimes even works with 10-12 processors). But when I try to use more processors (20+), I fairly consistently get the errors below. It provides two tracebacks:
Traceback (most recent call last):
File "opt_5mw.py", line 216, in <module>
top.setup() #call setup
File "/home/austinherrema/.local/lib/python2.7/site-packages/openmdao/core/problem.py", line 644, in setup
self.root._setup_vectors(param_owners, impl=self._impl, alloc_derivs=alloc_derivs)
File "/home/austinherrema/.local/lib/python2.7/site-packages/openmdao/core/group.py", line 476, in _setup_vectors
self._u_size_lists = self.unknowns._get_flattened_sizes()
File "/home/austinherrema/.local/lib/python2.7/site-packages/openmdao/core/petsc_impl.py", line 204, in _get_flattened_sizes
return self.comm.allgather(sizes)
File "MPI/Comm.pyx", line 1291, in mpi4py.MPI.Comm.allgather (src/mpi4py.MPI.c:109194)
File "MPI/msgpickle.pxi", line 746, in mpi4py.MPI.PyMPI_allgather (src/mpi4py.MPI.c:48575)
mpi4py.MPI.Exception: MPI_ERR_IN_STATUS: error code in status
Traceback (most recent call last):
File "opt_5mw.py", line 216, in <module>
top.setup() #call setup
File "/home/austinherrema/.local/lib/python2.7/site-packages/openmdao/core/problem.py", line 644, in setup
self.root._setup_vectors(param_owners, impl=self._impl, alloc_derivs=alloc_derivs)
File "/home/austinherrema/.local/lib/python2.7/site-packages/openmdao/core/group.py", line 476, in _setup_vectors
self._u_size_lists = self.unknowns._get_flattened_sizes()
File "/home/austinherrema/.local/lib/python2.7/site-packages/openmdao/core/petsc_impl.py", line 204, in _get_flattened_sizes
return self.comm.allgather(sizes)
File "MPI/Comm.pyx", line 1291, in mpi4py.MPI.Comm.allgather (src/mpi4py.MPI.c:109194)
File "MPI/msgpickle.pxi", line 749, in mpi4py.MPI.PyMPI_allgather (src/mpi4py.MPI.c:48609)
File "MPI/msgpickle.pxi", line 191, in mpi4py.MPI.Pickle.loadv (src/mpi4py.MPI.c:41957)
File "MPI/msgpickle.pxi", line 143, in mpi4py.MPI.Pickle.load (src/mpi4py.MPI.c:41248)
cPickle.BadPickleGet: 65
I am running under Ubuntu with OpenMDAO 1.7.3. I have tried running with both mpirun.openmpi (OpenRTE) 1.4.3 and mpirun (Open MPI) 1.4.3 and have gotten the same result in each case.
I found this post that seems to suggest that there is something wrong with the MPI installation. But if this were the case, it strikes me as strange that the problem would work for a small number of processors but not with a larger number. I also can run a relatively simple OpenMDAO problem (no external codes) with 32 processors without incident.
Because the traceback references OpenMDAO unknowns, I wondered if there are limitations on the size of OpenMDAO unknowns. In my case, each external code component has a few dozen array outputs that can be up to 50,000-60,000 elements each. Might that be problematic? Each external code component also reads the same set of input files. Could that be an issue as well? I have tried to ensure that read and write access is defined properly but perhaps that's not enough.
Any suggestions about what might be culprit in this situation are appreciated.
EDIT: I should add that I have tried running the problem without actually running the external codes (i.e. the components in the parallel group are called and set up but the external subprocesses are never actually created) and the problem persists.
EDIT2: I have done some more debugging on this issue and thought I should share the little that I have discovered. If I strip the problem down to only the parallel group containing the external code instances, the problem persists. However, if I reduce the components in the parallel group to basically nothing--just a print function for setup and for solve_nonlinear--then the problem can successfully "run" with a large number of processors. I started adding setup lines back in one by one to see what would create problems. I ran into issues when trying to add many large unknowns to the components. I can actually still add just a single large unknown--for example, this works:
self.add_output('BigOutput', shape=[100000])
But when I try to add too many large outputs like below, I get errors:
for i in range(100):
outputname = 'BigOutput{0}'.format(i)
self.add_output(outputname, shape=[100000])
Sometimes I just get a general segmentation violation error from PETSc. Other times I get a fairly length traceback that is too long to post here--I'll post just the beginning in case it provides any helpful clues:
*** glibc detected *** python2.7: free(): invalid pointer: 0x00007f21204f5010 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7da26)[0x7f2285f0ca26]
/home/austinherrema/miniconda2/lib/python2.7/lib-dynload/../../libsqlite3.so.0(sqlite3_free+0x4f)[0x7f2269b7754f]
/home/austinherrema/miniconda2/lib/python2.7/lib-dynload/../../libsqlite3.so.0(+0x1cbbc)[0x7f2269b87bbc]
/home/austinherrema/miniconda2/lib/python2.7/lib-dynload/../../libsqlite3.so.0(+0x54d6c)[0x7f2269bbfd6c]
/home/austinherrema/miniconda2/lib/python2.7/lib-dynload/../../libsqlite3.so.0(+0x9d31f)[0x7f2269c0831f]
/home/austinherrema/miniconda2/lib/python2.7/lib-dynload/../../libsqlite3.so.0(sqlite3_step+0x1bf)[0x7f2269be261f]
/home/austinherrema/miniconda2/lib/python2.7/lib-dynload/_sqlite3.so(pysqlite_step+0x2d)[0x7f2269e4306d]
/home/austinherrema/miniconda2/lib/python2.7/lib-dynload/_sqlite3.so(_pysqlite_query_execute+0x661)[0x7f2269e404b1]
/home/austinherrema/miniconda2/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x8942)[0x7f2286c6a5a2]
/home/austinherrema/miniconda2/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x86c3)[0x7f2286c6a323]
/home/austinherrema/miniconda2/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x86c3)[0x7f2286c6a323]
/home/austinherrema/miniconda2/bin/../lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x89e)[0x7f2286c6b1ce]
/home/austinherrema/miniconda2/bin/../lib/libpython2.7.so.1.0(+0x797e1)[0x7f2286be67e1]
/home/austinherrema/miniconda2/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x53)[0x7f2286bb6dc3]
/home/austinherrema/miniconda2/bin/../lib/libpython2.7.so.1.0(+0x5c54f)[0x7f2286bc954f]
/home/austinherrema/miniconda2/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x53)[0x7f2286bb6dc3]
/home/austinherrema/miniconda2/bin/../lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x43)[0x7f2286c60d63]
/home/austinherrema/miniconda2/bin/../lib/libpython2.7.so.1.0(+0x136652)[0x7f2286ca3652]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f2286957e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f2285f8236d]
======= Memory map: ========
00400000-00401000 r-xp 00000000 08:03 9706352 /home/austinherrema/miniconda2/bin/python2.7
00600000-00601000 rw-p 00000000 08:03 9706352 /home/austinherrema/miniconda2/bin/python2.7
00aca000-113891000 rw-p 00000000 00:00 0 [heap]
7f21107d6000-7f2241957000 rw-p 00000000 00:00 0
etc...
its hard to guess whats going on here, but if it works for a small number of processors and not on larger ones one guess might be that the issue shows up when you use more than one node, and data has to get transfered across the network. I have seen bad MPI compilations that behaved this way. Things would work if I kept the job to one node, but broke on more than one.
The traceback shows that you're not even getting through setup. So its not likely to be anything in your external code or any other components run method.
If you're running on a cluster, are you compiling your own MPI? You usually need to compile very with very specific options/libraries for any kind of HPC library. But most HPC systems provide modules you can load that have mpi pre-compiled.

Weird AttributeError: OpenMDAO says param has not been initialized when I run my simulation under mpirun

I am running developmental scientific code. I am stuck on a cryptic error message, and am curious what the OpenMDAO team thinks. When I run the code in serial, it works with no issues. When I run it under mpirun, OpenMDAO throws a cryptic error message:
Traceback (most recent call last):
File "test/exampleOptimizationAEP.py", line 129, in <module>
prob['ratedPower'] = ratedPower
.....
File "/scratch/jquick/test/lib/python2.7/site-packages/openmdao-1.7.3-py2.7.egg/openmdao/core/vec_wrapper.py", line 1316, in __setitem__
(self.name, name))
AttributeError: 'params' has not been initialized, setup() must be called before 'ratedPower' can be accessed
I am not sure how to approach this. There is nothing obviously different about the ratedPower variable in the code. What information does this error give me about what is going wrong?
This is a bug in OpenMDAO <= v1.7.2. Look at the output of check_setup and see the list of parameters without associated unknowns. You will find that variable in there. When running in parallel (because of the bug), you can not set any hanging params (ones without associated unknowns) in your setup script.
The way to fix it is to add an IndepVarComp to any variable you need to initialize the value of.

Resources