I am having trouble accessing variables that are implicitly linked through multiple layers of groups. According to the documentation:
In new OpenMDAO, Groups are NOT Components and do not have their own
variables. Variables can be promoted to the Group level by passing the
promotes arg to the add call, e.g.,
group = Group()
group.add('comp1', Times2(), promotes=['x'])
This will allow the variable x that belongs to comp1 to be accessed
via group.params[‘x’].
However, when I try to access variables of sub-sub-groups I am getting errors. Please see example below that shows a working and non-working example:
from openmdao.api import Component, Group, Problem
import numpy as np
class Times2(Component):
def __init__(self):
super(Times2, self).__init__()
self.add_param('x', 1.0, desc='my var x')
self.add_output('y', 2.0, desc='my var y')
def solve_nonlinear(self, params, unknowns, resids):
unknowns['y'] = params['x'] * 2.0
def linearize(self, params, unknowns, resids):
J = {}
J[('y', 'x')] = np.array([2.0])
return J
class PassGroup1(Group):
def __init__(self):
super(PassGroup1, self).__init__()
self.add('t1', Times2(), promotes=['*'])
class PassGroup2(Group):
def __init__(self):
super(PassGroup2, self).__init__()
self.add('g1', PassGroup1(), promotes=['*'])
prob = Problem(root=Group())
prob.root.add('comp', PassGroup2(), promotes=['*'])
prob.setup()
prob.run()
# this works
print prob.root.comp.g1.t1.params['x']
# this does not
print prob.root.params['x']
Could you explain why this does not work, and how I can make variables available to the top level without a knowledge of the lower level groups?
There are a few answers to your question. First, I'll point out that you have what we call a "hanging parameter". By this I mean, a parameter on a component (or linked to multiple components via promotion and/or connection) that has no ultimate src variable associated with it. So, just for a complete understanding, it needs to be stated that as far as OpenMDAO is concerned hanging parameters are not it's problem. As a convinence to the user, we provide an easy way for you to set its value in the problem instance, but we never do any data passing with it or anything during run time.
In the common case where x is a design variable for an optimizer, you would create an IndepVarComp to provide the src for this value. But since you don't have an optimizer in your example it is not technically wrong to leave out the IndepVarComp.
For a more direct answer to your question you shouldn't really be reaching down into the params dictionaries in any kind of sub-level. I can't think of a good reason to do that as a user. If you stick with problem['x'] you should never go wrong.
But since you asked, here is the details of whats really going on for a slightly modified case that allows there to be an actual parameter.
from openmdao.api import Component, Group, Problem
import numpy as np
class Plus1(Component):
def __init__(self):
super(Plus1, self).__init__()
self.add_param('w', 4.0)
self.add_output('x', 5.0)
def solve_nonlinear(self, params, unknowns, resids):
unknowns['x'] = params['w'] + 1
def linearize(self, params, unknowns, resids):
J = {}
J['x', 'w'] = 1
return J
class Times2(Component):
def __init__(self):
super(Times2, self).__init__()
self.add_param('x', 1.0, desc='my var x')
self.add_output('y', 2.0, desc='my var y')
def solve_nonlinear(self, params, unknowns, resids):
unknowns['y'] = params['x'] * 2.0
def linearize(self, params, unknowns, resids):
J = {}
J[('y', 'x')] = np.array([2.0])
return J
class PassGroup1(Group):
def __init__(self):
super(PassGroup1, self).__init__()
self.add('t1', Times2(), promotes=['x','y'])
class PassGroup2(Group):
def __init__(self):
super(PassGroup2, self).__init__()
self.add('g1', PassGroup1(), promotes=['x','y'])
self.add('p1', Plus1(), promotes=['w','x'])
prob = Problem(root=Group())
prob.root.add('comp', PassGroup2(), promotes=['w','x','y'])
prob.setup()
prob.run()
# this works
print prob.root.comp.g1.t1.params['x']
# this does not
print prob.root.comp.params.keys()
Please note that in my example, 'x' is no longer free for the user to set. Its now computed by 'p1'. Instead 'w' is now the user set parameter. This was necessary in order to illustrate how params work.
Now that there is actually some data passing going on that OpenMDAO is responsible for you can see the actual pattern more clearly. At the root, there are no parameters at all (excluding any hanging params). Everything, from the roots perspective is an unknown, because everything has a src responsible for it at that level. Go down one level where there is p1 and g1, and now there is a parameter on g1 that p1 is the src for so some data passing has to happen at that level of the hiearchy. So g1 has an entry in its parameter dictionary, g1.t1.x. Why is it a full path? All book keeping for parameters is done with full path names for a variety of reasons outside the scope of this answer. But that is also another motivation for working through the shortcut in problem because that will work with relative (or promoted) names.
Related
I would like to define a variable which, depending on certain options, will be equal to a previous output (as if the previous output had two names) or will be the output of a new component.
A trivial solution is to just omit the definition of the value when the component which would define it is not implemented, but I would prefer it to be defined for readability/traceability reasons (to simplify if statements in the code, and to provide it as timeseries output).
The problem is that when using the connect statement, if the subsequent condition does not lead to the variable being used as an input to another component, it provides an error mentioning that it attempted to connect but the variable does not exist.
I made a temporal fix with a sort of link statement (LinkVarComp bellow) which creates an explicit component with the output being equal to the input (and some additional things as scaling and a shift which could be useful for linear equations), but i am worried that this would add unnecessary computations/design variables/constraints.
Is there an easier/better workaround? (maybe by allowing variables to have multiple names?) what could be the best practice to just have a variable with a different name equal to a previous output/input?
A simple example:
import openmdao.api as om
model = om.Group()
model.add_subsystem('xcomp',subsys=om.IndepVarComp(name='x',val=np.zeros((3,2))),promotes_outputs=['*'])
model.connect('x','y')
p = om.Problem(model)
p.setup(force_alloc_complex=True)
p.set_val('x', np.array([[1.0 ,3],[10 ,-5],[0,3.1]]))
p.run_model()
Crashes with error
NameError: <model> <class Group>: Attempted to connect from 'x' to 'y', but 'y' doesn't exist.
While this works if using the following LinkVarComp component (but i suppose that adding new variables and computations)
import openmdao.api as om
import numpy as np
from math import prod
class LinkVarComp(om.ExplicitComponent):
"""
Component containing
"""
def initialize(self):
"""
Declare component options.
"""
self.options.declare('shape', types=(int,tuple),default=1)
self.options.declare('scale', types=int,default=1)
self.options.declare('shift', types=float,default=0.)
self.options.declare('input_default', types=float,default=0.)
self.options.declare('input_name', types=str,default='x')
self.options.declare('output_name', types=str,default='y')
self.options.declare('output_default', types=float,default=0.)
self.options.declare('input_units', types=(str,None),default=None)
self.options.declare('output_units', types=(str,None),default=None)
def setup(self):
self.add_input(name=self.options['input_name'],val=self.options['input_default'],shape=self.options['shape'],units=self.options['input_units'])
self.add_output(name=self.options['output_name'],val=self.options['output_default'],shape=self.options['shape'],units=self.options['output_units'])
if type(self.options['shape']) == int:
n = self.options['shape']
else:
n =prod( self.options['shape'])
ar = np.arange(n)
self.declare_partials(of=self.options['output_name'] , wrt=self.options['input_name'], rows=ar, cols=ar,val=self.options['scale'])
def compute(self, inputs, outputs):
outputs[self.options['output_name']] = self.options['scale']*inputs[self.options['input_name']] + self.options['shift']
model = om.Group()
model.add_subsystem('xcomp',subsys=om.IndepVarComp(name='x',val=np.zeros((3,2))),promotes_outputs=['*'])
model.add_subsystem('link', LinkVarComp(shape=(3,2)),
promotes_inputs=['*'],
promotes_outputs=['*'])
p = om.Problem(model)
p.setup(force_alloc_complex=True)
p.set_val('x', np.array([[1.0 ,3],[10 ,-5],[0,3.1]]))
p.run_model()
print(p['y'])
Outputing the expected:
[[ 1. 3. ]
[10. -5. ]
[ 0. 3.1]]
In OpenMDAO, you can not have the same variable take on two separate names. Thats simply not allowed.
The solution you came up with is effectively creating a separate component to hold a copy of the output. That works. You could use an ExecComp to have the same effect with a little less code:
import numpy as np
import openmdao.api as om
model = om.Group()
model.add_subsystem('xcomp',subsys=om.IndepVarComp(name='x',val=np.zeros((3,2))),promotes_outputs=['*'])
model.add_subsystem('ycomp', om.ExecComp("y=x", shape=(3,2)), promotes=['*'])
p = om.Problem(model)
p.setup(force_alloc_complex=True)
p.set_val('x', np.array([[1.0 ,3],[10 ,-5],[0,3.1]]))
p.run_model()
print(p['x'])
print(p['y'])
In general, I probably wouldn't actually do this myself. It seems kind of wasteful. Instead, I would modify my post-processing script to look for y and if it didn't find it to then grab x instead.
I try to declare an input with default integers but it does not seem possible. Am I making a mistake or is float enforced in the openmdao core.
Here are the code snippets I tried;
Expected output something like : array([1, 1, 1])
Received output : [1. 1. 1.]
from openmdao.api import ExplicitComponent, Problem, IndepVarComp
import numpy as np
class CompAddWithArrayIndices(ExplicitComponent):
"""Component for tests for declaring with array val and array indices."""
def setup(self):
self.add_input('x_a', val=np.ones(6,dtype=int))
self.add_input('x_b', val=[1]*5)
self.add_output('y')
p = Problem(model=CompAddWithArrayIndices())
p.setup()
p.run_model()
print(p['x_a'])
print(p['x_b'])
#%%
from openmdao.api import ExplicitComponent, Problem, IndepVarComp
import numpy as np
class CompAddWithArrayIndices(ExplicitComponent):
"""Component for tests for declaring with array val and array indices."""
def setup(self):
self.add_input('x_a', val=np.zeros(3,dtype=int))
self.add_output('y')
prob = Problem()
ivc=IndepVarComp()
prob.model.add_subsystem('ivc', ivc,promotes=['*'])
ivc.add_output('x_a', val=np.ones(3,dtype=int))
prob.model.add_subsystem('comp1', CompAddWithArrayIndices(),promotes=['*'])
prob.setup()
prob.run_model()
print(prob['x_a'])
Variables added via add_inputs or add_outputs will be converted to floats or float arrays. If you want a variable to be an int or any other discrete type, you must use add_discrete_input and add_discrete_output. Such variables will be passed between systems based on connection information, but no attempt will be made to compute their derivatives.
Discrete variable support was added in OpenMDAO v2.5 as an experimemental feature (its still being developed). There commit id 709401e535cf6933215abd942d4b4d49dbf61b2b on the master branch that promotion problem has been fixed.Make sure you're using a recent version of OpenMDAO from that commit or later
I am writing a simplistic ORM that would require me to override the methods inside the child classes. Depending on the context, I am either expecting to access the model through the class methods, or through the instance methods, therefore I can not simply override them. I believe this code describes the question well enough:
class A:
#classmethod
def key_fn(cls, id):
raise NotImplementedError('')
#classmethod
def load_all(cls):
yield from db_fetch_prefix(cls.key_fn('')):
class B(A):
#classmethod
def key_fn(cls, id):
return f'/keys/{id}'
# how do I make sure B.key_fn is called here?
B.load_all()
Your B.key_fn would be called indeed. But your load_all returns a generator, because you let it yield from db_fetch_prefix. You can check this by running print(B.load_all()) at the end. The output will be:
python .\clsss.py
<generator object A.load_all at 0x0521A330>
I don't know what you want to achieve by using yield from. But an example that shows overriding of classmethods in subclasses is possible would be:
class A:
#classmethod
def key_fn(cls, id):
raise NotImplementedError('')
#classmethod
def load_all(cls):
return cls.key_fn('foo')
class B(A):
#classmethod
def key_fn(cls, id):
return f'/keys/{id}'
print(B.load_all()) # prints "/keys/foo"
Guessing wildly that you want to apply the key_fn to each item yielded by your generator db_fetch_prefix (abbreviated to fetch below), the code below shows B.key_fn will be used in the presence of generators as well.
def fetch(callback):
for i in range(5): # fake DB fetch
yield callback(i)
class A:
#classmethod
def key_fn(cls, id):
raise NotImplementedError('')
#classmethod
def load_all(cls):
yield from fetch(cls.key_fn)
class B(A):
#classmethod
def key_fn(cls, id):
return f'/keys/{id}'
print(list(B.load_all())) # prints ['/keys/0', '/keys/1', '/keys/2', '/keys/3', '/keys/4']
Imagine having a function, which handles a heavy computational job, that we wish to execute asynchronously in a Tornado application context. Moreover, we would like to lazily evaluate the function, by storing its results to the disk, and not rerunning the function twice for the same arguments.
Without caching the result (memoization) one would do the following:
def complex_computation(arguments):
...
return result
#gen.coroutine
def complex_computation_caller(arguments):
...
result = complex_computation(arguments)
raise gen.Return(result)
Assume to achieve function memoization, we choose Memory class from joblib. By simply decorating the function with #mem.cache the function can easily be memoized:
#mem.cache
def complex_computation(arguments):
...
return result
where mem can be something like mem = Memory(cachedir=get_cache_dir()).
Now consider combining the two, where we execute the computationally complex function on an executor:
class TaskRunner(object):
def __init__(self, loop=None, number_of_workers=1):
self.executor = futures.ThreadPoolExecutor(number_of_workers)
self.loop = loop or IOLoop.instance()
#run_on_executor
def run(self, func, *args, **kwargs):
return func(*args, **kwargs)
mem = Memory(cachedir=get_cache_dir())
_runner = TaskRunner(1)
#mem.cache
def complex_computation(arguments):
...
return result
#gen.coroutine
def complex_computation_caller(arguments):
result = yield _runner.run(complex_computation, arguments)
...
raise gen.Return(result)
So the first question is whether the aforementioned approach is technically correct?
Now let's consider the following scenario:
#gen.coroutine
def first_coroutine(arguments):
...
result = yield second_coroutine(arguments)
raise gen.Return(result)
#gen.coroutine
def second_coroutine(arguments):
...
result = yield third_coroutine(arguments)
raise gen.Return(result)
The second question is how one can memoize second_coroutine? Is it correct to do something like:
#gen.coroutine
def first_coroutine(arguments):
...
mem = Memory(cachedir=get_cache_dir())
mem_second_coroutine = mem(second_coroutine)
result = yield mem_second_coroutine(arguments)
raise gen.Return(result)
#gen.coroutine
def second_coroutine(arguments):
...
result = yield third_coroutine(arguments)
raise gen.Return(result)
[UPDATE I] Caching and reusing a function result in Tornado discusses using functools.lru_cache or repoze.lru.lru_cache as a solution for second question.
The Future objects returned by Tornado coroutines are reusable, so it generally works to use in-memory caches such as functools.lru_cache, as explained in this question. Just be sure to put the caching decorator before #gen.coroutine.
On-disk caching (which seems to be implied by the cachedir argument to Memory) is trickier, since Future objects cannot generally be written to disk. Your TaskRunner example should work, but it's doing something fundamentally different from the others because complex_calculation is not a coroutine. Your last example will not work, because it's trying to put the Future object in the cache.
Instead, if you want to cache things with a decorator, you'll need a decorator that wraps the inner coroutine with a second coroutine. Something like this:
def cached_coroutine(f):
#gen.coroutine
def wrapped(*args):
if args in cache:
return cache[args]
result = yield f(*args)
cache[args] = f
return result
return wrapped
I'm using pytest.mark to give my tests kwargs. However, if I use the same mark on both the class and a test within the class, the class's mark overrides the mark on the function when the same kwargs are used for both.
import pytest
animal = pytest.mark.animal
#animal(species='croc') # Mark the class with a kwarg
class TestClass(object):
#animal(species='hippo') # Mark the function with new kwarg
def test_function(self):
pass
#pytest.fixture(autouse=True) # Use a fixture to inspect my function
def animal_inspector(request):
print request.function.animal.kwargs # Show how the function object got marked
# prints {'species': 'croc'} but the function was marked with 'hippo'
Where'd my hippo go and how can I get him back?
There are unfortunately various pytest bugs related to this, I'm guessing you're running into one of them. The ones I found are related to subclassing which you don't do there though.
So I've been digging around in the pytest code and figured out why this is happening. The marks on the functions are applied to the function at import time but the class and module level marks don't get applied on the function level until test collection. Function marks happen first and add their kwargs to the function. Then class marks overwrite any same kwargs and module marks further overwrite any matching kwargs.
My solution was to simply create my own modified MarkDecorator that filters kwargs before they are added to the marks. Basically, whatever kwarg values get set first (which seems to always be by a function decorator) will always be the value on the mark. Ideally I think this functionality should be added in the MarkInfo class but since my code wasn't creating instances of that I went with what I was creating instances of: MarkDecorator. Note that I only change two lines from the source code (the bits about keys_to_add).
from _pytest.mark import istestfunc, MarkInfo
import inspect
class TestMarker(object): # Modified MarkDecorator class
def __init__(self, name, args=None, kwargs=None):
self.name = name
self.args = args or ()
self.kwargs = kwargs or {}
#property
def markname(self):
return self.name # for backward-compat (2.4.1 had this attr)
def __repr__(self):
d = self.__dict__.copy()
name = d.pop('name')
return "<MarkDecorator %r %r>" % (name, d)
def __call__(self, *args, **kwargs):
""" if passed a single callable argument: decorate it with mark info.
otherwise add *args/**kwargs in-place to mark information. """
if args and not kwargs:
func = args[0]
is_class = inspect.isclass(func)
if len(args) == 1 and (istestfunc(func) or is_class):
if is_class:
if hasattr(func, 'pytestmark'):
mark_list = func.pytestmark
if not isinstance(mark_list, list):
mark_list = [mark_list]
mark_list = mark_list + [self]
func.pytestmark = mark_list
else:
func.pytestmark = [self]
else:
holder = getattr(func, self.name, None)
if holder is None:
holder = MarkInfo(
self.name, self.args, self.kwargs
)
setattr(func, self.name, holder)
else:
# Don't set kwargs that already exist on the mark
keys_to_add = {key: value for key, value in self.kwargs.items() if key not in holder.kwargs}
holder.add(self.args, keys_to_add)
return func
kw = self.kwargs.copy()
kw.update(kwargs)
args = self.args + args
return self.__class__(self.name, args=args, kwargs=kw)
# Create my Mark instance. Note my modified mark class must be imported to be used
animal = TestMarker(name='animal')
# Apply it to class and function
#animal(species='croc') # Mark the class with a kwarg
class TestClass(object):
#animal(species='hippo') # Mark the function with new kwarg
def test_function(self):
pass
# Now prints {'species': 'hippo'} Yay!