Define variable in problem which is not used as input, and equals a previous input/output - openmdao

I would like to define a variable which, depending on certain options, will be equal to a previous output (as if the previous output had two names) or will be the output of a new component.
A trivial solution is to just omit the definition of the value when the component which would define it is not implemented, but I would prefer it to be defined for readability/traceability reasons (to simplify if statements in the code, and to provide it as timeseries output).
The problem is that when using the connect statement, if the subsequent condition does not lead to the variable being used as an input to another component, it provides an error mentioning that it attempted to connect but the variable does not exist.
I made a temporal fix with a sort of link statement (LinkVarComp bellow) which creates an explicit component with the output being equal to the input (and some additional things as scaling and a shift which could be useful for linear equations), but i am worried that this would add unnecessary computations/design variables/constraints.
Is there an easier/better workaround? (maybe by allowing variables to have multiple names?) what could be the best practice to just have a variable with a different name equal to a previous output/input?
A simple example:
import openmdao.api as om
model = om.Group()
model.add_subsystem('xcomp',subsys=om.IndepVarComp(name='x',val=np.zeros((3,2))),promotes_outputs=['*'])
model.connect('x','y')
p = om.Problem(model)
p.setup(force_alloc_complex=True)
p.set_val('x', np.array([[1.0 ,3],[10 ,-5],[0,3.1]]))
p.run_model()
Crashes with error
NameError: <model> <class Group>: Attempted to connect from 'x' to 'y', but 'y' doesn't exist.
While this works if using the following LinkVarComp component (but i suppose that adding new variables and computations)
import openmdao.api as om
import numpy as np
from math import prod
class LinkVarComp(om.ExplicitComponent):
"""
Component containing
"""
def initialize(self):
"""
Declare component options.
"""
self.options.declare('shape', types=(int,tuple),default=1)
self.options.declare('scale', types=int,default=1)
self.options.declare('shift', types=float,default=0.)
self.options.declare('input_default', types=float,default=0.)
self.options.declare('input_name', types=str,default='x')
self.options.declare('output_name', types=str,default='y')
self.options.declare('output_default', types=float,default=0.)
self.options.declare('input_units', types=(str,None),default=None)
self.options.declare('output_units', types=(str,None),default=None)
def setup(self):
self.add_input(name=self.options['input_name'],val=self.options['input_default'],shape=self.options['shape'],units=self.options['input_units'])
self.add_output(name=self.options['output_name'],val=self.options['output_default'],shape=self.options['shape'],units=self.options['output_units'])
if type(self.options['shape']) == int:
n = self.options['shape']
else:
n =prod( self.options['shape'])
ar = np.arange(n)
self.declare_partials(of=self.options['output_name'] , wrt=self.options['input_name'], rows=ar, cols=ar,val=self.options['scale'])
def compute(self, inputs, outputs):
outputs[self.options['output_name']] = self.options['scale']*inputs[self.options['input_name']] + self.options['shift']
model = om.Group()
model.add_subsystem('xcomp',subsys=om.IndepVarComp(name='x',val=np.zeros((3,2))),promotes_outputs=['*'])
model.add_subsystem('link', LinkVarComp(shape=(3,2)),
promotes_inputs=['*'],
promotes_outputs=['*'])
p = om.Problem(model)
p.setup(force_alloc_complex=True)
p.set_val('x', np.array([[1.0 ,3],[10 ,-5],[0,3.1]]))
p.run_model()
print(p['y'])
Outputing the expected:
[[ 1. 3. ]
[10. -5. ]
[ 0. 3.1]]

In OpenMDAO, you can not have the same variable take on two separate names. Thats simply not allowed.
The solution you came up with is effectively creating a separate component to hold a copy of the output. That works. You could use an ExecComp to have the same effect with a little less code:
import numpy as np
import openmdao.api as om
model = om.Group()
model.add_subsystem('xcomp',subsys=om.IndepVarComp(name='x',val=np.zeros((3,2))),promotes_outputs=['*'])
model.add_subsystem('ycomp', om.ExecComp("y=x", shape=(3,2)), promotes=['*'])
p = om.Problem(model)
p.setup(force_alloc_complex=True)
p.set_val('x', np.array([[1.0 ,3],[10 ,-5],[0,3.1]]))
p.run_model()
print(p['x'])
print(p['y'])
In general, I probably wouldn't actually do this myself. It seems kind of wasteful. Instead, I would modify my post-processing script to look for y and if it didn't find it to then grab x instead.

Related

Enforcing integers for declared inputs is not possible

I try to declare an input with default integers but it does not seem possible. Am I making a mistake or is float enforced in the openmdao core.
Here are the code snippets I tried;
Expected output something like : array([1, 1, 1])
Received output : [1. 1. 1.]
from openmdao.api import ExplicitComponent, Problem, IndepVarComp
import numpy as np
class CompAddWithArrayIndices(ExplicitComponent):
"""Component for tests for declaring with array val and array indices."""
def setup(self):
self.add_input('x_a', val=np.ones(6,dtype=int))
self.add_input('x_b', val=[1]*5)
self.add_output('y')
p = Problem(model=CompAddWithArrayIndices())
p.setup()
p.run_model()
print(p['x_a'])
print(p['x_b'])
#%%
from openmdao.api import ExplicitComponent, Problem, IndepVarComp
import numpy as np
class CompAddWithArrayIndices(ExplicitComponent):
"""Component for tests for declaring with array val and array indices."""
def setup(self):
self.add_input('x_a', val=np.zeros(3,dtype=int))
self.add_output('y')
prob = Problem()
ivc=IndepVarComp()
prob.model.add_subsystem('ivc', ivc,promotes=['*'])
ivc.add_output('x_a', val=np.ones(3,dtype=int))
prob.model.add_subsystem('comp1', CompAddWithArrayIndices(),promotes=['*'])
prob.setup()
prob.run_model()
print(prob['x_a'])
Variables added via add_inputs or add_outputs will be converted to floats or float arrays. If you want a variable to be an int or any other discrete type, you must use add_discrete_input and add_discrete_output. Such variables will be passed between systems based on connection information, but no attempt will be made to compute their derivatives.
Discrete variable support was added in OpenMDAO v2.5 as an experimemental feature (its still being developed). There commit id 709401e535cf6933215abd942d4b4d49dbf61b2b on the master branch that promotion problem has been fixed.Make sure you're using a recent version of OpenMDAO from that commit or later

Connecting the declared input variables (global) to ExecComp

Is there a way to connect the global input variables i.e.
def initialize(self):
self.options.declare('num_elements', types=int)
to an execcomp?
prob.model.add_subsystem('paraboloid', ExecComp('f = num_elements*3 + c'))
There isn't any way to connect to a declared option. The only things you can connect to are variables that were added inside components with add_input or add_output. I think in this case, since num_elements isn't meant to change, you should use a string expression to put the value into the ExecComp -- something like:
prob.model.add_subsystem('paraboloid', ExecComp('f = %d*3 + c' % num_elements))
where num_elements is a variable in your top level script.

Indentation Error in Python 3.6.1 def

I had been having issues with python 3.7 for quite some time about very pointless indentations, so I decided to get back to 3.6, specifically repl.it Python 3.6.1, and as I mentioned, the errors are for no good reason whatsoever as far as I can tell, the code is as written below:
from random import randint
import functools
printf = functools.partial(print, end=" ")
defNuc = ['C','A','T','G']
def opNuc():
def create():
nuc = [0]
nucop = [0]
length = randint(11,16)
print (length - 1)
for i in range(1,length):
part = randint(1,4)
for a in range(1,4)
if part == a:
nuc = defNuc[a]
nucOp = defNuc[-a]
if i != length - 1:
printf(nuc[i],i,"-")
else:
print(nuc[i],i)
for i in range (1,length):
if i != length - 1:
printf(nucOp[i],"-")
else:
print(nucop[i])
The error is at line 9, at
def create():
and as for the reason of error, it just says
expected an indented block
Edit:
This was completely my stupidity, don't take the post seriously, will be deleted in 10 minutes.
You never finished the definition of opNuc, so the parser is expecting an indented line to continue the body of that function. Either add a pass statement to provide a trivial body:
def opNuc():
pass
or indent the definition of create if that is supposed to be local to the body of opNuc (unlikely, but possible):
def opNuc():
def create():
...
The problem is that your first function, opNuc, was never finished. I have made this simple mistake many times myself and is very easy to miss. It's easy to fix though, just type pass inside of the opNuc function and it should be fine. Hope I helped!

accumulator in pyspark with dict as global variable

Just for learning purpose, I tried to set a dictionary as a global variable in accumulator the add function works well, but I ran the code and put dictionary in the map function, it always return empty.
But similar code for setting list as a global variable
class DictParam(AccumulatorParam):
def zero(self, value = ""):
return dict()
def addInPlace(self, acc1, acc2):
acc1.update(acc2)
if __name__== "__main__":
sc, sqlContext = init_spark("generate_score_summary", 40)
rdd = sc.textFile('input')
#print(rdd.take(5))
dict1 = sc.accumulator({}, DictParam())
def file_read(line):
global dict1
ls = re.split(',', line)
dict1+={ls[0]:ls[1]}
return line
rdd = rdd.map(lambda x: file_read(x)).cache()
print(dict1)
For anyone who arrives at this thread looking for a Dict accumulator for pyspark: the accepted solution does not solve the posed problem.
The issue is actually in the DictParam defined, it does not update the original dictionary. This works:
class DictParam(AccumulatorParam):
def zero(self, value = ""):
return dict()
def addInPlace(self, value1, value2):
value1.update(value2)
return value1
The original code was missing the return value.
I believe that print(dict1()) simply gets executed before the rdd.map() does.
In Spark, there are 2 types of operations:
transformations, that describe the future computation
and actions, that call for action, and actually trigger the execution
Accumulators are updated only when some action is executed:
Accumulators do not change the lazy evaluation model of Spark. If they
are being updated within an operation on an RDD, their value is only
updated once that RDD is computed as part of an action.
If you check out the end of this section of the docs, there is an example exactly like yours:
accum = sc.accumulator(0)
def g(x):
accum.add(x)
return f(x)
data.map(g)
# Here, accum is still 0 because no actions have caused the `map` to be computed.
So you would need to add some action, for instance:
rdd = rdd.map(lambda x: file_read(x)).cache() # transformation
foo = rdd.count() # action
print(dict1)
Please make sure to check on the details of various RDD functions and accumulator peculiarities because this might affect the correctness of your result. (For instance, rdd.take(n) will by default only scan one partition, not the entire dataset.)
For accumulator updates performed inside actions only, their value is
only updated once that RDD is computed as part of an action

pyparsing for querying a database of chemical elements

I would like to parse a query for a database of chemical elements.
The database is stored in a xml file. Parsing that file produces a nested dictionary that is stored in a singleton object that inherit from collections.OrderedDict.
Asking for an element will give me an ordered dictionary of its corresponding properties
(i.e. ELEMENTS['C'] --> {'name':'carbon','neutron' : 0,'proton':6, ...}).
Conversely, asking for a propery will give me an ordered dictionary of its values for all the elements (i.e. ELEMENTS['proton'] --> {'H' : 1, 'He' : 2} ...).
A typical query could be:
mass > 10 or (nucleon < 20 and atomic_radius < 5)
where each 'subquery' (i.e. mass > 10) will return the set of elements that matches it.
Then, the query will be converted and transformed internally to a string that will be evaluated further to produce a set of the indexes of the elements that matched it. In that context the operators and/or are not boolean operator but rather ensemble operator that acts upon python sets.
I recently sent a post for building such a query. Thanks to the useful answers I got, I think that I did more or less the job (I hope on a nice way !) but I still have some questions related to pyparsing.
Here is my code:
import numpy
from pyparsing import *
# This import a singleton object storing the datase dictionary as
# described earlier
from ElementsDatabase import ELEMENTS
and_operator = oneOf(['and','&'], caseless=True)
or_operator = oneOf(['or' ,'|'], caseless=True)
# ELEMENTS.properties is a property getter that returns the list of
# registered properties in the database
props = oneOf(ELEMENTS.properties, caseless=True)
# A property keyword can be quoted or not.
props = Suppress('"') + props + Suppress('"') | props
# When parsed, it must be replaced by the following expression that
# will be eval later.
props.setParseAction(lambda t : "numpy.array(ELEMENTS['%s'].values())" % t[0].lower())
quote = QuotedString('"')
integer = Regex(r'[+-]?\d+').setParseAction(lambda t:int(t[0]))
float_ = Regex(r'[+-]?(\d+(\.\d*)?)?([eE][+-]?\d+)?').setParseAction(lambda t:float(t[0]))
comparison_operator = oneOf(['==','!=','>','>=','<', '<='])
comparison_expr = props + comparison_operator + (quote | float_ | integer)
comparison_expr.setParseAction(lambda t : "set(numpy.where(%s)%s%s)" % tuple(t))
grammar = Combine(operatorPrecedence(comparison_expr, [(and_operator, 2, opAssoc.LEFT) (or_operator, 2, opAssoc.LEFT)]))
# A test query
res = grammar.parseString('"mass " > 30 or (nucleon == 1)',parseAll=True)
print eval(' '.join(res._asStringList()))
My question are the following:
1 using 'transformString' instead of 'parseString' never triggers any
exception even when the string to be parsed does not match the grammar.
However, it is exactly the functionnality I need. Is there is a way to do so ?
2 I would like to reintroduce white spaces between my tokens in order
that my eval does not fail. The only way I found to do so it the one
implemented above. Would you see a better way using pyparsing ?
sorry for the long post but I wanted to introduce in deeper details its context. BTW, if you find this approach bad, do not hesitate to tell it me!
thank you very much for your help.
Eric
do not worry about my concern, I found a work around. I used the SimpleBool.py example shipped with pyparsing (thanks for the hint Paul).
Basically, I used the following approach:
1 for each subquery (i.e. mass > 10), using the setParseAction method,
I joined a function that returns the set of eleements that matched
the subquery
2 then, I joined the following functions for each logical operator (and,
or and not):
def not_operator(token):
_, s = token[0]
# ELEMENTS is the singleton described in my original post
return set(ELEMENTS.keys()).difference(s)
def and_operator(token):
s1, _, s2 = token[0]
return (s1 and s2)
def or_operator(token):
s1, _, s2 = token[0]
return (s1 or s2)
# Thanks for Paul for the hint.
grammar = operatorPrecedence(comparison_expr,
[(not_token, 1,opAssoc.RIGHT,not_operator),
(and_token, 2, opAssoc.LEFT,and_operator),
(or_token, 2, opAssoc.LEFT,or_operator)])
Please not that these operators acts upon python sets rather than
on booleans.
And that does the job.
I hope that this approach will help anyone of you.
Eric

Resources