Meaning of factor is exactly singular / singular entry found in group - openmdao

I am struggling with an error regarding a singular entry in the group caused by an implicit component, and I don't manage to figure out how to solve it.
We created a louvered fin heat exchanger model, based on the effectiveness-NTU method. It uses an ImplicitComponent to solve the system in such a way that the "guessed" outlet temperatures (used to calculate fluid properties) are equal to the actual outlet temperatures calculated based on the actual heat transfer. This components seems to run fine and the N2 diagram of this base heat exchanger can be found here.
Among others, two inputs are the mass flow rates of the cold and hot side (see indeps in original N2). However, the validation data uses cold side flow velocity and hot side volumetric flow rate instead of mass flow rates. Although not direct group inputs, these properties are calculated within the heat exchanger group. Manually changing the mass flow rates until the flow velocity and volumetric flow rate meet that of the validation data works fine. But I figured I could add an additional implicit component around the heat exchanger group to do that work for me. The resulting N2 diagram can be found here. However, this implicit component results in an error:
Traceback (most recent call last):
File "...\openmdao\solvers\linear\direct.py", line 275, in _linearize
self._lu = scipy.sparse.linalg.splu(matrix)
File "...\scipy\sparse\linalg\dsolve\linsolve.py", line 326, in splu
ilu=False, options=_options)
RuntimeError: Factor is exactly singular
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "louveredfin3.py", line 91, in <module>
p.run_model()
File "...\openmdao\core\problem.py", line 527, in run_model
self.model.run_solve_nonlinear()
File "...\openmdao\core\system.py", line 3734, in run_solve_nonlinear
self._solve_nonlinear()
File "...\openmdao\core\group.py", line 1886, in _solve_nonlinear
self._nonlinear_solver.solve()
File "...\openmdao\solvers\solver.py", line 597, in solve
raise err
File "...\openmdao\solvers\solver.py", line 593, in solve
self._solve()
File "...\openmdao\solvers\solver.py", line 384, in _solve
self._single_iteration()
File "...\openmdao\solvers\nonlinear\newton.py", line 230, in _single_iteration
self._linearize()
File "...\openmdao\solvers\nonlinear\newton.py", line 161, in _linearize
self.linear_solver._linearize()
File "...\openmdao\solvers\linear\direct.py", line 278, in _linearize
raise RuntimeError(format_singular_error(system, matrix))
RuntimeError: Singular entry found in Group (<model>) for column associated with state/residual 'ConvertInputs.m_dot_hot' index 0.
What does this error mean in practical terms? Is the solver not able to reduce the residual by changing the output (m_dot_hot and m_dot_cold in this case)? However, if this is the case I am failing to understand why, as manually changing the mass flow rates does result in a change in 'V_cold' and 'flowrate_hot'.
As an alternative I tried to use only one solver for all the components on the same level (N2 here), but this results in the same error. Also removing the original temperature implicit component (i.e. only having one implicit component, which changes the mass flow rate) did not solve the problem.
In case it helps, the implicit component looks like this (using single values instead of array with length n for the time being):
class ConvertInputs(om.ImplicitComponent):
def initialize(self):
self.options.declare('n', default=1, desc='length of the array')
def setup(self):
self.add_input('flowrate_hot_required', val=1.33, desc='required flowrate', units='L/s')
self.add_input('flowrate_hot', val=1.33, desc='actual flowrate', units='L/s')
self.add_input('V_cold_required', val=8., desc='required air velocity', units='m/s')
self.add_input('V_cold', val=8., desc='actual air velocity', units='m/s')
self.add_output('m_dot_hot', val=1.296, desc='hot side mass flow rate', units='kg/s')
self.add_output('m_dot_cold', val=1.655, desc='cold side mass flow rate', units='kg/s')
self.declare_partials('*', '*', method='fd')
def apply_nonlinear(self, inputs, outputs, residuals):
residuals['m_dot_hot'] = inputs['flowrate_hot'] - inputs['flowrate_hot_required']
residuals['m_dot_cold'] = inputs['V_cold'] - inputs['V_cold_required']
and the related lines in FluidProperties like this:
outputs['flowrate_hot'] = inputs['m_dot_hot'] / (outputs['rho_hot_in']*1e-3)
outputs['V_cold'] = inputs['m_dot_cold'] / (outputs['rho_cold_in'] * inputs['A_flow_cold'])
The solvers used are the NewtonSolver (solve_subsystems=True) and DirectSolver. Additionally I double checked that the derivatives are declared everywhere (e.g. self.declare_partials(' * ', ' * ', method='fd') in all components for now), but no success so far.
EDIT – based on the answer from Justin:
Thank you for the answer and the tips! I implemented the BalanceComp replacing the top-level implicit component, but unfortunately it did not make a difference. Setting maxiter=0 for the top solver still throws the error, but the lower solver seems to solve without problem:
+
+ ========
+ original
+ ========
+ NL: Newton 0 ; 14.3235199 1
+ NL: Newton 1 ; 0.0668341831 0.00466604463
+ NL: Newton 2 ; 0.000273898972 1.91223229e-05
+ NL: Newton 3 ; 1.17390481e-06 8.1956448e-08
+ NL: Newton 4 ; 5.0212449e-09 3.50559425e-10
+ NL: Newton 5 ; 2.14662005e-11 1.49866797e-12
+ NL: Newton Converged
NL: Newton 0 ; 0.252556049 1
Traceback (most recent call last):
[…]
RuntimeError: Factor is exactly singular
To get a feeling for the magnitudes: using only the HX subsystem and manually changing the indeps “m_dot_hot” and “m_dot_cold” with stepsize 1e-6:
Diff. flowrate_hot = -1.0271477852707989e-06
Diff. V_cold = -4.6808071525461514e-06
I did not realise yet that you can actually add bounds to the implicit component, that is very useful and I added them for both implicit components. Unfortunately, no improvement for the error from that neither. Setting the partials to complex step for the top-level implicit component and the FluidProperties did also not improve the situation. Should this be done for all components or only the involved ones here?.
However, I noticed the following when printing the “m_dot_* ” outputs in the implicit component, and the “m_dot_* ” inputs in the FluidProperties component in apply_nonlinear() and compute() respectively. It shows that after the heat exchanger subsystem has solved, the top-level implicit component “m_dot_* ” outputs change with the stepsize of 1e-6 as I would expect it to do for gradient calculation. However, the inputs in FluidProperties are not printed anymore at all after this and the singular error occurs soon after. Hence, to me it seems that the lower level FluidProperties compute method is not called anymore. Since no analytical derivatives are given or values set (i.e. only FD is used of all output w.r.t. all inputs), it appears to me that the FluidProperties component is never executed with the “m_dot_* ” stepsize and the gradient is not (successfully or correctly) calculated. Nevertheless, the N2 chart shows that the inputs/outputs of top-level to subsystem are correctly connected. Can this be a pointer to something specifically going wrong?

Your error (RuntimeError: Singular entry found in Group (<model>) for column associated with state/residual 'ConvertInputs.m_dot_hot' index 0.) indicates that, all of the partial derivatives in that column are 0.
Practically, that means that as far as OpenMDAO is concerned changes to 'ConvertInputs.m_dot_hot' don't affect any of the residuals in your model.
One thing you can try is to use the BalanceComp from OpenMDAO's standard library. This component is deigned specifically for what you were trying to accomplish, but has derivatives already defined. This is a small chance that this will fix your problem, but likely not.
What I recommend is the following:
Set the max_iter option on your top level newton solver to 0 (the sub-solves will still run). Then you can run your model and manually change your guess for m_dot_hot to see if you really can manually converge it in the model with the coupling built in. Perhaps there is a bug in the way you did the connections in this model that is causing the problem. You said that you could manually converge the original model, this step will make sure that you coupled model also has a solution
I noticed that you did not define any bounds on your design variables. Perhaps the solver is driving m_dot_hot to 0 or a negative number in its iterations. I suggest setting both lower and upper bounds to something reasonable as follows
self.add_output('m_dot_hot', val=1.296, desc='hot side mass flow rate', units='kg/s', lower=1e-4, upper=10)
self.add_output('m_dot_cold', val=1.655, desc='cold side mass flow rate', units='kg/s', lower=1e-4, upper=10)
For OpenMDAO V3.0 and up, the default for the newton solver is to use the bounds enforcing line search which will respect these limits when trying to converge
Consider switching to the cs method for your partial derivatives. Its possible that the FD values are just not very good (at least with the default step sizes and you're getting 0s that are numerical instead of physical.

Related

Performing gradient-based optimization with intermediate discrete variables in OpenMDAO

We have a user who wants to solve an optimization problem that has intermediate discrete variables using a gradient-based method. They're running into this error. I know we could restructure the problem to not use discrete variables, but could I also treat this error as a warning given that the discrete variable doesn't change? Or is there a fundamental reason that the derivs wouldn't propagate correctly. To be clear, we're using approx_totals at the model level.
Here is a small test case exhibiting this:
import openmdao.api as om
import numpy as np
class DiscreteComp(om.ExplicitComponent):
def setup(self):
self.add_input('a', 2.)
self.add_discrete_output('b', val=0)
def compute(self, inputs, outputs, discrete_inputs, discrete_outputs):
discrete_outputs['b'] = 2 * inputs['a']
class DummyComp(om.ExplicitComponent):
def setup(self):
self.add_input('a', 2.)
self.add_discrete_input('b', 1)
self.add_output('c')
def compute(self, inputs, outputs, discrete_inputs, discrete_outputs):
b = discrete_inputs['b']
outputs['c'] = inputs['a']**2 * b
prob = om.Problem()
prob.model.add_subsystem('discrete_comp', DiscreteComp(), promotes=['*'])
prob.model.add_subsystem('dummy_comp', DummyComp(), promotes=['*'])
prob.driver = om.pyOptSparseDriver()
prob.driver.options['optimizer'] = 'SLSQP'
prob.model.add_design_var('a', lower=-10, upper=10)
prob.model.add_objective('c')
prob.model.approx_totals()
prob.setup()
# run the optimization
prob.run_driver()
which gives this output:
Traceback (most recent call last):
File "tmp.py", line 39, in <module>
prob.run_driver()
File "/home/john/anaconda3/envs/weis/lib/python3.8/site-packages/openmdao/core/problem.py", line 685, in run_driver
return self.driver.run()
File "/home/john/anaconda3/envs/weis/lib/python3.8/site-packages/openmdao/drivers/pyoptsparse_driver.py", line 480, in run
raise self._exc_info
File "/home/john/anaconda3/envs/weis/lib/python3.8/site-packages/openmdao/drivers/pyoptsparse_driver.py", line 643, in _gradfunc
sens_dict = self._compute_totals(of=self._quantities,
File "/home/john/anaconda3/envs/weis/lib/python3.8/site-packages/openmdao/core/driver.py", line 892, in _compute_totals
total_jac = _TotalJacInfo(problem, of, wrt, use_abs_names,
File "/home/john/anaconda3/envs/weis/lib/python3.8/site-packages/openmdao/core/total_jac.py", line 209, in __init__
raise RuntimeError("Total derivative %s '%s' depends upon "
RuntimeError: Total derivative with respect to '_auto_ivc.v0' depends upon discrete output variables ['discrete_comp.b'].
Given your example, there is a reason that OpenMDAO raises this error. Discrete outputs are not passed through the same data system as continuous one. Only the continuous outputs have access to the derivative system. While it is possible to have this intermediate discrete variable here, and take a finite-difference over the whole thing, you only get away with it because you really have a continuous calculation there.
If the output had been really discrete, then its not mathematically valid to take a derivative or make finite-difference approximation across the compute. This is why OpenMDAO raises the error.
There are admittedly some corner cases where you could argue that OpenMDAO should let you do this. For example if your output c was computed as:
outputs['c'] = floor(inputs['a'])
and you knew that you would limit the value of a to between 1 and 2. You know that c would always be exactly 1. Hence you could say that its valid to differentiate across this since the discrete variable never changes in value and hence the function is differentiable within these bounds.
If you wanted to use OpenMDAO to finite difference this you have two options:
Even though it's a discrete value, list it as a continuous one anyway. This works around OpenMDAO's validity check, but its totally up to you to make sure the values are really never changing with the ranges you are taking derivatives around. If you decide to implement analytic derivatives, simply don't declare one for the discrete output with respect to anything.
Move the discrete calculations into the setup phase and pass the information around as options. This forces you to absolutely respect the "never changes" paradigm since setup only happens once. It does require some redesign.
If you don't like OpenMDAO's nanny-ing you can always comment out the error :) You obviously do this at your own risk ... but thats one of the values of open source software!
My personal recommendation is option #2. This is really the best overall design, and forces you to be sure the discrete calculations never change during an opt.

Use of data with negative weights in unbinned maximum likelihood fit in zfit

I am trying to perform an unbinned 3D angular fit in zfit, where the input data is a sample with per-event sWeights assigned from a separate invariant mass peak fit. I think I'm running into issues of negatively weighted events in some regions of the angular phase space, as zfit gives the error:
Traceback (most recent call last):
File "unbinned_angular_fit.py", line 282, in <module>
main()
File "unbinned_angular_fit.py", line 217, in main
result = minimizer.minimize(nll)
File "/home/dhill/miniconda/envs/ana_env/lib/python3.7/site-packages/zfit/minimizers/baseminimizer.py", line 265, in minimize
return self._hook_minimize(loss=loss, params=params)
File "/home/dhill/miniconda/envs/ana_env/lib/python3.7/site-packages/zfit/minimizers/baseminimizer.py", line 274, in _hook_minimize
return self._call_minimize(loss=loss, params=params)
File "/home/dhill/miniconda/envs/ana_env/lib/python3.7/site-packages/zfit/minimizers/baseminimizer.py", line 278, in _call_minimize
return self._minimize(loss=loss, params=params)
File "/home/dhill/miniconda/envs/ana_env/lib/python3.7/site-packages/zfit/minimizers/minimizer_minuit.py", line 179, in _minimize
result = minimizer.migrad(**minimize_options)
File "src/iminuit/_libiminuit.pyx", line 859, in iminuit._libiminuit.Minuit.migrad
RuntimeError: exception was raised in user function
User function arguments:
Hm_amp = +nan
Hm_phi = +0.000000
Hp_phi = +0.000000
Original python exception in user function:
RuntimeError: Loss starts already with NaN, cannot minimize.
File "/home/dhill/miniconda/envs/ana_env/lib/python3.7/site-packages/zfit/minimizers/minimizer_minuit.py", line 121, in func
values=info_values)
File "/home/dhill/miniconda/envs/ana_env/lib/python3.7/site-packages/zfit/minimizers/baseminimizer.py", line 47, in minimize_nan
return self._minimize_nan(loss=loss, params=params, minimizer=minimizer, values=values)
File "/home/dhill/miniconda/envs/ana_env/lib/python3.7/site-packages/zfit/minimizers/baseminimizer.py", line 107, in _minimize_nan
raise RuntimeError("Loss starts already with NaN, cannot minimize.")
I can avoid this error by restricting one of the fit observable ranges slightly, to avoid the region with small numbers of data events where some data is weighted negatively (signal is being slightly over-subtracted by the sWeights). But I wondered if there is another way around this in zfit?
Perhaps the UnbinnedNLL method in zfit explicitly requires positive events, but the negatively weighted data points could be set to zero or a small positive value instead? I should say that the level of negative weighting appears to be small compared to the total sum of weights, and occurs at the edge of one of the angular distributions where there are only a small number of data events. The low rate of data in this region is due to experimental acceptance effects.
Code that runs on a test file to reproduce the error is here:
https://github.com/donalrinho/zfit_3D_unbinned_angular_fit_test
The error is not encountered when the range for the costheta_X_VV_reco variable is restricted to (-0.9, 1.0) instead of the full range (-1.0, 1.0). I believe this is because it removes a region of phase space where the weighted data is negative.
As far as can be seen in the definition of the NLL in zfit, the weights simply multiply the log probabilities, so negative weights should not be an issue.
However, it seems that the PDF returns negative probabilities for some of the values, which you can be seen by simply getting the returned array with
custom_pdf.pdf(data)
This negative probabilities will turn into NaNs once the log is taken.
Maybe there is a typo in the definition of the PDF as the variable h_pst seems to be unused.
just to close this thread, it is the case that the PDF is negative in some places which I think is due to the acceptance PDF. It's also true that h_pst ins't used, but removing it didn't change anything. In the end I have just fitted the data in the region where I don't have any negative PDF values, which doesn't seem to impact the results (it just ignores a small region in costheta_X where the density is close to zero).

Error when computing jacobian vector product

I have a group with coupled disciplines which is nested in a model where all other components are uncoupled. I have assigned a nonlinear Newton and linear direct solvers to the coupled group.
When I try to run the model with default "RunOnce" solver everything is OK, but as soon as I try to run optimization I get following error raised from linear_block_gs.py:
File "...\openmdao\core\group.py", line 1790, in _apply_linear scope_out, scope_in)
File "...\openmdao\core\explicitcomponent.py", line 339, in _apply_linear
self.compute_jacvec_product(*args)
File "...\Thermal_Cycle.py", line 51, in compute_jacvec_product
d_inputs['T'] = slope * deff_dT / alp_sc
File "...\openmdao\vectors\vector.py", line 363, in setitem
raise KeyError(msg.format(name)) KeyError: 'Variable name "T" not found.'
Below is the N2 diagram of the model. Variable "T" which is mentioned in the error comes from implicit "temp" component and is fed back to "sc" component (file Thermal_Cycle.py in the error msg) as input.
N2 diagram
The error disappears when I assign DirectSolver on top of the whole model. My impression was that "RunOnce" would work as long as groups with implicit components have appropriate solvers applied to them as suggested here and is done in my case. Why does it not work when trying to compute total derivatives of the model, i.e. why compute_jacvec_product cannot find coupled variable "T"?
The reason I want to use "RunOnce" solver is that optimization with DirecSolver on top becomes very long as my variable vector "T" increases. I suspect it should be much faster with linear "RunOnce"?
I think this example of the compute_jacvec_product method might be helpful.
The problem is that, depending on the solver configuration or the structure of the model, OpenMDAO may only need some of the partials that you provide in this method. For example, your matrix-free component might have two inputs, but only one is connected, so OpenMDAO does not need the derivative with respect to the unconnected input, and in fact, does not allocate space for it in the d_inputs or d_outputs vectors.
So, to fix the problem, you just need to put an if statement before assigning the value, just like in the example.
Based on the N2, I think that I agree with your strategy of putting the direct solver down around the coupling only. That should work fine, however it looks like you're implementing a linear operator in your component, based on:
File "...\Thermal_Cycle.py", line 51, in compute_jacvec_product d_inputs['T'] = slope * deff_dT / alp_sc
You shouldn't use direct solver with matrix-free partials. The direct solver computes an inverse, which requires the full assembly of the matrix. The only reason it works at all is that OM has some fall-back functionality to manually assemble the jacobian by passing columns of the identity matrix through the compute_jacvec_product method.
This fallback mechanism is there to make things work, but its very slow (you end up calling compute_jacvec_product A LOT).
The error you're getting, and why it works when you put the direct solver higher up in the model, is probably due to a lack of necessary if conditions in your compute_jacvec_product implementation.
See the docs on explicit component for some examples, but the key insight is to realize that not every single variable will be present when doing a jacvec product (it depends on what kind of solve is being done --- i.e. one for Newton vs one for total derivatives of the whole model).
So those if-checks are needed to check if variables are relevant. This is done, because for expensive codes (i.e. CFD) some of these operations are quite expensive and you don't want to do them unless you need to.
Are your components so big that you can't use the compute_partials function? Have you tried specifying the sparsity in your jacobian? Usually the matrix-free partial derivative methods are not needed until you start working with really big PDE solvers with 1e6 or more implicit outputs variables.
Without seeing some code, its hard to comment with more detail, but in summary:
You shouldn't use compute_jacvec_product in combination with direct solver. If you really need matrix-free partials, then you need to switch to iterative linear solvers liket PetscKrylov.
If you can post the code for the the component in Thermal_Cycle.py that has the compute_jacvec_product I could give a more detailed recommendation on how to handle the partial derivatives in that case.

Why does lsoda (in R) fail to complete running duration, with warning messages?

I am writing a numerical model in R, for an ecological system, and solving it using "lsoda" from package deSolve.
My model has 14 state variables.
I define the model, set it up fine, and give time duration according to this:
nyears<-60
ndays<-nyears*365+1
times<-seq(0,nyears*365,by=1)
Rates of change of state variables (e.g. the rate of change of variable "A1" is "dA1")are calculated according to existing values for state variables (at time=t) and a set of parameters.
Simplified example:
dA1<-Tf*A1*(ImaxA*p_sub)
Where Tf, ImaxA and p_sub are parameters, and A1 is my state variable at time=t.
When I solve the model, I use the lsoda solver like this:
out<-as.data.frame(lsoda(start,times,model,parms))
Sometimes (depending on my parameter combinations), the model run completes over the entire duration I have specified, however sometimes it stops short of the mark (still giving me output up until the solver "crashes"). When it "crashes", this message is displayed:
DLSODA- At current T (=R1), MXSTEP (=I1) steps
taken on this call before reaching TOUT
In above message, I1 = 5000
In above message, R1 = 11535.5
Warning messages:
1: In lsoda(start, times, model, parms) :
an excessive amount of work (> maxsteps ) was done, but integration was not successful - increase maxsteps
2: In lsoda(start, times, model, parms) :
Returning early. Results are accurate, as far as they go
It commonly appears when one of the state variables is getting exponentially bigger, or is tending very near to zero, however sometimes it crashes when seemingly not much change is happening. I may be wrong, but is it due to the rate of change of state-variables becoming too large? If so, why might it also "crash" when there is not a fast rate of change?
Is there a way that I can make the solver complete its task with the specified parameter values, maybe with a more relaxed tolerance for error?
Thank you all for your contributions. I looked at some of the rates, and at the point of crashing, the model was switching between two metabolic states - and the fast rate of this binary switch caused the solver to stop - rejecting the solution because the rate of change was too large. I have fixed my model by introducing a gradual switch between states (with a logistic curve) instead of this binary switch. I aknowledge that I didn;t give enough info in the original question, so thanks for the help you offered!

Trouble implementing a very simple mass flow source

I am currently learning Modelica by trying some very simple examples. I have defined a connector Incompressible for an incompressible fluid like this:
connector Incompressible
flow Modelica.SIunits.VolumeFlowRate V_dot;
Modelica.SIunits.SpecificEnthalpy h;
Modelica.SIunits.Pressure p;
end Incompressible;
I now wish to define a mass or volume flow source:
model Source_incompressible
parameter Modelica.SIunits.VolumeFlowRate V_dot;
parameter Modelica.SIunits.Temperature T;
parameter Modelica.SIunits.Pressure p;
Incompressible outlet;
equation
outlet.V_dot = V_dot;
outlet.h = enthalpyWaterIncompressible(T); // quick'n'dirty enthalpy function
outlet.p = p;
end Source_incompressible;
However, when checking Source_incompressible, I get this:
The problem is structurally singular for the element type Real.
The number of scalar Real unknown elements are 3.
The number of scalar Real equation elements are 4.
I am at a loss here. Clearly, there are three equations in the model - where does the fourth equation come from?
Thanks a lot for any insight.
Dominic,
There are a couple of issues going on here. As Martin points out, the connector is unbalanced (you don't have matching "through" and "across" pairs in that connector). For fluid systems, this is acceptable. However, intensive fluid properties (e.g., enthalpy) have to be marked as so-called "stream" variables.
This topic is, admittedly, pretty complicated. I'm planning on adding an advanced chapter to my online Modelica book on this topic but I haven't had the time yet. In the meantime, I would suggest you have a look at the Modelica.Fluid library and/or this presentation by one of its authors, Francesco Casella.
That connector is not a physical connector. You need one flow variable for each potential variable. This is the OpenModelica error message if it helps a little:
Warning: Connector .Incompressible is not balanced: The number of potential variables (2) is not equal to the number of flow variables (1).
Error: Too many equations, over-determined system. The model has 4 equation(s) and 3 variable(s).
Error: Internal error Found Equation without time dependent variables outlet.V_dot = V_dot
This is because the unconnected connector will generate one equation for the flow:
outlet.V_dot = 0.0;
This means outlet.V_dot is replaced in:
outlet.V_dot = V_dot;
And you get:
0.0 = V_dot;
But V_dot is a parameter and can not be assigned to in an equation section (needs an initial equation if the parameter has fixed=false, or a binding equation in the default case).

Resources