Possible issue in presentation of N² diagram with indexed inputs/outputs? - openmdao

When visualizing the structure of the circuit tutorial via an N² diagram, I noticed that implicit components with indexed inputs/outputs labelled using the pattern x:y (e.g. I_out:0 of n1) do not display connections into the output of the block (in this case V of n1).
I understand that it is computing the residuals with the inputs and some initial "guess" to provide the output, so is this by design for ImplicitComponent because the connections are implicit? I tend to use the diagrams for debugging, and seeing no connections to the output makes it look unclear if it's connected, even though the inputs are fed into it and the code processes it via the residual equation correctly.

This is a known bug in OpenMDAO 2.9.1, but has been fixed already on OpenMDAO master. So the next release, due out before the end of Feb 2020 (2.10) should have the issue fixed.

Related

Make input variables inactive when computing the jacobian

Make input variables inactive when computing the jacobian
We are setting up an aero elastic optimization framework for wind turbine optimization and we are there facing issues with defining inputs and outputs for the components.
The issues is that we might have many inputs and outputs for a solver (example further down) but they are likely not active for all optimization cases. It leads to the problem that we need to compute partials for all combinations of input and output even though we only might have a single input and output. Is it possible to tell the component which input and outputs are active design variables?
Example:
An aerodynamic wind turbine rotor solver (ExplicitComponent).
Inputs
Chord (c, distributed along the blade span - 1D array)
Twist (t, distributed along the blade span - 1D array)
Outputs
Power (P, scalar)
Lift coefficient (Cl, distributed along the blade span - 1D array)
For the solver above we have both AD forward and backward gradients. Below we have two optimization problems where the fist do not lead to computational overhed but the other does.
Optimization problem 1
Maximize power while constraining the lift-coefficient to 1
max P for c, t
subj Cl <= 1
All input and outputs are active design-variables and objectives/constraints.
Optimization problem 2
Maximize power
max P for c, t
If using the same OpenMDAO component the Cl output is still there and it would therefore compute the gradient for it. But that is computational expensive as all the needed gradient are given when running reverse AD for P, but it will still to compute the gradients for Cl. Is there is a way to side step that behavior? Ex. making the output inactive?
We have tried to make input and output dynamic for the component but it quickly get to be difficult code to read and for nested components is difficult to keep. Another thing is that it is mostly a think you need to define for the problem and not the component.
You mention that you are using AD, but not which of the derivative APIs you are using. From the context of your question it sounds like you're using the compute_partials API. That means you're likely asking the AD system to compute all the partials you need and then passing them to OpenMDAO.
Assuming that I have guessed right, then there is one possible way to speed things up a bit here and get the effects you are looking for without explicitly turning I/O on an off, and AD based partials are particularly well suited to this approach.
The matrix-free derivative APIs in OpenMDAO are designed to give you the exact behavior you want automatically. For ExplicitComponent, the method is called compute_jacvec_product. In the example from the OpenMDAO docs, this is implemented manually but it should tie in with an AD system very easily. For example the JAX AD library has JVP and ['VJP][3] methods that can be used in the fwdanrev` modes of the OpenMDAO matrix-free APIs respectively.
When using these Matrix-free APIs, OpenMDAO will only call your AD system a minimum number of times. The exact number depends on if OpenMDAO selects fwd or rev mode (or what you hard code in setup) and then also on the number of design variables, and constraints you have.
In your case, I would guess you'd end up using reverse mode. Then when you don't have the CL constraint, you wouldn't get the extra calls to the AD library.
There are a few additional caveats for the matrix-free APIs when using implicit components that I didn't cover here. Your question specifically noted ExplicitComponent, so Im not sure they are relevant. But I wanted to note that if you graduate to implicit components then you have to worry about the solve_linear method along with the apply_linear (which is the Implicit analogue to the compute_jacvec_product explicit method)

Understanding the complex-step in a physical sense

I think I understand what complex step is doing numerically/algorithmically.
But the questions still linger. First two questions might have the same answer.
1- I replaced the partial derivative calculations of 'Betz_limit' example with complex step and removed the analytical gradients. Looking at the recorded design_var evolution none of the values are complex? Aren't they supposed to be shown as somehow a+bi?
Or it always steps in the real space. ?
2- Tying to picture 'cs', used in a physical concept. For example a design variable of beam length (m), objective of mass (kg) and a constraint of loads (Nm). I could be using an explicit component to calculate these (pure python) or an external code component (pure fortran). Numerically they all can handle complex numbers but obviously the mass is a real value. So when we say capable of handling the complex numbers is it just an issue of handling a+bi (where actual mass is always 'a' and b is always equal to 0?)
3- How about the step size. I understand there wont be any subtractive cancellation errors but what if i have a design variable normalized/scaled to 1 and a range of 0.8 to 1.2. Decreasing the step to 1e-10 does not make sense. I am a bit confused there.
The ability to use complex arithmetic to compute derivative approximations is based on the mathematics of complex arithmetic.
You should read about the theory to get a better understanding of why it works and how the step size issue is resolved with complex-step vs finite-difference.
There is no physical interpretation that you can make for the complex-step method. You are simply taking advantage of the mathematical properties of complex arithmetic to approximate a derivative in a more accurate manner than FD can. So the key is that your code is set up to do complex-arithmetic correctly.
Sometimes, engineering analyses do actually leverage complex numbers. One aerospace example of this is the Jukowski Transformation. In electrical engineering, complex numbers come up all the time for load-flow analysis of ac circuits. If you have such an analysis, then you can not easily use complex-step to approximate derivatives since the analysis itself is already complex. In these cases, it is technically possible to use a more general class of numbers called hyper dual numbers, but this is not supported in OpenMDAO. So if you had an analysis like this you could not use complex-step.
Also, occationally there are implementations of methods that are not complex-step safe which will prevent you from using it unless you define a new complex-step safe version. The simplest example of this is the np.absolute() method in the numpy library for python. The implementation of this, when passed a complex number, will return the asolute magnitude of the number:
abs(a+bj) = sqrt(1^2 + 1^2) = 1.4142
While not mathematically incorrect, this implementation would mess up the complex-step derivative approximation.
Instead you need an alternate version that gives:
abs(a+bj) = abs(a) + abs(b)*j
So in summary, you need to watch out for these kinds of functions that are not implemented correctly for use with complex-step. If you have those functions, you need to use alternate complex-step safe versions of them. Also, if your analysis itself uses complex numbers then you can not use complex-step derivative approximations either.
With regard to your step size question, again I refer you to the this paper for greater detail. The basic idea is that without subtractive cancellation you are free to use a very small step size with complex-step without the fear of lost accuracy due to numerical issues. So typically you will use 1e-20 smaller as the step. Since complex-step accuracy scalea with the order of step^2, using such a small step gives effectively exact results. You need not worry about scaling issues in most cases, if you just take a small enough step.

Graph partitioning optimization

The problem
I have a set of locations on a plane (actually they are pins in a KML file) and I want to partition this graph into subgraphs. Connectivity is pretty good - as with all real world road networks - so I assume that if two locations are close they have some kind of connection. The resulting set of subgraphs should adhere to these constraints:
Every node has to be covered by a subgraph
Every node should be in exactly 1 subgraph
Every node within a subgraph should be close to each other (L2 norm distances)
Every subgraph should contain at least 5 locations
The amount of subgraphs should be minimal
Right now the amount of locations is no more than 100 so I thought about brute forcing through every possibility but this obviously won't scale well.
I thought about using some k-Nearest-Neighbors algorithm (e.g. using QuickGraph) but I can't get my head around where to start and how to extend/shrink the subgraphs on the way. Maybe it's possible to map this problem to another problem that can easily be solved with some numerical procedure (e.g. Simplex) ...
Maybe someone has experience in this kind of optimization problems and is willing to help me find a solution? I don't have access to Mathematica/Matlab or the like ... but sufficient .NET programming skills and hmm Excel :-)
Thanks a lot!
As soon as there are multiple criteria that need to be appeased in the best possible way simultanously, it is usually starting to get difficult.
A numerical solution could work as follows: You could define yourself a utility function, that maps partitionings of your locations to positive real values, describing how "good" a partition is by assigning it a "rating" (good could be high "bad" could be near zero).
Once you have such a function assigning partitions their according "values", you simply need to optimize it and then you hopefully obtain a good solution if you defined your utility function reasonably. Evolutionary algorithms are good at that task since your utility function is probably analytically too complex to solve due to its discrete nature.
The problem is then only how you assign "values" to partitions via this utility function. This is then your task. It can be done for example by weighing each criterion with a factor and summing the results up, or even more complex functions (least squares etc.). The factors you use in the definition of the utility function are tuning parameters and can be varied until the result seems to be good.
Some CA software wold help a lot for testing if you can get your hands on one, bit I guess to obtain a black box solver for your partitioning problem, you need to implement the complete procedure yourself using a language of your choice.

rapid exploring random trees

http://msl.cs.uiuc.edu/rrt/
Can anyone explain how rrt works with simple wording that is easy to understand?
I read the description in the site and in wikipedia.
What I would like to see, is a short implementation of a rrt or a thorough explanation of the following thing:
Why does the rrt grow outwards instead of just growing very dense around the center?
How is it different from a naive random tree?
How is the next new vertex that we attempt to reach picked?
I know there is an Motion Strategy Library I could download but I would much rather understand the idea before I delve into the code rather than the other way around.
The simplest possible RRT algorithm has been so successful because it is pretty easy to implement. Things tend to get complicated when you:
need to visualise planning concepts in more than two dimensions
are unfamiliar with the terminology associated with planning, and;
in the huge number of variants of RRT that are have been described in the literature.
Pseudo code
The basic algorithm looks something like this:
Start with an empty search tree
Add your initial location (configuration) to the search tree
while your search tree has not reached the goal (and you haven't run out of time)
3.1. Pick a location (configuration), q_r, (with some sampling strategy)
3.2. Find the vertex in the search tree closest to that random point, q_n
3.3. Try to add an edge (path) in the tree between q_n and q_r, if you can link them without a collision occurring.
Although that description is adequate, after a while working in this space, I really do prefer the pseudocode of figure 5.16 on RRT/RDT in Steven LaValle's book "Planning Algorithms".
Tree Structure
The reason that the tree ends up covering the entire search space (in most cases) is because of the combination of the sampling strategy, and always looking to connect from the nearest point in the tree. This effect is described as reducing the Voronoi bias.
Sampling Strategy
The choice of where to place the next vertex that you will attempt to connect to is the sampling problem. In simple cases, where search is low dimensional, uniform random placement (or uniform random placement biased toward the goal) works adequately. In high dimensional problems, or when motions are very complex (when joints have positions, velocities and accelerations), or configuration is difficult to control, sampling strategies for RRTs are still an open research area.
Libraries
The MSL library is a good starting point if you're really stuck on implementation, but it hasn't been actively maintained since 2003. A more up-to-date library is the Open Motion Planning Library (OMPL). You'll also need a good collision detection library.
Planning Terminology & Advice
From a terminology point of view, the hard bit is to realise that although lots of the diagrams you see in the (early years of) publications on RRT are in two dimensions (trees that link 2d points), that this is the absolute simplest case.
Typically, a mathematically rigorous way to describe complex physical situations is required. A good example of this is planning for a robot arm with n- linkages. Describing the end of such an arm requires a minimum of n joint angles. This set of minimum parameters to describe a position is a configuration (or some publications state). A single configuration is often denoted q
The combination of all possible configurations (or a subset thereof) that can be achieved make up a configuration space (or state space). This can be as simple as an unbounded 2d plane for a point in the plane, or incredibly complex combinations of ranges of other parameters.

How do programs like mathematica draw graphs and how can I make such a program?

I've been wondering how programs like mathematica and mathlab, and so on, plot graphs of functions so gracefully and fast. Can anyone explain to me how they do this and, furthermore, how I can do this? Is it related to an aspect or course in Computer Programming or Math? Which then?
Well, with some encouragement from belisarius, here's a my comment as an answer: try looking at matplotlib. From the home page:
matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell (ala MATLAB®* or Mathematica®†), web application servers, and six graphical user interface toolkits.
It was originally inspired by MATLAB's plotting capabilities, though it's grown a lot since then. It's solid software - and it's open source, under a BSD license, so not only can you read the source, you can hack on it and use it in whatever you like.
Another place you could look is gnuplot. It's not one of the common open source licenses, but it's certainly open source, with some permissions to modify and such.
Gnuplot is a portable command-line driven graphing utility for linux, OS/2, MS Windows, OSX, VMS, and many other platforms. The source code is copyrighted but freely distributed (i.e., you don't have to pay for it). It was originally created to allow scientists and students to visualize mathematical functions and data interactively, but has grown to support many non-interactive uses such as web scripting. It is also used as a plotting engine by third-party applications like Octave. Gnuplot has been supported and under active development since 1986.
It does 3D plotting as well, which matplotlib doesn't do, and it's been around a lot longer. The reason I thought of matplotlib first is that it's intended as a library for a higher-level language, not a stand-alone application, so I'm guessing it might a bit easier for you to read.
One other suggestion, just to get an idea of the sorts of things Mathematica is doing under the hood, is to look at the documentation for Plot. In particular, if you look at the available options, you can deduce things.
MaxRecursion Automatic the maximum number of recursive subdivisions allowed
Method Automatic the method to use for refining curves
PerformanceGoal $PerformanceGoal aspects of performance to try to optimize
PlotPoints Automatic initial number of sample points
From the MaxRecursion and PlotPoints, you can see that it's doing an initial sampling then somehow deciding which regions need to be subdivided (resampled) to get an accurate view of the plot. And from there on, it's magic: there is some Method for this, and a PerformanceGoal to guide it...
For MATLAB, because of its cross-platform requirement there is no alternatives as using OpenGL. MATLAB runtime is written in C++ and non-axis GUI uses Java Swing. Therefore MATLAB Plot is probably a C++/OpenGL/Swing mixture.
In reality MATLAB graphics is much less complex then a video game graphics. I think it is easier to find tutorials on video game graphics and then "downsize" it to MATLAB functionality, like drawing a single line with the same color.
The most important concept is probably Transformation Matrix.
Basically most programs that plot any type of graph (particularly any graphs of reasonable complexity) will use some type of third party libraries.
The specific library used would depend on the programming language that is being used.
For example:
For a .Net application you might use Crystal reports. http://en.wikipedia.org/wiki/Crystal_Reports
For Java you might use JFreeChart. http://www.jfree.org/jfreechart/
And so on...
You will likely find numerious libraries for whatever language you decide to code in.
If you want to accomplish this functionality in your specific project I suggest using a library especially if you are a beginner. The internal complexities of how these graph libraries are implemented would be significant because of many issues such as cross platform compatibility, graphic rendering optimizations (ie: making sure the graphics render quickly and ‘prettily’), the maths associated with the positioning of elements on the graph and so forth.
Lastly I doubt you will find specific courses in this subject (or require them) as again excluding VERY specific cases programmers will always use libraries that already exist.
Why code it yourself when someone has
already solved the problem for you?
A good place to start is to understand that there is a grammar to graphics and what you want to construct upon receiving a plot command is a symbolic representation of the graph. For Mathematica, you can do something like
FullForm[Plot[Sin[x], {x, 0, 2 Pi}]]
to see the internal representation Mathematica uses. Basically you need to describe the line segments (2D) or meshes (3D) you want to draw in terms of their color and coordinates. Also, there needs to information about the scale of the graph and how to draw tick marks, label axes, etc.
This leads us to the heart of the question, how do you determine the line segment you want to draw from a function and a range? If you dig around in the help file for plot, you see a few things. First there is a plot points option and a MaxRecursion option. This leads me to believe (and this is just an educated guess, but it is how I would do it) that Mathematica plots the initial number of points on an even interval over the range to get a starting value. The next part is to identify regions where change exceeds some threshold and then to sample more points until the "change" between any two points in your line segment is below a threshold. Mathematica does this recursively, hence the MaxRecursion option.
So far I have been pretty vague about defining rate of change. A more useful way to describe change is to take 3 pts on your line segment. Assume a linear relationship between the 1st and 3rd point and, assuming this linear relationship, make a prediction about what the 2nd point would be. If the error of this prediction is sufficiently low, then consider the next group of three points. If the error is above a threshold, then you should sample some more points in this region until the threshold is met. In this way you will require relatively few points where the curve is relatively straight and more at the "interesting" parts where it bends in new directions. The smoothness of the curve you draw will be proportional to the error you are willing to tolerate in the linear prediction of points.

Resources