Will there be a meaningful increase in solution speed if inactive constraints are removed (say wing stiffness was dominating wing max stress, so stress constraint inactive)? Is it more of a problem for the optimisation drivers themselves, or does openMDAO have any tricks to help with this?
OpenMDAO does not change the size of the optimization problem during execution (the number of design variables and constraints) and most optimizer's don't allow this.
Many optimizer's already employ an active set approach. While the framework is required to compute partials of potentially inactive constraints, this often isn't a significant performance hit.
Obviously this depends on the size of the problem involved and the cost of computing constraints. There are some tricks for aggregating large vector constraints into a single constraint (http://openmdao.org/twodocs/versions/latest/features/building_blocks/components/ks_comp.html), but I wouldn't worry about this unless you're confident that your constraint evaluation is a performance bottleneck.
Related
I am getting a warning in this form:
DerivativesWarning:Constraints or objectives ['traj.linkages.stage_1_grav_turn:alpha_final|coast_1:alpha_initial', 'traj.phases.stage_1_maneuver.path_constraints.path:q_alpha'] cannot be impacted by the design variables of the problem.
not sure what to make of the first one, a linkage constraint. Alpha is a parameter in the grav_turn and coast phase, and it's set to 0. The second one makes no sense, as in the the stage_1_maneuver phase alpha is a control so you can definitely control dynamic pressure * alpha. Perhaps because alpha at the end of that phase is constrained to 0?
Anyways the optimizer converges fine, and produces results that look correct and makes sense when cross checked. Just was curious about this.
In OpenMDAO V3.9.0 a feature was added that detects rows and columns of all 0 in the total derivative Jacobian. A row of all 0's means that an objective or constraint is not impacted by any of the design variables. A column of all 0's means that a design variable does not impact any constraint or objective values.
Both of these situations are potentially problematic. A 0 column means that there are less degrees of freedom than you might think, since that DV doesn't affect anything. This isn't fatal, but it is still something that is worth warning a user about.
A 0 row is much more problematic. If it the row is associated with a constraint, it means the optimizer has no ability to satisfy that constraint. You may get "lucky" and find that the constraint happens that the constraint happens to be satisfied at the initial condition anyway, and so you can technically solve the optimization problem (your specific case is likely one of these). However, mathematically the problem is singular, and unless the optimizer you using has specific code to handle this corner case it can make things difficult.
One of the primary reasons this feature was added was that the OpenMDAO dev team noticed that Dymos users were particularly prone to accidentally creating 0 rows when adding linkage and path constraints. Often these 0 rows seem to not cause harm, but we have definitely also seen cases where they give the optimizer fits.
The warning helps you identify the problem so you can correct it.
In this case, it looks like you have two separate 0 rows.
traj.linkages.stage_1_grav_turn:alpha_final|coast_1:alpha_initial means that none of the design variables you've given to the optimizer affect that constraint. Likely this means that you have specified both alpha_final and alpha_initial as fixed_final and fixed_initial respectively. You're getting away with it because the initial conditions you provided must have both alphas equal by construction.
You would still be better off either removing the constraint, or adding at least one end of the linkage or the other as a design variable.
traj.phases.stage_1_maneuver.path_constraints.path:q_alpha means that at least one of the entries in your path constraint is not affected by any DV. It is likely not the entire path constraint, but just one end of it that is fixed because its computed from fixed boundary conditions. In this case, you can simply add indices to the add_path_constraint call to exclude the first or last point from the constraint.
Since your optimization is running, the 0 rows aren't killing you. However, its good practice to clean this up. It's possible that not having them will improve performance now, or it may save you from a future situation where the optimization "mysteriously" stops working because you somehow trigger a situation where the optimizer can no longer handle the 0 rows.
First of all, what is the maximum theoretical speed/speed up?
Can anyone explain why pipelining cannot operate at its maximum theoretical speed?
The maximum theoretical speedup is equal to the increase in pipeline depth. In a scalar (one instruction wide execution) design, the ideal instructions per cycle is one. Ideally, the clock frequency could increase by a factor equal to the increase in pipeline depth.
The actual frequency increase will be less than this ideal due to latching overheads, clock skew, and imbalanced division of work/latency. (While one can theoretically place latches at any point, the amount of state latched, its position, and other factors make certain points more friendly for stage divisions.
Manufacturing variation also means that work designed to take an equal amount of time will not do so for all stages in the pipeline. A designer can provide more slack so that more chips will meet minimal timing in all stages. Another technique to handle such variation is to accept that not all chips will meet the target frequency (whether one exclusively uses the "golden samples" or use lower frequency chips as well is a marketing decision).
As one might expect, with shallow pipelines variation in a stage is spread out over more logic and so is less likely to affect frequency.
Wave pipelining, where a multiple signal waves (corresponding to pipeline stages) can be passing through a block of logic at the same time, provides a limited method to avoid latch overhead. However, besides other design issues, such is more sensitive to variation both from manufacturing and from run-time conditions such as temperature and voltage (which one might wish to intentionally vary to target different power/performance behaviors).
Even if one did have incredible hardware that provided a perfect frequency increase, hazards (as mentioned in Peter Cordes' comment) would prevent the perfect utilization of available execution resources.
Constraint Satisfaction Problems (CSPs) are basically, you have a set of constraints with variables and the domains of values for the variables. Then given some configuration of the variables (assignment of variables to values in their domains), you check to see if the constraints are "satisfied". That is, you check to see that evaluating all of the constraints returns a Boolean "true".
What I would like to do is sort of the reverse. Instead of this Boolean "testing" if the constraints are true, I would like to instead take the constraints and enforce them on the variables. That is, set the variables to whatever values they need to be in order to satisfy the constraints. An example of this would be like in a game, you say "this box's right side is always to the left of its containing box's right side," or, box.right < container.right. Then the constraint solving engine (like Cassowary for the game example) would take the box and set its "right" property to whatever number value it resolved to. So instead of the constraint solver giving you a Boolean value "yes the variable configuration satisfies the constraints", it instead updates the variables' configuration with appropriate values, "you have updated the variables". I think Cassowary uses the Simplex Algorithm for solving its constraints.
I am a bit confused because Wikipedia says:
constraint satisfaction is the process of finding a solution to a set of constraints that impose conditions that the variables must satisfy. A solution is therefore a set of values for the variables that satisfies all constraints—that is, a point in the feasible region.
That seems different than the constraint satisfaction problem, of which it says:
An evaluation is consistent if it does not violate any of the constraints.
That's why it seems CSPs are to return Boolean values, while in CS you can set the values. Not quite clear the distinction.
Anyways, I am looking for general techniques on Constraint Solving, in the sense of setting variables like in the simplex algorithm. However, I would like to apply it to any situation, not just linear programming. Some standard and simple example constraints are:
All variables are different.
box.right < container.right
The sum of all variables < 10
Variable a goes before variable b in evaluation.
etc.
For the first case, seeing if the constraints are satisfied (Boolean true) is pretty easy: iterate through the pairs of variables, and if any pair is not equal to each other, return false, otherwise return true after processing all variables.
However, doing the equivalent of setting the variables doesn't seem possible at first glance: iterate through the pairs of variables, and if they are not equal, perhaps you set the first one to the second one. You might have to do some fixed point thing, processing some of them more than once. And then figuring out what value to set them to seems arbitrary how I just did it. Maybe instead you need some further (nested) constraints defining how set the values (e.g. "set a to b if a > b, otherwise set b to a"). The possibilities are customizable.
In addition, for simpler cases like box.right < container.right, it is even complicated. You could say at first that if box.right >= container.right then set box.right = container.right. But maybe actually you don't want that, but instead you want some iPhone-like physics "bounce" where it overextends and then bounces back with momentum. So again, the possibilities are large, and you should probably have additional constraints.
So my question is, similar to how for testing the constraints (for Boolean value) is standardized to CSP, I am wondering if there are any references or standardizations in terms of setting the values used by the constraints.
The only thing I have seen so far is that Cassowary simplex algorithm example which works well for an array of linear inequalities on real-numbered variables. I would like to see something that can handle the "All variables are different" case, and the other cases listed, as well as the standard CSP example problems like for scheduling, box packing, etc. I am not sure why I haven't encountered more on setting/updating constraint variables instead of the Boolean "yes constraints are satisfied" problem.
The only limits I have are that the constraints work on finite domains.
If it turns out there is no standardization at all and that every different constraint listed requires its own entire field of research, that would be good to know. Then I at least know what the situation is and why I haven't really seen much about it.
CSP is a research fields with many publications each year. I suggest you to read one of the books on the subject, like Rina Dechter's.
For standardized CSP languages, check MiniZinc on one hand, and XCSP3 on the other.
There are two main approaches to CSP solving: systematic and stochastic (also known as local search). I have worked on three different CSP solvers, one of them stochastic, but I understand systematic solvers better.
There are many different approaches to systematic solvers. It is possible to fill a whole book covering all the possible approaches, so I will explain only the two approaches I believe the most in:
(G)AC3 which propagates constraints, until all global constraints (hyper-arcs) are consistent.
Reducing the problem to SAT, and letting the SAT solver do the hard work. There is a great algorithm that creates the CNF lazily, on demand when the solver is already working. In a sence, this is a hybrid SAT/CSP algorithm.
To get the AC3 approach going you need to maintain a domain for each variable. A domain is basically a set of possible assignments.
For example, consider the domains of a and b: D(a)={1,2}, D(b)={0,1} and the constraint a <= b. The algorithm checks one constraint at a time, and when it reaches a <= b, it sees that a=2 is impossible, and also b=0 is impossible, so it removes them from the domains. The new domains are D'(a)={1}, D'(b)={1}.
This process is called domain propagation. Using a queue of "dirty" constraints, or "dirty" variables, the solver knows which constraint to propagate next. When the queue is empty, then all constraints (hyper arcs) are consistent (this is where the name AC3 comes from).
When all arcs are consistent, then the solver picks a free variable (with more than one value in the domain), and restricts it to a single value. In SAT, this is called a decision It adds it to the queue and propagates the constraints. If it gets to a conflict (a constraint can't be satisfied), it goes back and undos an earlier decision.
There are a lot of things going on here:
First, how the domains are represented. Some solvers only hold a pair of bounds for each domain. Others, have a set of integers. My solver holds an interval set, or a bit vector.
Then, how the solver knows to propagate a constraint? Some solvers such as SAT solvers, Minion, and HaifaCSP, use watches to avoid propagating irrelevant constraints. This has a significant performance impact on clauses.
Then there is the issue of making decisions. Usually, it is good to choose a variable that has a small domain and high connectivity. There are many papers comparing many different strategies. I prefer a dynamic strategy that resembles the VSIDS of SAT solvers. This strategy is auto-tuned according to conflicts.
Making decision on the value is also important. Many simply take the smallest value in the domain. Sometimes this can be suboptimal if there is a constraint that limits a sum from below. Another option is to randomly choose between max and min values. I tune it further, and use the last assigned value.
After everything, there is the matter of backtracking. This is a whole can of worms. The problem with simple backtracking is that sometimes the cause for conflicts happened at the first decision, but it is detected only at the 100'th. The best thing is to analyze the conflict, and realize where the cause of the conflict is. SAT solvers have been doing this for decades. But CSP representation is not as trivial as CNF. So not many solvers could do it efficiently enough.
This is a nontrivial subject that can fill at least two university courses. Just the subject of conflict analysis can take half of a course.
I recently implemented an algorithm in Java that used a hash table. I compared it to a few other algorithms with rather large data input sizes such as 100000.
The thing that has struck me is that once my data input size exceeds 10000 the performance of the hash table drops dramatically. To emphasise this drop, what took 4000 ms with input size 1000 suddenly goes up to 172000 ms for input size 5000.
Can anyone please explain to me what the reason for this is? I'd really like to know.
Thanks!
This question is way too ambiguous for anyone to give a definitive answer, but if I had to guess I would say that you are encountering collisions. The stock implementation of java's HashMap uses linked lists to hold the entries whose keys' hashes collide, which will certainly happen if the hashCode method has been incorrectly defined; perhaps returning a constant value.
Having said that, if you're just measuring elapsed time, that doesn't tell you too much. Perhaps you crossed a threshold that caused a major garbage collection to occur. You should try to measure performance after your JVM and hash table are sufficiently warmed up, and take lots of measurements and consider their average, before coming to any conclusions.
I am curious to know what is the reasoning that could overweighs towards using a self-balancing tree technique to store items than using a hash table.
I see that hash tables cannot maintain the insertion-order, but I could always use a linked list on top to store the insertion-order sequence.
I see that for small number of values, there is an added cost of of the hash-function, but I could always save the hash-function together with the key for faster lookups.
I understand that hash tables are difficult to implement than the straight-forward implementation of a red-black tree, but in a practical implementation wouldn't one be willing to go an extra mile for the trouble?
I see that with hash tables it is normal for collisions to occur, but with open-addressing techniques like double hashing that allow to save the keys in the hash table itself, hasn't the problem been reduced to the effect of not tipping the favor towards red black trees for such implementations?
I am curious if I am strictly missing a disadvantage of hash table that still makes red black trees quite viable data structure in practical applications (like filesystems, etc.).
Here is what I can think of:
There are kinds of data which cannot be hashed (or is too expensive to hash), therefore cannot be stored in hash tables.
Trees keep data in the order you need (sorted), not insertion order. You can't (effectively) do that with hash table, even if you run a linked list through it.
Trees have better worst-case performace
Storage allocation is another consideration. Every time you fill all of the buckets in a hash-table, you need to allocate new storage and re-hash everything. This can be avoided if you know the size of the data ahead of time. On the other hand, balanced trees don't suffer from this issue at all.
Just wanted to add :
Balanced binary trees have a predictable time of fetching a data [log n] independent of the type of data. Many times that may be important for your application to estimate the response times for your application. [hash tables may have unpredictable response times]. Remember for smaller n's as in most common use cases the difference in performance in an in-memory look up is hardly going to matter and the bottle neck of the system is going to be elsewhere and sometimes you just want to make the system much simpler to debug and analyze.
Trees are generally more memory efficient compared to hash tables and much simpler to implement without any analysis on the distribution of input keys and possible collisions etc.
In my humble opinion, self-balancing trees work pretty well as Academic topics. And I
do not know anything that can be qualified as a "straight-forward implementation of a
red-black tree".
In the real world, the memory wall makes them far less efficient than they are on paper.
With this in mind, hash tables are decent alternatives, especially if you don't practice
them the Academic style (forget about the table size constraint and you magically resolve
the table resize issue and almost all collision issues).
In a word: keep it simple. If that's simple for you then that's simple for your computer.
I think if you want to query for a range of keys instead of one key, self balanced tree structure will perform better than a hash table structure.
A few reasons I can think of:
Trees are dynamic (the space complexity is N), whereas hash tables are often implemented as arrays which are fixed size, which means they will often be initialized with K size, where K > N, so even if you only have 1 element in a hashmap, you might still have 100 empty slots that take up memory. Another effect of this is:
Increasing the size of an array-based hash table is costly (O(N) average time, O(N log N) worst case), whereas trees can grow in constant time (O(1)) + (time to locate insertion point (O(log N))
Elements in a tree can be gathered in sorted order (using ex: in-order-traversal). Thereby you often get a sorted list as a free perk with trees.
Trees can have a better worst-case performance vs a hashmap depending on how the hashmap is implemented (ex: hashmap with chaining will have O(N) worst case, whereas self-balanced trees can guarantee O(log N) worst case for all operations).
Both self-balanced trees and hashmaps have a worst-case efficiency of O(log N) in the best worst-case (assuming that the hashmap does handle colissions), but Hashmaps can have a better average-case performance (often close to O(1)), whereas Trees will have a constant O(log N). This is because even thou a hashmap can locate the insertion index in O(1), it has to account for hash colissions (more than one element hashing to the same array index), and thus in the best case degrades to a self-balanced tree (such as the Java implementation of hashmap), that is, each element in the hashmap can be implemented as a self-balanced tree, storing all elements which has hashed to the given array cell.