Min Max component, objective function - openmdao

I would like to perform some optimizations by minimizing the maximum of a specific path variable within Dymos. or the maximum of the absolute of such a variable.
In linear programming methods, this can be done by introducing slack variables.
Do you know if this has been attempted before with Dymos, or if there was a reason not to include it?
I understand gradient based methods are not entirely suitable for these problems, though I think some "functions" can be introduced to mitigate this.
For example,
The space shuttle reentry problem from [Betts][1] used as a [test example][2] in dymos, the original source contains an example where the maximum heat flux is minimized. Such functionality could be implemented with the "loc" argument as:
phase.add_objective('q_c', loc='max')
[1]: J. Betts. Practical Methods for Optimal Control and Estimation Using Nonlinear Programming. Society for Industrial and Applied Mathematics, second edition, 2010. URL: https://epubs.siam.org/doi/abs/10.1137/1.9780898718577, arXiv:https://epubs.siam.org/doi/pdf/10.1137/1.9780898718577, doi:10.1137/1.9780898718577.
[2]: https://openmdao.github.io/dymos/examples/reentry/reentry.html

This has been done with pseudospectral methods before. Dymos currently doesn't have any direct way of implementing this, for a few reasons:
As you said, doing this naively can introduce discontinuous gradients that confuse the optimizer. When the node at which the maximum occurs switches, this tends to cause a sharp edge discontinuity in the gradient.
Since the pseudospectral methods are discrete, you cannot guarantee that the maximum will occur at a node. It's often fine to assume it does, but sometimes your requirements might demand more precision.
There are two possible ways to get around this.
The KSComp in OpenMDAO can be used as a "differentiable maximum". Add one after the trajectory, feed it the timeseries data for the output of interest, and set it up such that it returns a smooth approximation to the maximum. The KS function is a bit conservative, so it won't pick out the precise maximum, but depending on the value of the rho option it can be tuned to get pretty close.
When a more precise value of a maximum is needed, it's pretty common to set up a trajectory such that a phase ends when the maximum or minimum is reached.
If the variable whose maximum is being sought is a state, this can be done by adding a boundary constraint on the rate source for that state.
This ensures that the maximum occurs at the first or last node in the phase (depending on if its an initial or final boundary constraint). That lets you more accurately capture its value.
If the variable being sought is not a state, its possible to use the polynomials that are used for fitting states and controls in a phase to interpolate the variable of interest. By then taking the time derivative of that polynomial we can get a reasonably good approximation for its rate. The master branch of dymos has a method add_timeseries_rate_output that does this. And soon, within a few weeks hopefully, we'll add add_boundary_rate_constraint so that these interpolated rates can be easily used as boundary constraints.
In the meantime, you should be able to achieve this by adding the timeseries rate output and then manually applying the OpenMDAO method 'add_constraint' to the resulting timeseries output, using either indices=[0] or indices=[-1] to treat it as an initial or final constraint.
This is a common enough request that we'll add some documentation on how to achieve this behavior using both the KSComp approach and the boundary constraint approach.

Personally I'm not as much of a fan of KSComp because I've had trouble getting problems getting those types of objectives to converge in the past. I've used the slack variable and that has worked well. In the following example, we take a guess at the Rotor power in static analysis, and then we run a trajectory and get the actual rotor power during the mission. The objective was to minimize aircraft weight, so if you have a large amount of power in statics, that costs more weight. The constraint shown below prevents us from decreasing our updated guess of rotor power in statics below the maximum power required during the trajectory.
p.model.add_subsystem(
'static_power_check',
om.ExecComp('Power_check = Power_ODE - Power_statics',
Power_check = {'value':np.ones(nn_timeseries_main_tx), 'units':'kW'},
Power_ODE = {'value':np.ones(nn_timeseries_main_tx), 'units':'kW'},
Power_statics = {'value':0.0, 'units':'kW'}),
promotes_inputs=[
('Power_ODE','hop0.main_phase.timeseries.Power_R'), ('Power_statics','Power_{rotor,slack}')],
promotes_outputs=['Power_check'])
p.model.add_constraint('Power_check', upper=0, ref=1)
The constraint on the slack variable effectively helped us ensure that our slack rotor power matched the maximum rotor power during the mission. This allowed us to get the right sizes for the rotor parts (i.e. motors).

Related

OpenMDAO Dymos defect_refs -- how should I set these?

I was hoping to get some information on how to set my defect refs in Dymos a smart way. I found the following notes on scaling here https://github.com/hweyandtnasa/scaling-tutorial but it lists defect scaling in Dymos as a TODO still. Should I just set them equal to the ref value for the state they pertain to?
Scaling pseudospectral optimal control problems is tricky. If you can get a copy of John Betts' Practical Methods for Optimal Control and Estimation Using Nonlinear Programming, I highly recommend it. Betts suggest using the same scaling for both the state design variable values and the defects. This is often a good rule of thumb, but as with most approaches to scaling, isn't universal. The collocation "defects" which dictate whether the dynamics are physically correct are just the difference between the slope of the approximating polynomial and the computed equations of motion.
In situations where state values are large but tiny rates of change are significant, then different scaling is warranted in my experience. Examples of states where these can be true are aircraft range or spacecraft orbital elements. Just recently we had a situation where a low-thrust orbit transfer of spacecraft wasn't matching physics. The semi-latus rectum, for instance, is typically measured in km, so on the scale of thousands when in Earth orbit). In the units being used, a "significant" difference in the defect was less than 1E-6 (the threshold for feasibility being used). In this case, the problem was solved by bumping the defect_scaler up a few orders of magnitude (equivalent to bumping the defect_ref down a few orders of magnitude).
I'd also recommend this paper from Ross, Gong, Karpenko, and Proulx. It lays out some good rules of thumb and has an approachable example in the brachistochrone. It references costates a lot. Dymos doesn't provide automatic costate estimation yet, but they are closely related to the lagrange multipliers of the problem, which are printed in the pyoptsparse output if you use SNOPT.
The github repo you pointed out was the work of an intern and was based around this scaling method developed by Sagliano. We found it to work well in a many situations, but it's also not a panacea.
Ultimately we want some automatic scaling options in Dymos and/or OpenMDAO, but we're not sure when they might find their way into the framework. Our past work has typically tied scaling approaches more tightly to the equations of motion, and Dymos is designed to be more general in that the user can supply whatever EOM they choose.
In Dymos, if you leave the defect_ref value unset when you call set_state_options then the default behavior is to make make the defect_ref equal to the ref value. Here is why that is done:
Defects are the differences between the computed state rate from the polynomial interpolation function and the actual state rate computed by the ODE.
As you can see here:
defect = (f_approx-f_computed) * dt_dstau
the dt_dstau just adjusts things into a normalized time space called tau but it also multiplies by the time unit as well (tau is dimensionless). That means the defects are computed in the same units as the states themselves. Thus a reasonable guess for scaling is to match the scaling between the states and the defects. As Rob Falck's answer points out that is not always the right solution, but it's a good starting point.

PCL RANSAC model fitting: How can I initialise the model parameters?

I'm reading the PCL tutorial on plane segmentation, because I want to find 3D circles in a very large and dense point cloud I have.
I know already the approximate values for center, radius and orientation of the circle, but I have found no way so far to inform the SACSegmentation object of this fact. I could also name 3 inliers to compute initial values on, but I also don't find a way to do this.
My pointcloud is extremely large (10-20M points), so just random samples will likely be prohibitive, especially since I know already more or less what the parameter values should be and only want to optimize them.
Question: How can I set the starting point of the Sample Consensus optimization procedure?
To segment and optimize model
Set SACSegmentation::setOptimizeCoefficients(true)
Use SACSegmentation::segment which takes in an initial guess (or the final model to segment using iff optimize coefficients is set as false)
You can provide you guess here. Depending on optimization method used, you can reduce the computational load.

DSP method to detect specific frequency which may not be the most dominant in the current sound

I'm trying to determine the best DSP method for what I'm trying to accomplish, which is the following:
In real-time, detect the presence of a frequency from a set of different predefined frequencies (no more than 40 different frequencies all within a 1000Hz range). I need to be able to do this even when there are other frequencies (outside of this set or range) that are more dominant.
It is my understanding that FFT might not be the best method for this, because it tells you the most dominant frequency (magnitude) at any given time. This seems like it wouldn't work because if I'm trying to detect say a frequency at 1650Hz (which is present), but there's also a frequency at 500Hz which is stronger, then it's not going to tell me the current frequency is 1650Hz.
I've heard that maybe the Goertzel algorithm might be better for what I'm trying to do, which is to detect single frequencies or a set of frequencies in real-time, even within sounds that have more dominant frequencies than the ones trying to be detected .
Any guidance is greatly appreciated and please correct me if I'm wrong on these assumptions. Thanks!
In vague and somewhat inaccurate terms, the output of the FFT is the magnitude and phase of all[1] frequencies. That is, your statement, "[The FFT] tells you the most dominant frequency (magnitude) at any given time" is incorrect. The FFT is often used as a first step to determine the most dominant frequency, but that's not what it does. In fact, if you are interested in the most dominant frequency, you need to take extra steps over and beyond the FFT: you take the magnitude of all frequencies output by the FFT, and then find the maximum. The corresponding frequency is the dominant frequency.
For your application as I understand it, the FFT is the correct algorithm.
The Goertzel algorithm is closely related to the FFT. It allows for some optimization over the FFT if you are only interested in the magnitude and/or phase of a small subset of frequencies. It might be the right choice for your application depending on the number of frequencies in question, but only as an optimization -- other than performance, it won't solve any problems the FFT won't solve. Because there is more written about the FFT, I suggest you start there and use the Goertzel algorithm only if the FFT proves to not be fast enough and you can establish the Goertzel will be faster in your case.
[1] For practical purposes, what's most inaccurate about this statement is that the frequencies are grouped together in "bins". There's a limited resolution to the analysis which depends on a variety of factors.
I am leaving my other answer as-is because I think it stands on it's own.
Based on your comments and private email, the problem you are facing is most likely this: sounds, like speech, that are principally in one frequency range, have harmonics that stretch into higher frequency ranges. This problem is exacerbated by low quality microphones and electronics, but it is not caused by them and wouldn't go away even with perfect equipment. Once your signal is cluttered with noise in the same band, you can't really distinguish on from off in a simple and reliable way, because on could be caused by the noise. You could try to do some adaptive thresholding based on noise in other bands, and you'll probably get somewhere, but that's no way to build a robust system.
There are a number of ways to solve this problem, but they all involve modulating your signal and using error detection and correction. Basically, you are building a modem and/or radio. Ultimately, what I'm saying is this: you can't solve your problem on the detector alone. You need to build some redundancy into your signal, and you may need to think about other methods of detection. I know of three methods of sending complex signals:
Amplitude modulation, which is what it sounds like you are doing now.
Frequency modulation, which tends to be more robust in the face of ambient noise. (compare FM and AM radio)
Phase modulation, which is more subtle and tricky.
These methods can be combined and multiplexed in various ways. Read about them on wikipedia. Moreover, once your base signal is transmitted, you can add error correction and detection on top.
I am not an expert in this area, but off the top of my head, I am not sure you'll be able to use PM silently, and AM is simply too sensitive to noise, as you've discovered, although it might work with the right kind of redundancy. FM is probably your best bet.

Numerical integration, Runge-Kutta, RK4 in game design

Nearly every game tends to use some of a game loop. Gafferongames has a great article on how to make a well designed game loop: http://gafferongames.com/game-physics/fix-your-timestep/
In his code, he uses integrate( state, t, deltaTime );, where I believe state contains position, velocity, and acceleration of the object. He uses RK4 to integrate it from t to t+deltaTime.
My question is, why use a numerical integration technique like RK4, when you can use kinematics equations (here) to find the exact value?
These equations work when acceleration is constant. It seems rare that you would have a changing acceleration within a timestep. It seems like RK4 is a lower performance, lower accuracy, more complex solution.
Edit: I think you could add a "jerk" value to objects and still find exact expressions for acceleration, velocity, and displacement, if you really wanted to.
Edit 2: Well, I did not read his "Integration Basics" article too carefully. I think he's modelling a damper and spring, which do cause non-constant acceleration within a timestep.
As soon as you add things that many game designers want, like (velocity dependent) drag, position dependent forces, etc. the equations are no longer solvable exactly.
So, if you're happy to limit your forces to those the kinematic equation can handle, then go with it. If you want something flexible, then numerical integration is the only way to go.
Note: If you treat the forces as constant over a time interval when they are not really constant - then you are actually using a form of numerical integration. And it is an inaccurate form of integration too. So why not use a tried and proven numerical method instead? RK4 is one of many such methods.
Approximating acceleration (derivatives, really) as constant within a time step is how numerical integration methods work. When the derivatives are not constant, you need to consider what sort of error you introduce by treating them as constant.
Imagine breaking a time range T up into N equal steps of width h=T/N. Now integrate the dynamical equations stepwise. With RK4, the local error per-step is O(h^5) giving a global error of O(h^4).
Using the kinematical equations as you propose, we can assess the error by considering the Taylor expansion of the position, keeping terms to second order. The position will have error of O(h^3) introduced at each step, corresponding to where you truncate the expansion. This gives local error O(h^3) and global error O(h^2).
Based on the asymptotic error, the error from RK4 goes to zero much more rapidly than does the kinematical equations. It's more accurate. RK4 obtains a very nice accuracy obtained for the number of function evaluations that need to be done.

fitness function and Selection for a Genetic Algorithm

I'm trying to design a nonlinear fitness function where I maximize variable A and minimize the variable B. The issue is that maximizing A is much more important at single digit values, almost logarithmic. B needs to be minimized and in contrast to A, it becomes less important when small (less than one) and more important when it's larger (>1), so exponential decay.
The main goal is to optimize A, so I guess an analog is A=profits, B=costs
Should I aim to keep everything positive so that the I can use a roulette wheel selection, or would it be better to use a rank/torunament kind of system? The purpose of my algorithm is shape optimization.
Thanks
When considering a multi-objective problem the goal is usually to identify all solutions that lie on the Pareto curve - the Pareto optimal set. Have a look here for a 2-dimensional visual example. When the algorithm completes you want a set of solutions that are not dominated by any other solution. You therefore need to define a pareto ranking mechanism to take into account both objectives - for a more in depth explanation, as well as links to even more reading, go here
With this in mind, in order to effectively explore all solutions along the pareto front you do not want an implementation that encourages premature convergence, otherwise your algorithm will only explore the search space in one specific area of the Pareto curve. I would implement a selection operator that keeps all members of each iteration's optimal set of solutions, that is all solutions which are not dominated by another + plus a parameter controlled percentage of other solutions. This way you encourage exploration all along the Pareto curve.
You also need to ensure your mutation and crossover operators are tuned correctly too. With any novel application of Evolutionary Algorithms, part of the problem is trying to identify an optimal parameter set for the problem domain... this is where it gets really interesting!!
The description is very vague, but assuming that you actually have an idea of what the function should look like and you're just wondering whether you need to modify it so that proportional selection can be used easily, then no. Regardless of fitness function, you should probably default to using something like tournament selection. Controlling selection pressure is one of the most important things you have to do in order to get consistently good results, and roulette wheel selection doesn't allow you that control. You typically get enormous pressure very early, which drives premature convergence. That might be preferable in a few cases, but it's not where I'd start my investigations.

Resources