DSP method to detect specific frequency which may not be the most dominant in the current sound - math

I'm trying to determine the best DSP method for what I'm trying to accomplish, which is the following:
In real-time, detect the presence of a frequency from a set of different predefined frequencies (no more than 40 different frequencies all within a 1000Hz range). I need to be able to do this even when there are other frequencies (outside of this set or range) that are more dominant.
It is my understanding that FFT might not be the best method for this, because it tells you the most dominant frequency (magnitude) at any given time. This seems like it wouldn't work because if I'm trying to detect say a frequency at 1650Hz (which is present), but there's also a frequency at 500Hz which is stronger, then it's not going to tell me the current frequency is 1650Hz.
I've heard that maybe the Goertzel algorithm might be better for what I'm trying to do, which is to detect single frequencies or a set of frequencies in real-time, even within sounds that have more dominant frequencies than the ones trying to be detected .
Any guidance is greatly appreciated and please correct me if I'm wrong on these assumptions. Thanks!

In vague and somewhat inaccurate terms, the output of the FFT is the magnitude and phase of all[1] frequencies. That is, your statement, "[The FFT] tells you the most dominant frequency (magnitude) at any given time" is incorrect. The FFT is often used as a first step to determine the most dominant frequency, but that's not what it does. In fact, if you are interested in the most dominant frequency, you need to take extra steps over and beyond the FFT: you take the magnitude of all frequencies output by the FFT, and then find the maximum. The corresponding frequency is the dominant frequency.
For your application as I understand it, the FFT is the correct algorithm.
The Goertzel algorithm is closely related to the FFT. It allows for some optimization over the FFT if you are only interested in the magnitude and/or phase of a small subset of frequencies. It might be the right choice for your application depending on the number of frequencies in question, but only as an optimization -- other than performance, it won't solve any problems the FFT won't solve. Because there is more written about the FFT, I suggest you start there and use the Goertzel algorithm only if the FFT proves to not be fast enough and you can establish the Goertzel will be faster in your case.
[1] For practical purposes, what's most inaccurate about this statement is that the frequencies are grouped together in "bins". There's a limited resolution to the analysis which depends on a variety of factors.

I am leaving my other answer as-is because I think it stands on it's own.
Based on your comments and private email, the problem you are facing is most likely this: sounds, like speech, that are principally in one frequency range, have harmonics that stretch into higher frequency ranges. This problem is exacerbated by low quality microphones and electronics, but it is not caused by them and wouldn't go away even with perfect equipment. Once your signal is cluttered with noise in the same band, you can't really distinguish on from off in a simple and reliable way, because on could be caused by the noise. You could try to do some adaptive thresholding based on noise in other bands, and you'll probably get somewhere, but that's no way to build a robust system.
There are a number of ways to solve this problem, but they all involve modulating your signal and using error detection and correction. Basically, you are building a modem and/or radio. Ultimately, what I'm saying is this: you can't solve your problem on the detector alone. You need to build some redundancy into your signal, and you may need to think about other methods of detection. I know of three methods of sending complex signals:
Amplitude modulation, which is what it sounds like you are doing now.
Frequency modulation, which tends to be more robust in the face of ambient noise. (compare FM and AM radio)
Phase modulation, which is more subtle and tricky.
These methods can be combined and multiplexed in various ways. Read about them on wikipedia. Moreover, once your base signal is transmitted, you can add error correction and detection on top.
I am not an expert in this area, but off the top of my head, I am not sure you'll be able to use PM silently, and AM is simply too sensitive to noise, as you've discovered, although it might work with the right kind of redundancy. FM is probably your best bet.

Related

Min Max component, objective function

I would like to perform some optimizations by minimizing the maximum of a specific path variable within Dymos. or the maximum of the absolute of such a variable.
In linear programming methods, this can be done by introducing slack variables.
Do you know if this has been attempted before with Dymos, or if there was a reason not to include it?
I understand gradient based methods are not entirely suitable for these problems, though I think some "functions" can be introduced to mitigate this.
For example,
The space shuttle reentry problem from [Betts][1] used as a [test example][2] in dymos, the original source contains an example where the maximum heat flux is minimized. Such functionality could be implemented with the "loc" argument as:
phase.add_objective('q_c', loc='max')
[1]: J. Betts. Practical Methods for Optimal Control and Estimation Using Nonlinear Programming. Society for Industrial and Applied Mathematics, second edition, 2010. URL: https://epubs.siam.org/doi/abs/10.1137/1.9780898718577, arXiv:https://epubs.siam.org/doi/pdf/10.1137/1.9780898718577, doi:10.1137/1.9780898718577.
[2]: https://openmdao.github.io/dymos/examples/reentry/reentry.html
This has been done with pseudospectral methods before. Dymos currently doesn't have any direct way of implementing this, for a few reasons:
As you said, doing this naively can introduce discontinuous gradients that confuse the optimizer. When the node at which the maximum occurs switches, this tends to cause a sharp edge discontinuity in the gradient.
Since the pseudospectral methods are discrete, you cannot guarantee that the maximum will occur at a node. It's often fine to assume it does, but sometimes your requirements might demand more precision.
There are two possible ways to get around this.
The KSComp in OpenMDAO can be used as a "differentiable maximum". Add one after the trajectory, feed it the timeseries data for the output of interest, and set it up such that it returns a smooth approximation to the maximum. The KS function is a bit conservative, so it won't pick out the precise maximum, but depending on the value of the rho option it can be tuned to get pretty close.
When a more precise value of a maximum is needed, it's pretty common to set up a trajectory such that a phase ends when the maximum or minimum is reached.
If the variable whose maximum is being sought is a state, this can be done by adding a boundary constraint on the rate source for that state.
This ensures that the maximum occurs at the first or last node in the phase (depending on if its an initial or final boundary constraint). That lets you more accurately capture its value.
If the variable being sought is not a state, its possible to use the polynomials that are used for fitting states and controls in a phase to interpolate the variable of interest. By then taking the time derivative of that polynomial we can get a reasonably good approximation for its rate. The master branch of dymos has a method add_timeseries_rate_output that does this. And soon, within a few weeks hopefully, we'll add add_boundary_rate_constraint so that these interpolated rates can be easily used as boundary constraints.
In the meantime, you should be able to achieve this by adding the timeseries rate output and then manually applying the OpenMDAO method 'add_constraint' to the resulting timeseries output, using either indices=[0] or indices=[-1] to treat it as an initial or final constraint.
This is a common enough request that we'll add some documentation on how to achieve this behavior using both the KSComp approach and the boundary constraint approach.
Personally I'm not as much of a fan of KSComp because I've had trouble getting problems getting those types of objectives to converge in the past. I've used the slack variable and that has worked well. In the following example, we take a guess at the Rotor power in static analysis, and then we run a trajectory and get the actual rotor power during the mission. The objective was to minimize aircraft weight, so if you have a large amount of power in statics, that costs more weight. The constraint shown below prevents us from decreasing our updated guess of rotor power in statics below the maximum power required during the trajectory.
p.model.add_subsystem(
'static_power_check',
om.ExecComp('Power_check = Power_ODE - Power_statics',
Power_check = {'value':np.ones(nn_timeseries_main_tx), 'units':'kW'},
Power_ODE = {'value':np.ones(nn_timeseries_main_tx), 'units':'kW'},
Power_statics = {'value':0.0, 'units':'kW'}),
promotes_inputs=[
('Power_ODE','hop0.main_phase.timeseries.Power_R'), ('Power_statics','Power_{rotor,slack}')],
promotes_outputs=['Power_check'])
p.model.add_constraint('Power_check', upper=0, ref=1)
The constraint on the slack variable effectively helped us ensure that our slack rotor power matched the maximum rotor power during the mission. This allowed us to get the right sizes for the rotor parts (i.e. motors).

Signal to filter in R

I need to filter a signal without it losing its properties so that later this signal is inserted into an artificial neural network. I'm using the R and the signal library, I thought about using a low-pass filter or an FFT.
This is the signal to be filtered, it is about shifting pixels in a video. In the case I calculated the resultant of vectors X and Y to obtain only one value and thus generate this graph / signal:
Using the signal library and the fftfilt function, I obtained the following signal, which seems to be easier for a neural network to be trained, but I did not understand what the function is doing and if the signal properties have remained.
resulting <- fftfilt(rep(1,50)/50,resulting)
Could someone explain how this function works or suggest a better method to filter this signal.
As for the fftfilt(...) function I can tell you roughly what it does: it is an approximate finite impulse response filter implementation that uses FFT of the filter impulse response function with some padding as a window. It gets the signal's FFT within the window and filter's IR FFT, then generates the filtered signal in frequency domain by just multiplying the two results and then uses the reverse FFT to get the actual result in time domain. If your FIR filter has a huge number of coefficients (even though in many cases it is just a sign of a bad system design and should not be needed) the function works much faster than filter(Ma(...),...). With more reasonable number of coefficients (like definitely below 100) the direct and precise approach is actually faster.
As for the proper methods of filtering there are so many of them that there are full thick books on just the topic. And my personal experience in the field is a little bit skewed towards somewhat specific very low computing power microcontroller-based sensor signal DSP tricks with fixed point arithmetics, precise unity gains in a selected pass band point, power of 2 scale coefficients and staged implementations so I doubt it would help you in this case. Basically first you just need to know what result do you want to get from your filter (and do you even need to filter or maybe just something like peak detection and decimation is enough) and then just know what you are doing. From your message it is hard to guess what your "neural network" requirements are, what you think you need for it and what you actually need.

Finding and counting audio drop-outs in an ecological recording

I am trying to assess how many audio drop outs are in a given sound file of an ecological soundscape.
Format: Wave
Samplingrate (Hertz): 192000
Channels (Mono/Stereo): Stereo
PCM (integer format): TRUE
Bit (8/16/24/32/64): 16
My project had a two element hydrophone. The elements were different brands/models, and we are trying to determine which element preformed better in our specific experiment. One analysis we would like to conduct is measuring how often each element had drop-outs, or loss of signal. These drop-outs are not signal amplitude related, in other words, the drop-outs are not caused by maxing out the amplitude. The element or the associated electronics just failed.
I've been trying to do this in R, as that is the program I am most familiar with. I have very limited experience with Matlab and regex, but am opening to trying those programs/languages. I'm a biologist, so please excuse any ignorance.
In R I've been playing around with the package 'seewave', and while I've been able to produce some very pretty spectrograms (which, to be fair, is the only context I've previously used that package). I attempted to use the envelope and automatic temporal measurements function within seewave (timer). I got some interesting, but opposite results.
foo=readWave("Documents/DASBR/DASBR2_20131119$032011.wav", from=53, to=60, units="seconds")
timer(foo, f=96000, threshold=6.5, msmooth=c(30,5), colval="blue")
I've altered the values of msmooth and threshold countless times, but that's just fine tinkering. What this function preforms is measuring the duration between amplitude peaks at the given threshold. What I need it to do either a) find samples in the signal without amplitude or b) measure the duration between areas without amplitude. I can work with either of those outputs. Basically I want to reverse the direction the threshold is measuring, does that make sense? So therefore any sample that is below a threshold will trigger a measurement, rather than any sample that is above the threshold.
I'm still playing with seewave to see how to produce the data I need, but I'm looking for a bit of guidance. Perhaps there is a function in seewave that will accomplish what I'm trying to do more efficiently. Or, if there is anyway to output the numerical data generated from timer, I could use the 'quantmod' package function 'findValleys' to get a list of all the data gaps.
So yeah, guidance is what I'm requesting, oh data crunching gods.
Cheers.
This problem sounds reminiscent of power transfer problems often seen in electrical engineering. One way to solve the problem is to take the RMS (the root of the mean of the square) of the samples in the signal over time, averaged over short durations (perhaps a few seconds or even shorter). The durations where you see low RMS are where the dropouts are. It's analogous to the VU meters that you sometimes see on audio amplifiers - which indicate the power being transferred to the speakers from the amplifier.
I just wanted to summarize what I ended up doing so other people will be aware. Unfortunately, the RMS measurement is not what I was looking for. Though rms could technically give me a basic idea of drop-outs could occur, because I'm working with ecological recordings there are too many other factors at play.
Background: The sound streams I am working with are from a two element hydrophone, separated vertically by 2 meters and recording at 100 m below sea level. We are finding that the element sitting at ~100 meters is experiencing heavy drop outs, while the element at ~102 meters is mostly fine. We are currently attributing this to a to-be-identified electrical issue. If both elements were poised to receive auto exactly the same way, rms would work when detecting drop-outs, but because sound is received independently the rms calculation is too heavily impacted by other factors. Two meters can make a larger difference than you'd think when it comes to source levels and signal reception, it's enough for us to localize vocalizing animals (with left/right ambiguity) based on the delay between signal arrivals.
All the same, here's what I did:
library(seewave)
library(tuneR)
foo=readWave("Sound_file_Path")
L=foo#left
R=foo#right
rms(L)
rms(R)
I then looped this process through a directory, which I detail here: for.loop with WAV files
So far, this issue is still unresolved, but thank you for the discussion!
~etg

fitness function and Selection for a Genetic Algorithm

I'm trying to design a nonlinear fitness function where I maximize variable A and minimize the variable B. The issue is that maximizing A is much more important at single digit values, almost logarithmic. B needs to be minimized and in contrast to A, it becomes less important when small (less than one) and more important when it's larger (>1), so exponential decay.
The main goal is to optimize A, so I guess an analog is A=profits, B=costs
Should I aim to keep everything positive so that the I can use a roulette wheel selection, or would it be better to use a rank/torunament kind of system? The purpose of my algorithm is shape optimization.
Thanks
When considering a multi-objective problem the goal is usually to identify all solutions that lie on the Pareto curve - the Pareto optimal set. Have a look here for a 2-dimensional visual example. When the algorithm completes you want a set of solutions that are not dominated by any other solution. You therefore need to define a pareto ranking mechanism to take into account both objectives - for a more in depth explanation, as well as links to even more reading, go here
With this in mind, in order to effectively explore all solutions along the pareto front you do not want an implementation that encourages premature convergence, otherwise your algorithm will only explore the search space in one specific area of the Pareto curve. I would implement a selection operator that keeps all members of each iteration's optimal set of solutions, that is all solutions which are not dominated by another + plus a parameter controlled percentage of other solutions. This way you encourage exploration all along the Pareto curve.
You also need to ensure your mutation and crossover operators are tuned correctly too. With any novel application of Evolutionary Algorithms, part of the problem is trying to identify an optimal parameter set for the problem domain... this is where it gets really interesting!!
The description is very vague, but assuming that you actually have an idea of what the function should look like and you're just wondering whether you need to modify it so that proportional selection can be used easily, then no. Regardless of fitness function, you should probably default to using something like tournament selection. Controlling selection pressure is one of the most important things you have to do in order to get consistently good results, and roulette wheel selection doesn't allow you that control. You typically get enormous pressure very early, which drives premature convergence. That might be preferable in a few cases, but it's not where I'd start my investigations.

Examples of mathematics algorithms that apply to game development

I am designing a RPG game like final fantasy.
I have the programming part done but what I lack is the maths. I am ok at maths but I am having trouble incorporating the players stas into mu sums.
How can I make an action timer that is based on the players speed?
How can I use attack and defence so that it is not always exactly the same damage?
How can I add randomness into the equations?
Can anyone point me to some resources that I can read to learn this sort of stuff.
EDIT: Clarification Of what I am looking for
for the damage I have (player attack x move strength) / enemy defence.
This works and scales well but i got a look at the algorithms from final fantasy 4 a while a got and this sum alone was over 15 steps. mine has only 2.
I am looking for real game examples if possible but would settle for papers or books that have sections that explain how they get these complex sums and why they don't use simple ones.
I eventually intent to implement but am looking for more academic knowledge at the moment.
Not knowing Final fantasy at all, here are some thoughts.
Attack/Defence could either be a 'chance to hit/block' or 'damage done/mitigated' (or, possibly, a blend of both). If you decide to go for 'damage done/mitigated', you'll probably want to do one of:
Generate a random number in a suitable range, added/subtracted from the base attack/defence value.
Generate a number in the range 0-1, multiplied by the attack/defence
Generate a number (with a Gaussian or Poisson distribution and a suitable standard deviation) in the range 0-2 (or so, to account for the occasional crit), multiplied by the attack/defence
For attack timers, decide what "double speed" and "triple speed" should do for the number of attacks in a given time. That should give you a decent lead for how to implement it. I can, off-hand, think of three methods.
Use N/speed as a base for the timer (that means double/triple speed gives 2/3 times the number of attacks in a given interval).
Use Basetime - Speed as the timer (requires a cap on speed, may not be an issue, most probably has an unintuitive relation between speed stat and timer, not much difference at low levels, a lot of difference at high levels).
Use Basetime - Sqrt(Speed) as the timer.
I doubt you'll find academic work on this. Determining formulae for damage, say, is heuristic. People just make stuff up based on their experience with various functions and then tweak the result based on gameplay.
It's important to have a good feel for what the function looks like when plotted on a graph. The best advice I can give for this is to study a course on sketching graphs of functions. A Google search on "sketching functions" will get you started.
Take a look at printed role playing games like Dungeons & Dragons and how they handle these issues. They are the inspiration for computer RPGs. I don't know of academic work
Some thoughts: you don't have to have an actual "formula". It can be rules like "roll a 20 sided die, weapon does 2 points of damage if the roll is <12 and 3 points of damage if the roll is >=12".
You might want to simplify continuous variables down to small ranges of integers for testing. That way you can calculate out tables with all the possible permutations and see if the results look reasonable. Once you have something good, you can interpolate the formulas for continuous inputs.
Another key issue is play balance. There aren't necessarily formulas for telling you whether your game mechanics are balanced, you have to test.

Resources