Parselmouth batch full voice report - report

I was wondering if there is a way to batch process audio files and generate full voice reports using parselmouth or another pythonic implementation of praat. So far I have only been able to get the median pitch but I need to be able to work out the total number of pulses and periods, the degree of voice breaks and the shimmer. If this isn't possible using python would it be possible using a praat script?
praat generated voice report

[Disclaimer: I am the author of the mentioned Parselmouth library]
This question was asked and solved on the Gitter chatbox for Parselmouth, but for future reference, this was the solution I suggested there:
A similar question was asked on StackOverflow before: How to automate voice reports for Praat, explaining how to get the voice report without the Praat 'View & Edit' window (i.e., using a Sound, Pitch, and PointProcess object).
So first you get these these three objects, the Sound sound, Pitch pitch, and PointProcess pulses, possibly changing parameters you want to have differently:
import parselmouth
sound = parselmouth.Sound("the_north_wind_and_the_sun.wav")
pitch = sound.to_pitch()
pulses = parselmouth.praat.call([sound, pitch], "To PointProcess (cc)")
After that, you can query the different quantities you want to extract in different ways. For example, the number of pulses in the PointProcess can be extracted with:
n_pulses = parselmouth.praat.call(pulses, "Get number of points")
And the others:
n_periods = parselmouth.praat.call(pulses, "Get number of periods", 0.0, 0.0, 0.0001, 0.02, 1.3)
shimmer_local = parselmouth.praat.call([sound, pulses], "Get shimmer (local)...", 0.0, 0.0, 0.0001, 0.02, 1.3, 1.6)
Getting the degree of voice breaks is somehow harder. No idea why Praat has no command to get this.
A quick way of getting this in Python would be:
max_voiced_period = 0.02 # This is the "longest period" parameter in some of the other queries
periods = [parselmouth.praat.call(pulses, "Get time from index", i+1) -
parselmouth.praat.call(pulses, "Get time from index", i)
for i in range(1, n_pulses)]
degree_of_voice_breaks = sum(period for period in periods if period > max_voiced_period) / sound.duration
You could also find the line that reports this percentage in the output string of "Voice report"; see https://stackoverflow.com/a/51657044/2043407
If you have a look at the Praat user interface, there's indeed no button "Get median", so that's why that line doesn't work. However, there is a "Get quantile" command in Praat
So I would suggest
parselmouth.praat.call(pitch, "Get quantile", 0.0, 0.0, 0.5, "Hertz")
(That 0.5 is then the 50% quantile, i.e., the median)

I am using parselmouth to extract a large number of features from some audio files. In general, I end up using something along these lines:
import parselmouth
from parselmouth.praat import call
sound = parselmouth.Sound(filename)
pitch = call(sound, "To Pitch", 0.0, F0min, F0max)
pulse = call([sound, pitch], "To PointProcess (cc)")
voice_report = call([sound, pitch, pulse], "Voice report", 0.0, 0.0, 75, 600, 1.3, 1.6, 0.03, 0.45).split(chr(10))
From the report, you can retrieve coefficients such as harmonic-to-noise ratio (HNR), Jitters, Shimmers and classic pitch statistics.

Related

RandomLinkSplit not working with HeteroData

I am having some serious trouble with torch-geometric when dealing with my own data.
I am trying to build a graph that has 4 different node entities (of which only 1 bears some node features, the others are simple nodes), and 5 different edge type (of which only one bears a weight).
I have managed to do so by building a HeteroData() object and loading the different matrices with labels, attributes and so on.
The problem arises when I try to call RandomLinkSplit. Here's what my call looks like:
import torch_geometric.transforms as T
transform = T.RandomLinkSplit(
num_val = 0.1,
num_test = 0.1,
edge_types = [('Patient', 'suffers_from', 'Diagnosis'),
('bla', 'bla', 'bla') #I copy all the edge types here
],
)
but I get the empty AssertionError on the condition:
assert is instance(rev_edge_types, list)
So I thought that I needed to transform the graph to undirected (for some weird reason) like the tutorial does, and then to sample also reverse edges (even though I don't need them):
import torch_geometric.transforms as T
data = T.ToUndirected()(data)
transform = T.RandomLinkSplit(
num_val = 0.1,
num_test = 0.1,
edge_types = [('Patient', 'suffers_from', 'Diagnosis'),
('bla', 'bla', 'bla') #I copy all the edge types here
],
rev_edge_types = [('Diagnosis', 'rev_suffers_from', 'Patient'),
...
]
)
but this time I get the error unsupported operand type(s) for *: 'Tensor' and 'NoneType'.
Does any expert have any ideas on why this is happening? I am simply trying to do a train test split, and from the docs I read the Heterogeneous graphs should be well supported, but I don't understand why this is not working and I have been trying different things for quite a lot of time.
Any help would be appreciated!
Thanks

Error: Required number of iterations = 1087633109 exceeds iterMax = 1e+06 ; either increase iterMax, dx, dt or reduce sigma

I am getting this error and this post telling me that I should decrease the sigma but here is the thing this code was working fine a couple of months ago. Nothing change based on the data and the code. I was wondering why this error out of blue.
And the second point, when I lower the sigma such as 13.1, it looks running (but I have been waiting for an hour).
sigma=203.9057
dimyx1=1024
A22den=density(Lnetwork,sigma,distance="path",continuous=TRUE,dimyx=dimyx1) #
About Lnetwork
Point pattern on linear network
69436 points
Linear network with 8417 vertices and 8563 lines
Enclosing window: rectangle = [143516.42, 213981.05] x [3353367, 3399153] units
Error: Required number of iterations = 1087633109 exceeds iterMax = 1e+06 ; either increase iterMax, dx, dt or reduce sigma
This is a question about the spatstat package.
The code for handling data on a linear network is still under active development. It has changed in recent public releases of spatstat, and has changed again in the development version. You need to specify exactly which version you are using.
The error report says that the required number of iterations of the algorithm is too large. This occurs because either the smoothing bandwidth sigma is too large, or the spacing dx between sample points along the network is too small. The number of iterations is proportional to (sigma/dx)^2 in most cases.
First, check that the value of sigma is physically reasonable.
Normally you shouldn't have to worry about the algorithm parameter dx because it is determined automatically by default. However, it's possible that your data are causing the code to choose a very small value of dx.
The internal code which automatically determines the spacing dx of sample points along the network has been changed recently, in order to fix several bugs.
I suggest that you specify the algorithm parameters manually. See the help file for densityHeat for information on how to control the spacings. Setting the parameters manually will also ensure greater consistency of the results between different versions of the software.
The quickest solution is to set finespacing=FALSE. This is not the best solution because it still uses some of the automatic rules which may be giving problems. Please read the help file to understand what that does.
Did you update spatstat since this last worked? Probably the internal code for determining spacing on the network etc. changed a bit. The actual computations are done by the function densityHeat(), and you can see how to manually set spacing etc. in its help file.

Transform accelerometer data to object space

For the context: I'm developing a embedded system that has an accelerometer built in. This device is connected to a smartphone and streams data (including the accelerometer values). The device can now be attached in any orientation to a vehicle / bike / ...
The Problem: When I receive the accelerometer data from the device, I would like to transform them into the "vehicle-space". What I found out so far is needed:
A downwards pointing vector, in "device-space" (basically gravitation)
A forward vector, in "device-space" (pointing in the forward direction of the vehicle)
I have both of this vectors calculated in my application, however I'm now a little bit stuck with the maths / implementation part.
What I found that could possibly a solution is the Change of Basis, however I was not able to
Find a confirmation that this is the way to do it
How to do this in code/pseudo-code
I don't want to include a fat math library for such a "small" task and would rather understand the maths behind it myself.
The current solution in my head, which is based on my long-ago memories from university-math and which I have no proof for: (Pseudo-Code)
val nfv = normalize(forwardVector)
val ndv = normalize(downwardVector)
val fxd = cross(nfv, ndv)
val rotationMatrix = (
m11: fxd.x, m12: fxd.y, m13: fxd.z,
m21: ndv.x, m22: ndv.y, m23: ndv.z,
m31: nfv.x, m32: nfv.y, m33: nfv.z
)
// Then for each "incoming" vector
val transformedVector = rawVector * rotationMatrix
Question: Is this the correct way to do it?

MATH? SHA256 Roulette / Lottery computable?

There is a website where you can play roulette, you can put on the colors only (red, black = double, green = 14x)
The rolls are computed by the following way:
There is a serverSeed EVERY 24 hours its different
This is a precomputed value generated some time in the past.
Seeds are generated in a chain such that today's seed is the hash of tomorrow's seed. Since there is no way to reverse SHA-256 we can prove each seed was generated in advance by working backwards from a precomputed chain.
There is a lotto and round_id too but they are given only the serverSeed is hidden until next day.
Example:
$server_seed = "39b7d32fcb743c244c569a56d6de4dc27577d6277d6cf155bdcba6d05befcb34";
$lotto = "0422262831";
$round_id = "1";
$hash = hash("sha256",$server_seed."-".$lotto."-".$round_id);
$roll = hexdec(substr($hash,0,8)) % 15;
echo "Round $round_id = $roll";
This is how rolls are generated with making a new hash everyround as the round ID increments by 1 every roll, the serverSeed and the lotto remains the same whole day.
There is also a history page on the website where you can check every rolled color and number in the past.
My Question: Is there anyway to compute the next roll by the already rolled numbers? (I dont talk about reversing the SHA256 serverSeed or anything like that!!!)
But really isnt any math in this?
I know its MIGHT be all random but i cant imagine this is random.
Here are the yesterday's rolls where you can see the Round IDs too.
I saw the rolls a lot of time repeat sometimes but sometimes its not... I cant believe it doesnt have any math in it.
>>>> LUCK? OR MATH? <<<<
Waiting for answers...
The method is an example of "KDF in Counter Mode" as defined in NIST SP 800-108, with SHA256 as "PRF". As far as i know this method is considered a secure random number generator. So the answer is "There is no math in it, repetitions are just luck".

How to compute the velocity & distance from Tri-Axial Accelerometer Data?

I'm having sample data of tri-axial accelerometer as listed below:
Timestamp, AcceX, AcceY, AcceZ
0.0, -0.96, -0.69, -1.24
0.1, ............
I want to determine velocity and distance traveled by the object with accelerometer.
Determining the velocity and position from an accelerometer is much harder than it might at first seem.
Depending on your specific needs, there might be a way to form an estimate that would work, but it is generally a very inaccurate approach, and almost worthless given what's usually available on a phone. With specialized equipment one can do much better.
First problem: gravity will be recorded as an acceleration and it's hard to remove (so you at least need a gyro too).
Second problem: to get distance and velocity you need to integrate your acceleration so small errors will accumulate.
To find out more, search for "dead reckoning". See, for example, the first google hit i got, which seems to explain the issues well enough.

Resources