AnalyserNode's getFloatFrequencyData vs getFloatTimeDomainData - audio-recording

So I think I understand getFloatFrequencyData pretty well. If getFloatFrequencyData returns an array of 1024 values, each value represents the volume of a frequency bin/range. In the case of 1024 values at a sample rate of 44.1, each value would represent the volume of a frequency range of about 20 hertz.
Now what about getFloatTimeDomainData? Let's say I have 2048 values, what does each value represent?
Not the same as understanding getByteTimeDomainData and getByteFrequencyData in web audio. Or at least, this question's answer doesn't answer mine.

The Float32Array obtained using getFloatTimeDomainData will contain an array of sample values, each value defining the amplitude at the sampled location, usually in the domain of [-1, 1]. Sample locations are uniquely distributed, the obtained data is essentially the equivalent of raw PCM.
For a sine wave, it would yield gradually changing continuous values among the following approximation curve:
0 ... 0.7 ... 1.0 ... 0.7 ... 0 ... -0.7 ... -1.0 ... -0.7 ... 0 ...
Think of it as a series of subsequent values that together define the shape of the audio wave; if you were to visualize the obtained values on, say, a canvas, using the sample values as y coordinates (amplitude) and a subsequently increasing value for x coordinates (time), you would get an oscilloscope, such as:
Note how this sine waveform correlates with the example values above. Here are some example operations you can do on this data to get a better understanding:
If you were to multiply each value by 2, you would amplify the volume by 100% (double volume)
If you were to replace every value with 0, you would get silence
If you were to skip every second value, you would get a 100% pitched up audio (double playback speed)

getFloatTimeDomainData returns a sample of the PCM data from the audio stream - i.e. the raw audio data.

Related

Relrisk function and bandwidth selection in spatstat

I'm having trouble interpreting the results I got from relrisk. My data is a multiple point process containing two marks (two rodents species AA and RE), I want to know if they are spatially segregated or not.
> summary(REkm)
Marked planar point pattern: 46 points
Average intensity 0.08101444 points per square unit
*Pattern contains duplicated points*
Coordinates are given to 3 decimal places
i.e. rounded to the nearest multiple of 0.001 units
Multitype:
frequency proportion intensity
AA 15 0.326087 0.02641775
RE 31 0.673913 0.05459669
Window: rectangle = [4, 38] x [0.3, 17] units
x 16.7 units)
Window area = 567.8 square units
relkm <- relrisk(REkm)
plot(relkm, main="Relrisk default")
The bandwidth of this relrisk estimation is automatically selection by default(bw.relrisk), but when I tried other numeric number using sigma= 0.5 or 1, the results are somehow kind of weird.
How did this happened? Was it because the large proportion of blank space of my ppp?
According to chapter.14 of Spatial Point Patterns books and the previous discussion, I assume the default of relrisk will show the ratio of intensities (case divided by control, in my case: RE divided by AA), but if I set casecontrol=FALSE, I can get the spatially-varying probability of each type.
Then why the image of type RE in the Casecontrol=False looks exactly same as the relrisk estimation by default? Or they both estimate p(RE)=λRE/ λRE+λAA for each sites?
Any help will be appreciated! Thanks a lot!
That's two questions.
Why does the image for RE when casecontrol=FALSE look the same as the default output from relrisk?
The definitive source of information about spatstat functions is the online documentation in the help files. The help file for relrisk.ppp gives full details of the behaviour of this function. It says that the calculation of probabilities and risks is controlled by the argument relative. If relative=FALSE (the default), the code calculates the spatially varying probability of each type. If relative=TRUE it calculates the relative risk of each type i, defined as the ratio of the probability of type i to the probability of type c where c is the type designated as the control. If you wanted the relative risk then you should set relative=TRUE.
Very different results obtained when setting sigma=0.5 compared to the automatically selected bandwidth.
Your example output says that the window is 34 by 17 units. A smoothing bandwidth of sigma=0.5 is very small for this region. Imagine each data point being replaced by a blurry circle of radius 0.5; there would be a lot of empty space. The smoothing procedure is encountering numerical problems which are causing the funky artefacts.
You could try a range of different values of sigma, say from 1 to 15, and decide which value produces the most satisfactory result.
The plot of relrisk(REkm, casecontrol=FALSE) suggests that the automatic bandwidth selector bw.relriskppp chose a much larger value of sigma, perhaps about 10. You can investigate this by
b <- bw.relriskppp(REkm)
print(b)
plot(b)
The print command will print the chosen value of sigma that was used in the default calculation. The plot command will show the cross-validation criterion which was maximised to select the bandwidth. This gives you an idea of the range of values of sigma that are acceptable according to the automatic selector.
Read the help file for bw.relriskppp about the different options available for bandwidth selection method. Maybe a different choice of method would give you a more acceptable result from your viewpoint.

How to use the output of the inverse Fourier transform?

I am trying to enhance an ultrasonic signal by spectral subtraction. The signal is in the time domain and contains noise. I have divided the signal into Hamming windows of 2 µs and calculated the Fourier transforms of those frames. Then I selected 3 consecutive frames which I interpreted as noise. I averaged the magnitude spectra of those 3 frames and subtracted that average from every single frame's magnitude spectrum. Then I defined all negative magnitude spectra as zero and reconstructed the enhanced Fourier transform by combining the new magnitude spectra with the phase spectra. This gives me a series of complex numbers per frame. Now I would like to transform this series back to the time domain by using the inverse Fourier transform. However, this operation provides me with complex numbers which I do not know how to use.
I have read in a couple of posts that it is normal to obtain a complex output from inverse Fourier transformation. However the use of the complex numbers is divided. Some say to neglect the imaginary part, because it is supposed to be very small (e-15) but in my case it is not negligible (0.01-0.5). To be honest I just do not know what to do with the numbers now, because I expected the inverse Fourier transform to give me real number only. And hope for very small imaginary parts, but unfortunately..
# General parameters
#
total_samples = length(time_or) # Total numbers of samples in the current series
max_time = max(time_or) # Length of the measurement in microseconds
sampling_freq = 1/(max_time/1000000)*total_samples # Sampling frequency
frame_length_t = 2 # In microseconds (time)
frame_length_s = round(frame_length_t/1000000*sampling_freq) # In samples per frame
overlap = frame_length_s/2 # Overlap in number of frames, set to 50% overlap
#
# Transform the frame to frequency domain
#
fft_frames = specgram(amp, n=frame_length_s, Fs=125, window=hamming(frame_length_s), overlap=overlap)
mag_spec=abs(fft_frames[["S"]])
phase_spec=atan(Im(fft_frames[["S"]])/Re(fft_frames[["S"]]))
#
# Determine the arrival time of noise
#
cutoff= 10 #determine the percentage of the signal that has to be cut off
dnr=us_data[(length(us_data[,1])*(cutoff/100)):length(us_data[,1]), ]
noise_arr=(length(us_data[,1])-length(dnr[,1])+min(which(dnr[,2]>0.01)))*0.008
#
# Select the frames for noise spectrum estimation
#
noise_spec=0
noise_spec=mag_spec[,noise_arr]
noise_spec=noise_spec+mag_spec[, (noise_arr+1)]
noise_spec=noise_spec+mag_spec[, (noise_arr+2)]
noise_spec_check=noise_spec/3
#
# Subtract the estimated noise spectrum from every frame
#
est_mag_spec=mag_spec-noise_spec_check
est_mag_spec[est_mag_spec < 0] = 0
#
# Transform back to frequency spectrum
#
j=complex(real=0, imaginary=1)
enh_spec = est_mag_spec*exp(j*phase_spec)
#
# Transform back to time domain
#
install.packages("pracma")
library("pracma")
enh_time=fft(enh_spec[,2], inverse=TRUE)
I hope that there is anyone with an idea on how to process these complex numbers. Maybe I have made an mistake earlier on in the processing method, but I have checked it multiple times and it seems quite solid to me. It is the (one but last) last step of the process and really hoping to obtain a nice time domain signal after inverse Fourier transforming.
An essential troubleshooting aid when transforming data using Fourier Transform is the notion that you can do a fft then take that data and do an inverse fft and get back your original data ... I suggest you get comfortable doing this with toy input time domain data ... lets say a 1 Khz audio wave which is your time domain data ... send it into a fft call which will return back an array of its frequency domain representation ... without doing anything with that data send it into an inverse fft ( ifft ) ... the data returned will the your original 1 Khz audio wave ... do that now to gain an appreciation of its power and use this trick on your project to confirm you are in the ball park ... Alternatively if you begin with frequency domain data you can also do this ...
freq domain data -> ifft -> time domain data -> fft -> same freq domain data
or
time domain -> fft -> freq domain -> ifft -> same time domain data
see more details here Get frequency with highest amplitude from FFT
This is your problem:
phase_spec=atan(Im(fft_frames[["S"]])/Re(fft_frames[["S"]]))
Here you compute an angle in a half circle, mapping the other half to the first. That is, you are loosing information.
Many languages have a function to obtain the phase of a complex value, for example in MATLAB it is angle, and in Python numpy.angle.
Alternatively, use the atan2 function, which exists in every single language I’ve ever used, except in NumPy they decided to call it arctan2. It computes the four-quadrant arctangent by taking the two components as separate values. That is, atan(y/x) is the same as atan2(y,x) if the result is in the first two quadrants.
I presume you can do
phase_spec=atan2(Im(fft_frames[["S"]]), Re(fft_frames[["S"]]))

Periodogram (TSA In R) can't find correct frequency

I'm trying to process a sinusoidal time series data set:
I am using this code in R:
library(readxl)
library(stats)
library(matplot.lib)
library(TSA)
Data_frame<-read_excel("C:/Users/James/Documents/labssin2.xlsx")
# compute the Fourier Transform
p = periodogram(Data_frame$NormalisedVal)
dd = data.frame(freq=p$freq, spec=p$spec)
order = dd[order(-dd$spec),]
top2 = head(order, 5)
# display the 2 highest "power" frequencies
top2
time = 1/top2$f
time
However when examining the frequency spectrum the frequency (which is in Hz) is ridiculously low ~ 0.02Hz, whereas it should have one much larger frequency of around 1Hz and another smaller one of 0.02Hz (just visually assuming this is a sinusoid enveloped in another sinusoid).
Might be a rather trivial problem, but has anyone got any ideas as to what could be going wrong?
Thanks in advance.
Edit 1: Using
result <- abs(fft(df$Data_frame.NormalisedVal))
Produces what I am expecting to see.
Edit2: As requested, text file with the output to dput(Data_frame).
http://m.uploadedit.com/bbtc/1553266283956.txt
The periodogram function returns normalized frequencies in the [0,0.5] range, where 0.5 corresponds to the Nyquist frequency, i.e. half your sampling rate. Since you appear to have data sampled at 60Hz, the spike at 0.02 would correspond to a frequency of 0.02*60 = 1.2Hz, which is consistent with your expectation and in the neighborhood of what can be seen in the data your provided (the bulk of the spike being in the range of 0.7-1.1Hz).
On the other hand, the x-axis on the last graph you show based on the fft is an index and not a frequency. The corresponding frequency should be computed according to the following formula:
f <- (index-1)*fs/N
where fs is the sampling rate, and N is the number of samples used by the fft. So in your graph the same 1.2Hz would appear at an index of ~31 assuming N is approximately 1500.
Note: the sampling interval in the data you provided is not quite constant and may affect the results as both periodogram and fft assume a regular sampling interval.

FFT frequency domain values depend on sequence length?

I am extracting Heart Rate Variable (HRV) frequency domain features, e.g. LF, HF, using FFT. Currently, I have found out that LF and HF values in longer sequence length, e.g. 3 minutes, will be larger than shorter sequence length, e.g. 30 seconds. I wonder if this is a common observation or there are some bugs in my execution codes? Thanks in advance
Yes, the frequency in each bin depends on N, the sequence length.
See this related answer: https://stackoverflow.com/a/4371627/119527
An FFT by itself is a dimensionless basis transform. But if you know the sample rate (Fs) of the input data and the length (N) of the FFT, then the center frequency represented by each FFT result element or result bin is bin_index * (Fs/N).
Normally (with baseband sampling) the resulting range is from 0 (DC) up to Fs/2 (for strictly real input the rest of the FFT results are just a complex conjugate mirroring of the first half).
Added: Also many forward FFT implementations (but not all) are energy preserving. Since a longer signal of the same amplitude input into a longer FFT contains more total energy, the FFT result energy will also be greater by the same proportion, either by bin magnitude for sufficiently narrow-band components, and/or by distribution into more bins.
What you are observing is to be expected, at least with most common FFT implementations. Typically there is a scale factor of N in the forward direction and 1 in the reverse direction, so you need to scale the output of the FFT bins by a factor of 1/N if you are interested in calculating spectral energy (or power).
Note however that these scale factor are just a convention, e.g. some implementations have a sqrt(N) scale factor in both forward and reverse directions, so you need to check the documentation for your FFT library to be absolutely certain.

Offsetting the difference between measured and expected data

I am testing a temperature sensor for a project. i found that there exist a variance between the expected and measured value. As the difference is non -linear over e temperature range i cant simply add an offset . Is there a way i can do a kind of offset to the acquired data ?
UPDATE
I have a commercial heater element which heat up to a set temperature(i named this temperature as expected). On the other side i have a temp sensor (my proj)which measure the temperature of the heater (here i named it as measured).
I noticed the difference between the measured and expected which i would like to compensate so that measured will be close to the expected value.
Example
If my sensor measured 73.3 it should be process by some means(mathematically or otherwise)so that it will show that it is close to 70.25.
Hope this clears thing a little.
Measured Expected
30.5 30.15
41.4 40.29
52.2 50.31
62.8 60.79
73.3 70.28
83 79.7
94 90.39
104.3 99.97
114.8 109.81
Thank you for your time.
You are interested in describing deviation one variable from the other. What you are looking for is function
g( x) = f( x) - x
which returns approximation, a prediction, what number to add to x to get y data based on real x input. You need the prediction of y based on observed x values first, the f(x). This is what you can get from doing a regression:
x = MeasuredExpected ( what you have estimated, and I assume
you will know this value)
y = MeasuredReal ( what have been actually observed instead of x)
f( x) = MeasuredReal( estimated) = alfa*x + beta + e
In the simplest case of just one variable you don't even have to include special tools for this. The coefficients of equation are equal to:
alfa = covariance( MeasuredExpected, MeasuredReal) / variance( MeasuredExpected)
beta = average( MeasuredReal) - alfa * average( MeasuredExpected)
so for each expected measured x you can now state that the most probable value of real measured is:
f( x) = MeasuredReal( expected) = alfa*x + beta (under assumption that error
is normally distributed, iid)
So you have to add
g( x) = f( x) - x = ( alfa -1)*x + beta
to account for the difference that you have observed between your usual Expected and Measured.
Maybe you could use a data sample in order to do a regression analysis on the variation and use the regression function as an offset function.
http://en.wikipedia.org/wiki/Regression_analysis
You can create a calibration lookup table (LUT).
The error in the sensor reading is not linear over the entire range of the sensor, but you can divide the range up into a number of sub-ranges for which the error within the sub-range is nearly linear. Then you calibrate the sensor by taking a reading in each sub-range and calculating the offset error for each sub-range. Store the offset for each sub-range in an array to create a calibration lookup table.
Once the calibration table is known, you can correct a measurement by performing a table lookup for the proper offset. Use the actual measured value to determine the index into the array from which to get the proper offset.
The sub-ranges don't need to be same-sized although that should make it easy to calculate the proper table index for any measurement. (If the sub-ranges are not same-sized then you could use a multidimensional array (matrix) and store not only the offset but also the beginning or end point of each sub-range. Then you would scan through the begin-points to determine the proper table index for any measurement.)
You can make the correction more accurate by dividing into smaller sub-ranges and creating a larger calibration lookup table. Or you may be able to interpolate between two table entries to get a more accurate offset.

Resources