How to use the output of the inverse Fourier transform? - r

I am trying to enhance an ultrasonic signal by spectral subtraction. The signal is in the time domain and contains noise. I have divided the signal into Hamming windows of 2 µs and calculated the Fourier transforms of those frames. Then I selected 3 consecutive frames which I interpreted as noise. I averaged the magnitude spectra of those 3 frames and subtracted that average from every single frame's magnitude spectrum. Then I defined all negative magnitude spectra as zero and reconstructed the enhanced Fourier transform by combining the new magnitude spectra with the phase spectra. This gives me a series of complex numbers per frame. Now I would like to transform this series back to the time domain by using the inverse Fourier transform. However, this operation provides me with complex numbers which I do not know how to use.
I have read in a couple of posts that it is normal to obtain a complex output from inverse Fourier transformation. However the use of the complex numbers is divided. Some say to neglect the imaginary part, because it is supposed to be very small (e-15) but in my case it is not negligible (0.01-0.5). To be honest I just do not know what to do with the numbers now, because I expected the inverse Fourier transform to give me real number only. And hope for very small imaginary parts, but unfortunately..
# General parameters
#
total_samples = length(time_or) # Total numbers of samples in the current series
max_time = max(time_or) # Length of the measurement in microseconds
sampling_freq = 1/(max_time/1000000)*total_samples # Sampling frequency
frame_length_t = 2 # In microseconds (time)
frame_length_s = round(frame_length_t/1000000*sampling_freq) # In samples per frame
overlap = frame_length_s/2 # Overlap in number of frames, set to 50% overlap
#
# Transform the frame to frequency domain
#
fft_frames = specgram(amp, n=frame_length_s, Fs=125, window=hamming(frame_length_s), overlap=overlap)
mag_spec=abs(fft_frames[["S"]])
phase_spec=atan(Im(fft_frames[["S"]])/Re(fft_frames[["S"]]))
#
# Determine the arrival time of noise
#
cutoff= 10 #determine the percentage of the signal that has to be cut off
dnr=us_data[(length(us_data[,1])*(cutoff/100)):length(us_data[,1]), ]
noise_arr=(length(us_data[,1])-length(dnr[,1])+min(which(dnr[,2]>0.01)))*0.008
#
# Select the frames for noise spectrum estimation
#
noise_spec=0
noise_spec=mag_spec[,noise_arr]
noise_spec=noise_spec+mag_spec[, (noise_arr+1)]
noise_spec=noise_spec+mag_spec[, (noise_arr+2)]
noise_spec_check=noise_spec/3
#
# Subtract the estimated noise spectrum from every frame
#
est_mag_spec=mag_spec-noise_spec_check
est_mag_spec[est_mag_spec < 0] = 0
#
# Transform back to frequency spectrum
#
j=complex(real=0, imaginary=1)
enh_spec = est_mag_spec*exp(j*phase_spec)
#
# Transform back to time domain
#
install.packages("pracma")
library("pracma")
enh_time=fft(enh_spec[,2], inverse=TRUE)
I hope that there is anyone with an idea on how to process these complex numbers. Maybe I have made an mistake earlier on in the processing method, but I have checked it multiple times and it seems quite solid to me. It is the (one but last) last step of the process and really hoping to obtain a nice time domain signal after inverse Fourier transforming.

An essential troubleshooting aid when transforming data using Fourier Transform is the notion that you can do a fft then take that data and do an inverse fft and get back your original data ... I suggest you get comfortable doing this with toy input time domain data ... lets say a 1 Khz audio wave which is your time domain data ... send it into a fft call which will return back an array of its frequency domain representation ... without doing anything with that data send it into an inverse fft ( ifft ) ... the data returned will the your original 1 Khz audio wave ... do that now to gain an appreciation of its power and use this trick on your project to confirm you are in the ball park ... Alternatively if you begin with frequency domain data you can also do this ...
freq domain data -> ifft -> time domain data -> fft -> same freq domain data
or
time domain -> fft -> freq domain -> ifft -> same time domain data
see more details here Get frequency with highest amplitude from FFT

This is your problem:
phase_spec=atan(Im(fft_frames[["S"]])/Re(fft_frames[["S"]]))
Here you compute an angle in a half circle, mapping the other half to the first. That is, you are loosing information.
Many languages have a function to obtain the phase of a complex value, for example in MATLAB it is angle, and in Python numpy.angle.
Alternatively, use the atan2 function, which exists in every single language I’ve ever used, except in NumPy they decided to call it arctan2. It computes the four-quadrant arctangent by taking the two components as separate values. That is, atan(y/x) is the same as atan2(y,x) if the result is in the first two quadrants.
I presume you can do
phase_spec=atan2(Im(fft_frames[["S"]]), Re(fft_frames[["S"]]))

Related

Periodogram (TSA In R) can't find correct frequency

I'm trying to process a sinusoidal time series data set:
I am using this code in R:
library(readxl)
library(stats)
library(matplot.lib)
library(TSA)
Data_frame<-read_excel("C:/Users/James/Documents/labssin2.xlsx")
# compute the Fourier Transform
p = periodogram(Data_frame$NormalisedVal)
dd = data.frame(freq=p$freq, spec=p$spec)
order = dd[order(-dd$spec),]
top2 = head(order, 5)
# display the 2 highest "power" frequencies
top2
time = 1/top2$f
time
However when examining the frequency spectrum the frequency (which is in Hz) is ridiculously low ~ 0.02Hz, whereas it should have one much larger frequency of around 1Hz and another smaller one of 0.02Hz (just visually assuming this is a sinusoid enveloped in another sinusoid).
Might be a rather trivial problem, but has anyone got any ideas as to what could be going wrong?
Thanks in advance.
Edit 1: Using
result <- abs(fft(df$Data_frame.NormalisedVal))
Produces what I am expecting to see.
Edit2: As requested, text file with the output to dput(Data_frame).
http://m.uploadedit.com/bbtc/1553266283956.txt
The periodogram function returns normalized frequencies in the [0,0.5] range, where 0.5 corresponds to the Nyquist frequency, i.e. half your sampling rate. Since you appear to have data sampled at 60Hz, the spike at 0.02 would correspond to a frequency of 0.02*60 = 1.2Hz, which is consistent with your expectation and in the neighborhood of what can be seen in the data your provided (the bulk of the spike being in the range of 0.7-1.1Hz).
On the other hand, the x-axis on the last graph you show based on the fft is an index and not a frequency. The corresponding frequency should be computed according to the following formula:
f <- (index-1)*fs/N
where fs is the sampling rate, and N is the number of samples used by the fft. So in your graph the same 1.2Hz would appear at an index of ~31 assuming N is approximately 1500.
Note: the sampling interval in the data you provided is not quite constant and may affect the results as both periodogram and fft assume a regular sampling interval.

FFT frequency domain values depend on sequence length?

I am extracting Heart Rate Variable (HRV) frequency domain features, e.g. LF, HF, using FFT. Currently, I have found out that LF and HF values in longer sequence length, e.g. 3 minutes, will be larger than shorter sequence length, e.g. 30 seconds. I wonder if this is a common observation or there are some bugs in my execution codes? Thanks in advance
Yes, the frequency in each bin depends on N, the sequence length.
See this related answer: https://stackoverflow.com/a/4371627/119527
An FFT by itself is a dimensionless basis transform. But if you know the sample rate (Fs) of the input data and the length (N) of the FFT, then the center frequency represented by each FFT result element or result bin is bin_index * (Fs/N).
Normally (with baseband sampling) the resulting range is from 0 (DC) up to Fs/2 (for strictly real input the rest of the FFT results are just a complex conjugate mirroring of the first half).
Added: Also many forward FFT implementations (but not all) are energy preserving. Since a longer signal of the same amplitude input into a longer FFT contains more total energy, the FFT result energy will also be greater by the same proportion, either by bin magnitude for sufficiently narrow-band components, and/or by distribution into more bins.
What you are observing is to be expected, at least with most common FFT implementations. Typically there is a scale factor of N in the forward direction and 1 in the reverse direction, so you need to scale the output of the FFT bins by a factor of 1/N if you are interested in calculating spectral energy (or power).
Note however that these scale factor are just a convention, e.g. some implementations have a sqrt(N) scale factor in both forward and reverse directions, so you need to check the documentation for your FFT library to be absolutely certain.

Phase/Amplitude Forumla in R for Fourier Transformation

So I am trying to find 3 things given a certain function in the x domain when transformed into the spectral domain.
the Amplitude
The Frequency
The Phase
In R (statistical software) I have coded the following function:
y=7*cos(2*pi*(seq(-50,50,by=.01)*(1/9))+32)
fty=fft(y,inverse=F)
angle=atan2(Im(fty), Re(fty))
x=which(abs(fty)[1:(length(fty)/2)]==max(abs(fty)[1:(length(fty)/2)]))
par(mfcol=c(2,1))
plot(seq(-50,50,by=.01),y,type="l",ylab = "Cosine Function")
plot(abs(fty),xlim=c(x-30,x+30),type="l",ylab="Spectral Density in hz")
I know I can compute the frequency manually by taking the bin value and dividing it by the size of the interval(total time of the domain). Since the bins started at 1, when it should be zero, it would thus be frequency=(BinValue-1)/MaxTime, which does get me the 1/9'th I have in the function above.
I have two quick questions:
First) I am having trouble computing the phase, is there a prebuilt R function that can give me the phase? From a manual calculation, the density function peaks at 12 (see bottom graph), shouldn't the value then the value of the phase be 2*pi+angle[12] but I am getting a value of
angle[12] [1] -2.558724
which puts the phase at 2*pi+angle[12]=3.724462. But that's wrong the phase should be 32 radians. What am I doing wrong?
Second) Is there a function that can automatically convert abs(fty)[12]=34351.41 , to the amplitude number I have in front of the cosine, which is 7?

Trying to do a simulation in R

I'm pretty new to R, so I hope you can help me!
I'm trying to do a simulation for my Bachelor's thesis, where I want to simulate how a stock evolves.
I've done the simulation in Excel, but the problem is that I can't make that large of a simulation, as the program crashes! Therefore I'm trying in R.
The stock evolves as follows (everything except $\epsilon$ consists of constants which are known):
$$W_{t+\Delta t} = W_t exp^{r \Delta t}(1+\pi(exp((\sigma \lambda -0.5\sigma^2) \Delta t+\sigma \epsilon_{t+\Delta t} \sqrt{\Delta t}-1))$$
The only thing here which is stochastic is $\epsilon$, which is represented by a Brownian motion with N(0,1).
What I've done in Excel:
Made 100 samples with a size of 40. All these samples are standard normal distributed: N(0,1).
Then these outcomes are used to calculate how the stock is affected from these (the normal distribution represent the shocks from the economy).
My problem in R:
I've used the sample function:
x <- sample(norm(0,1), 1000, T)
So I have 1000 samples, which are normally distributed. Now I don't know how to put these results into the formula I have for the evolution of my stock. Can anyone help?
Using R for (discrete) simulation
There are two aspects to your question: conceptual and coding.
Let's deal with the conceptual first, starting with the meaning of your equation:
1. Conceptual issues
The first thing to note is that your evolution equation is continuous in time, so running your simulation as described above means accepting a discretisation of the problem. Whether or not that is appropriate depends on your model and how you have obtained the evolution equation.
If you do run a discrete simulation, then the key decision you have to make is what stepsize $\Delta t$ you will use. You can explore different step-sizes to observe the effect of step-size, or you can proceed analytically and attempt to derive an appropriate step-size.
Once you have your step-size, your simulation consists of pulling new shocks (samples of your standard normal distribution), and evolving the equation iteratively until the desired time has elapsed. The final state $W_t$ is then available for you to analyse however you wish. (If you retain all of the $W_t$, you have a distribution of the trajectory of the system as well, which you can analyse.)
So:
your $x$ are a sampled distribution of your shocks, i.e. they are $\epsilon_t=0$.
To simulate the evolution of the $W_t$, you will need some initial condition $W_0$. What this is depends on what you're modelling. If you're modelling the likely values of a single stock starting at an initial price $W_0$, then your initial state is a 1000 element vector with constant value.
Now evaluate your equation, plugging in all your constants, $W_0$, and your initial shocks $\epsilon_0 = x$ to get the distribution of prices $W_1$.
Repeat: sample $x$ again -- this is now $\epsilon_1$. Plugging this in, gives you $W_2$ etc.
2. Coding the simulation (simple example)
One of the useful features of R is that most operators work element-wise over vectors.
So you can pretty much type in your equation more or less as it is.
I've made a few assumptions about the parameters in your equation, and I've ignored the $\pi$ function -- you can add that in later.
So you end up with code that looks something like this:
dt <- 0.5 # step-size
r <- 1 # parameters
lambda <- 1
sigma <- 1 # std deviation
w0 <- rep(1,1000) # presumed initial condition -- prices start at 1
# Show an example iteration -- incorporate into one line for production code...
x <- rnorm(1000,mean=0,sd=1) # random shock
w1 <- w0*exp(r*dt)*(1+exp((sigma*lambda-0.5*sigma^2)*dt +
sigma*x*sqrt(dt) -1)) # evolution
When you're ready to let the simulation run, then merge the last two lines, i.e. include the sampling statement in the evolution statement. You then get one line of code which you can run manually or embed into a loop, along with any other analysis you want to run.
# General simulation step
w <- w*exp(r*dt)*(1+exp((sigma*lambda-0.5*sigma^2)*dt +
sigma*rnorm(1000,mean=0,sd=1)*sqrt(dt) -1))
You can also easily visualise the changes and obtain summary statistics (5-number summary):
hist(w)
summary(w)
Of course, you'll still need to work through the details of what you actually want to model and how you want to go about analysing it --- and you've got the $\pi$ function to deal with --- but this should get you started toward using R for discrete simulation.

Wavelet reconstruction of time series

I'm trying to reconstruct the original time series from a Morlet's wavelet transform. I'm working in R, package Rwave, function cwt. The result of this function is a matrix of n*m (n=period, m=time) containing complex values.
To reconstruct the signal I used the formula (11) in Torrence & Compo classic text, but the result has nothing to do with the original signal. I'm specially concerned with the division between the real part of the wavelet transform and the scale, this step distorts completely the result. On the other hand, if I just sum the real parts over all the scales, the result is quite similar to the original time series, but with slightly wider values (the original series ranges~ [-0.2, 0.5], the reconstructed series ranges ~ [-0.4,0.7]).
I'm wondering if someone could tell of some practical procedure, formula or algorithm to reconstruct the original time series. I've already read the papers of Torrence and Compo (1998), Farge (1992) and other books, all with different formulas, but no one really help me.
I have been working on this topic currently, using the same paper. I show you code using an example dataset, detailing how I implemented the procedure of wavelet decomposition and reconstruction.
# Lets first write a function for Wavelet decomposition as in formula (1):
mo<-function(t,trans=0,omega=6,j=0){
dial<-2*2^(j*.125)
sqrt((1/dial))*pi^(-1/4)*exp(1i*omega*((t-trans)/dial))*exp(-((t-trans)/dial)^2/2)
}
# An example time series data:
y<-as.numeric(LakeHuron)
From my experience, for correct reconstruction you should do two things: first subject the mean to get a zero-mean dataset. I then increase the maximal scale. I mostly use 110 (although the formula in the Torrence and Compo suggests 71)
# subtract mean from data:
y.m<-mean(y)
y.madj<-y-y.m
# increase the scale:
J<-110
wt<-matrix(rep(NA,(length(y.madj))*(J+1)),ncol=(J+1))
# Wavelet decomposition:
for(j in 0:J){
for(k in 1:length(y.madj)){
wt[k,j+1]<-mo(t=1:(length(y.madj)),j=j,trans=k)%*%y.madj
}
}
#Extract the real part for the reconstruction:
wt.r<-Re(wt)
# Reconstruct as in formula (11):
dial<-2*2^(0:J*.125)
rec<-rep(NA,(length(y.madj)))
for(l in 1:(length(y.madj))){
rec[l]<-0.2144548*sum(wt.r[l,]/sqrt(dial))
}
rec<-rec+y.m
plot(y,type="l")
lines(rec,col=2)
As you can see in the plot, it looks like a perfect reconstruction:

Resources