Graph a timer based on counts - graphite

I am sending timer metrics to Graphite and would like to create a graph of the counts based on the times buts I cannot do it or a line graph that is based on the value.
Tried playing with Graphite web app but no luck.
Example metrics
timer1 200
timer1 300
timer1 200

I'm trying to understand what you want. Given the following data logged in [metric] [data] [timestamp] format
timer1 200 1s
timer1 300 50s
timer1 200 100s
(I know my timestamps are fake non-posix times, just trying to keep it simple)
Would you like to see something like
(A)
timer1's values summed by minute, so the (x, y) coordinates (0, 500) and (1, 200)
(B)
timer1's values summing up continuously over time like a counter, so the (x, y) coordinates (0, 200), (50, 500), (100, 700)
For (A):
summarize(timer1, "1min")
For (B):
integral(timer1)
You can apply functions to your data in the graphite webapp by putting the raw series data in a graph, then click Graph Data -> select your series -> Apply Function
Also read over the documentation for different functions

Related

Interpolating blinks in eyetracking data - start/end intervals as time points

So, I apologise in advance for my poor attempt at explaining myself. I am rather lost.
Summary:
I am working with the eyelinker package in R to analyse pupil size data in a time-series fashion.
I have managed to create a set of intervals where blinks start and end (extendedBlinks, they extend 150 milliseconds each direction (1000Hz).
# Define set of intervals for blinks
Blk <- cbind(df$blinks$stime, df$blinks$etime)
# Extend blinks (100 milliseconds each way)
extendedBlinks <- Intervals(Blk) %>% expand(150, "absolute")
head(extendedBlinks)
output:
Object of class Intervals
6 intervals over R:
[4485724, 4486141]
[4485984, 4486657]
[4486549, 4486853]
[4486595, 4487040]
[4486800, 4489142]
[4498990, 4499339]
In my dataframe, I have PSL (Pupil Size Left), PSR (Pupil Size Right), and time (relative to the eyetracker, and has the same values as the intervals shown above.
So, I want to get the PSL/PSR (for the sake of the example, let's just stick to getting the PSL).
I've tried many things, nothing seems to work for me. I want to replace the given values in y1 with extendedBlinks[1,1] and extendedBlinks[1,2] respectively (and then iterate over the intervals to interpolate the blinks.
# Interpolation
x1 <- c(extendedBlinks[1,1],extendedBlinks[1,2])
y1 <- c(500, 550)
interp <- approx(x1,y1, n = extendedBlinks[1,2]-extendedBlinks[1,1])
plot(interp)
Again, sorry for the poorly worded question. I'll edit as I receive feedback to try and make it clearer.
Any ideas?
Cheers!

How to convert a spectrogram matrix into wav file

Is there a way to convert a matrix representing a grayscale spectrogram (values non-complex and between 0 and 1) like the one shown in the image below back into a sound file, e.g. wav file? This post explains how to do it with a seewave spectrogram using the istft function. However, in my case I see two problems which need to be solved:
The original spectrogram (obtained by signal::specgram) is lost and matrix dimensions are different from the original spectrogram (i.e. both frequency and time are up-/ or downsampled) while exact frequency and time values for each row and each column are known
The matrix values range between 0 and 1 and are not complex as required by istft
Furthermore, the dimensions of the original spectrogram, the sample frequency of the original wave object and the window length and overlap used to obtain the original spectrogram are known.
Thank you!
audio is just a curve which wobbles over time where this wobble mirrors your eardrum or microphone pickup membrane ... this signal is in the time domain where axis are time on X and curve height on Y ... typical CD quality audio has 44,100 samples per second meaning you capture that number of points on this audio curve per second ... what gets captured is the audio curve height whereas time is implied knowing each sample is captured in a known sample rate ... so sample rate is one of the two critical audio attributes on digital audio ... bit depth is the other attribute ... if you devote two bytes ( 16 bits ) to record CD quality curve height you get 2 raised to the 16th power ( 2^16 == 65536 ) distinct possible values to store the curve height
its critical to emphasize a raw audio signal is in the time domain (X is time Y is curve height) ... when you send a set of these samples into a fft call the data gets transformed into the frequency domain (X is frequency Y is magnitude [energy]) so the direct dimension of time is gone yet is baked into the notion of that entire body of frequency domain data ... there are trade offs when deciding both the number of samples you feed into the fft call ( sample window size ) namely to increase the frequency resolution of the freq domain signal (to lower incr_freq ) you need more audio samples to get fed into the fft call however to gain temporal specificity in the freq domain you need as few samples as possible which you pay for by getting a lower frequency resolution and lower peak freq ( lower nyquist limit )
to generate a spectrogram you feed a memory buffer of say 4096 samples of this curve height array ( time domain ) into a Fourier Transform ( fft ) which will return back an array ( freq domain ) of same number of array elements yet this time each element stores a complex number from which you can calculate the magnitude ( energy level ) and phase ... array element zero is the DC bias which can be ignored ... each array element represents a distinct frequency where the freq increment can be calculated
with sample_rate of 44100 samples per second, and one second worth of samples ( 44100 )
this gives you a frequency increment resolution of 1 hertz ... IE each freq bin is 1 Hertz apart
incr_freq := sample_rate / number_of_samples
nyquist_limit_index := int(number_of_samples / 2)
here is how you can iterate across the array complex_fft (in go not r)
for index_fft, curr_complex := range complex_fft { // we really only use half this range + 1
if index_fft <= nyquist_limit_index && curr_freq >= min_freq && curr_freq < max_freq {
curr_real = real(curr_complex) // pluck out real portion of complex number
curr_imag = imag(curr_complex) // ditto for imaginary portion
curr_mag = 2.0 * math.Sqrt(curr_real*curr_real+curr_imag*curr_imag) / number_of_samples
curr_theta = math.Atan2(curr_imag, curr_real)
curr_dftt := discrete_fft{
real: 2.0 * curr_real,
imaginary: 2.0 * curr_imag,
magnitude: curr_mag,
theta: curr_theta,
}
as time marches along you repeat above process of feeding the next set of 4096 samples into the fft api call so you collect a set of pairs of time domain arrays and their corresponding freq domain representation
the process which created your plot has done this repeat process which is why time is shown as X axis ... on your plot each vertical bar of data represents output from single fft call where its resultant magnitude is shown as the dark portions of that vertical bar and the lighter dots on the plot show the lower energy frequencies ... only after the process which generated that plot progressed over time was the data available to plot the next vertical bar as the plot progressed from left to right hence the time axis across the X axis on bottom
another critical insight is to be aware you can start with audio (time domain) ... populate a window of samples ( 4096 for example ) and send this array into a fft call to obtain a new array (freq domain) of frequencies each with its magnitude and phase ... here is the pure magic, you can then perform an inverse Fourier Transform ( ifft ) on this freq domain array to get an array in the time domain which will match (to a 1st approx ) your original input audio signal
so in your case walk across your data from left to right on the plot and for each set of vertical magnitude values ( indicated by grayscale ) which is a single frequency domain array perform this inverse Fourier Transform which will give you the raw audio signal ( time domain ) only for a very quick segment of time ( as defined by the 4096 audio samples or similar ) ... this raw audio is the payload portion of a wav file ... repeat this process for the next vertical column of data until you have walked across the entire plot from left to right ... stitch together this sequence of payload buffers into a wav file

Why the need for a mask when performing Fast Fourier Transform?

I'm trying to find out the peak frequencies hidden in my data using the fft() method in R. While preparing the data, a more experienced user recommends to create a "mask" (more after explaining the details), that does give me the exact diagram I'm looking for. The problem is, I don't understand what it does or why it's needed.
To give some context, I'm working with .txt files with around 12000 entries each. It's voltage vs. time information, and the expected result is just a sinusoidal wave with a clear peak frequency that should be close to 1-2 Hz. This is an example of what one of those files look like:
I've been trying to use the Fast Fourier Transform method fft() implemented in R to find the peak frequencies and get a diagram that reflected them clearly. At first, I calculate some things that I understand are going to be useful, like the Nyquist frequency and the range of frequencies I'll show in the final graph:
n = length(variable)
dt = time[5]-time[4]
df = 1/(max(time)) #Find out the "unit" frequency
fnyquist = 1/(2*dt) #The Nyquist frequency
f = seq(-fnyquist, fnyquist-df, by=df) #These are the frequencies I'll plot
But when I plot the absolute value of what fft(data) calculates vs. the range of frequencies, I get this:
The peak frequency seems to be close to 50 Hz, but I know that's not the case. It should be close to 1 Hz. I'm a complete newbie in R and in Fourier analysis, so after researching a little, I found in a Swiss page that this can be solved by creating a "mask", which is actually just a vector with a repeatting patern (1, -1, 1, -1...) with the same length as my data vector itself:
mask=rep(c(1, -1),length.out=n)
Then if I multiply my data vector by this mask and plot the results:
results = mask*data
plot(f,abs(fft(results)),type="h")
I get what I was looking for. (This is the graph after limiting the x-axis to a reasonable scale).
So, what's the mask actually doing? I undestand it's changing my data point signs in an alternate manner, but I don't get why it would take the infered peak frequencies from ~50 Hz to the correct result of ~1 Hz.
Thanks in advance!
Your "mask" is one of two methods of performing an fftshift, which is commonly done to center the 0 Hz output of an FFT in the middle of a graph or plot (instead of at the left edge, with the negative frequencies wrapping around to the right edge).
To perform an fftshift, you can hetrodyne or modulate your data (by Fs/2) before the FFT, or simply do a circular shift by 50% after the FFT. Both produce the same result. They are the same due to the shift property of the DFT.

Find start point (time) of each cycle in a sine wave

I am tying to achieve sine wave gradually changing from 8Hz to 2Hz over 5 seconds:
This waveform was produced in Cool Edit. I gave it a start frequency of 8Hz, an end frequency of 2Hz and a duration of 5 seconds. The sine wave gradually changes from one frequency to the other over the given time.
My question is, how can I accurately find the start time of each cycle (highlighted with a red dot), using a FOR loop?
Pseudo code:
time = 5 //Duration
freq1 = 8 //Start frequency
freq2 = 2 //End frequency
cycles = ( (freq1 + freq2) / 2 ) * time //Total number of cycles
for(i = 0; i < cycles; i++) {
/* Formula to find start time of each cycle */
}
That is backward thinking for this problem which leads to madness in the program. Not to mention the individual waves will not be a sin wave because the frequency is changing (they will be slightly distorted) which you will not achieve with your generator and also there is very slight chance the ending signal will stop on zero after 5sec. Instead do a continuous sin wave with variable frequency:
First compute actual frequency
linear interpolation will suffice (unless you need different change)
f=f0+(f1-f0)*t/T
where:
f0=8 [Hz] start frequency
f1=2 [Hz] stop frequency
T =5 [s] change time
t =<0,T> is actual time in [s]
compute the sin wave data
for (t=0.0,angle=0.0;t<=T;t+=dt)
{
f=f0+((f1-f0)*t/T); // actual frequency
signal=Amplitude*sin(angle); // your signal put it in a array or output somewhere ...
angle+=6.283185307179586476925286766559*dt*f; // update phase
while (angle>6.283185307179586476925286766559) // cut just to avoid floating rounding problems
angle-=6.283185307179586476925286766559;
}
Where dt [s] is a time step you want to sample your signal with. If you are generating this in Real Time and outputting to real HW you can use a timer or measure the time directly (with performance counters on Windows or by RDTSC or whatever you have at disposal)
If you got predefined number of samples n for this then
dt=T/double(n-1);
Here sample output (n=image width):
If you also need the number of periods then add counter increment inside the angle cut while loop And also there is your zero point too (but if samplerate is too small or you need high precision you need to interpolate the real zero position).

Reconstructing a signal from its discrete fourier transform in R

I am trying to replicate the following figure in R: (adapted from http://link.springer.com/article/10.1007/PL00011669)
The basic concept of the figure is to show the first few components of a DFT, plotted in the time domain, and then show a reconstructed wave in the time domain using only these components (X') relative to the original data (X). I would like to slightly modify the above figure such that all of the lines shown are overlaid on a single plot.
I have been trying to adapt the figure with some real data sampled at 60 Hz. For example:
## 3 second sample where: time is in seconds and var is the variable of interest
temp = data.frame(time=seq(from=0,to=3,by=1/60),
var = c(0.054,0.054,0.054,0.072,0.072,0.072,0.072,0.09,0.09,0.108,0.126,0.126,
0.126,0.126,0.126,0.144,0.144,0.144,0.144,0.144,0.162,0.162,0.144,0.126,
0.126,0.108,0.144,0.162,0.18,0.162,0.126,0.126,0.108,0.108,0.126,0.144,
0.162,0.144,0.144,0.144,0.144,0.162,0.162,0.126,0.108,0.09,0.09,0.072,
0.054,0.054,0.054,0.036,0.036,0.018,0.018,0.018,0.018,0,0.018,0,
0,0,-0.018,0,0,0,-0.018,0,-0.018,-0.018,0,-0.018,
-0.018,-0.018,-0.018,-0.036,-0.036,-0.054,-0.054,-0.072,-0.072,-0.072,-0.072,-0.072,
-0.09,-0.09,-0.108,-0.126,-0.126,-0.126,-0.144,-0.144,-0.144,-0.162,-0.162,-0.18,
-0.162,-0.162,-0.162,-0.162,-0.144,-0.144,-0.144,-0.126,-0.126,-0.108,-0.108,-0.09,
-0.072,-0.054,-0.036,-0.018,0,0,0,0,0.018,0.018,0.036,0.054,
0.054,0.054,0.054,0.054,0.054,0.054,0.054,0.054,0.054,0.072,0.054,0.072,
0.072,0.072,0.072,0.072,0.072,0.054,0.054,0.054,0.036,0.036,0.036,0.036,
0.036,0.054,0.054,0.072,0.09,0.072,0.036,0.036,0.018,0.018,0.018,0.018,
0.036,0.036,0.036,0.036,0.018,0,-0.018,-0.018,-0.018,-0.018,-0.018,0,
-0.018,-0.036,-0.036,-0.018,-0.018,-0.018,-0.036,0,0,-0.018,-0.018,-0.018,-0.018))
##plot the original data
ggplot(temp, aes(x=time, y=var))+geom_line()
I believe that I can use fft() to eventually accomplish this goal however the leap from the output of fft() to my goal is a bit unclear.
I realize that this question is somewhat similar to: How do I calculate amplitude and phase angle of fft() output from real-valued input? but I am more specifically interested in the actual code for the specific data above.
Please note that I am relatively new to time series analysis so any clarity you could provide w.r.t. putting the output of fft() in context, or any package you could recommend that would accomplish this task efficiently would be appreciated.
Thank you
Matlab is your best tool, and the specific function is just fft(). To use it, first determine several basic parameters of your time domain data:
1, time duration (T), which equals to 3s.
2, Sampling interval T_s, which equals to 1/60 s.
3, Frequency domain revolution f_s, which equals to the frequency difference between two adjacent Fourier basis. You may define f_s according to your needs. However, the smallest possible f_s equals to 1/T=0.333 Hz. As a result, if you want better frequency domain revolution (smaller f_s), you need longer time domain data.
4, Maximum frequency f_M, which equals to 1/(2T_s)=30 according to Shannon sampling theory.
5, DFT length N, which equals to 2*f_M/f_s.
Then find out the specific frequencies of four Fourier basis that you want to use to approximate the data. For example, 3,6,9 and 12 Hz. So f_s = 3 Hz. Then N=2*f_M/f_s=20.
Your Matlab code looks like this:
var=[0.054,0.054,0.054 ...]; % input all your data points here
f_full=fft(var,20); % Do 20-point fft
f_useful=f_full(2:5); % You are interested with the lowest four frequencies except DC
Here f_useful contains the four complex coefficients of four Fourier basis. To reconstruct var, do the following:
% Generate basis functions
dt=0:1/60:3;
df=[3:3:12];
basis1=exp(1j*2*pi*df(1)*dt);
basis2=exp(1j*2*pi*df(2)*dt);
basis3=exp(1j*2*pi*df(3)*dt);
basis4=exp(1j*2*pi*df(4)*dt);
% Reconstruct var
var_recon=basis1*f_useful(1)+...
basis2*f_useful(2)+...
basis3*f_useful(3)+...
basis4*f_useful(4);
var_recon=real(var_recon);
% Plot both curves
figure;
plot(var);
hold on;
plot(var_recon);
Adapt this code to your paper :)
Adapting my own post from Signal Processing. I think it's still relevant for those in Python.
I am no expert in this topic, but have some useful examples to share.
The more Fourier components you keep, the closer you'll mimic the original signal.
This example shows what happens when you keep 10, 20, ...up to n components. Assuming x and y are your data vectors.
import numpy
from matplotlib import pyplot as plt
n = len(y)
COMPONENTS = [10, 20, n]
for c in COMPONENTS:
colors = numpy.linspace(start=100, stop=255, num=c)
for i in range(c):
Y = numpy.fft.fft(y)
numpy.put(Y, range(i+1, n), 0.0)
ifft = numpy.fft.ifft(Y)
plt.plot(x, ifft, color=plt.cm.Reds(int(colors[i])), alpha=.70)
plt.title("First {c} fourier components".format(c=c))
plt.plot(x,y, label="Original dataset", linewidth=2.0)
plt.grid(linestyle='dashed')
plt.legend()
plt.show()
For the book's dataset, keeping up to 4, 10, and n components:
For your dataset, keeping up to 4, 10, and n components:

Resources