draw sines at higher freq - plot

I'm trying to plot some sine waves (code example in plain js here).
When freq is "low" (freq = 10 hz in that case), the plot is quite nice:
The problem is when I increase the freq (try to set var freq = 50 for example):
lots of ripples, it becomes distorted and not so good as plot. If I increment it more, even worse (var freq = 8030 for example is terrible).
When I see those kind of graph on pro systems, they are displayed just fine.
How would you improve it? FFT, splines, whatever? Which is the right approch?
I don't really need accurancy (i.e. for waveform analysis or whatever), just plot it nicely (as in Desmos https://www.desmos.com/calculator/eodkjlywjh, for example).

Like Matt said, the problem here is the same as in plotting a discrete-time signal shows amplitude modulation: "At higher frequencies, when the peak falls between two samples, the sampled points can be a lot lower than the peak." This causes the peaks of the waveform to vary, making a kind of ripple visual effect at the top and bottom of the plot.
Increasing the sampling resolution helps. If you can't increase sampling resolution by much, adjust the sampling resolution to the closest integer multiple of the sine's frequency and set the sampling phase so that it samples the waveform peaks. For instance for a 100 Hz wave cos(2 π (100*x + 0.3)), you could sample at 400 Hz at x[n] = n/400 - 0.003. That way there are four samples per period, with x[0], x[4], x[8], ... sampling exactly on the peaks.
Another thought: plotting with a thicker line can help to smooth over some defects. It looks like the desmos example that you linked is using something like a 3-pixel line width.

Related

Why the need for a mask when performing Fast Fourier Transform?

I'm trying to find out the peak frequencies hidden in my data using the fft() method in R. While preparing the data, a more experienced user recommends to create a "mask" (more after explaining the details), that does give me the exact diagram I'm looking for. The problem is, I don't understand what it does or why it's needed.
To give some context, I'm working with .txt files with around 12000 entries each. It's voltage vs. time information, and the expected result is just a sinusoidal wave with a clear peak frequency that should be close to 1-2 Hz. This is an example of what one of those files look like:
I've been trying to use the Fast Fourier Transform method fft() implemented in R to find the peak frequencies and get a diagram that reflected them clearly. At first, I calculate some things that I understand are going to be useful, like the Nyquist frequency and the range of frequencies I'll show in the final graph:
n = length(variable)
dt = time[5]-time[4]
df = 1/(max(time)) #Find out the "unit" frequency
fnyquist = 1/(2*dt) #The Nyquist frequency
f = seq(-fnyquist, fnyquist-df, by=df) #These are the frequencies I'll plot
But when I plot the absolute value of what fft(data) calculates vs. the range of frequencies, I get this:
The peak frequency seems to be close to 50 Hz, but I know that's not the case. It should be close to 1 Hz. I'm a complete newbie in R and in Fourier analysis, so after researching a little, I found in a Swiss page that this can be solved by creating a "mask", which is actually just a vector with a repeatting patern (1, -1, 1, -1...) with the same length as my data vector itself:
mask=rep(c(1, -1),length.out=n)
Then if I multiply my data vector by this mask and plot the results:
results = mask*data
plot(f,abs(fft(results)),type="h")
I get what I was looking for. (This is the graph after limiting the x-axis to a reasonable scale).
So, what's the mask actually doing? I undestand it's changing my data point signs in an alternate manner, but I don't get why it would take the infered peak frequencies from ~50 Hz to the correct result of ~1 Hz.
Thanks in advance!
Your "mask" is one of two methods of performing an fftshift, which is commonly done to center the 0 Hz output of an FFT in the middle of a graph or plot (instead of at the left edge, with the negative frequencies wrapping around to the right edge).
To perform an fftshift, you can hetrodyne or modulate your data (by Fs/2) before the FFT, or simply do a circular shift by 50% after the FFT. Both produce the same result. They are the same due to the shift property of the DFT.

Curve smoothing preserving peaks and valleys

I want to generate a smoothed version of a set of datapoints, but preserve any local peaks or valleys (i.e. where the previous and next value are either both greater than or both less than the current value). The standard smooth function in R does a running median, which works great for intermediate values but chops off the peaks and valleys of the data. Here's an example which hopefully helps to explain the issue:
peakValleys <- runif(20);
landform <- approx(x=seq_along(peakValleys), y=peakValleys, n=200)$y;
landform <- landform + runif(200, max=0.1);
plot(landform);
points(smooth(landform), type="l", col="red");
I'd like the red line to include any local peaks/valleys, but still look smooth (i.e. not just jumping at the peak/valley point). This is most obvious around 28, where a valley datapoint has been completely excluded.
The problem is worse if I have a larger smoothing window with the running median. Ideally, this "smoothing with peak inclusion" should work for any window size:
plot(landform.jitter);
points(runmed(landform.jitter, 7), type="l", col="red");
I care the most about the peaks/valleys, and the least about the points in between. One application of this is in finding and quantifying peaks in flow cytometry, where a small change in a peak height can mean a big difference in the estimated cell fraction. Another application is in nanopore sequencing of DNA, where peaks can be very brief, but are important to know about

Plot gigantic correlation matrix as colours

I have a correlation matrix $P_{i,j}$ which is $1000 \times 1000$. Given the data the matrix will have rectangular patches of very high correlations. That is, if you draw a $20 \times 20$ square anywhere in this matrix you will either be looking at a patch of highly correlated variables ($\rho_{i,j}> 0.8$) or medium to uncorrelated ($\in [-0.1, 0.5]$). The reason for this is the structure of the data.
How do I represent this graphically? I know of one way to visualize a matrix like this but it only works for small dimensions:
install.packages("plotrix")
library(plotrix)
rhoMat = array(rnorm(1000*1000),dim=c(1000,1000))
color2D.matplot(rhoMat[1:10,1:10],cs1=c(0,0.01),cs2=c(0,0),cs3=c(0,0)) #nice!
color2D.matplot(rhoMat,cs1=c(0,0.01),cs2=c(0,0),cs3=c(0,0)) #broken!
What is a function or algorithm that would plot a red area if in that vicinity in the matrix $P_{i,j}$, correlations "tend to" be high, versus "tending" to be low (even better if it switches from one colour to another as we move from positive to negative correlation patches). I want something to see how many patches of high correlations there are and whether one patch is correlated to another patch at a different place in the dataset.
I only want to do it in R.
I think you can use image with the argument breaks to get exactly what you want:
dat <- matrix(runif(10000), ncol = 100)
image(dat, breaks = c(0.0, 0.8, 1.0), col = c("yellow", "red"))
I always fail to think of image for this kind of problem - the name is sort of non-obvious. I started with heatmap and then it led me to image.
Look at the corrplot package. It has various tools for visualizing correlations, one option that it has is to use hierarchical clustering to draw rectangles around groups of high or low correlation.
I've done this in Excel fairly easily. You can change the colour of boxes based on range of values within the boxes. You can even create a gradient from lets say 0 to 1. 1000 x 1000 would be big for Excel, but I think it would work. You would just have to zoom out.

heat transfer for spherical coordinates boundary conditions implementation

I want to apply heat transfer ( heat conduction and convection) for a hemisphere. It is a transient homogeneous heat transfer in spherical coordinates. There is no heat generation. Boundary conditions of hemisphere is in the beginning at Tinitial= 20 degree room temperature. External-enviromental temperature is -22 degree. You can imagine that hemisphere is a solid material. Also, it is a non-linear model, because thermal conductivity is changing after material is frozen, and this is going to change the temperature profile.
I want to find the temperature profile of this solid during a certain time until center temperature reach to -22 degree.
In this case, Temperature depends on 3 parameters : T(r,theta,t). radius, angle, and time.
1/α(∂T(r,θ,t))/∂t =1/r^2*∂/∂r(r^2(∂T(r,θ,t))/∂r)+ 1/(r^2*sinθ )∂/∂θ(sinθ(∂T(r,θ,t))/∂θ)
I applied finite difference method using matlab, However, boundary conditions have issues. There are convection on surface of the hemisphere, and conduction in the inner nodes, bottom of the hemisphere has constant temperature which is air temperature (-22). You can see the scripts which i am using for BCs in the matlab file.
% Temperature at surface of hemisphere solid boundary node
for i=nodes
for j=1:1:(nodes-1)
Qcd_ot(i,j)= ((k(i,j)+ k(i-1,j))/2)*A(i-1,j)*(( Told(i,j)-Told(i-1,j))/dr); % heat conduction out of node
Qcv(i,j) = h*(Tair-Told(i,j))*A(i,j); % heat transfer through convectioin on surface
Tnew(i,j) = ((Qcv(i,j)-Qcd_ot(i,j))/(mass(i,j)*cp(i,j))/2)*dt + Told(i,j);
end % end of for loop
end
% Temperature at inner nodes
for i=2:1:(nodes-1)
for j=2:1:(nodes-1)
Qcd_in(i,j)= ((k(i,j)+ k(i+1,j))/2)*A(i,j) *((2/R)*(( Told(i+1,j)-Told(i,j))/(2*dr)) + ((Told(i+1,j)-2*Told(i,j)+Told(i-1,j))/(dr^2)) + ((cot(y)/(R^2))*((Told(i,j+1)-Told(i,j-1))/(2*dy))) + (1/(R^2))*(Told(i,j+1)-2*Told(i,j)+ Told(i,j-1))/(dy^2));
Qcd_out(i,j)= ((k(i,j)+ k(i-1,j))/2)*A(i-1,j)*((2/R)*(( Told(i,j)-Told(i-1,j))/(2*dr)) +((Told(i+1,j)-2*Told(i,j)+Told(i-1,j))/(dr^2)) + ((cot(y)/(R^2))*((Told(i,j+1)-Told(i,j-1))/(2*dy))) + (1/(R^2))*(Told(i,j+1)-2*Told(i,j)+ Told(i,j-1))/(dy^2));
Tnew(i,j) = ((Qcd_in(i,j)-Qcd_out(i,j))/(mass(i,j)*cp(i,j)))*dt + Told(i,j);
end %end for loop
end % end for loop
%Temperature for at center line nodes
for i=2:1:(nodes-1)
for j=1
Qcd_line(i,j)=((k(i,j)+ k(i+1,j))/2)*A(i,j)*(Told(i+1,j)-Told(i,j))/dr;
Qcd_lineout(i,j)=((k(i,j)+ k(i-1,j))/2)*A(i-1,j)*(Told(i,j)-Told(i-1,j))/dr;
Tnew(i,j)= ((Qcd_line(i,j)-Qcd_lineout(i,j))/(mass(i,j)*cp(i,j)))*dt + Told(i,j);
end
end
% Temperature at bottom point (center) of the hemisphere solid
for i=1
for j=1:1:(nodes-1)
Qcd_center(i,j)=(((k(i,j)+k(i+1,j))/2)*A(i,j)*(Told(i+1,j)-Tair)/dr);
Tnew(i,j)= ((Qcd_center(i,j))/(mass(i,j)*cp(i,j)))*dt + Told(i,j);
end
end
% Temperature at all bottom points of the hemisphere
Tnew(:,nodes)=-22;
Told=Tnew;
t=t+dt;
Tnew temperatures values are getting bigger exponentially after program is run, and then becoming NaN. It supposed to show me cooling and freezing temperature profile of solid until it reaches to Tair temperature. I could not figure out the reasons why it is changing like that.
I would like to hear your suggestions for BCs implementation to this program, or how should i change them according to this conditions. Thanks in advance !!
Your code is too long to read and understand completely, but it looks like you are using a simple forward Euler scheme, is that correct? If so, try to reduce the time-step dt, maybe by a lot, since this method can become numerically unstable if dt is too big. This might slow down the speed of the computation (again by a lot), but that is the price you pay for such a simple algorithm. There are alternatives methods that do not suffer from instability, but they are much harder to implement, since you need to solve a system of equations.
I did some thermal simulations using this simple scheme a long time ago. I found that the stability criteria was dt < (dx)^2 * c_p * rho / (6 * k), which should be valid for a simulation on a 3D cartesian grid, where dx is the spatial step, c_p is the specific heat, rho the density and k the thermal conductivity of the material. I don't know how to convert this to your case with spherical coordinates. The thing I learned then was to choose small time-steps, but more importantly as large dx as possible: when you reduce dx by a factor 2, you also need to reduce dt by a factor 4 to keep things stable. At the same time, for a 3D problem, the number of elements will increase by a factor 8. So the total simulation time scales with 1 / (dx)^5!!!

Why do I get two frequency spikes from a simple sin function via FFT in R?

I learned about fourier transformation in mathematics classes and thought I had understood them. Now, I am trying to play around with R (statistical language) and interpret the results of a discrete FFT in practice. This is what I have done:
x = seq(0,1,by=0.1)
y = sin(2*pi*(x))
calcenergy <- function(x) Im(x) * Im(x) + Re(x) * Re(x)
fy <- fft(y)
plot(x, calcenergy(fy))
and get this plot:
If I understand this right, this represents the 'half' of the energy density spectrum. As the transformation is symmetric, I could just mirror all values to the negative values of x to get the full spectrum.
However, what I dont understand is, why I am getting two spikes? There is only a single sinus frequency in here. Is this an aliasing effect?
Also, I have no clue how to get the frequencies out of this plot. Lets assume the units of the sinus function were seconds, is the peak at 1.0 in the density spectrum 1Hz then?
Again: I understand the theory behind FFT; the practical application is the problem :).
Thanks for any help!
For a purely real input signal of N points you get a complex output of N points with complex conjugate symmetry about N/2. You can ignore the output points above N/2, since they provide no useful additional information for a real input signal, but if you do plot them you will see the aforementioned symmetry, and for a single sine wave you will see peaks at bins n and N - n. (Note: you can think of the upper N/2 bins as representing negative frequencies.) In summary, for a real input signal of N points, you get N/2 useful complex output bins from the FFT, which represent frequencies from DC (0 Hz) to Nyquist (Fs / 2).
To get frequencies from the result of an FFT you need to know the sample rate of the data that was input to the FFT and the length of the FFT. The center frequency of each bin is the bin index times the sample rate divided by the length of the FFT. Thus you will get frequencies from DC (0 Hz) to Fs/2 at the halfway bin.
The second half of the FFT results are just complex conjugates of the first for real data inputs. The reason is that the imaginary portions of complex conjugates cancel, which is required to represent a summed result with zero imaginary content, e.g. strictly real.

Resources