How to plot data from different blocks with lines in Gnuplot? - plot

I have a data file with blocks of x/y values. Each block contains 16 lines with x/y pairs and each block represents those positions in a different time. http://pastebin.com/0teRrfRU
I want to plot the trajectory of a specific particle. To do that, I've written plot 'pos.dat' u 2:3 every ::n:0:n:i, where n is the n-th particle and i is the time up to which I want the trajectory plotted (I can then loop over the i to generate an animation).
This runs fine, but when I add w lines nothing gets plotted, and I don't understand why. Is there a way to plot this with lines? The only alternative I see is writing a script to parse the data file and generate a new one with only the values I want (effectively acting as every), but I don't want to do that if I can do it in Gnuplot.

After a closer look to your data, your case has some speciality.
Like in Plotting same line number of several blocks data with gnuplot you can plot the file into a table via with table which will remove the empty lines and hence lines will be connected.
However, some of your particles disappear on one side and re-appear on the opposite side. If you plot this with lines you will get a line through the whole graph which is certainly undesired. You can workaround this if you introduce a function Break() which returns NaN if the difference of two successive x- or y-values are larger than 90% (to be on the safe side) of the x- or y-range , respectively. The effect of NaN is that the line will interrupted.
Code: (works with gnuplot>=5.0.0 version at the time of OP's question)
### plotting trajectories
reset session
set term gif animate delay 3 size 400,400
set output "SO30744875.gif"
set size square
FILE = 'SO30744875.dat'
set key noautotitle
stats FILE u (N=column(-1),M=column(1),$2):3 nooutput
xrange = STATS_max_x-STATS_min_x
yrange = STATS_max_y-STATS_min_y
set table $Data
plot FILE u 1:2:3 w table
unset table
Break(col1,col2) = (x0=x1,x1=column(col1), y0=y1,y1=column(col2), \
abs(x1-x0)<0.9*xrange && abs(y1-y0)<0.9*yrange ? column(col2) : NaN)
do for [i=0:N] {
plot for [j=1:16] x1=y1=NaN $Data u 2:(Break(2,3))every M::j-1::(i+1)*M w l, \
FILE u 2:3 every :::i::i w p pt 7, \
FILE u 2:3:1 every :::i::i w labels offset 0.7,0.7
}
set output
### end of code
Result:

Related

Reducing number of datapoints when plotting in loglog scale in Gnuplot

I have a large dataset which I need to plot in loglog scale in Gnuplot, like this:
set log xy
plot 'A_1D_l0.25_L1024_r0.dat' u 1:($2-512)
LogLogPlot of my datapoints
Text file with the datapoints
Datapoints on the x axis are equally spaced, but because of the logscale they get very dense on the right part of the graph, and as a result the output file (I finally export it in .tex) gets very large.
In linear scale, I would simply use the option every to reduce the number of points which get plotted. Is there a similar option for loglogscale, such that the plotted points appear equally spaced?
I am aware of a similar question which was raised a few years ago, but in my opinion the solution is unsatisfactory: plotted points are not equally spaced along the x-axis. I think this is a really unsophisticated problem which deserves a clearer solution.
As I understand it, you don't want to plot the actual data points; you just want to plot a line through them. But you want to keep the appearance of points rather than a line. Is that right?
set log xy
plot 'A_1D_l0.25_L1024_r0.dat' u 1:($2-512) with lines dashtype '.' lw 2
Amended answer
If it is important to present outliers/errors in the data set then you must not use every or any other technique that simply discards or skips most of the data points. In that case I would prefer the plot with points that you show in the original question, perhaps modified to represent each point as a dot rather than a cross. I will simulate this by modifying a single point in your 500000 point data set (first figure below). But I would also suggest that the presence of outliers is even more apparent if you plot with lines (second figure below).
Showing error bounds is another alternative for noisy data, but the options depend on what you have to work with in your data set. If you want to pursue that, please ask a separate question.
If you really want to reduce the number of data to be plotted, you might consider the following script.
s = 0.1 ### sampling interval in log scale
### (try 0.05 for more detail)
c = log10(0.01) ### a parameter used in sampler(x)
### which should be initialized by
### smaller value than any x in log scale
sampler(x) = (x>0 && log10(x)>=c) ? (c=ceil(log10(x)/s+0.5)*s, x) : NaN
set log xy
set grid xtics
plot 'A_1D_l0.25_L1024_r0.dat' using (sampler($1)):($2-512) with points pt 7 lt 1 notitle , \
'A_1D_l0.25_L1024_r0.dat' using 1:($2-512) with lines lt 1 notitle
This script samples the data in increments of roughly 0.1 on x-axis in log scale. It makes use of the property that points whose x value is evaluated as NaN in using are not drawn.

Force 1st point of pointinterval to be plotted

I tried to plot graph using the pointinterval command and I would like the 1st point of my data to be plotted which is not the case for the hot side of my first plot. Indeed we see the purple dashed line but no point at the bottom left corner (around y+=0.35).
My code involves for loop and is displayed below:
plot for [i=1:words(FILES)] myDataFile(i) u (column(1)):(column(6)/word(UTAUS_ch,i)) w lp pointinterval 2 pt myPointtype(i) ps myPointsize(i) dt myDashtype(i) lt myLinetype(i) lw myLinewidth(i) lc rgb myLinecolor(i) title myTitle(i)
If I plot with pointinterval 1 we see that those points exist (see picture below).
How can I force the first point to be plotted with pointinterval?
Is that possible to plot half of my points every 2 points and the other part every 2 points but with an offset of 1 point?
I do not think you will be able to do what you want using the pointinterval property. It is designed so that the offset of the initial point increases by one for each plot drawn, with the intention of reducing the chance that point symbols from successive plots will overlap. This is exactly opposite to what you are trying to do.
Therefore I suggest not plotting each dataset with linespoints pi N. Instead plot each dataset twice, once with lines and once with points using a filter in the using specifier like this:
plot FOO using 1:2 with lines, '' using ((int($0)%N) ? NaN : $1) : 2 with points
The filter (int($0)%N ? NaN : $1) suppresses all points whose line number is not evenly divisible by N. This is essentially what the pointinterval property does, except that pointinterval skips out-of-range points and otherwise unplottable points rather than strictly using the line number as an index.
Edit If individual offset values are required because x-coordinates are not consistent:
array offset[N] = [1,1,2,-1, and so on]
plot for [i=1:N] \
MyDataFile(i) using 1:2 with lines, \
'' using (((int($0)+offset[i] % N) ? NaN : $1) : 2 with points

How do i plot a large dataset (550.000 points) so spikes in the values can be identified?

I have a dataset of 550.000 points. around 460.000 of them have values in the area of -2 to -30 while the rest has values from 0 to 1550.
I tried plotting them with gnuplot, but the large amount of jumps from minus values and to some of the positive values makes it one big block of lines where I cannot see anything.
So how can I plot these values in a way such that I can identify all the spikes where it goes from negative values and into positive values?
data is a single column of values in a file so it should be fairly simple to parse into any tool (preferably for Linux though, but I can do with Windows as well).
Depending on what you exactly want to do with your 90'000 spikes, here, plotting with impulses is probably better than with lines. The example below is just with only 20'000 datapoints and approx. 17% spikes.
Code:
### "find" spikes
reset session
# create some test data
range1(n) = int(rand(0)*28)-30
range2(n) = int(rand(0)*1550)
data(n) = int(rand(0)+0.17) == 0 ? range1(0) : range2(0)
set print $Data
do for [i=1:20000] { print data(0) }
set print
plot $Data u 1 w impulses notitle
### end of code
Result: (screen capture of wxt terminal with zoom-in)

GNU plot - count the number of peaks

I have a very huge text file with 11 columns. As I can't post the whole data, I have uploded the text file to a public repo and is found in this link: http://s000.tinyupload.com/?file_id=59483318155908771897
Is there any way to COUNT the number of peaks using GNU plot in Linux? From the above text file, I am plotting the 1st and 7th column as x and y columns where the peaks are variations of the 7th column and that's what I am interested in. For example, to count the number of peaks of frequency as in the following image as 10.
Here a simple plotting script i am using.
set key right top
set xrange [:10]
#show timestamp
set xlabel "time in sec"
set ylabel "Freq"
set title "Testing"
plot "data/freq.csv" using 1:7 title "Freq", \
Thanks for any help.
Gnuplot is for plotting and minor arithmetic, finding peaks in a signal is a signal processing task and you need something like GNU Octave to do a reasonable job. If you load freq.csv file and run findpeaks() on it with a plausible value for MinPeakDistance you get:
The code I used to generate the above plot:
y = dlmread('freq.csv', ' ');
[peak_y, peak_x] = findpeaks(y(:,7), "MinPeakDistance", 40);
plot(y(:,1), y(:,7), y(peak_x,1), peak_y, '.r');
Depending on what you want findpeaks() might be enough, see help findpeaks and demo findpeaks for other options you can tweak.
It's a bit of tweaking but this example should help:
y2=y1=y0=NaN
stat "data/freq.csv" using (y2=y1,y1=y0,y0=$7,(y1>y2&&y1>y0?y1:NaN)) prefix "data"
Now in the variable data_records you should get the COUNT of local maximums you have in column 7.
You can print via
print data_records
To understand more, I post here an example of the sinus function
set table 'test.dat'
plot sin(x)
unset table
x2=x1=x0=NaN
y2=y1=y0=NaN
plot 'test.dat' using (x2=x1,x1=x0,x0=$1,x1):(y2=y1,y1=y0,y0=$2,(y1>y2&&y1>y0?y1:NaN)) w p, 'test.dat' u 1:2 w l
Should plot a sinus and also the maximum points.
In case several points have the same value:
x2=x1=x0=NaN
y2=y1=y0=NaN
plot 'freq.csv' u 0:7 w l, '' using (x2=x1,x1=x0,x0=$0,x1):(y2=y1,y1=y0,y0=$7,(y1>=y2&&y1>y0?y1:NaN)) w p
or
plot 'freq.csv' u 0:7 w l, '' using (x2=x1,x1=x0,x0=$0,x1):(y2=y1,y1=y0,y0=$7,(y1>y2&&y1>=y0?y1:NaN)) w p
depending on which side of the plateau you want to count the peak
The stat command becomes:
stat 'freq.csv' using (y2=y1,y1=y0,y0=$7,(y1>=y2&&y1>y0?y1:NaN)) prefix "data"

Autoscale axis in Gnuplot pm3d map

I have a 4 column file with x,y,z data (the 4th column is just a row counter) and I am trying to make an animation with pm3d map in gnuplot. Each frame is given by 10000 points in the file (the file I shared contains only 3 frames). I am able to plot the first frame with the following command:
splot 'data.txt' u 1:2:3 every:::0::10198
However, if I try to plot the second frame, for example, with the following command:
splot 'data.txt' u 1:2:3 every:::10100::20198
I am given the message:
Warning: No usable data in this plot to auto-scale axis range.
It tells me it can't auto scale any of the axis and if i try to scale it manually, it just doesn't work. First I thought I was just plotting the wrong rows, so I added the row counter, still doesn't work though.
What is funny is that if I make this plot in the traditional splot, everything works fine. I could just go with that, but this is a terrible visualizaiton of data in my opinion, so I'd really like to use pm3d map.
Here is the GDrive folder with 3 frames data, first frame in pm3d and the traditional splot animation
Thanks in advance.
Because you have to try plot not the first, second, third... 10000 data-points but the first 10000 data-sets.
every A:B:C:D:E:F
A - every Ath datapoint
B - every Bth data-set
C,D - first data-point/set
E,F - last data-point/set

Resources