Gnuplot : skip missing data points and xticlabels - plot

I would like to skip some points to draw a graph in gnuplot and not connecting lines through missing points.
It is the same problem than : https://superuser.com/questions/440947/in-gnuplot-how-to-plot-with-lines-but-skip-missing-data-points
The gnuplot help says :
set datafile missing "?"
set style data lines
plot '-'
1 10
2 20
3 ?
4 40
5 50
e
plot '-' using 1:2
1 10
2 20
3 ?
4 40
5 50
e
plot '-' using 1:($2)
1 10
2 20
3 ?
4 40
5 50
e
The first plot will recognize only the first datum in the "3 ?" line. It
will use the single-datum-on-a-line convention that the line number is "x"
and the datum is "y", so the point will be plotted (in this case erroneously)
at (2,3).
The second plot will correctly ignore the middle line. The plotted line
will connect the points at (2,20) and (4,40).
The third plot will also correctly ignore the middle line, but the plotted
line will not connect the points at (2,20) and (4,40).
In order not to connect the points (2,20) and (4,40), we have to put a $ symbol : plot '-' using 1:($2)
I'd like to do the same thing with the following line :
plot using i:xticlabels(1) title columnheader(i)
But it doesn't work (i tried ($i):xticlabels(1) and other things... it doesn't work)
Thank you

You must use column(i) to select the i-th column. $1 is a shortcut for column(1) but you cannot use $i as shortcut for column(i):
set style data lines
i=2
plot '-' using (column(i)):xticlabels(1) title columnheader(i)
A B
1 10
2 20
3 ?
4 40
5 50
e

Related

Gnuplot - a way to convert and plot text information?

I am trying to use gnuplot to display the information contained in a file as in the example below:
1 2 3 … 10 11
1 1.0000000e-06 1.0000000e-06 … 0
2 2.5000000e-06 1.5000000e-06 … 0 #dt_grow
3 4.7500000e-06 2.2500000e-06 … 0 #dt_grow
4 8.1250000e-06 3.3750000e-06 … 0 #dt_cfl
5 1.2450703e-05 4.3257029e-06 … 1 #dt_mach, max_iteration_turbulence
6 1.6811013e-05 0.3603104e-06 … 0 #dt_grow
My goal is to be able to represent, somehow, the information listed in column 11 which, as you can see, contains non-numeric characters.
It might be pointless but, before moving ahead, it might be helpful to stress that:
row1 has no value at column 11
each column 11 value start with # and is not quoted
column 11 contains many other different possible entries (e.g. "#dt_piso","#dt_piso, 2*max_piso reached", "#dt_mach, temperature extrapolation error")
when values of column 11 present an additional information (e.g ", max_iteration_turbulence") values of column 10 are non-zero
the number of rows is typically of the order 10^6
My idea was to use associate a numeric value to each element of column11 using functions (e.g. if #dt_grow then 1, if #dt_cfl then 2 ecc) so that I can somehow represent this information.
What I have tried so far produce nothing but errors (that I am for brevity listing below each used plot command):
p "file" u 1:11 w l
--> x range is invalid
p "file" u 1:(''.$11 eq "#dt_cfl" ? 1 : 0) w l
--> warning: Skipping data file with no valid points. x range is invalid
p "file" u 1:(column(11) eq "#dt_cfl" ? 1 : 0) w l
--> internal error : STRING operator applied to non-STRING type
p "file" u 1:(strcol(11) eq "#dt_cfl" ? 1 : 0) w l
--> internal error : STRING operator applied to non-STRING type
splot "time.out" u 1:(11 eq "#dt_cfl" ? 1 : 0) w l
--> Need 1 or 3 columns for cartesian data
#Usage of functions does not resolve the issue:
e.g. f(x)= ''.x eq "#dt_cfl" ? 1 : 0
As you can probably tell by the diversity of my trials I am somehow confused on how it is recommendable to proceed in such cases. I have never had to plot string data and I am not quite sure of what is causing the issue. I've been looking for some inputs on the documentation but nothing really helped me on this. I would very much appreciate any inputs on how to handle string data and associate them to numeric values.
To wrap it up: I want to display the evolution of the information on column 11.
Ideally, I would like to be able to use the eventual additional information (as explained in point 4 above) based on the value of column 10.
Based on my request I believe a python script could better fit my necessities, but I am wondering if gnuplot offers such possibilities and I am eager to learn more.
Thanks in advance :)!
P.S.: I am adding a sketch of the results I am trying to obtain hping that this can help clarify my goals.
I am anyway open to new solution as this is just my plan of how I was thinking about overcoming the problem of plotting text data.
With respect to the few rows of data that provided above and assuming to do the following assosiations:
#dt_grow is 1
#dt_cfl is 2
#dt_mach is 3
so on for other possible values (this could be hardcoded as I would have no more that 10 possible values in column11)
Plot_ sketch
Maybe something like this?
You can use the 11th column (here: 5th column) as x2ticlabels (check help xticlabels). Before, link the x2 axis to the x1 axis (check help link).
You could rotate the x2tic labels if they are getting to many and overlap: set x2tics rotate by 90.
In principle, you could get rid of the leading # of each label, but I guess it will get a bit tricky because of your missing value in row 1.
Look at the example below as a starting point.
Script:
### adding text info from columns to some labels
reset session
$Data <<EOD
1 2 3 4 5
1 1.0000000e-06 1.0000000e-06 0
2 2.5000000e-06 1.5000000e-06 0 #dt_grow
3 4.7500000e-06 2.2500000e-06 0 #dt_grow
4 8.1250000e-06 3.3750000e-06 0 #dt_cfl
5 1.2450703e-05 4.3257029e-06 1 #dt_mach, max_iteration_turbulence
6 1.6811013e-05 0.3603104e-06 0 #dt_grow
EOD
set termoption noenhanced
set key top left
set link x2 via x inverse x
set x2tics
plot $Data u 1:2:x2tic(5) skip 1 axes x2y1 w lp pt 7 lc "red" title "column 2", \
'' u 1:3 skip 1 w lp pt 7 lc "web-green" title "column 3"
### end of script
Result:
Addition:
I guess I understand what you want to do but the background is still a bit unclear.
What you are asking for is a conversion or mapping of strings to numbers.
I assume you have a fixed and known set of keywords.
Apparently, for your desired plot the other columns besides 1 and 11 do not play a role.
Your missing value in column 11 in row 1 (excl. header) will create problems, hence add the option skip 2.
In the minimized example below, your column 11 is actually column 2.
The example below will create some random test data for better illustration.
create a string list of your keywords
you can address them via word(), check help word
you can (mis)use sum for a lookup to get the index, check help sum
furthermore, check help strcol, help xticlabels, help skip, help ternary.
Script:
### map strings to numbers
reset session
myKeys = '#dt_grow #dt_cfl #dt_piso #dt_foo #dt_bar #dt_xyz #dt_abc'
myKey(i) = word(myKeys,i)
# create some random test data
set table $Data
set samples 50
plot '+' u ("1 2") every ::0::0 w table
plot '+' u ("1") every ::0::0 w table
plot '+' u ($0+1):(word(myKeys,int(rand(0)*words(myKeys)+1))) w table
unset table
getIdx(s) = (n=0, sum[i=1:words(myKeys)] (s eq myKey(i) ? n=i : 0), n)
set ytics 1
set grid x,y
plot $Data u 1:(y0=getIdx(strcol(2))):ytic(myKey(y0)) skip 2 w lp pt 7 lc "red" notitle
### end of script
Result:
I will not attempt a full answer right now, but here are a few pieces that may be useful by themselves or in conjunction with the answer from #theozh.
Column 11 not always present: The presence or absence of column 11 on any given line can be tested using the "pseudo-column" #$, which evaluates to the total number of columns found on that line. See "help pseudo". This feature was introduced in gnuplot version 5.4.2 (June 2021). For example to plot the values of column 10 but only if column 11 is also present:
plot FOO using 0:((#$ > 10) ? column(10) : NaN)
-Separate lines on the graph for each column 11 category: This could be done more cleanly using arrays in the development version of gnuplot, but sticking with features present in version 5.4 I suggest placing all the categories you want to track in one big string and then looping over the string.
Category = "#dt_grow #dt_cfl #dt_mach"
xcoord(x) = ... some function of the value in column 1? ...
ycoord(y) = ... some function of the value in column 10? ...
set datafile missing NaN #ignore any lines that evaluate to NaN
plot for [cat in Category] (xcoord($1)) : (strcol(#$) eq cat ? ycoord($10) : NaN) with steps

gnuplot every command with different lines color and legend

Lets say I have this sample data file
1 2
2 3
3 4
1 5
2 6
3 7
1 8
2 9
3 10
Now in gnuplot if I run this command
pl 'test.dat' u 1:2 every :::0::2 w l
It plots three lines for each of the block in the data file, but there's no way to distinguish which line comes from which data block. I want those three lines to have three different colors and different legend labels. Can I do that in addition to the every command?
Sure, there are multiple ways to achieve that. If you insist on having a single empty line between the blocks and on using every, you can plot iteratively:
plot for [i=0:*] 'test.dat' u 1:2 every :::i::i w l lc i
Alternatively, if you separate your data block with two empty lines, you can use the index:
1 2
2 3
3 4
1 5
2 6
3 7
1 8
2 9
3 10
plot for [i=0:*] 'test.dat' index i u 1:2 w l lc i
(shortcut i i instead of index i is also allowed, but difficult to read)
Or without iteration, but using the pseudo-column -2 (which gives you the index number). Note that gnuplot doesn't draw continuous lines between points that are separated with empty lines, therefore the every command is not necessary.
plot 'test.dat' u 1:2:-2 w l lc variable
Automatically generated, different labels can be produced in the following way:
plot for [i=0:*] 'test.dat' i i u 1:2 w l lc i t sprintf('This is block %d', i)

Printing custom label every n elements using Gnuplot

I want to create scatter plot of a file that looks like:
counter N x y
1 200 50 50
2 200 46 46
3 200 56 56
4 200 36 36
5 200 56 56
There are 240 lines in this file. The N is incremented by 200 every 30 lines.
So, when I plot the numbers I want to create a scatter plot of x, y values vs. counter. Here is my code:
plot "file" using 1:3 title "hb" with points pt 2 ps 1 lc rgb "red", \
"file" using 1:4 title "ls" with points pt 3 ps 1 lc rgb "blue"
As a result my x-axis has the range [1,240].
The question is that I want the label of my x-axis to contain the values from the second column, and I want them to be printed after every 30 points.
So, I want my x-axis label to be customized as: [200,400,600,800,1000,1200,1400,1600] where they each have 30 points in between.
I actually searched for this question before, found the solution and solved it. So, I know there is an answer somewhere. But apparently I lost my code. I have been searching for the old post for an hour now but could not find it.
Can anyone help me with using customized labels here?
I'm not sure how to generate xtics from the data in gnuplot, so I'd use bash to generate them for me:
#! /bin/bash
xtics='('$(cut -d' ' -f1,2 file | sort -nuk2 | sed 's/\(.*\) \(.*\)/\2 \1/;s/^/"/;s/ /" /;s/$/,\\/')$'\n)'
gnuplot <<EOF
set term png
set output '1.png'
set xtics $xtics
plot "file" using 1:3 title "hb" with points pt 2 ps 1 lc rgb "red", \
"" using 1:4 title "ls" with points pt 3 ps 1 lc rgb "blue"
EOF
On a randomly generated input, it gives this output:
You can evaluate any expression in xticlabel to give a string or an invalid value. In order to set labels only at certain values of column 1, you can use
plot "file" using 1:3:xtic(int($1)%30 == 0 ? strcol(2) : 1/0) title "hb" pt 2 lc rgb "red", \
"" using 1:4 title "ls" pt 3 lc rgb "blue"
Thr expression xtic(int($1)%30 == 0 ? strcol(2) : 1/0) places the string value of column 2 when the value in column 1 is a multiple of 30. All other values are skipped, because 1/0 is an invalid value.

How to plot a vector field with colormap in gnuplot?

I have data in a text file in the following way
x y dx dy z
1 0 1 2 5
2 3 3 3 6
2 4 5 4 8
. . . . .
I'm using gnuplot and I can already plot the vector field using columns x,y,dx,dy but I also want to plot color map using x,y and z on the same graph. I want something like this vector field with color map
I have no idea how to do this. Please help!
You can do a plot 'data' with image. You can look up the plotting styles in the gnuplot doc.
Then you can combine your vector plot and this one. Examples with source code you can find here: image « Gnuplotting. This way you should be able to create graphs like the desired one:
#your plot script without plot...
plot 'data' u 1:2:5 with image, \ # not scaled yet
'' u 1:2:3:4 with vector ...#your vector field plot
Note: Thies does only works, if you have your data in the form of
1 1 z1
1 2 z2
. . ..
1 n zn
2 1 z21
2 2 z22
. . ...
2 n z2n
3 1 ...
. . ...
So you need ALL datapoints without a 'hole' in it...
I'll continue tomorrow...

Gnuplot: How do I skip columns in matrix input to plot?

I have data file of the form:
unimportant1 unimportant2 unimportant3 matrixdata[i]
1e4 2e5 3e2 1 2 3 4 5
2e3 1e1 7e3 5 4 3 2 1
... ... ... ...
2e3 1e4 4e2 4 4 4 4 4
So it has columnheaders (here "unimportant1" to "unimportant3") as the first row. I want gnuplot to ignore these first three unimportant columns columns so the data entries in exponential notation. I want gnuplot to plot the matrixdata as a matrix. So as if I did it like this:
#!/usr/bin/gnuplot -p
plot '-' matrix with image
1 2 3 4 5
5 4 3 2 1
...
4 4 4 4 4
e
How do I get gnuplot to ignore the first three columns and the header row and plot the rest as matrix image? For compatibility, I would prefere a gnuplot built-in to do that, but I could write a shell script and use the `plot '< ...' syntax preprocessing the data file.
Edit: So neuhaus' answer almost solved it. The only thing I'm missing is, how to ignore the first row (line) with the text header data. Every seems to expect numeric data and so the whole plot fails as it's not a matrix. I don't want to comment out the fist line, as I'm using the unimportant data sets for other 2D plots that, in turn, use the header data.
So how do I skip a row in a matrix plot that already uses every to skip columns?
When using matrix gnuplot must first parse the data file before it can skip rows and columns. Now, your first row evaluates to four invalid number, the second row has 8 number and I get an error that Matrix does not represent a grid.
If you don't want to comment out the first line or skip it with an external tool like < tail -n +2 matrix.dat, then you could change it to contain some dummy strings like
unimportant1 unimportant2 unimportant3 matrixdata[i] B C D E
1e4 2e5 3e2 1 2 3 4 5
2e3 1e1 7e3 5 4 3 2 1
... ... ... ...
2e3 1e4 4e2 4 4 4 4 4
Now your first row has as many entries as the other rows, and you can plot this file with
plot 'test.txt' matrix every ::3:1 with image
This still gives you a warning: matrix contains missing or undefined values, but you don't need to care.
I'm not familiar with matrix plots, but I got some sample data and
plot 'matrix.dat' matrix every ::3 with image
seems to do the trick.
You could probably use shell commands, for instance, the following skips the first six lines of a file:
plot '<tail -n +7 terrain0.dem' matrix with image

Resources