Printing custom label every n elements using Gnuplot - plot

I want to create scatter plot of a file that looks like:
counter N x y
1 200 50 50
2 200 46 46
3 200 56 56
4 200 36 36
5 200 56 56
There are 240 lines in this file. The N is incremented by 200 every 30 lines.
So, when I plot the numbers I want to create a scatter plot of x, y values vs. counter. Here is my code:
plot "file" using 1:3 title "hb" with points pt 2 ps 1 lc rgb "red", \
"file" using 1:4 title "ls" with points pt 3 ps 1 lc rgb "blue"
As a result my x-axis has the range [1,240].
The question is that I want the label of my x-axis to contain the values from the second column, and I want them to be printed after every 30 points.
So, I want my x-axis label to be customized as: [200,400,600,800,1000,1200,1400,1600] where they each have 30 points in between.
I actually searched for this question before, found the solution and solved it. So, I know there is an answer somewhere. But apparently I lost my code. I have been searching for the old post for an hour now but could not find it.
Can anyone help me with using customized labels here?

I'm not sure how to generate xtics from the data in gnuplot, so I'd use bash to generate them for me:
#! /bin/bash
xtics='('$(cut -d' ' -f1,2 file | sort -nuk2 | sed 's/\(.*\) \(.*\)/\2 \1/;s/^/"/;s/ /" /;s/$/,\\/')$'\n)'
gnuplot <<EOF
set term png
set output '1.png'
set xtics $xtics
plot "file" using 1:3 title "hb" with points pt 2 ps 1 lc rgb "red", \
"" using 1:4 title "ls" with points pt 3 ps 1 lc rgb "blue"
EOF
On a randomly generated input, it gives this output:

You can evaluate any expression in xticlabel to give a string or an invalid value. In order to set labels only at certain values of column 1, you can use
plot "file" using 1:3:xtic(int($1)%30 == 0 ? strcol(2) : 1/0) title "hb" pt 2 lc rgb "red", \
"" using 1:4 title "ls" pt 3 lc rgb "blue"
Thr expression xtic(int($1)%30 == 0 ? strcol(2) : 1/0) places the string value of column 2 when the value in column 1 is a multiple of 30. All other values are skipped, because 1/0 is an invalid value.

Related

Gnuplot - a way to convert and plot text information?

I am trying to use gnuplot to display the information contained in a file as in the example below:
1 2 3 … 10 11
1 1.0000000e-06 1.0000000e-06 … 0
2 2.5000000e-06 1.5000000e-06 … 0 #dt_grow
3 4.7500000e-06 2.2500000e-06 … 0 #dt_grow
4 8.1250000e-06 3.3750000e-06 … 0 #dt_cfl
5 1.2450703e-05 4.3257029e-06 … 1 #dt_mach, max_iteration_turbulence
6 1.6811013e-05 0.3603104e-06 … 0 #dt_grow
My goal is to be able to represent, somehow, the information listed in column 11 which, as you can see, contains non-numeric characters.
It might be pointless but, before moving ahead, it might be helpful to stress that:
row1 has no value at column 11
each column 11 value start with # and is not quoted
column 11 contains many other different possible entries (e.g. "#dt_piso","#dt_piso, 2*max_piso reached", "#dt_mach, temperature extrapolation error")
when values of column 11 present an additional information (e.g ", max_iteration_turbulence") values of column 10 are non-zero
the number of rows is typically of the order 10^6
My idea was to use associate a numeric value to each element of column11 using functions (e.g. if #dt_grow then 1, if #dt_cfl then 2 ecc) so that I can somehow represent this information.
What I have tried so far produce nothing but errors (that I am for brevity listing below each used plot command):
p "file" u 1:11 w l
--> x range is invalid
p "file" u 1:(''.$11 eq "#dt_cfl" ? 1 : 0) w l
--> warning: Skipping data file with no valid points. x range is invalid
p "file" u 1:(column(11) eq "#dt_cfl" ? 1 : 0) w l
--> internal error : STRING operator applied to non-STRING type
p "file" u 1:(strcol(11) eq "#dt_cfl" ? 1 : 0) w l
--> internal error : STRING operator applied to non-STRING type
splot "time.out" u 1:(11 eq "#dt_cfl" ? 1 : 0) w l
--> Need 1 or 3 columns for cartesian data
#Usage of functions does not resolve the issue:
e.g. f(x)= ''.x eq "#dt_cfl" ? 1 : 0
As you can probably tell by the diversity of my trials I am somehow confused on how it is recommendable to proceed in such cases. I have never had to plot string data and I am not quite sure of what is causing the issue. I've been looking for some inputs on the documentation but nothing really helped me on this. I would very much appreciate any inputs on how to handle string data and associate them to numeric values.
To wrap it up: I want to display the evolution of the information on column 11.
Ideally, I would like to be able to use the eventual additional information (as explained in point 4 above) based on the value of column 10.
Based on my request I believe a python script could better fit my necessities, but I am wondering if gnuplot offers such possibilities and I am eager to learn more.
Thanks in advance :)!
P.S.: I am adding a sketch of the results I am trying to obtain hping that this can help clarify my goals.
I am anyway open to new solution as this is just my plan of how I was thinking about overcoming the problem of plotting text data.
With respect to the few rows of data that provided above and assuming to do the following assosiations:
#dt_grow is 1
#dt_cfl is 2
#dt_mach is 3
so on for other possible values (this could be hardcoded as I would have no more that 10 possible values in column11)
Plot_ sketch
Maybe something like this?
You can use the 11th column (here: 5th column) as x2ticlabels (check help xticlabels). Before, link the x2 axis to the x1 axis (check help link).
You could rotate the x2tic labels if they are getting to many and overlap: set x2tics rotate by 90.
In principle, you could get rid of the leading # of each label, but I guess it will get a bit tricky because of your missing value in row 1.
Look at the example below as a starting point.
Script:
### adding text info from columns to some labels
reset session
$Data <<EOD
1 2 3 4 5
1 1.0000000e-06 1.0000000e-06 0
2 2.5000000e-06 1.5000000e-06 0 #dt_grow
3 4.7500000e-06 2.2500000e-06 0 #dt_grow
4 8.1250000e-06 3.3750000e-06 0 #dt_cfl
5 1.2450703e-05 4.3257029e-06 1 #dt_mach, max_iteration_turbulence
6 1.6811013e-05 0.3603104e-06 0 #dt_grow
EOD
set termoption noenhanced
set key top left
set link x2 via x inverse x
set x2tics
plot $Data u 1:2:x2tic(5) skip 1 axes x2y1 w lp pt 7 lc "red" title "column 2", \
'' u 1:3 skip 1 w lp pt 7 lc "web-green" title "column 3"
### end of script
Result:
Addition:
I guess I understand what you want to do but the background is still a bit unclear.
What you are asking for is a conversion or mapping of strings to numbers.
I assume you have a fixed and known set of keywords.
Apparently, for your desired plot the other columns besides 1 and 11 do not play a role.
Your missing value in column 11 in row 1 (excl. header) will create problems, hence add the option skip 2.
In the minimized example below, your column 11 is actually column 2.
The example below will create some random test data for better illustration.
create a string list of your keywords
you can address them via word(), check help word
you can (mis)use sum for a lookup to get the index, check help sum
furthermore, check help strcol, help xticlabels, help skip, help ternary.
Script:
### map strings to numbers
reset session
myKeys = '#dt_grow #dt_cfl #dt_piso #dt_foo #dt_bar #dt_xyz #dt_abc'
myKey(i) = word(myKeys,i)
# create some random test data
set table $Data
set samples 50
plot '+' u ("1 2") every ::0::0 w table
plot '+' u ("1") every ::0::0 w table
plot '+' u ($0+1):(word(myKeys,int(rand(0)*words(myKeys)+1))) w table
unset table
getIdx(s) = (n=0, sum[i=1:words(myKeys)] (s eq myKey(i) ? n=i : 0), n)
set ytics 1
set grid x,y
plot $Data u 1:(y0=getIdx(strcol(2))):ytic(myKey(y0)) skip 2 w lp pt 7 lc "red" notitle
### end of script
Result:
I will not attempt a full answer right now, but here are a few pieces that may be useful by themselves or in conjunction with the answer from #theozh.
Column 11 not always present: The presence or absence of column 11 on any given line can be tested using the "pseudo-column" #$, which evaluates to the total number of columns found on that line. See "help pseudo". This feature was introduced in gnuplot version 5.4.2 (June 2021). For example to plot the values of column 10 but only if column 11 is also present:
plot FOO using 0:((#$ > 10) ? column(10) : NaN)
-Separate lines on the graph for each column 11 category: This could be done more cleanly using arrays in the development version of gnuplot, but sticking with features present in version 5.4 I suggest placing all the categories you want to track in one big string and then looping over the string.
Category = "#dt_grow #dt_cfl #dt_mach"
xcoord(x) = ... some function of the value in column 1? ...
ycoord(y) = ... some function of the value in column 10? ...
set datafile missing NaN #ignore any lines that evaluate to NaN
plot for [cat in Category] (xcoord($1)) : (strcol(#$) eq cat ? ycoord($10) : NaN) with steps

GNUPLOT with point-size variables stored in a different file

I have a data file with the following format :
y1 y2 y3 y4 ...
1.3 1.1 0.5 0.5 ...
0.2 0.4 0.6 0.1 ...
I know how to use Gnuplot to plot the data in this file. Suppose I have 50 columns, then I use:
plot for [col=0:150] filename using 0:col with lines ...
Now, I want to make a scatter instead of a line plot with points having variable size. I have a different file storing the pointsize variables. I know I need to also use a for loop and:
w p ps variable
However, since the point-size variables are stored in a different file, I do not know how to write the using specification. Normally one uses
using 0:1:2
where the point size variables are stored in the second column etc. But what if these variables are stored in a different file ?
I think I can solve this problem by combining both the data and the pointsize variables file into a single file, but I wonder if one can do this using gnuplot.
Thanks
If there is a one-to-one matchup of lines in the two files, then yes. Assuming file.dat is formatted like the one you show above, and ps.dat contains one header record and then in column 1 the point size for all points in that same line of the data file:
# read point sizes into a data block in gnuplot
set datafile columnheaders
set table $pointsize
plot "ps.dat" using 1 with table
unset table
# Now plot the data, using the value of $pointsize[j+1] for row j of points
# There are two tricky bits here
# 1) the line numbers are counted starting with 0
# but array and datablock entries are counted starting from 1.
# 2) $pointsize is an array of strings. We need to convert this to a
# real number in order to use it as a point size
plot for [i=1:*] "file.dat" using 0:i:(real($pointsize[$0+1])) with points ps variable
file.dat
y1 y2 y3 y4
1 2 4 3
2 3 5 4
3 4 6 5
4 5 8 6
ps.dat
ps
1
5
2
3

gnuplot every command with different lines color and legend

Lets say I have this sample data file
1 2
2 3
3 4
1 5
2 6
3 7
1 8
2 9
3 10
Now in gnuplot if I run this command
pl 'test.dat' u 1:2 every :::0::2 w l
It plots three lines for each of the block in the data file, but there's no way to distinguish which line comes from which data block. I want those three lines to have three different colors and different legend labels. Can I do that in addition to the every command?
Sure, there are multiple ways to achieve that. If you insist on having a single empty line between the blocks and on using every, you can plot iteratively:
plot for [i=0:*] 'test.dat' u 1:2 every :::i::i w l lc i
Alternatively, if you separate your data block with two empty lines, you can use the index:
1 2
2 3
3 4
1 5
2 6
3 7
1 8
2 9
3 10
plot for [i=0:*] 'test.dat' index i u 1:2 w l lc i
(shortcut i i instead of index i is also allowed, but difficult to read)
Or without iteration, but using the pseudo-column -2 (which gives you the index number). Note that gnuplot doesn't draw continuous lines between points that are separated with empty lines, therefore the every command is not necessary.
plot 'test.dat' u 1:2:-2 w l lc variable
Automatically generated, different labels can be produced in the following way:
plot for [i=0:*] 'test.dat' i i u 1:2 w l lc i t sprintf('This is block %d', i)

Is it possible to suppress plotting zero values from datafile column?

I wrote same data collecting procedure, and over time I added more columns to the data output.
To build a consistent format, the procedure outputs 0 where no measurements were available.
I wonder when plotting the data file whether it is possible not to plot zero values (like if no data were present).
Some of the new columns are plotted by themselves (using 2:7) and others are used in an expression (using 2:($7+$8)).
Here is another option: set datafile missing "0". Note, that a value of 0.0 will be plotted.
This will also plot the lines connected in case you use with lines or with linespoints
Code:
### do not plot values "0"
reset session
$Data <<EOD
1 1.1
2 0
3 5.1
4 2.1
5 0
6 0.0
7 5.1
EOD
set datafile missing "0"
plot $Data u 1:2 w lp pt 7, \
'' u 1:($1+$2) w lp pt 7
### end of code
Result:
Also check help set datafile, help set datafile missing or help missing.
gnuplot will not plot values if they are not-a-number, i.e. NaN. You can either use this string in the data instead of 0, or write a function to convert 0 to NaN and use that, eg:
chk(x) = (x==0?NaN:x)
plot "file" using 2:(chk($7)+chk($8)) with lines
Adding a value to NaN results in NaN.

Gnuplot : skip missing data points and xticlabels

I would like to skip some points to draw a graph in gnuplot and not connecting lines through missing points.
It is the same problem than : https://superuser.com/questions/440947/in-gnuplot-how-to-plot-with-lines-but-skip-missing-data-points
The gnuplot help says :
set datafile missing "?"
set style data lines
plot '-'
1 10
2 20
3 ?
4 40
5 50
e
plot '-' using 1:2
1 10
2 20
3 ?
4 40
5 50
e
plot '-' using 1:($2)
1 10
2 20
3 ?
4 40
5 50
e
The first plot will recognize only the first datum in the "3 ?" line. It
will use the single-datum-on-a-line convention that the line number is "x"
and the datum is "y", so the point will be plotted (in this case erroneously)
at (2,3).
The second plot will correctly ignore the middle line. The plotted line
will connect the points at (2,20) and (4,40).
The third plot will also correctly ignore the middle line, but the plotted
line will not connect the points at (2,20) and (4,40).
In order not to connect the points (2,20) and (4,40), we have to put a $ symbol : plot '-' using 1:($2)
I'd like to do the same thing with the following line :
plot using i:xticlabels(1) title columnheader(i)
But it doesn't work (i tried ($i):xticlabels(1) and other things... it doesn't work)
Thank you
You must use column(i) to select the i-th column. $1 is a shortcut for column(1) but you cannot use $i as shortcut for column(i):
set style data lines
i=2
plot '-' using (column(i)):xticlabels(1) title columnheader(i)
A B
1 10
2 20
3 ?
4 40
5 50
e

Resources