I have a data in file which I would like to plot using gnuplot. In the file, there are 3 data sets separated by two blank lines so that gnuplot can differentiate between the data sets by 'index'. I can plot three data sets separately via 'index' option of 'plot' command.
However, I am not sure how can I plot the data which is sum of 2nd column of all three data sets?
Note: all three data sets have same x data, i.e. 1st column
To do this the simplest thing would be to change your file format. Gnuplot manipulates columns pretty well. Since you are sharing the x data, you can change the file format to have four columns (assuming you are just plotting (x,y) data):
<x data> <y1 data> <y2 data> <y3 data>
and use a command like
plot 'data.dat' using 1:2 title 'data 1', \
'' u 1:3 t 'data 2', \
'' u 1:4 t 'data 3', \
'' u 1:($2+$3+$4) t 'sum of datas'
The dollar signs inside the parens in the using column specification allow you to add/subtract/perform other functions on columnar data.
This way your data file will also be smaller since you won't repeat the x data.
#Youjun Hu, never say that there is "no way" to do something with gnuplot. Most of the cases there is a way with gnuplot only, sometimes maybe not obvious or sometimes a bit cumbersome.
Data: SO16861334.dat
1 11
2 12
3 13
4 14
1 21
2 22
3 23
4 24
1 31
2 32
3 33
4 34
Code 1: (works with gnuplot 4.6.0, needs some adaptions for >=4.6.5)
In gnuplot 4.6.0 (version at the time of OP's question) there were no datablocks and no plot ... with table. The example below only works for 3 subdatasets, but could be adapted for other numbers. However, arbitrary large number of subdatasets will be difficult with this approach.
### calculate sum from 3 different (sub)datasets, gnuplot 4.6.0
reset
FILE = "SO16861334.dat"
stats FILE u 0 nooutput
N = int(STATS_records/STATS_blocks) # get number of lines per subblock
set table FILE."2"
plot FILE u 1:2
set table FILE."3"
x1=x2=y1=y2=NaN
myValueX(col) = (x0=x1,x1=x2,x2=column(col), r=int($0-2)/N, r<1 ? x0 : r<2 ? x1 : x2)
myValueY(col) = (y0=y1,y1=y2,y2=column(col), r<1 ? y0 : r<2 ? y1 : y2)
plot FILE."2" u (myValueX(1)):(myValueY(2))
unset table
set key top left
set offset graph 0.1, graph 0.1, graph 0.2, graph 0.1
plot for [i=0:2] FILE u 1:2 index i w lp pt 7 lc i+1 ti sprintf("index %d",i), \
FILE."3" u 1:2 every ::2 smooth freq w lp pt 7 lc rgb "magenta" ti "sum"
### end of code
Code 2: (works with gnuplot>=5.0.0)
This code works with arbitrary number of subdatasets.
### calculate sum from 3 different (sub)datasets, gnuplot>=5.0.0
reset
FILE = "SO16861334.dat"
set table $Data2
plot FILE u 1:2 w table
unset table
set key top left
set offset graph 0.1, graph 0.1, graph 0.2, graph 0.1
set colorsequence classic
plot for [i=0:2] FILE u 1:2 index i w lp pt 7 lc i+1 ti sprintf("index %d",i), \
$Data2 u 1:2 smooth freq w lp pt 7 lc rgb "magenta" ti "sum"
### end of code
Result: (same result for Code1 with gnuplot 4.6.0 and Code2 for gnuplot 5.0.0)
Related
How to turn the following tabular dataset into a simple 2D density plot to show a loc-number distribution?
I am new to gnuplot. Attempted a tutorial. A simple x,y plot with multiple columns of data, the plot is fine of course. Then tried this answer.. However I encountered the following issue, though x values are defined. I am guessing fundamentally my data set is lacking?(!).. what am I not doing right here? How to achieve a simple 2D contour from below data?
Updating based on recommended suggestions while OP aim remains intact.
Following is the input sample data used. File is single-space delimited. x = x, y=y, z1 = locid (1 to n) or z2=loctype (scuba, shower, swimming, restrooms, sushi, cafe, restaurant, etc)
input data :
ametype amename X(1000) Y1000) km-to-carpark
Scuba SCUB1 10.72 49.01
Scuba SCUB2 13.88 47.32
Scuba SCUB3 14.58 46.46
Scuba SCUB4 14.52 48.23
Scuba SCUB5 13.05 47.23
Scuba SCUB6 12.21 47.95
Scuba SCUB7 12.66 46.19
Cafe CAFE1 13.97 47.45
Cafe CAFE4 31.63 30.3
Playground PARK2 31.57 30.2
Playground PARK1 27.51 31.87
Cafe CAFE5 67.71 109.09
Scuba SCUB8 68.58 109.54
Scuba SCUB9 67.14 109.99
Cafe CAFE2 13.83 46.24
SUSHI SUSH1 79.59 41.22
SUSHI SUSHI2 73.81 54.14
SUSHI SUSHI3 72.87 55.47
SUSHI SUSHI4 75.05 56.51
RESTROOM RESTR1 74.1 56.05
RESTROOM RESTR2 74.96 57.9
RESTROOM RESTR3 75.06 55.59
RESTAURANT RESTAU1 76.57 56.33
RESTAURANT RESTAU1 76.95 55.1
RESTAURANT RESTAU2 77.75 54.69
RESTAURANT RESTAU2 76.15 54.34
code tried for a different dataset where x,y weren't coordinates;
set view map
set contour
set isosample 250, 250
set cntrparam level incremental 1, 0.1
set palette rgbformulae 33,13,10
splot 'data.dat' with lines nosurface
#splot for [col=1:10] ‘data.dat’ u ($1):(column(col) > 2 ? 1/0 : column(col)):3
errors:
1) All points x value undefined
2) Tabular output of this 3D plot style not implemented
updated:
a) increased data points
c) a possible chicken scratch to give simple impression.
Expecting a distribution density map like this.
This is an interesting plotting challenge.
The input data format is also straightforward, but needs some processing until the desired contour lines can be plotted with gnuplot.
Comments:
The data is all in one file. Data entries for the types can be random, no order necessary.
the example below will create some random test data with "Cafe, Scuba, Sushi" and 50 entries of each. Skip this part if you want to use your own file.
the further lines of the script, have no idea about the content of the test data file (i.e. how many types, type names, coordinates, etc.), all will be determined automatically.
create a unique list of types. The list will be in the order of first occurrence.
define a grid (here dx=0.2, dy=0.2, i.e. reasonable values within the data range) and count for each grid point the occurrences for each type within a certain radius (here: 0.5). Calculate the density by dividing the count by the unit area (area of the circle).
for each type create the contour lines via plotting to a file indexed by a two digit number. So far, I don't know how one would easily write this into indexed datablocks to avoid files on disk.
finally, plot the contour line files and the original data points by using a filter to get the right color.
One thing which I haven't figured out yet is set cntrparam level 2: I would like to have exactly 2 contour lines per type, but it seems gnuplot still uses the option set cntrparam level auto 2 and adjusts the number of levels itself.
As you can imagine this graph will probably look pretty confusing with 10 or more types.
For sure, there is room for improvement and no guarantee that there are no bugs in this script. Look at it as a starting point for further optimization. Suggestions for improvements are welcome!
Script:
### plot density contours from simple x,y location file
reset session
FILE = "SO73244095.dat"
# create some random test data
myTypes = "Cafe Scuba Sushi"
set print FILE
do for [p=1:words(myTypes)] {
a = word(myTypes,p)
x0 = rand(0)*5
y0 = rand(0)*5
do for [i=1:20] {
print sprintf("%s %s%d %.3g %.3g",a,a,i,invnorm(rand(0))+x0,invnorm(rand(0))+y0)
}
}
set print
# create a unique list of types
# and extract min, max data
addToList(list,col) = list.(_s='"'.strcol(col).'"', strstrt(list,_s)>0 ? '' : _s)
myTypes = ''
myType(i) = word(myTypes,i)
stats FILE u (myTypes=addToList(myTypes,1),$3):4 name "DATA" nooutput
Nt = words(myTypes)
print sprintf("%d types found: %s",Nt,myTypes)
# get densities for each type
dx = 0.2 # adjust the grid as you like...
dy = 0.2 # ... time for graph creation will increase with finer grid
Radius = 0.5 # adjust radius to a reasonable value
Nx = ceil((DATA_max_x-DATA_min_x)/dx)
Ny = ceil((DATA_max_y-DATA_min_y)/dy)
Dist(x0,y0,x1,y1) = sqrt((x1-x0)**2 + (y1-y0)**2)
print "Please wait..."
set print $Densities
do for [nt=1:Nt] {
do for [ny=0:Ny] {
do for [nx=0:Nx] {
c = 0
x = DATA_min_x+nx*dx
y = DATA_min_y+ny*dy
stats FILE u (Dist(x,y,$3,$4)<=Radius && (strcol(1) eq word(myTypes,nt)) ? c=c+1 : 0) nooutput
d = c / (pi * Radius**2) # density per unit area
print sprintf("%g %g %g",x,y,d)
}
print "" # empty line
}
print ""; print "" # two empty lines
}
set print
# get contour lines via splot into files
myContFile(n) = sprintf("%s.cont%02d",FILE,n)
unset surface
set contour
set cntrparam cubicspline levels 2 # cubicspline for "nice" round curves
do for [nt=1:Nt] {
set table myContFile(nt)
splot $Densities u 1:2:3 index nt-1
unset table
}
# set size ratio -1 # uncomment if equal x,y scale is important
set grid x,y
set key out noautotitle
set xrange[:] noextend
set yrange[:] noextend
set colorsequence classic
myFilter(colD,colF,valF) = strcol(colF) eq valF ? column(colD) : NaN
plot for [i=1:Nt] myContFile(i) u 1:2 w l lc i, \
for [i=1:Nt] FILE u 3:(myFilter(4,1,myType(i))) w p pt 7 lc i ti myType(i)
### end of script
Result: (a few random examples)
Using gnuplot, I am trying to make a 2D plot with points where the point color is represented by the third column of a data file(file has 3 columns)
Here is the link to the file
I am using the following command to generate the graph:
pl "outPhaseDiff_b1_dScan.dat" u 1:2:3 w p pt 7 ps 2 lc variable
The desired output should contain 5 colors but it is only plotting 2 colors, which is really strange because I have been using this command for a long time and did not encounter such issue before. I guess it has to do something with the plotting algorithm but I have no clue.
Check your data, it contains many line pairs with the following pattern:
0.0000 0.0060 3
0.0000 0.0060 5
One line with x, y, color1, another line with identical x and y, but different color2. So the points from the second line hide the points from the first one.
If you plot it 3d with several layers, it looks like this:
z = 0
y = 0
splot "outPhaseDiff_b1_dScan.dat" \
u 1:2:($2 == y ? (z = z+1) : (z = 0, y=$2), z):3 \
w p pt 7 ps 2 lc variable
A 2d plot looks from top, only two colors are visible.
I want to plot a file with linespoints in Gnuplot but the line using all the data samples and the points using fewer data samples. For example the following file plots the data but the line is not visible at all.
set terminal png
set out "plot_sample.png"
plot [t=-1000:1000] t w linespoints pt 64 lt 10 ps 1.5
How to do it if I want to define a custom sampling interval for the points but use all the data samples for the line? I could do two separate plots in the same figure but then the key will show both of them separately.
Use pointinterval to reduce the number of plotted points, but keep all points for drawing the line:
set samples 100
plot x**2 w linespoints pointinterval 10
Use every to reduce the samples taken from file!
Plot the line and
the points in two part, and use notitle at one of them!
Don't forget to 'synchronize' the color of the 2 plots!
Something like:
plot [t=-1000:1000] 'data.dat' w l lt 10 lc 10 t 'something', '' every 10 w p pt 64 ps 1.5 lc 10 notitle
NOTES
Usage of every: plot 'alma.dat' every A:B:C:D:E:F
where
A is the data increment (every Ath)
B is the datablock increment (datablocks are separated by empty lines)
C/D is the first data/datablock (start from C/D)
E/F is the last data/datablock (end at E/F)
You can use all the features described above, but if you don't need, just leave it empty, eg. ...every 2 or every 2::1 or every 2::1:0 ect...
I'm trying to plot a very simple data plot from an experiment we're running at my work. Essentially, I only need to plot y vs. x from a tab-separated data file which looks like this:
468.822 5.76025 2.3631 3 271.91676 60.13701
896.187 5.52183 1.11077 2 519.78846 57.6479052
731.708 6.38751 0.697295 1 424.39064 66.6856044
[and about 2000 more lines like this]
The first two columns are my x and y values.
Now, this is a data taken from a video, so it's represented in pixels, but we need to convert it to the right units (μm and μm/sec instead of pixels, and pixels/frame). For this reason, I plot the data with the following line:
plot 'datafile.data' u 1*xScale:2*yScale pt 7 ps 1 lc rgb "red" title "[some title]"
I get an error saying:
plot 'datafile.data' u 1*xScale:2*yScale pt 7 ps 1 lc rgb "red" title "[some title]"
^
"datafile.data", line 9: x range is invalid
(with the ^ sign pointing at the end of the above line)
I tried to scale the data itself (these are columns 5 and 6), but it gives the same error.
Anyone has any idea to what might be wrong?
The command you have uses the result of the arithmetic expression 1*xScale as column number, and same for the second expression. What you want is
plot 'datafile.data' u ($1*xScale):($2*yScale) pt 7 ps 1 lc rgb "red" title "[some title]"
I like following linespoints plotting style:
http://www.gnuplotting.org/join-data-points-with-non-continuous-lines/
However, I have encountered an issue when I plot several lines with this style:
As you can see the second series of points blank-out also the first series (lines and points), what I don't want to happen.
Feature of gnuplot which makes this possible is pointinterval and pointintervalbox.
Documentation of gnuplot:
A negative value of pointinterval, e.g. -N, means that point symbols
are drawn only for every Nth point, and that a box (actually circle)
behind each point symbol is blanked out by filling with the background
color. The command set pointintervalbox controls the radius of this
blanked-out region. It is a multiplier for the default radius, which
is equal to the point size.
http://www.bersch.net/gnuplot-doc/set-show.html#set-pointintervalbox
Since the doc says, fill with background color I was hoping using a transparent background the issue could be resolved, but it seems to be that the color white is used.
Gnuplot version
gnuplot> show version long
G N U P L O T
Version 5.0 patchlevel 0 last modified 2015-01-01
Copyright (C) 1986-1993, 1998, 2004, 2007-2015
Thomas Williams, Colin Kelley and many others
gnuplot home: http://www.gnuplot.info
faq, bugs, etc: type "help FAQ"
immediate help: type "help" (plot window: hit 'h')
Compile options:
-READLINE +LIBREADLINE +HISTORY
-BACKWARDS_COMPATIBILITY +BINARY_DATA
+GD_PNG +GD_JPEG +GD_TTF +GD_GIF +ANIMATION
-USE_CWDRC +HIDDEN3D_QUADTREE
+DATASTRINGS +HISTOGRAMS +OBJECTS +STRINGVARS +MACROS +THIN_SPLINES +IMAGE +USER_LINETYPES +STATS +EXTERNAL_FUNCTIONS
Minimal Working Example (MWE):
gnuplot-space-line-mark-style.gp
reset
set terminal pngcairo transparent size 350,262 enhanced font 'Verdana,10'
show version
set output 'non-continuous_lines.png'
set border linewidth 1.5
set style line 1 lc rgb '#0060ad' lt 1 lw 2 pt 7 pi -1 ps 1.5
set style line 2 lc rgb '#0020ad' lt 1 lw 2 pt 7 pi -1 ps 1.5
set pointintervalbox 3
unset key
set ytics 1
set tics scale 0.75
set xrange [0:5]
set yrange [0:4]
plot 'plotting_data1.dat' with linespoints ls 1,\
'plotting_data2.dat' with linespoints ls 2
plotting_data1.dat
# X Y
1 2
2 3
3 2
4 1
plotting_data2.dat
# X Y
1.2 2.4
2 3.5
3 2.5
4 1.2
UPDATE
A working pgfplots solution is given on tex.stackoverflow.com
You can do a lot with gnuplot. It's just a matter of how complicated you allow it to get.
You can realize the gap by a two step plotting. First: only with points and second: with vectors which are lines between the points shortened by performing a bit of geometry calculations.
The parameter L1 determines the gap and needs to be adjusted to the data and graph scale. Tested with gnuplot 5.0 and 5.2.
Revised version:
Here is the version which creates gaps independent of the terminal size and the graph scale. It just requires bit more scaling. However, since it requires the size of terminal and graph which are stored in GPVAL_...-variables which you only get after plotting, therefere the procedure unfortunately requires replotting.
I'm not sure whether this works for all terminals. I just tested on a wxt terminal.
Empirical findings (for wxt-terminal on Win7):
pointsize 100 (ps) corresponds to 600 pixels (px), hence: Rpxps=6 (ratio pixel to pointsize )
term size 400,400 (px) corresponds to 8000,8000 terminal units (tu), hence: Rtupx=20 (ratio terminal units to pixels)
Edit: the factor Rtupx apparently is different for different terminals: wxt: 20, qt: 10, pngcairo: 1, you could use the variable GPVAL_TERM for checking the terminal.
Rtupx = 1. # for pngcairo terminal 1 tu/px
if (GPVAL_TERM eq "wxt") { Rtupx = 20. } # 20 tu/px, 20 terminal units per pixel
if (GPVAL_TERM eq "qt") { Rtupx = 10. } # 10 tu/px, 10 terminal units per pixel
The ratios of axis units (au) to terminal units (tu) are different for x and y and are:
Rxautu = (GPVAL_X_MAX-GPVAL_X_MIN)/(GPVAL_TERM_XMAX-GPVAL_TERM_XMIN)
Ryautu = (GPVAL_Y_MAX-GPVAL_Y_MIN)/(GPVAL_TERM_YMAX-GPVAL_TERM_YMIN)
The variable GapSize is given in pointsize units. Actually, the real gap size depends on the pointsize (and also linewidth of the line). For simplicity, here gap size means the distance from the center of the point to where the line starts. So, GapSize=1.5 when having pointsize 1.5 will result in a gap of 0.75 on each side. L3(n) from the earlier version is now replaced by L3px(n) in pixel dimensions and L1 from the earlier version is not needed anymore.
Code:
### "linespoints" with gaps between lines and points
reset session
$Data1 <<EOD
# X Y
0 3
1 2
1.5 1
3 2
4 1
EOD
$Data2 <<EOD
0 0
1 1
2 1
2 2
3 1
3.98 0.98
EOD
GapSize = 1.5
Rtupx = 20. # 20 tu/px, 20 terminal units per pixel
Rpxps = 6. # 6 px/ps, 6 pixels per pointsize
# Ratio: axis units per terminal units
Rxautu(n) = (GPVAL_X_MAX-GPVAL_X_MIN)/(GPVAL_TERM_XMAX-GPVAL_TERM_XMIN)
Ryautu(n) = (GPVAL_Y_MAX-GPVAL_Y_MIN)/(GPVAL_TERM_YMAX-GPVAL_TERM_YMIN)
dXpx(n) = (x3-x0)/Rxautu(n)/Rtupx
dYpx(n) = (y3-y0)/Ryautu(n)/Rtupx
L3px(n) = sqrt(dXpx(n)**2 + dYpx(n)**2)
x1px(n) = dXpx(n)*GapSize*Rpxps/L3px(n)
y1px(n) = dYpx(n)*GapSize*Rpxps/L3px(n)
x2px(n) = dXpx(n)*(L3px(n)-GapSize*Rpxps)/L3px(n)
y2px(n) = dYpx(n)*(L3px(n)-GapSize*Rpxps)/L3px(n)
x1(n) = x1px(n)*Rtupx*Rxautu(n) + x0
y1(n) = y1px(n)*Rtupx*Ryautu(n) + y0
x2(n) = x2px(n)*Rtupx*Rxautu(n) + x0
y2(n) = y2px(n)*Rtupx*Ryautu(n) + y0
set style line 1 pt 7 ps 1.5 lc rgb "black"
set style line 2 lw 2 lc rgb "black
set style line 3 pt 7 ps 1.5 lc rgb "red"
set style line 4 lw 2 lc rgb "red"
plot \
$Data1 u (x3=NaN, y3=NaN,$1):2 w p ls 1 notitle, \
$Data1 u (y0=y3,y3=$2,x0=x3,x3=$1,x1(0)):(y1(0)): \
(x2(0)-x1(0)):(y2(0)-y1(0)) w vectors ls 2 nohead notitle, \
$Data2 u (x3=NaN, y3=NaN,$1):2 w p ls 3 notitle, \
$Data2 u (y0=y3,y3=$2,x0=x3,x3=$1,x1(0)):(y1(0)): \
(x2(0)-x1(0)):(y2(0)-y1(0)) w vectors ls 4 nohead notitle
replot
### end of code
Result: (two different terminal sizes)
Explanations:
Question: Why is there the argument (n) for L3(n), x1(n), y1(n), x2(n), y2(n)?
n is always 0 when L3(n),... are computed and is not used on the right hand side.
Answer:
To make them non constant-expressions. Alternatively, one could
add x0,x3,y0,y3 as variables, e.g. L3(x0, y0, x3, y3); however, the
compactness would be lost.
Question: What does the using part in plot $Data1 using (x3=NaN,y3=NaN,$1):2 mean?
Answer:
(,) is called a serial evaluation which is documented under the
section Expressions > Operator > Binary in the gnuplot documentation
(only v4.4 or newer).
Serial evaluation occurs only in parentheses and is guaranteed to
proceed in left to right order. The value of the rightmost subexpression
is returned.
This is done here for the initialialization of (x3,y3) for the
subsequent plot of the line segments as vectors. It is irrelevant for
the plotting of points.
Question: How does this draw N-1 segments/vectors for N points?
Answer:
Setting x3=NaN, y3=NaN when plotting points ensures that for the
first data point the initial data point (x0,y0) is set to (NaN,NaN)
which has the consequence that the evaluation of x1(0) and y1(0) also returns NaN.
Gnuplot in general skips points with NaN, i.e. for the first
data point no vector is drawn. The code draws the line between the
first and second point when the iteration reaches the second point.
Question: How does the second plot '' u ... iterates over all points?
Answer:
gnuplot> h special-filenames explains this:
There are a few filenames that have a special meaning: '', '-', '+' and '++'.
The empty filename '' tells gnuplot to re-use the previous input file in the
same plot command. So to plot two columns from the same input file:
plot 'filename' using 1:2, '' using 1:3
Question: Do we need the parentheses around (y1(0))?
Answer: gnuplot> h using explains this:
Each may be a simple column number that selects the value from one
field of the input file, a string that matches a column label in the first
line of a data set, an expression enclosed in parentheses, or a special
function not enclosed in parentheses such as xticlabels(2).