Hi I'm using gnuplot to plot data from a simulation structured in data blocks, like this:
CurrentTime CurrentState
0 2
1.234 2
1.990 1
2.462 0
CurrentTime CurrentState
0 2
0.895 1
1.456 2
2.052 1
3.017 0
The number of data blocks is not strictly known but is at least 30 blocks.
Notice that the number of intervals are different for each CurrentTime.
I'm using the following code to plot the data as is
# GNUPlot code
set multiplot layout 2,1 title "Insert title" font ",14"
set tmargin 3
set bmargin 3
set lmargin 5
set rmargin 2
plot "data.txt" every :1 using 1:2:(column(-2)) with linespoints lc variable
The next thing I want to plot will go in the lower plot due to the multiplot command. That plot I want to be the average of my data at intervals of time that I set. In pseudo code I want:
# pseudo code
float start, step, stop;
assign start, step, stop;
define Interval=start, by step, to stop; typed another way Interval=start:step:stop
array sum(size(number of data blocks,length(Interval), length(Interval)))
assign sum=0;
for every data block
for k=0 to length(CurrentTime)
for j=0 to length(Interval)-1
(CurrentTime(k) < Interval(j+1) && CurrentTime(k) > Interval(j-1)) ? sum += CurrentState(k) : sum += 0
average=sum/(Number of data blocks)
I am stuck trying to implement that in gnuplot. Any assistance would be awesome!
First there is the data file, some of my real data is
CurrentTime CurrentState
0 2
4.36393 1
5.76339 2
13.752 1
13.7645 2
18.2609 1
19.9713 2
33.7285 1
33.789 0
CurrentTime CurrentState
0 2
3.27887 1
3.74072 2
3.86885 1
4.97116 0
CurrentTime CurrentState
0 2
1.19854 1
3.23982 2
7.30501 1
7.83872 0
Then I used python to find the average of the data at the time I intervals I want to check the average. I chose to check at discrete time steps but they could be any time step. The following is my python code
#Loading data file: Goal is to calculate average(TimeIntervals)=averageOfTimeIntervals.
import numpy as np
data=np.genfromtxt('data.txt', comments='C')
CurrentState=data[:,1]
CurrentTime=data[:,0]
numberTimeIntervals=101
TimeIntervals=np.linspace(0,numberTimeIntervals-1,numberTimeIntervals) #gives integer values of time
stateOfTimeIntervals=np.zeros(numberTimeIntervals,dtype=np.float64)
stateOfTimeIntervals[0]=CurrentState[0] #setting initial state
#main loop
run=0
numberSimTimes=len(CurrentTime)
for j in range(0,len(stateOfTimeIntervals)): #start at 1 b/c we know initial state
for k in range(0,numberSimTimes-1):
lengthThisRun=0
if CurrentTime[k] <= TimeIntervals[j] and CurrentTime[k+1] > TimeIntervals[j]:
lengthThisRun+=1
#Goal is to get the length of this run up to the time we decide to check the state
stateOfTimeIntervals[j]+=CurrentState[k]
else:
lengthThisRun+=1
#The number of runs can be claculated using
numberRuns=len(CurrentTime) - np.count_nonzero(CurrentTime)
print "Number of Runs=%f" %(numberRuns)
#Compute the average
averageState=stateOfTimeIntervals/numberRuns
#Write to file and plot with gnuplot
np.savetxt('plot2gnu.txt',averageState)
Then using gnuplot I plotted 'plot2gnu.txt' using the following code
# to plot everything on the same plot use "multiplot"
set multiplot layout 2,1 title "Insert title" font ",14"
set tmargin 3
set bmargin 3
set lmargin 5
set rmargin 2
plot "data.txt" every :1 using 1:2:(column(-2)) with linespoints lc variable
plot 'plot2gnu.txt' using 1:2 with linespoints
I would like to point out the use of a pseudocolumn 'column(-2)' in the third column specifying line color. 'column(-2)' represents "The index number of the current data set within a file that contains multiple data sets." - From the 'old' gnuplot 4.6 documentation.
Related
I'm new using gnuplot, today I learned a lot on stackoverflow for a problem that I had with a plot, but now for my research I need to do another one like the following image:
The data are of the same tipe, I collected them and isolate only the column that I need in a file.dat, and a sample is the following:
5.38e-51 1
2.75e-81 1
5.3e-67 1
8.71e-170 4
3.62e-59 3
2.98e-52 2
3.2e-31 1
2.98e-54 2
3.85e-29 2
5.38e-57 1
3.2e-33 2
2.75e-88 1
3.2e-34 1
9.89e-37 1
5.38e-59 2
3.2e-35 2
1.68e-168 1
1.81e-101 1
9.89e-39 1
2.98e-59 2
3.2e-39 1
1.07e-110 3
1.07e-111 2
1.81e-107 2
2.82e-40 4
2.6e-108 1
1.07e-115 1
My problem is that I would like on the x-axis that my x is the exponent, as 1E-x.
I tried using set format set format x '%.0f'.sprintf('e%d') but it doesn't work.
How can I do that?
thank you.
If you want to plot values which span many order of magnitudes you either plot it on logarithmic scale
or take log10() of your values and plot them in linear scale.
From your description without code and details, I'm guessing that you have
the data you've shown and want to create a histogram and display the negative log10 value on the x-axis.
NB: In your earlier data you had also 0.0, which will mess-up the smooth freq option together with the logarithmic bins. Note that set datafile missing "0.0" will only exclude 0.0 as text, i.e. 0, 0.00, or 0e-10 will not be excluded. So, make sure you don't have zeros in your data, this will not work with logarithmic scale.
Code:
### plot histogram with logarithmic bins
reset session
$Data <<EOD
5.38e-51 1
2.75e-81 1
5.3e-67 1
8.71e-170 4
3.62e-59 3
2.98e-52 2
3.2e-31 1
2.98e-54 2
3.85e-29 2
5.38e-57 1
3.2e-33 2
2.75e-88 1
3.2e-34 1
9.89e-37 1
5.38e-59 2
3.2e-35 2
1.68e-168 1
1.81e-101 1
9.89e-39 1
2.98e-59 2
3.2e-39 1
1.07e-110 3
1.07e-111 2
1.81e-107 2
2.82e-40 4
2.6e-108 1
1.07e-115 1
EOD
set xlabel "-log_{10}(x)"
set xrange[-15:200]
set xtics 10 out
set ylabel "Sequences"
set yrange [0:]
set key noautotitle
# Histogram with logarithmic bins
BinWidth = 10
Bin(x) = floor(-log10(x)/BinWidth)*BinWidth + BinWidth*0.5
set boxwidth BinWidth
set style fill solid 0.3
set offset 1,1,1,0
set grid x,y
set datafile missing "0.0" # exclude zero, otherwise it will mess up "smooth freq"
plot $Data u (Bin($1)):2 smooth freq with boxes lc "red"
### end of code
Result:
Can you tell me how to specify the default cb (or z) value?
I build a 3d chart {x,y,z} or {x,y,cb}, but for different x there are different ranges of y, and as a result white bars are visible on the chart (for heatmap/colorbox). I would like to see no white stripes, and where there is no data, gnuplot would substitute the default value (for example, 0) and, accordingly, paint the field with the appropriate color for heatmap
You have several options, depending on exactly what plot mode you are using and what type of data you have. In general you can use two properties of the color assignment to get what you want:
1) out-of-bound values are mapped to the color of the extreme min or max of the colorbar. So one option is to assign a palette that has your desired "default" color at the min and max, independent of whatever palette function you use for the rest of the range
2) data values that are "missing" or "not-a-number" generally leave a hole in the grid of a pixel image or heat map that lets the background color show through.
There is a demo imageNaN.dem in the standard demo set that shows use of these features for several 2D and 3D heat map commands. The output from a heatmap generated by splot $matrixdata matrix with image is shown below.You can see extreme values pinned to the min/max of the colorbar range.
Note that if you want some color other than the backgroundn to show through, you could position a colored rectangle behind the heat map surface.
# Define the test data as a named data block
$matrixdata << EOD
0 5 4 3 0
? 2 2 0 1
Junk 1 2 3 5
NaN 0 0 3 0
Inf 3 2 0 3
-Inf 0 1 2 3
EOD
set view map
set datafile missing '?'
unset xtics
set ytics ("0" 0.0, "?" 1.0, "Junk" 2.0, "NaN" 3.0, "Inf" 4.0, "-Inf" 5.0)
set cblabel "Score"
set cbrange [ -2.0 : 7.0 ]
splot $matrixdata matrix using 1:2:(0):3 with image
#Ethan, I really don't have some data, which results in white slits.
I can fill in the missing data 0 at the stage of forming the data file, but then some files become very large and gnuplot spends all the memory.
So I'm looking for a way to solve the problem.
My example:
For #Ethan: my code:
set arrow from 0,86400 rto graph 1, graph 0 nohead ls 5 front
#===> decision of problem
set object rectangle from graph 0, graph 0 to graph 1, graph 1 behind fc rgbcolor 'blue' fs noborder
set pm3d map
# set pm3d interpolate 32,32
set size square
set palette rgbformulae 22,13,-31
splot inputFullPath u 2:1:(percentage($4)) notitle
and my data (for example):
0 1 0.1
0 2 0.2
0 4 0.5
# -------- {0,5..7} - white gap
# -------- {1,1..3} - white gap
1 3 0.6
1 4 0.5
1 7 0.9
I like following linespoints plotting style:
http://www.gnuplotting.org/join-data-points-with-non-continuous-lines/
However, I have encountered an issue when I plot several lines with this style:
As you can see the second series of points blank-out also the first series (lines and points), what I don't want to happen.
Feature of gnuplot which makes this possible is pointinterval and pointintervalbox.
Documentation of gnuplot:
A negative value of pointinterval, e.g. -N, means that point symbols
are drawn only for every Nth point, and that a box (actually circle)
behind each point symbol is blanked out by filling with the background
color. The command set pointintervalbox controls the radius of this
blanked-out region. It is a multiplier for the default radius, which
is equal to the point size.
http://www.bersch.net/gnuplot-doc/set-show.html#set-pointintervalbox
Since the doc says, fill with background color I was hoping using a transparent background the issue could be resolved, but it seems to be that the color white is used.
Gnuplot version
gnuplot> show version long
G N U P L O T
Version 5.0 patchlevel 0 last modified 2015-01-01
Copyright (C) 1986-1993, 1998, 2004, 2007-2015
Thomas Williams, Colin Kelley and many others
gnuplot home: http://www.gnuplot.info
faq, bugs, etc: type "help FAQ"
immediate help: type "help" (plot window: hit 'h')
Compile options:
-READLINE +LIBREADLINE +HISTORY
-BACKWARDS_COMPATIBILITY +BINARY_DATA
+GD_PNG +GD_JPEG +GD_TTF +GD_GIF +ANIMATION
-USE_CWDRC +HIDDEN3D_QUADTREE
+DATASTRINGS +HISTOGRAMS +OBJECTS +STRINGVARS +MACROS +THIN_SPLINES +IMAGE +USER_LINETYPES +STATS +EXTERNAL_FUNCTIONS
Minimal Working Example (MWE):
gnuplot-space-line-mark-style.gp
reset
set terminal pngcairo transparent size 350,262 enhanced font 'Verdana,10'
show version
set output 'non-continuous_lines.png'
set border linewidth 1.5
set style line 1 lc rgb '#0060ad' lt 1 lw 2 pt 7 pi -1 ps 1.5
set style line 2 lc rgb '#0020ad' lt 1 lw 2 pt 7 pi -1 ps 1.5
set pointintervalbox 3
unset key
set ytics 1
set tics scale 0.75
set xrange [0:5]
set yrange [0:4]
plot 'plotting_data1.dat' with linespoints ls 1,\
'plotting_data2.dat' with linespoints ls 2
plotting_data1.dat
# X Y
1 2
2 3
3 2
4 1
plotting_data2.dat
# X Y
1.2 2.4
2 3.5
3 2.5
4 1.2
UPDATE
A working pgfplots solution is given on tex.stackoverflow.com
You can do a lot with gnuplot. It's just a matter of how complicated you allow it to get.
You can realize the gap by a two step plotting. First: only with points and second: with vectors which are lines between the points shortened by performing a bit of geometry calculations.
The parameter L1 determines the gap and needs to be adjusted to the data and graph scale. Tested with gnuplot 5.0 and 5.2.
Revised version:
Here is the version which creates gaps independent of the terminal size and the graph scale. It just requires bit more scaling. However, since it requires the size of terminal and graph which are stored in GPVAL_...-variables which you only get after plotting, therefere the procedure unfortunately requires replotting.
I'm not sure whether this works for all terminals. I just tested on a wxt terminal.
Empirical findings (for wxt-terminal on Win7):
pointsize 100 (ps) corresponds to 600 pixels (px), hence: Rpxps=6 (ratio pixel to pointsize )
term size 400,400 (px) corresponds to 8000,8000 terminal units (tu), hence: Rtupx=20 (ratio terminal units to pixels)
Edit: the factor Rtupx apparently is different for different terminals: wxt: 20, qt: 10, pngcairo: 1, you could use the variable GPVAL_TERM for checking the terminal.
Rtupx = 1. # for pngcairo terminal 1 tu/px
if (GPVAL_TERM eq "wxt") { Rtupx = 20. } # 20 tu/px, 20 terminal units per pixel
if (GPVAL_TERM eq "qt") { Rtupx = 10. } # 10 tu/px, 10 terminal units per pixel
The ratios of axis units (au) to terminal units (tu) are different for x and y and are:
Rxautu = (GPVAL_X_MAX-GPVAL_X_MIN)/(GPVAL_TERM_XMAX-GPVAL_TERM_XMIN)
Ryautu = (GPVAL_Y_MAX-GPVAL_Y_MIN)/(GPVAL_TERM_YMAX-GPVAL_TERM_YMIN)
The variable GapSize is given in pointsize units. Actually, the real gap size depends on the pointsize (and also linewidth of the line). For simplicity, here gap size means the distance from the center of the point to where the line starts. So, GapSize=1.5 when having pointsize 1.5 will result in a gap of 0.75 on each side. L3(n) from the earlier version is now replaced by L3px(n) in pixel dimensions and L1 from the earlier version is not needed anymore.
Code:
### "linespoints" with gaps between lines and points
reset session
$Data1 <<EOD
# X Y
0 3
1 2
1.5 1
3 2
4 1
EOD
$Data2 <<EOD
0 0
1 1
2 1
2 2
3 1
3.98 0.98
EOD
GapSize = 1.5
Rtupx = 20. # 20 tu/px, 20 terminal units per pixel
Rpxps = 6. # 6 px/ps, 6 pixels per pointsize
# Ratio: axis units per terminal units
Rxautu(n) = (GPVAL_X_MAX-GPVAL_X_MIN)/(GPVAL_TERM_XMAX-GPVAL_TERM_XMIN)
Ryautu(n) = (GPVAL_Y_MAX-GPVAL_Y_MIN)/(GPVAL_TERM_YMAX-GPVAL_TERM_YMIN)
dXpx(n) = (x3-x0)/Rxautu(n)/Rtupx
dYpx(n) = (y3-y0)/Ryautu(n)/Rtupx
L3px(n) = sqrt(dXpx(n)**2 + dYpx(n)**2)
x1px(n) = dXpx(n)*GapSize*Rpxps/L3px(n)
y1px(n) = dYpx(n)*GapSize*Rpxps/L3px(n)
x2px(n) = dXpx(n)*(L3px(n)-GapSize*Rpxps)/L3px(n)
y2px(n) = dYpx(n)*(L3px(n)-GapSize*Rpxps)/L3px(n)
x1(n) = x1px(n)*Rtupx*Rxautu(n) + x0
y1(n) = y1px(n)*Rtupx*Ryautu(n) + y0
x2(n) = x2px(n)*Rtupx*Rxautu(n) + x0
y2(n) = y2px(n)*Rtupx*Ryautu(n) + y0
set style line 1 pt 7 ps 1.5 lc rgb "black"
set style line 2 lw 2 lc rgb "black
set style line 3 pt 7 ps 1.5 lc rgb "red"
set style line 4 lw 2 lc rgb "red"
plot \
$Data1 u (x3=NaN, y3=NaN,$1):2 w p ls 1 notitle, \
$Data1 u (y0=y3,y3=$2,x0=x3,x3=$1,x1(0)):(y1(0)): \
(x2(0)-x1(0)):(y2(0)-y1(0)) w vectors ls 2 nohead notitle, \
$Data2 u (x3=NaN, y3=NaN,$1):2 w p ls 3 notitle, \
$Data2 u (y0=y3,y3=$2,x0=x3,x3=$1,x1(0)):(y1(0)): \
(x2(0)-x1(0)):(y2(0)-y1(0)) w vectors ls 4 nohead notitle
replot
### end of code
Result: (two different terminal sizes)
Explanations:
Question: Why is there the argument (n) for L3(n), x1(n), y1(n), x2(n), y2(n)?
n is always 0 when L3(n),... are computed and is not used on the right hand side.
Answer:
To make them non constant-expressions. Alternatively, one could
add x0,x3,y0,y3 as variables, e.g. L3(x0, y0, x3, y3); however, the
compactness would be lost.
Question: What does the using part in plot $Data1 using (x3=NaN,y3=NaN,$1):2 mean?
Answer:
(,) is called a serial evaluation which is documented under the
section Expressions > Operator > Binary in the gnuplot documentation
(only v4.4 or newer).
Serial evaluation occurs only in parentheses and is guaranteed to
proceed in left to right order. The value of the rightmost subexpression
is returned.
This is done here for the initialialization of (x3,y3) for the
subsequent plot of the line segments as vectors. It is irrelevant for
the plotting of points.
Question: How does this draw N-1 segments/vectors for N points?
Answer:
Setting x3=NaN, y3=NaN when plotting points ensures that for the
first data point the initial data point (x0,y0) is set to (NaN,NaN)
which has the consequence that the evaluation of x1(0) and y1(0) also returns NaN.
Gnuplot in general skips points with NaN, i.e. for the first
data point no vector is drawn. The code draws the line between the
first and second point when the iteration reaches the second point.
Question: How does the second plot '' u ... iterates over all points?
Answer:
gnuplot> h special-filenames explains this:
There are a few filenames that have a special meaning: '', '-', '+' and '++'.
The empty filename '' tells gnuplot to re-use the previous input file in the
same plot command. So to plot two columns from the same input file:
plot 'filename' using 1:2, '' using 1:3
Question: Do we need the parentheses around (y1(0))?
Answer: gnuplot> h using explains this:
Each may be a simple column number that selects the value from one
field of the input file, a string that matches a column label in the first
line of a data set, an expression enclosed in parentheses, or a special
function not enclosed in parentheses such as xticlabels(2).
I am working on gnuplot linepoints to create a comulative and normal distribution graph. I have created a file to provide the information to both graphs.
I got a problem when I was trying to plot the last data.
Here is the my script to create the second graph.
plot.plt
set term pos eps
set style data linespoints
set style line 1 lc 8 lt -1
set size 1,1
set yr [0:20]
set key below
set grid
set output 'output.eps'
plot "<awk '{i=i+$3; print $1,i}' data.dat" smooth cumulative t 'twitter' ls 1
data.dat
5.0 1 0.10
9.0 5 0.20
13.0 7 0.30
14.0 1 0.20
15.0 9 0.20
I want to create x axis with the first column and y axis with the last column. so the y axis range must between 0 to 1. which part should I change? thanks
Using smooth cumulative is enough, no need for awk. You are doing the same operation twice, once with gnuplot and once with awk. Simply do
plot 'data.dat' using 1:3 smooth cumulative
I am creating a program that solves a 3D partial differential equation using finite difference methods. This is surprisingly not the hard part, and it is technically finished.
At the end of the program, I am writing the numerical solutions to the PDE in the following format to some file (for later processing)
X Y Z C
0 0 0 0.1
0 0 1 etc etc
Where X Y and Z are spatial coordinates and C is the intensity at each location.
I found one a lot of information on plotting 3D data with 2 spatial dimensions and 1 intensity. So "technically" I have 4D data ... 3 spacial, 1 intensity.
The one piece of information I found was using this command:
splot 'datafile' u 1:2:3:4 w pm3d
Which does do the job, but since it is a rectangular prism, you can't easily see the concentration at the center of the prism.
I was imagining that the best way to do this would be to take a "chunk" out of the rectangular prism, so that you can see the intensity layers. The best analogy I could think of was the how text books represent the layers of the earth, where they take a chunk of the earth out to show all the way down to the core.
Another way I saw in research papers was to plot and XY cross section, YZ cross section and XZ cross section all on the same graph.
I have tried to search for both of these but it is very hard (for me) to concisely articulate.
Any advice would be great on the best way to represent this data!
I always find color maps to be the most helpful when interpreting data. You basically plot slices though planes with sensible information. If you have gridded data this is very easy to do in gnuplot even without preprocessing the data file. For instance, if your data looks like this:
# x y z c
0 0 0 0.15
1 0 0 0.14
2 0 0 0.16
0 1 0 0.11
1 1 0 0.19
2 1 0 0.12
0 2 0 0.15
1 2 0 0.19
2 2 0 0.13
0 0 1 0.10
1 0 1 0.09
2 0 1 0.17
# etc
then you can make a conditional plot with gnuplot for a fixed value of x, y or z. For the z = 0 plane, this could be achieved with splot "data" u 1:2:($3 == 0 ? $4 : 1/0), that is, if 3rd column's value is 0 then plot the 4th column's value, else ignore that point. For the simple example above:
set pm3d map
splot "data" u 1:2:($3 == 0 ? $4 : 1/0)
Note that pm3d does some interpolation between data points.
If you preprocess your data or have it nicely structured like in my example, you can also use the with image style, that might be preferred over pm3d for several reasons, including smaller file sizes:
plot "data" u 1:2:4 every :::0::2 with image
Where there is no interpolation but the actual point values. every :::0::2 above selected data blocks 0 to 2, which are the ones that belong to z = 0 in my example.
Finally, if your data is non gridded, you cannot use with image and should use pm3d instead. In this case the command should take into considerations points that are at an acceptable distance from the plane where you want to plot. This could be achieved as follows:
set pm3d map
plane_z = 0
splot "data" u 1:2:( abs($3 - plane_z) < 0.1 ? $4 : 1/0)
Above I include in the plot all the points whose z values are less than a distance 0.1 away from the plane (z = 0) I'm interested in.