I’m trying to plot multiple Gaussian functions on the same graph with Gnuplot, which is quite a simple thing. The problem is that the peaks do not overlap and I get the following result that looks like they have different peaks, which they don’t. How can I fix this?
First, it helps to understand how gnuplot generates plots of functions (or really how any computer program must do it). It must convert a continuous function into some kind of discrete representation. The mathematical function to be plotted is evaluated at various points along the independent (x) axis. This creates a set of (x,y) points. A line is then drawn between these points (think "connect the dots"). As you might imagine, the number of discrete samples used affects how accurately the curve is represented, and how smooth it looks.
The problem you have noticed is that the default sample size in gnuplot is a bit too low. The default (I believe) is 100 samples across the visible x-axis. You can adjust the number of samples (to 1000, for example) with
set samples 1000
I have made some example plots of gaussians to illustrate this point. (I made a rough estimate of your gaussian parameters.) Each plot has a different number of samples:
Notice how the lines get too jagged if the sample size is too low. Even the default value of 100 is too low. Setting to 1000 makes it plenty smooth. This is probably more than it needs to be, but it works. If you're using a terminal that generates a bitmap image (e.g. PNG), then you shouldn't need more samples than you have width in pixels used for the x-axis plot area. If you're generating vector based output, then just pick something that "looks right" for whatever you are using it in.
See the question Gnuplot x-axis resolution for more.
By the way, the code to generate the above examples is:
set terminal pngcairo size 640,480 enhanced
# Line styles
set style line 1 lw 2 lc rgb "blue"
set style line 2 lw 2 lc rgb "red"
set style line 3 lw 2 lc rgb "yellow"
# Gaussian function stuff
set yrange [0:1.1]
set xrange [-20:20]
gauss(x,a) = exp(-(x/a)**2)
eqn(a) = sprintf("y = e^{-(x/%d)^2}", a)
# First example (default)
set output "example1.png"
set title "100 samples (default)"
plot gauss(x,8) ls 1 title eqn(8), \
gauss(x,2) ls 2 title eqn(2), \
gauss(x,1) ls 3 title eqn(1)
# Second example (too low)
set output "example2.png"
set title "20 samples (too low)"
set samples 20
replot
# Third example (plenty high)
set output "example3.png"
set title "1000 samples (plenty high)"
set samples 1000
replot
Related
I have a large dataset which I need to plot in loglog scale in Gnuplot, like this:
set log xy
plot 'A_1D_l0.25_L1024_r0.dat' u 1:($2-512)
LogLogPlot of my datapoints
Text file with the datapoints
Datapoints on the x axis are equally spaced, but because of the logscale they get very dense on the right part of the graph, and as a result the output file (I finally export it in .tex) gets very large.
In linear scale, I would simply use the option every to reduce the number of points which get plotted. Is there a similar option for loglogscale, such that the plotted points appear equally spaced?
I am aware of a similar question which was raised a few years ago, but in my opinion the solution is unsatisfactory: plotted points are not equally spaced along the x-axis. I think this is a really unsophisticated problem which deserves a clearer solution.
As I understand it, you don't want to plot the actual data points; you just want to plot a line through them. But you want to keep the appearance of points rather than a line. Is that right?
set log xy
plot 'A_1D_l0.25_L1024_r0.dat' u 1:($2-512) with lines dashtype '.' lw 2
Amended answer
If it is important to present outliers/errors in the data set then you must not use every or any other technique that simply discards or skips most of the data points. In that case I would prefer the plot with points that you show in the original question, perhaps modified to represent each point as a dot rather than a cross. I will simulate this by modifying a single point in your 500000 point data set (first figure below). But I would also suggest that the presence of outliers is even more apparent if you plot with lines (second figure below).
Showing error bounds is another alternative for noisy data, but the options depend on what you have to work with in your data set. If you want to pursue that, please ask a separate question.
If you really want to reduce the number of data to be plotted, you might consider the following script.
s = 0.1 ### sampling interval in log scale
### (try 0.05 for more detail)
c = log10(0.01) ### a parameter used in sampler(x)
### which should be initialized by
### smaller value than any x in log scale
sampler(x) = (x>0 && log10(x)>=c) ? (c=ceil(log10(x)/s+0.5)*s, x) : NaN
set log xy
set grid xtics
plot 'A_1D_l0.25_L1024_r0.dat' using (sampler($1)):($2-512) with points pt 7 lt 1 notitle , \
'A_1D_l0.25_L1024_r0.dat' using 1:($2-512) with lines lt 1 notitle
This script samples the data in increments of roughly 0.1 on x-axis in log scale. It makes use of the property that points whose x value is evaluated as NaN in using are not drawn.
I'm looking for a way to plot histograms in 3d to produce something like this figure http://www.gnuplot.info/demo/surface1.17.png but where each series is a histogram.
I'm using the procedure given here https://stackoverflow.com/a/19596160 and http://www.gnuplotting.org/calculating-histograms/ to produce histograms, and it works perfectly in 2d.
Basically, the commands I use are
hist = 'u (binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart):(1) smooth freq w boxes
plot 'data.txt' #hist
Now I would just like to add multiple histograms in the same plot, but because they overlap in 2d, I would like to space them out in a 3d plot.
I have tried to do the following command (using above procedure)
hist = 'u (1):(binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart):(1) smooth freq w boxes
splot 'data.txt' #hist
But gnuplot complains that the z values are undefined.
I don't understand why this would not put a histogram along the value 1 on the x-axis with the bins along the y-axis, and plot the height on the z-axis.
My data is formatted simply in two columns:
Index angle
0 92.046
1 91.331
2 86.604
3 88.446
4 85.384
5 85.975
6 88.566
7 90.575
I have 10 files like this, and since the values in the files are close to each other, they will completely overlap if I plot them all in one 2d histogram. Therefore, I would like to see 10 histograms behind each other in a sort of 3d perspective.
This second answer is distinct from my first. Whereas the first addresses what the OP was trying to accomplish, this second provides an alternative approach which address the underlying problem the OP was trying to overcome.
I have posted an answer that addresses the ability to do this in 3d. However, this isn't usually the best way to do this with multiple histograms like this. A 3d graph like that will be difficult to compare.
We can address the overlap in 2D by stagnating the position of the boxes. With default settings, the boxes will spread out to touch. We can turn that off and adjust the position of the boxes to allow more than 1 histogram on a graph. Remember, that the coordinates you supply are the center of the boxes.
Suppose that I have the data you have provided and this additional data set
Index Angle
0 85.0804
1 92.2482
2 90.0384
3 99.2974
4 87.729
5 94.6049
6 86.703
7 97.9413
We can set the boxwidth to 2 units with set boxwidth 2 (your bins are 4 units wide). Additionally, we will turn on box filling with set style fill solid border lc black.
Then I can issue
plot datafile1 u (binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart):(1) smooth freq w boxes, \
datafile2 u (binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart+1):(1) smooth freq w boxes
The second plot command is identical to the first, except for the +1 after binstart. This will shift this box 1 unit to the right. This produces
Here, the two series are clear. Keeping track of which box is associated with each is easy because of the overlap, but it is not enough to mask the other series.
We can even move them next to each other, with no overlap, by subtracting 1 from the first plot command:
plot datafile1 u (binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart-1):(1) smooth freq w boxes, \
datafile2 u (binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart+1):(1) smooth freq w boxes
producing
This first answer is distinct from my second. This answer address what the OP was trying to accomplish whereas the second addresses the underlying problem the OP was trying to overcome.
Gnuplot isn't going to be able to do this on it's own, as the relevant styles (boxes and histograms) only work in 2D. You would have to do it using an external program.
For example, using your data and your 2d command (your first command), we get (using your data and the linked values of -100 and 4 for binstart and binwidth)
To draw these boxes on the 3d grid, we will need to use the line style and have four points for each: lower left, upper left, upper right, and lower right. We can use the previous command and capture to a table, but this will only gives the upper center point. We can use an external program to pre-process, however. The following python program, makehist.py, does just that.
from sys import argv
import re
from math import floor
pat = re.compile("\s+")
fname = argv[1]
binstart = float(argv[2])
binwidth = float(argv[3])
data = [tuple(map(float,pat.split(x.strip()))) for x in open(fname,"r").readlines()[1:]]
counts = {}
for x in data:
bn = binwidth*(floor((x[-1]-binstart)/binwidth)+0.5)+binstart
if not bn in counts: counts[bn] = 0
counts[bn]+=1
for x in sorted(counts.keys()):
count = counts[x]
print(x-binwidth/2,0)
print(x-binwidth/2,count)
print(x+binwidth/2,count)
print(x+binwidth/2,0)
print(max(counts.keys())+binwidth/2,0)
print(min(counts.keys())-binwidth/2,0)
Essentially, this program does the same thing as the smooth frequency option does, but instead of getting the upper center of each box, we get the four previously mentioned points along with two points to draw a line along the bottom of all the boxes.
Running the following command,
plot "< makehist.py data.txt -100 4" u 1:2 with lines
produces
which looks very similar to the original graph. We can use this in a 3d plot
splot "< makehist.py data.txt -100 4" u (1):1:2 with lines
which produces
This isn't all that pretty, but does lay the histogram out on a 3d plot. The same technique can be used to add multiple data files onto it spread out. For example, with the additional data
Index Angle
0 85.0804
1 92.2482
2 90.0384
3 99.2974
4 87.729
5 94.6049
6 86.703
7 97.9413
We can use
splot "< makehist.py data.txt -100 4" u (1):1:2 with lines, \
"< makehist.py data2.txt -100 4" u (2):1:2 with lines
to produce
I have a data file with blocks of x/y values. Each block contains 16 lines with x/y pairs and each block represents those positions in a different time. http://pastebin.com/0teRrfRU
I want to plot the trajectory of a specific particle. To do that, I've written plot 'pos.dat' u 2:3 every ::n:0:n:i, where n is the n-th particle and i is the time up to which I want the trajectory plotted (I can then loop over the i to generate an animation).
This runs fine, but when I add w lines nothing gets plotted, and I don't understand why. Is there a way to plot this with lines? The only alternative I see is writing a script to parse the data file and generate a new one with only the values I want (effectively acting as every), but I don't want to do that if I can do it in Gnuplot.
After a closer look to your data, your case has some speciality.
Like in Plotting same line number of several blocks data with gnuplot you can plot the file into a table via with table which will remove the empty lines and hence lines will be connected.
However, some of your particles disappear on one side and re-appear on the opposite side. If you plot this with lines you will get a line through the whole graph which is certainly undesired. You can workaround this if you introduce a function Break() which returns NaN if the difference of two successive x- or y-values are larger than 90% (to be on the safe side) of the x- or y-range , respectively. The effect of NaN is that the line will interrupted.
Code: (works with gnuplot>=5.0.0 version at the time of OP's question)
### plotting trajectories
reset session
set term gif animate delay 3 size 400,400
set output "SO30744875.gif"
set size square
FILE = 'SO30744875.dat'
set key noautotitle
stats FILE u (N=column(-1),M=column(1),$2):3 nooutput
xrange = STATS_max_x-STATS_min_x
yrange = STATS_max_y-STATS_min_y
set table $Data
plot FILE u 1:2:3 w table
unset table
Break(col1,col2) = (x0=x1,x1=column(col1), y0=y1,y1=column(col2), \
abs(x1-x0)<0.9*xrange && abs(y1-y0)<0.9*yrange ? column(col2) : NaN)
do for [i=0:N] {
plot for [j=1:16] x1=y1=NaN $Data u 2:(Break(2,3))every M::j-1::(i+1)*M w l, \
FILE u 2:3 every :::i::i w p pt 7, \
FILE u 2:3:1 every :::i::i w labels offset 0.7,0.7
}
set output
### end of code
Result:
I try to reproduce a simple histogram with Gnuplot with the simple macro:
reset
n=9 #number of intervals
width=1 #interval width
hist(x,width)=width*floor(x/width)
set terminal pngcairo size 800,500 enhanced font 'Verdana,14'
set output "test.png"
set boxwidth width
set style fill transparent solid 0.5 border #fillstyle
set xrange [*:*]
set yrange [0:2.]
set xlabel "x"
set ylabel "Freq."
plot "const.dat" u (hist($1,width)) smooth freq w boxes lc rgb "orange" notitle
whit the follow data:
1.1
1.1
1.1
1.1
1.1
1.1
1.1
1.1
Now I like to understand how the hist(x,width) works in the sense:
hist(x,width)=width*floor(x/width)
works with every numbers taking the width=1 and then:
hist(1.1,1)=1*floor(1.1/1)=1
and so on, right?
Now (hist($1,width)) take all the elements in the columns and applay the hist function to everyone.
And I can be able to make the follow plot with the macro above:!
Question:
If I use (hist($1,width)):(1.0) I Don't understand whit the plots change as all the elements stay in one single boxes (from 0.5 to 1.5) ?
In the first case you specify only a single column in the using statement. Since you need at least two (x and y-value), the specified value (your hist(...)) is used as y-value and the row number as x-value. The statement smooth frequency works such, that it takes all points with the same x-value and sums up the corresponding y-values. In your first example you have no equal x-values since the row number is used.
In the second example, you use the hist(...) value as x-value, which is 1 for all rows. The y-value is 1.0. So you get a single box at x=1 and y=8 (the number of rows).
I am encountering problems while trying to create a 3D (2D mapped) graph.
The data I am generating should create a 3 dimensional normal distribution bump, or, when "mapped", it should look like a flattened 3D graph, with color used as the third dimension
The script I am using to generate the mapped graph is the following:
#!/usr/bin/gnuplot
reset
#set terminal png
set term postscript eps enhanced
set size square
set xlabel "X position"
set ylabel "Y position"
#set zlabel "Synaptic Strength"
#Have a gradient of colors from blue (low) to red (high)
set pm3d map
set palette rgbformulae 22,13,-31
#set xrange [0:110]
#set yrange [0:80]
#set zrange [0:1]
set style line 1 lw 1
#set title "Title"
#Don't want a key
unset key
#set the number of samples
set dgrid3d 51,51
set hidden3d
splot DataFile u 1:2:3
when I run it on the following DataFile (http://www.sendspace.com/file/ppibyw)
I get the following output
The legend indicates a z-range of 0-0.03, however, the datafile has far larger z-values, such as 0.1. Obviously I can't publish a graph that is so inaccurate. Furthermore, I need a better graph in order to gain a better insight as to what is wrong with my simulation.
Does anyone know why gnuplot handles 3d mapped graphs like this? I suspect it has to do with the number, and nature, of the samples.
You problem is in the set dgrid3d 51,51
Have a look at what happens if you write set dgrid3d 51,102 (much better) or set dgrid3d 51,500 (much worse)
The point is that (from the help)
The grid is equally spaced in x
(rows) and in y (columns); the z
values are computed as weighted
averages or spline interpolations of
the scattered points' z values. In
other words, a regularly spaced grid
is created and the a smooth
approximation to the raw data is
evaluated for all grid points. Only
this approximation is plotted, but
not the raw data.
You could try and improve the approximation if you want see the help (?dgrid3d), but I would rather just plot the data straight. You can do this by ditching the dgrid3d command altogether. You will have to modify your data file so that there is a blank line when the x coordinate changes. For example
3.10000000000000142109 4.15692193816530508599 0.00004084299890679580
3.10000000000000142109 4.33012701892219364908 0.00001123746243460237
3.15000000000000124345 0.08660254037844386521 0.00000816290100763514
3.15000000000000124345 0.25980762113533162339 0.00001935936190868058
Then with this simplified script
set terminal png![enter image description here][1]
#set size square
set xlabel "X position"
set ylabel "Y position"
#uncomment the next command to eliminate the mysterious glitch around x=3.4
set yrange [0.1:4.5]
set pm3d map
set output "grid_merged.png"
splot "grid_merged2.dat" u 1:2:3
set output
set term pop
I get
which is better than you get with the interpolated plot. I'm not sure what causes the glitch aroung 3.4, its not there on other (non-mapped) views - altering the yrange eliminates it - although I'm not sure it changing the y-range is cheating in terms of your simulation results....