I'm looking for a way to plot histograms in 3d to produce something like this figure http://www.gnuplot.info/demo/surface1.17.png but where each series is a histogram.
I'm using the procedure given here https://stackoverflow.com/a/19596160 and http://www.gnuplotting.org/calculating-histograms/ to produce histograms, and it works perfectly in 2d.
Basically, the commands I use are
hist = 'u (binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart):(1) smooth freq w boxes
plot 'data.txt' #hist
Now I would just like to add multiple histograms in the same plot, but because they overlap in 2d, I would like to space them out in a 3d plot.
I have tried to do the following command (using above procedure)
hist = 'u (1):(binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart):(1) smooth freq w boxes
splot 'data.txt' #hist
But gnuplot complains that the z values are undefined.
I don't understand why this would not put a histogram along the value 1 on the x-axis with the bins along the y-axis, and plot the height on the z-axis.
My data is formatted simply in two columns:
Index angle
0 92.046
1 91.331
2 86.604
3 88.446
4 85.384
5 85.975
6 88.566
7 90.575
I have 10 files like this, and since the values in the files are close to each other, they will completely overlap if I plot them all in one 2d histogram. Therefore, I would like to see 10 histograms behind each other in a sort of 3d perspective.
This second answer is distinct from my first. Whereas the first addresses what the OP was trying to accomplish, this second provides an alternative approach which address the underlying problem the OP was trying to overcome.
I have posted an answer that addresses the ability to do this in 3d. However, this isn't usually the best way to do this with multiple histograms like this. A 3d graph like that will be difficult to compare.
We can address the overlap in 2D by stagnating the position of the boxes. With default settings, the boxes will spread out to touch. We can turn that off and adjust the position of the boxes to allow more than 1 histogram on a graph. Remember, that the coordinates you supply are the center of the boxes.
Suppose that I have the data you have provided and this additional data set
Index Angle
0 85.0804
1 92.2482
2 90.0384
3 99.2974
4 87.729
5 94.6049
6 86.703
7 97.9413
We can set the boxwidth to 2 units with set boxwidth 2 (your bins are 4 units wide). Additionally, we will turn on box filling with set style fill solid border lc black.
Then I can issue
plot datafile1 u (binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart):(1) smooth freq w boxes, \
datafile2 u (binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart+1):(1) smooth freq w boxes
The second plot command is identical to the first, except for the +1 after binstart. This will shift this box 1 unit to the right. This produces
Here, the two series are clear. Keeping track of which box is associated with each is easy because of the overlap, but it is not enough to mask the other series.
We can even move them next to each other, with no overlap, by subtracting 1 from the first plot command:
plot datafile1 u (binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart-1):(1) smooth freq w boxes, \
datafile2 u (binwidth*(floor(($2-binstart)/binwidth)+0.5)+binstart+1):(1) smooth freq w boxes
producing
This first answer is distinct from my second. This answer address what the OP was trying to accomplish whereas the second addresses the underlying problem the OP was trying to overcome.
Gnuplot isn't going to be able to do this on it's own, as the relevant styles (boxes and histograms) only work in 2D. You would have to do it using an external program.
For example, using your data and your 2d command (your first command), we get (using your data and the linked values of -100 and 4 for binstart and binwidth)
To draw these boxes on the 3d grid, we will need to use the line style and have four points for each: lower left, upper left, upper right, and lower right. We can use the previous command and capture to a table, but this will only gives the upper center point. We can use an external program to pre-process, however. The following python program, makehist.py, does just that.
from sys import argv
import re
from math import floor
pat = re.compile("\s+")
fname = argv[1]
binstart = float(argv[2])
binwidth = float(argv[3])
data = [tuple(map(float,pat.split(x.strip()))) for x in open(fname,"r").readlines()[1:]]
counts = {}
for x in data:
bn = binwidth*(floor((x[-1]-binstart)/binwidth)+0.5)+binstart
if not bn in counts: counts[bn] = 0
counts[bn]+=1
for x in sorted(counts.keys()):
count = counts[x]
print(x-binwidth/2,0)
print(x-binwidth/2,count)
print(x+binwidth/2,count)
print(x+binwidth/2,0)
print(max(counts.keys())+binwidth/2,0)
print(min(counts.keys())-binwidth/2,0)
Essentially, this program does the same thing as the smooth frequency option does, but instead of getting the upper center of each box, we get the four previously mentioned points along with two points to draw a line along the bottom of all the boxes.
Running the following command,
plot "< makehist.py data.txt -100 4" u 1:2 with lines
produces
which looks very similar to the original graph. We can use this in a 3d plot
splot "< makehist.py data.txt -100 4" u (1):1:2 with lines
which produces
This isn't all that pretty, but does lay the histogram out on a 3d plot. The same technique can be used to add multiple data files onto it spread out. For example, with the additional data
Index Angle
0 85.0804
1 92.2482
2 90.0384
3 99.2974
4 87.729
5 94.6049
6 86.703
7 97.9413
We can use
splot "< makehist.py data.txt -100 4" u (1):1:2 with lines, \
"< makehist.py data2.txt -100 4" u (2):1:2 with lines
to produce
Related
I have a large dataset which I need to plot in loglog scale in Gnuplot, like this:
set log xy
plot 'A_1D_l0.25_L1024_r0.dat' u 1:($2-512)
LogLogPlot of my datapoints
Text file with the datapoints
Datapoints on the x axis are equally spaced, but because of the logscale they get very dense on the right part of the graph, and as a result the output file (I finally export it in .tex) gets very large.
In linear scale, I would simply use the option every to reduce the number of points which get plotted. Is there a similar option for loglogscale, such that the plotted points appear equally spaced?
I am aware of a similar question which was raised a few years ago, but in my opinion the solution is unsatisfactory: plotted points are not equally spaced along the x-axis. I think this is a really unsophisticated problem which deserves a clearer solution.
As I understand it, you don't want to plot the actual data points; you just want to plot a line through them. But you want to keep the appearance of points rather than a line. Is that right?
set log xy
plot 'A_1D_l0.25_L1024_r0.dat' u 1:($2-512) with lines dashtype '.' lw 2
Amended answer
If it is important to present outliers/errors in the data set then you must not use every or any other technique that simply discards or skips most of the data points. In that case I would prefer the plot with points that you show in the original question, perhaps modified to represent each point as a dot rather than a cross. I will simulate this by modifying a single point in your 500000 point data set (first figure below). But I would also suggest that the presence of outliers is even more apparent if you plot with lines (second figure below).
Showing error bounds is another alternative for noisy data, but the options depend on what you have to work with in your data set. If you want to pursue that, please ask a separate question.
If you really want to reduce the number of data to be plotted, you might consider the following script.
s = 0.1 ### sampling interval in log scale
### (try 0.05 for more detail)
c = log10(0.01) ### a parameter used in sampler(x)
### which should be initialized by
### smaller value than any x in log scale
sampler(x) = (x>0 && log10(x)>=c) ? (c=ceil(log10(x)/s+0.5)*s, x) : NaN
set log xy
set grid xtics
plot 'A_1D_l0.25_L1024_r0.dat' using (sampler($1)):($2-512) with points pt 7 lt 1 notitle , \
'A_1D_l0.25_L1024_r0.dat' using 1:($2-512) with lines lt 1 notitle
This script samples the data in increments of roughly 0.1 on x-axis in log scale. It makes use of the property that points whose x value is evaluated as NaN in using are not drawn.
I tried to plot graph using the pointinterval command and I would like the 1st point of my data to be plotted which is not the case for the hot side of my first plot. Indeed we see the purple dashed line but no point at the bottom left corner (around y+=0.35).
My code involves for loop and is displayed below:
plot for [i=1:words(FILES)] myDataFile(i) u (column(1)):(column(6)/word(UTAUS_ch,i)) w lp pointinterval 2 pt myPointtype(i) ps myPointsize(i) dt myDashtype(i) lt myLinetype(i) lw myLinewidth(i) lc rgb myLinecolor(i) title myTitle(i)
If I plot with pointinterval 1 we see that those points exist (see picture below).
How can I force the first point to be plotted with pointinterval?
Is that possible to plot half of my points every 2 points and the other part every 2 points but with an offset of 1 point?
I do not think you will be able to do what you want using the pointinterval property. It is designed so that the offset of the initial point increases by one for each plot drawn, with the intention of reducing the chance that point symbols from successive plots will overlap. This is exactly opposite to what you are trying to do.
Therefore I suggest not plotting each dataset with linespoints pi N. Instead plot each dataset twice, once with lines and once with points using a filter in the using specifier like this:
plot FOO using 1:2 with lines, '' using ((int($0)%N) ? NaN : $1) : 2 with points
The filter (int($0)%N ? NaN : $1) suppresses all points whose line number is not evenly divisible by N. This is essentially what the pointinterval property does, except that pointinterval skips out-of-range points and otherwise unplottable points rather than strictly using the line number as an index.
Edit If individual offset values are required because x-coordinates are not consistent:
array offset[N] = [1,1,2,-1, and so on]
plot for [i=1:N] \
MyDataFile(i) using 1:2 with lines, \
'' using (((int($0)+offset[i] % N) ? NaN : $1) : 2 with points
I have a data file with blocks of x/y values. Each block contains 16 lines with x/y pairs and each block represents those positions in a different time. http://pastebin.com/0teRrfRU
I want to plot the trajectory of a specific particle. To do that, I've written plot 'pos.dat' u 2:3 every ::n:0:n:i, where n is the n-th particle and i is the time up to which I want the trajectory plotted (I can then loop over the i to generate an animation).
This runs fine, but when I add w lines nothing gets plotted, and I don't understand why. Is there a way to plot this with lines? The only alternative I see is writing a script to parse the data file and generate a new one with only the values I want (effectively acting as every), but I don't want to do that if I can do it in Gnuplot.
After a closer look to your data, your case has some speciality.
Like in Plotting same line number of several blocks data with gnuplot you can plot the file into a table via with table which will remove the empty lines and hence lines will be connected.
However, some of your particles disappear on one side and re-appear on the opposite side. If you plot this with lines you will get a line through the whole graph which is certainly undesired. You can workaround this if you introduce a function Break() which returns NaN if the difference of two successive x- or y-values are larger than 90% (to be on the safe side) of the x- or y-range , respectively. The effect of NaN is that the line will interrupted.
Code: (works with gnuplot>=5.0.0 version at the time of OP's question)
### plotting trajectories
reset session
set term gif animate delay 3 size 400,400
set output "SO30744875.gif"
set size square
FILE = 'SO30744875.dat'
set key noautotitle
stats FILE u (N=column(-1),M=column(1),$2):3 nooutput
xrange = STATS_max_x-STATS_min_x
yrange = STATS_max_y-STATS_min_y
set table $Data
plot FILE u 1:2:3 w table
unset table
Break(col1,col2) = (x0=x1,x1=column(col1), y0=y1,y1=column(col2), \
abs(x1-x0)<0.9*xrange && abs(y1-y0)<0.9*yrange ? column(col2) : NaN)
do for [i=0:N] {
plot for [j=1:16] x1=y1=NaN $Data u 2:(Break(2,3))every M::j-1::(i+1)*M w l, \
FILE u 2:3 every :::i::i w p pt 7, \
FILE u 2:3:1 every :::i::i w labels offset 0.7,0.7
}
set output
### end of code
Result:
I have a set of data, looks like:
x y z
1 1 2 1
2 3 5 7
3 -3 2 4
4 -2 1 1
so each row record the dot coordinate in a 3-D space. I want to plot all the dot as points except for one, say no.15 as a translucent sphere, with radius I can set. Then I can see from the plot that which of those points in the data are included in the sphere. I'm using RGL package right now and did the following:
> open3d()
> plot3d(readin,col=3,type="p")
> plot3d(readin[15,],col=2,add=T,type="s",radius=0.1)
So the first plot command plotted the whole set as scatter plots and the second plot command picked the 15th row of the data and plot it as a sphere and add it to the previous canvas. I just wondering if I can make the sphere translucent so that I can see which dots a included in the sphere which means those dots are very near to the one I select.
Is there a way to do this by RGL Or you can provide me another ways to complete this task?
Thanks!
I think what you are looking for is the argument alpha.
Example
df <- data.frame(x=c(1,3,-3,-2), y=c(2,5,2,1),z=c(1,7,4,1))
library(rgl)
open3d()
plot3d(df,col=3,type="p", radius=0.5)
plot3d(df,col=rgb(1,0,0.3),alpha=0.5, add=T,type="s",radius=1)
You can plot transparent spheres using the alpha argument to spheres3d. You can rotate the plot to move the box line behind the sphere to prove it's transparent.
spheres3d(dat[4,],col=rgb(1,0,0), alpha=0.9) # transparent red.
(I tried to do it with the alpha argument to rgb but it failed.)
If you just want to find out which points are within a certain radius of point 15 then you can calculate the Euclidean distance from each point to point 15 and see which of those distances are less than the radius. No plotting needed (though you could plot those points as a different color to highlight them. The dist function is one way to compute the distances, or it is simple to program yourself.
I know on gnuplot you can plot some data with circles as the plot points:
plot 'data.txt' using 1:2 ls 1 with circles
How do I then set the size of the circles? I want to plot several sets of data but with different size circles for each data set.
If you have a third column in your data, the third column specifies the size of the circles. In your case, you could have the third column have the same value for all the points in each data set. For example:
plot '-' with circles
1 1 0.2
e
will plot a circle at (1,1) with radius 0.2. Note that the radius is in the same units as the data. (The special file name '-' lets you input data directly; typing 'e' ends the input. Type help special at the gnuplot console for more info.)
You can look here for more ideas of how to use circles.
I used:
plot "file" using 1:2:($2*0+10) with circles
This will fake a the third column specifying the sizes - it is probably possible to write it simpler, but this worked for me.