fit a function to a histogram created with frequency in gnuplot - plot

Intro
In gnuplot there's a solution to create histogram from file named hist.dat what likes
1
2
2
2
3
by using commands
binwidth=1
set boxwidth binwidth
bin(x,width)=width*floor(x/width) + binwidth/2.0
plot [0:5][0:*] "hist.dat" u (bin($1,binwidth)):(1.0) smooth freq with boxes
that generates a histogram like this one from other SO page.
Question
How can I fit my function to this histogram? I defined a Gaussian function and initialized its values by
f(x) = a*exp(-((x-m)/s)**2)
a=3; m=2.5; s=1
and in the output the function follow the histogram well.
Unfortunatelly I cannot fit to this histogram using command
fit f(x) "hist.dat" u (bin($1,binwidth)):(1.0) smooth freq via a,m,s
^
Need via and either parameter list or file
So how can I fit my function without creating a new file containing the binned values?

I'm facing a similar problem and I found a kind of not very ellegant solution.
binwidth=1
set boxwidth binwidth
bin(x,width)=width*floor(x/width) + binwidth/2.0
set table 'hist.temp'
plot [0:5][0:*] "hist.dat" u (bin($1,binwidth)):(1.0) smooth freq with boxes
unset table
And then you can do the fit of the file as you prefer. I know that probably there are some better way of doing this, but for me it is a fast and working solution. I hope this will be helpful for you.
Cheers!

I used this a nd it worked:
gauss(x)=a/(sqrt(2*pi)sigma)*exp(-(x-mean)**2/(2*sigma**2))
fit gauss(x) 'data.txt' via a,sigma,mean
after 83 iterations GNUplot calculated me a, sigma, and mean

Related

How do I plot a function with different values of various parameters on gnuplot?

I want to plot this parametric funtion
x(t)=a+(n/w)*(cos(w*t)-1)-((g/w**2)*sin(l)-m/w)*(sin(w*t)+(g*t/w)*sin(l))
y(t)=b+((g/w**2)*sin(l)-m/w)*(cos(w*t)-1)+(n/w)*sin(w*t)
But I have various parameters a,b,m,n,w,k,g y I want to vary a,b,m,n,w,k parameters. g=7
I don't know how do it.
However i intended:
plot for [a=1:0],[b=1:0],[n=0:1],[m=1:0],[g=7:7],[w=2*pi*3:2*pi*2] x(t),y(t)
I appreciate your help. thanks!
For what regards multiple for in a single plot command, the syntax is documented in the inline help of gnuplot
Nested iteration is supported:
set for [i=1:9] for [j=1:9] label i*10+j sprintf("%d",i*10+j) at i,j
E.g.,
gnuplot> plot for[a=0:1] for[b=0:1] 1+a*x+b*x*x
You could do now what you want except for a further problem: in gnuplot numeric loops variables have only integer values, e.g.,
gnuplot> plot for [s=3.2:9.3:2.9] x title sprintf("%f", s)
so your loop on w is impossible, you have to devise a different strategy.
With the provision that I don't know what value you want to assign to l (I've used l=1) and also that I don't know the step on w, here it is a possible implementation where the tricks are ❶ define a function that gives you the values of w in terms of an integer variable and ❷ define x and y also in terms of this auxiliary variable
gnuplot> set parametric
gnuplot> w(k) = 2*pi*k
gnuplot> x(t, k)=a+(n/w(k))*(cos(w(k)*t)-1)-((g/w(k)**2)*sin(l)-m/w(k))*(sin(w(k)*t)+(g*t/w(k))*sin(l))
gnuplot> y(t, k) = b + ((g/w(k)**2)*sin(l)-m/w(k))*(cos(w(k)*t)-1)+(n/w(k))*sin(w(k)*t)
gnuplot> g = 7 ; l = 1
gnuplot> plot for[a=0:1] for[b=0:1] for[m=0:1] for[n=0:1] for[k=2:3] x(t,k), y(t,k) title sprintf("%d,%d,%d,%d,%d", a,b,n,m,k)

Graph with Gnuplot of statistic with different three different input data

I am plotting the frequency of the data sampling in an interval. The code is:
n=50 #number of intervals
plot "xxx.csv" u ($0):1 #To get the max and min value
max=GPVAL_Y_MAX
min=GPVAL_Y_MIN
width=(max-min)/n #interval width
#function used to map a value to the intervals
hist(x,width)=width*floor(x/width)
set ytic auto
set xtic auto
plot "xxx.csv" u (hist($1,width)):(1.0) smooth freq w histeps ls 1 title "xxx"
This works, but I would like to put two similar graph overlapped with different data. The problem is that the data are different so max, min and width are not the same. The data are separated files like yyy.csv and zzz.csv. How can I do this?
Do you have gnuplot >= 4.6? If so you can use the stats command to get statistics for those files easily, otherwise it would probably be a matter of doing what you did in your script (plot, then use GPVAL_Y_MIN, etc.) and create a set of variables for each data set.
(Posting my earlier comment as an answer.)

Plotting different contour plots with similar scales in R or gnuplot

I am new to R for plotting, and I wish to do contour plots for several files. and here is what I have got so far. My file has 3 columns, X,Y,Z, and with some nan values. Since lattice does not allow Inf/NaN values, I had to remove them prior, and do some interpolation.
data <- read.table("file", sep=",", header=T)
mydata <- na.omit(data)
library(akima)
library(lattice)
s = interp(mydata$X, mydata$Y, mydata$Z)
filled.contour(s, xlim= c(5,25), ylim=c(40,180))
This does gives some results, but there are things I am not able to do:
To get contour lines on the graph.
Also there are like 3 files with different z ranges, say one from (0-18), (0-20), (0-25). I wish to adjust and rescale them to provide similar color scale on graph, for instance, the '15' value should be similar color on all three.
I am more familiar with gnuplot, but there also the problem is with the ranges, as the range always autoscale to color, and it seems difficult to control the range. Any help with that is also deeply appreciated.
I may be doing something wrong, so in case anybody could help me out, and provide to right direction, or right software, I will be grateful.
There are demos here for how to make contours in gnuplot. Are you having trouble in the sense that you have code to make a contour plot but it does not work?
To answer your second question, in gnuplot the command you probably want is
set cbrange [CB_MIN:CB_MAX]
This sets the range of values which will be colored according to the current palette. You would just have to issue the same set cbrange command for all three plots you are making. If you want to automatically set the cbrange to the min/max on all files, you can use the stats command (in version 4.6 or newer, otherwise it is more tricky):
stats 'datafile1' using 3 name 'd1'
stats 'datafile2' using 3 name 'd2'
stats 'datafile3' using 3 name 'd3'
datamin_z = (d1_min<d2_min&&d1_min<d3_min?d1_min:d2_min<d3_min?d2_min:d3_min)
datamax_z = (d1_max>d2_max&&d1_max>d3_max?d1_max:d2_max>d3_max?d2_max:d3_max)
set cbrange [datamin_z:datamax_z]

Fixing bug in geom_smooth method in R

I want to create a smooth curve plot for the data I have. I have data in a text file, say file.txt which is a tab seperated file and headers are A and B like
A B
0.1 0.2
.....
.....
There are about 30000 such data points under both A and B
I am using the following code for that:
dstr_data <- read.table("file.txt", header=T, sep="\t")
ggplot(dstr_data,aes(xaxis))+geom_smooth(method="auto",aes(y=dstr_data$A)
,colour="red",size=0.75)+geom_smooth(method="auto",aes(y=dstr_data$B),
colour="darkgreen",alpha=0.5,size=0.75)+opts(title=expression("Test Plot"),
panel.background = theme_rect(fill='blanchedalmond', colour='black'))+
xlab("Data")+ylab("Values")
geom_smooth: method="auto" and size of largest group is >=1000,
so using gam with formula: y ~ s(x, bs = "cs"). Use 'method = x' to change the smoothing method.
xaxis in my code holds numbers from 1 to 30000. So my X-axis would be numbers from 1 to 30000. Y-axis would be values from the file.txt. So, I am trying to plot two curves on one graph now.
I want to know why this Error is displayed and how I can fix it. I want to use a method that gives me a smooth curve of the data and not a straight line and hence I do not want to use the lm,glm methods.
Also I get the graph only for a subset of data and not the entire data. Why does this happen?
Can someone help me in this? Thank you in advance.

Problem with axis limits when plotting curve over histogram [duplicate]

This question already has an answer here:
How To Avoid Density Curve Getting Cut Off In Plot
(1 answer)
Closed 6 years ago.
newbie here. I have a script to create graphs that has a bit that goes something like this:
png(Test.png)
ht=hist(step[i],20)
curve(insert_function_here,add=TRUE)
I essentially want to plot a curve of a distribution over an histogram. My problem is that the axes limits are apparently set by the histogram instead of the curve, so that the curve sometimes gets out of the Y axis limits. I have played with par("usr"), to no avail. Is there any way to set the axis limits based on the maximum values of either the histogram or the curve (or, in the alternative, of the curve only)?? In case this changes anything, this needs to be done within a for loop where multiple such graphs are plotted and within a series of subplots (par("mfrow")).
Inspired by other answers, this is what i ended up doing:
curve(insert_function_here)
boundsc=par("usr")
ht=hist(A[,1],20,plot=FALSE)
par(usr=c(boundsc[1:2],0,max(boundsc[4],max(ht$counts))))
plot(ht,add=TRUE)
It fixes the bounds based on the highest of either the curve or the histogram.
You could determine the mx <- max(curve_vector, ht$counts) and set ylim=(0, mx), but I rather doubt the code looks like that since [] is not a proper parameter passing idiom and step is not an R plotting function, but rather a model selection function. So I am guessing this is code in Matlab or some other idiom. In R, try this:
set.seed(123)
png("Test.png")
ht=hist(rpois(20,1), plot=FALSE, breaks=0:10-0.1)
# better to offset to include discrete counts that would otherwise be at boundaries
plot(round(ht$breaks), dpois( round(ht$breaks), # plot a Poisson density
mean(ht$counts*round(ht$breaks[-length(ht$breaks)]))),
ylim=c(0, max(ht$density)+.1) , type="l")
plot(ht, freq=FALSE, add=TRUE) # plot the histogram
dev.off()
You could plot the curve first, then compute the histogram with plot=FALSE, and use the plot function on the histogram object with add=TRUE to add it to the plot.
Even better would be to calculate the the highest y-value of the curve (there may be shortcuts to do this depending on the nature of the curve) and the highest bar in the histogram and give this value to the ylim argument when plotting the histogram.

Resources