Area under the curve using idl - idl-programming-language

I have a curve between dff(x axis) and dc(y axis) and I calculated the area under the curve using IN_TABULATED function.
X=[-0.00205553,-0.00186668,-0.00167783,-0.00148899,-0.00130014,-0.00111129,-0.000922443,-0.000733597,-0.000450326,-0.000261480,0.000116216,0.000399487, 0.000588333,0.000777179,0.000966027,0.00115488,0.00134372,0.00153257,0.00172141,0.00181584,0.00200468]
F=[0.00000,21.0000,26.0000,57.0000,94.0000,148.000,248.000,270.000,388.000,418.000,379.000,404.000,358.000,257.000,183.000,132.000,81.0000,47.0000,23.0000,17.0000,431.000]
A=INT_TABULATED(X,F)
print, A
Now, I need to have a loop start from n,0 (from right to left) and calculate A1 which is 0.01 of A and to stop there, then print dff values which represent A1's area. How can I do this? Any suggestion will be helpful.

I'm not sure I fully understand the question, so let me begin by stating my interpretation. You have a curve which integrates to A. Starting from the right, you want the X-value (let's call it X1) which encloses 0.01 of A (the total area under the curve). In other words, 0.99 of the total area under the curve F is to the left of X1, and 0.01 of the area is to the right.
Assuming this interpretation is correct, here's a solution:
First, loop through the data and calculate the integral from 0 to each point.
npoints = n_elements(x)
; Initialize a vector to hold integration results
area_cumulative = []
; Loop through each data point, calculating integrals from 0 to that point
for index = 0, npoints-1 do begin
; Assume area under first point is zero, otherwise, calculate integral
if index eq 0 then area_up_to_point = 0d0 $
else area_up_to_point = int_tabulated(x[0:index], f[0:index])
; Store integral value in the cumulative vector
area_cumulative = [area_cumulative, area_up_to_point]
endfor
Then, you can interpolate to find X1:
;;; Find where cumulative distribution reaches 0.99 of A
a1 = 0.99 * a
x1 = interpol(x, area_cumulative, a1)
Here's an illustration. The upper plot is your data, and the lower plot is the cumulative area (integral from x[0] to x). The red dashed lines show X1 = 0.001952. The gray shaded region contains 0.01 of the total area.
Hope this helps!

Related

How to define transformation of curve using single parameter?

Let's assume I have a curve defined by 4 points and I have 2 'states' of curve, like on this picture:
I want to control the deformation of the curve by single parameter in range [0, 1], 0 is corresponding to upper curve and 1 corresponding to lower curve, intermediate values like 0.5 should represent some intermediate transformation from upper curve to lower curve. How it can be done?
Do you know how to parametrize the motion of one point?
Suppose you have a point that can move on a vertical line, its position varying between two extremes, y0 and y1.
Now assign a parameter t, which varies from 0 t0 1, so that y(t=0) = y0 and y(t=1) = y1.
Now make y a linear function of t: y(t) = y0 + t(y1-y0)
Now look at your curves. The only motion of the points to get from one state to the other appears to be vertical. So each of the four points moves like an example of the y(t) above, but with different values of x, y0 and y1. (From your drawing, it looks as if the two end points are stationary and the two middle points move the same way, but that's just a special case.)

How to find the intecept x and y coordinates from 4 data points in Excel?

I have two points which form one line: (1,4) and (3,6), and another two which form another line: (2,1) and (4,2). These lines are continuous and I can find their intersection points by finding the equation for each line, and then equating them to find the x value at the intersection point, and then the y value.
i.e. for the first line, the equation is y = x + 3, and the second is y = 0.5x. At the intersection the y values are the same so x + 3 = 0.5x. So x = -6. Subbing this back into either of the equations gives a y value of -3.
From those steps, I now know that the intersection point is (-6,-3). The problem is I need to do the same steps in Excel, preferably as one formula. Can anyone give me some advice on how I would start this?
Its long but here it is:
Define x1,y1 and x2,y2 for the 1st line and x3,y3 and x4,y4 for the second.
x = (x2y1-x1y2)(x4-x3)-(x4y3-x3y4)(x2-x1) / [ (x2-x1)(y4-y3) - (x4-x3)(y2-y1) ]
y = (x2y1-x1y2)(y4-y3)-(x4y3-x3y4)(y2-y1) / [ (x2-x1)(y4-y3) - (x4-x3)(y2-y1) ]
Note that the denominators are the same. They will be ZERO! when the system has no solution. So you may want to check that in another cell and conditionally compute the answer.
Essentially, this formula is derived by solving a system of equations for x and y by hand using generic points (x1,y1), (x2,y2), (x3,y3), and (x4,y4). Easier yet, is solving the system by hand using well developed linear algebra concepts.
Wikipedia outlines this procedure well: Line-line intersection.
Also, this website describes all the different formulas and lets you put in whatever data you have in any mixed format and provides many details of the solutions: Everything about 2 lines.
Here's a matrix based solution:
x - y = -3
0.5*x - y = 0
Written as a matrix equation (I apologize for the poor typesetting):
| 1.0 -1.0 |{ x } { -3 }
| 0.5 -1.0 |{ y } = { 0 }
You can invert this matrix or use LU decomposition to solve it to get the answer. That method will work for any number of cases where you have one equation for each unknown.
This is easy to do by hand:
Subtract the second equation from the first: 0.5*x = -3
Divide both sides by 0.5: x = -6
Substitute this result into the other equation: y = 0.5*x = -3

R density plot y axis larger than 1

I want a density plot, and here is the code:
d = as.matrix(read.csv(file = '1.csv'))
plot(density(d))
my data is a list of number. What I don't understand is that the value of y axis large than 1
I think there is something wrong and search the internet, but i can't find any resource, Can you guys help me?
enter image description here
here is the data:
link:http://pan.baidu.com/s/1hsE8Ony password:7a4z
There is nothing wrong with the density being greater than 1 at some points. The area under the curve must be 1, but at specific points the density can be greater than 1. For example,
dnorm(0,0, 0.1)
[1] 3.989423
See this Cross Validated post
Edit:
I think that the dnorm part above could be amplified a little.
For a Gaussian distribution, with mean μ and standard deviation σ approximately 99.73% of the area under the density curve lies between
μ-3σ and μ+3σ. The example above used μ=0 and σ=0.1 so the area under the curve between -0.3 and 0.3 should be 0.9973. The width of the curve here is 0.6. Compare this with a rectangle of equal area (0.9973) and the same base (0.6).
If the area of the rectangle is 0.9973 and the base is 0.6, the height must be 0.9973/0.6 = 1.6621, i.e. the average height of the curve must be 1.6621. Clearly there must be some points with height greater than 1.

1D Hermite Cubic Splines with tangents of zero - how to make it look smoother

I am given 3 values y0, y1, y2. They are supposed to be evenly spaced, say x0 = -0.5, x1 = 0.5, x2 = 1.5. And to be able to draw a spline through all of them, the derivatives at all points are said to be dy/dx = 0.
Now the result of rendering two Catmull-Rom-Splines (which is done via GLSL fragment shader, including a nonlinear transformation) looks pretty rigit. I.e. where the curve bends, it does so smoothly, though, but the bending area is very small. Zooming out makes the bends look too sharp.
I wanted to switch to TCB-Splines (aka. Kochanek-Bartels Splines), as those provide a tension parameter - thus I hoped I could smooth the look. But I realized that all TCB-Parameters applied to a zero tangent won't do any good.
Any ideas how I could get a smoother looking curve?
The tangent vector for a 2d parametric curve f(t)=(x(t), y(t)) is defined as f'(t)=(dx(t)/dt, dy(t)/dt). When you require your curve to have dy/dx = 0 at some points, it simply means the tangent vector at those points will go horizontally (i.e., dy/dt = 0). It does not necessarily mean the tangent vector itself is a zero vector. So, you should still be able to use TCB spline to do whatever you want to do.
Obviously nobody had a good answer, but as it's my job, I found a solution: The Points are evenly spaced, and the idea is to make transitions smoother. Now it's given, that the tangents are zero at all given Points, so it is most likely that close to the points we get the strongest curvature y''(x). This means, we'd like to stretch these "areas around the points".
Considering that currently we use Catmull-Rom-Splines, sectioned between the points. That makes y(x) => y(t) , t(x) = x-x0.
This t(x) needs to be stretched around the 0- and the 1-areas. So the cosine function jumped into my mind:
Replacing t(x) = x-x0 with t(x) = 0.5 * (1.0 - cos( PI * ( x-x0 ) ) did the job for me.
Short explanation:
cosine in the range [0,PI] runs smoothly from 1 to -1.
we want to run from 0 to 1, though
so flip it: 1-cos() -> now it runs from 0 to 2
halve that: 0.5*xxx -> now it runs from 0 to 1
Another problem was to find the correct tangents. Normally, calculating such a spline using Matrix-Vector-Math, you simply derive your t-vector to get the tangents, so deriving [t³ t² t 1] yields [3t² 2t 1 0]. But here, t is not simple. Using this I found the right derived vector:
| 0.375*PI*sin(PI*t)(1-cos(PI*t))² |
| 0.500*PI*sin(PI*t)(1-cos(PI*t)) |
| 0.500*PI*sin(PI*t) |
| 0 |

What does "t" represent in De Casteljau's algorithm

Hi everybody I need your help. My question is: what does "t" represent in De Casteljau's algorithm?
We have the following formula to calculate the point Q:
Q=(1−t)P1+tP2, t∈[0,1]
But what does t mean here and why it is between 0 and 1?
The t is an interpolation value.
For many computations, it is beneficial to parameterize the curve based on unit length. This basically means that the t describes a position on the curve, with t=0 being the start of the curve, and t=1 being the end of the curve.
Consider the simplest case of interpolating between two points: Changing the value of t between 0 and 1 can be imagined as "walking along the line between the two points". For such a simple interpolation, you can say that the "curve" (i.e. the line between the points) is described by the equation
P(t) = (1-t)*P0 + t*P1
For example, for t=0.25, you compute 0.75 * P0 + 0.25 * P1, which yields a point in the middle of the left half of the line between P0 and P1.
For the case of De Casteljau's algorithm, the situation is a bit more involved: Depending on the degree of the curve, you don't interpolate between fixed points P0 and P1, but between multiple points whose positions in turn depend on the variable t. This is usually computed recursively. But still, the variable t is a value between 0 and 1 that describes the position on the curve.
Q = (1 − t) P1 + t P2, t ∈ [0, 1] is a parametrization of the line segment from P1 to P2, and t is its parameter. As t goes from 0 to 1, Q traverses the line segment from P1 to P2.
Note that that is valid for any P1 and P2 such that you can compute a linear combination of them. It's not necessary that P1 and P2 be points in R^n -- they could be matrices, for instance, or functions.

Resources