Formula to calculate the distance between points within a group - formula

In my problem, there are n-number of groups. Each group contains k-number of points.
The number of nodes can vary in groups. I have to select the group whose points are closely located (closely located in terms of distance).
For this purpose, first I have to calculate the closeness of points within a group (for each group I have to calculate the closeness metric).
I need guidance regarding the formula to calculate the distance between nodes within the cluster.
Please guide me which formula suits this problem.

if your points have some x y coords and depend on your structure
you can use
1)Manhattan distance: (x1 - x2) + (y1 - y2)
2)shortest path distance (straight line): square root of ((x2 -x1)^2 + (y2 -y1)^2)
edit: you could make your groups as graph and use some algorithm for shortest path (something like dijkstra shortest path), and get the nearest from the starting location, and then maybe calculate what is the distance

Related

Reflection matrix for an arbitrary plane

I have seen the HouseHolder
equation which creates an matrix that reflects an point about an plane but the equation assumes the plane only has a normal vector v.
My plane has 3 components
The normal unit vector V
A point that lies on the plane P
Distance of the plane from origin D
All stored in seperate variables.
How would I extend the equation to take the point and distance into its calculation or do I need a different approach?
(I found out the solution anyway so here it is)
The aforementioned householder equation also suppose that your plan contains the origin. So we cannot directly apply to your solution.
However let consider P as the new origin, x coordinates in this system are now x - P and the plan goes by the origin, its normal remains the same so you can compute S the symetrie of x in this system thanks to the HouseHolder equation:
S' = (x-p) - 2v(v^h(x-p)).
and you can get its coordinate in the original system :
S(x) = (x-p) - 2v(v^h(x-p)) + p

r - Calculate shortest distance between 2 points in a delaunay triangulation

Currently I'm working with spatial data and applied a Delaunay triangulation on my data points. I additionally calculated the geodesic distances on the WGS84 ellipsoid for every edge (connection between 2 points) in the triangulation. Now I'm going to search the shortest path between every 2 points in the generated graph and calculate the path distance. The shortest distance should thus be calculated as the sum over all edge distances.
Below is a minimal working example:
library(deldir)
set.seed(31)
x <- runif(100)
y <- runif(100)
d <- deldir(x, y) #preforms tesselation & Delaunay triangulation
#Calculate edge distances (for reasons of simplicity I calculate here Euclidean distances)
geodists <- NULL
for (i in 1:nrow(d$delsgs)) {
geodists[i] <- sqrt((x[d$delsgs[i,5]] - x[d$delsgs[i,6]])^2 + (y[d$delsgs[i,5]] - y[d$delsgs[i,6]])^2)
}
#Plot data
plot(d, wlines="triang")
However, I have no idea how I can perform the shortest path search on the deldir object I created. Thus, I'd be very happy if you could provide some solutions for my problem:
How can I identify which edges are involved in the shortest path between point A and B?
How can I then efficiently calculate the path distance matrix?
Thanks a lot in advance for your help!
There are some path finding algorithms. One of them is A* (Wikipedia Link)
Maybe this helps you.
You can replace the regularly ordered points in an Euclidean Metric by the delaunay points of your collection of points.
Then always go to the next neighbor, which is closest to the finish point.

Position(t) on cubic bezier curve

The only equation to calculate this that I can find involves t in the range [0, 1], but I have no idea how long it will take to travel the entire path, so I can't calculate (1 - t).
I know the speed at which I'm traveling, but it seems to be a heavy idea to calculate the total time beforehand (nor do I actually know how to do that calculation). What is an equation to figure out the position without knowing the total time?
Edit To clarify on the cubic bezier curve: I have four control points (P0 to P1), and to get a value on the curve with t, I need to use the four points as such:
B(t) = (1-t)^3P0 + 3t(1-t)^2P1 + 3t^2(1-t)P2 + t^3P3
I am not using a parametric equation to define the curve. The control points are what define the curve. What I need is an equation that does not require the use of knowing the range of t.
I think there is a misunderstanding here. The 't' in the cubic Bezier curve's definition does not refer to 'time'. It is parameter that the x, y or even z functions based on. Unlike the traditional way of representing y as a function of x, such as y=f(x), an alternative way of representing a curve is by the parametric form that represents x, y and z as functions of an additional parameter t, C(t)=(x(t), y(t), z(t)). Typically the t value will range from 0 to 1, but this is not a must. The common representation for a circle as x=cos(t) and y=sin(t) is an example of parametric representation. So, if you have the parametric representation of a curve, you can evaluate the position on the curve for any given t value. It has nothing to do with the time it takes to travel the entire path.
You have the given curve and you have your speed. To calculate what you're asking for you need to divide the total distance by the speed you traveled given that time. That will give you the parametric (t) you need. So if the total curve has a distance of 72.2 units and your speed is 1 unit then your t is 1/72.2.
Your only missing bit is calculating the length of a given curve. This is typically done by subdividing it into line segments small enough that you don't care, and then adding up the total distance of those line segments. You could likely combine those two steps as well if you were so inclined. If you have your given speed, just iteration like 1000th of the curve add the line segment between the start and point 1000th of the way through the curve, and subtract that from how far you need to travel (given that you have speed and time, you have distance you need to travel), and keep that up until you've gone as far as you need to go.
The range for t is between 0 and 1.
x = (1-t)*(1-t)*(1-t)*p0x + 3*(1-t)*(1-t)*t*p1x + 3*(1-t)*t*t*p2x + t*t*t*p3x;
y = (1-t)*(1-t)*(1-t)*p0y + 3*(1-t)*(1-t)*t*p1y + 3*(1-t)*t*t*p2y + t*t*t*p3y;

Finding the coordinates of points from distance matrix

I have a set of points (with unknow coordinates) and the distance matrix. I need to find the coordinates of these points in order to plot them and show the solution of my algorithm.
I can set one of these points in the coordinate (0,0) to simpify, and find the others. Can anyone tell me if it's possible to find the coordinates of the other points, and if yes, how?
Thanks in advance!
EDIT
Forgot to say that I need the coordinates on x-y only
The answers based on angles are cumbersome to implement and can't be easily generalized to data in higher dimensions. A better approach is that mentioned in my and WimC's answers here: given the distance matrix D(i, j), define
M(i, j) = 0.5*(D(1, j)^2 + D(i, 1)^2 - D(i, j)^2)
which should be a positive semi-definite matrix with rank equal to the minimal Euclidean dimension k in which the points can be embedded. The coordinates of the points can then be obtained from the k eigenvectors v(i) of M corresponding to non-zero eigenvalues q(i): place the vectors sqrt(q(i))*v(i) as columns in an n x k matrix X; then each row of X is a point. In other words, sqrt(q(i))*v(i) gives the ith component of all of the points.
The eigenvalues and eigenvectors of a matrix can be obtained easily in most programming languages (e.g., using GSL in C/C++, using the built-in function eig in Matlab, using Numpy in Python, etc.)
Note that this particular method always places the first point at the origin, but any rotation, reflection, or translation of the points will also satisfy the original distance matrix.
Step 1, arbitrarily assign one point P1 as (0,0).
Step 2, arbitrarily assign one point P2 along the positive x axis. (0, Dp1p2)
Step 3, find a point P3 such that
Dp1p2 ~= Dp1p3+Dp2p3
Dp1p3 ~= Dp1p2+Dp2p3
Dp2p3 ~= Dp1p3+Dp1p2
and set that point in the "positive" y domain (if it meets any of these criteria, the point should be placed on the P1P2 axis).
Use the cosine law to determine the distance:
cos (A) = (Dp1p2^2 + Dp1p3^2 - Dp2p3^2)/(2*Dp1p2* Dp1p3)
P3 = (Dp1p3 * cos (A), Dp1p3 * sin(A))
You have now successfully built an orthonormal space and placed three points in that space.
Step 4: To determine all the other points, repeat step 3, to give you a tentative y coordinate.
(Xn, Yn).
Compare the distance {(Xn, Yn), (X3, Y3)} to Dp3pn in your matrix. If it is identical, you have successfully identified the coordinate for point n. Otherwise, the point n is at (Xn, -Yn).
Note there is an alternative to step 4, but it is too much math for a Saturday afternoon
If for points p, q, and r you have pq, qr, and rp in your matrix, you have a triangle.
Wherever you have a triangle in your matrix you can compute one of two solutions for that triangle (independent of a euclidean transform of the triangle on the plane). That is, for each triangle you compute, it's mirror image is also a triangle that satisfies the distance constraints on p, q, and r. The fact that there are two solutions even for a triangle leads to the chirality problem: You have to choose the chirality (orientation) of each triangle, and not all choices may lead to a feasible solution to the problem.
Nevertheless, I have some suggestions. If the number entries is small, consider using simulated annealing. You could incorporate chirality into the annealing step. This will be slow for large systems, and it may not converge to a perfect solution, but for some problems it's the best you and do.
The second suggestion will not give you a perfect solution, but it will distribute the error: the method of least squares. In your case the objective function will be the error between the distances in your matrix, and actual distances between your points.
This is a math problem. To derive coordinate matrix X only given by its distance matrix.
However there is an efficient solution to this -- Multidimensional Scaling, that do some linear algebra. Simply put, it requires a pairwise Euclidean distance matrix D, and the output is the estimated coordinate Y (perhaps rotated), which is a proximation to X. For programming reason, just use SciKit.manifold.MDS in Python.
The "eigenvector" method given by the favourite replies above is very general and automatically outputs a set of coordinates as the OP requested, however I noticed that that algorithm does not even ask for a desired orientation (rotation angle) for the frame of the output points, the algorithm chooses that orientation all by itself!
People who use it might want to know at what angle the frame will be tipped before hand so I found an equation which gives the answer for the case of up to three input points, however I have not had time to generalize it to n-points and hope someone will do that and add it to this discussion. Here are the three angles the output sides will form with the x-axis as a function of the input side lengths:
angle side a = arcsin(sqrt(((c+b+a)*(c+b-a)*(c-b+a)*(-c+b+a)*(c^2-b^2)^2)/(a^4*((c^2+b^2-a^2)^2+(c^2-b^2)^2))))*180/Pi/2
angle side b = arcsin(sqrt(((c+b+a)*(c+b-a)*(c-b+a)*(-c+b+a)*(c^2+b^2-a^2)^2)/(4*b^4*((c^2+b^2-a^2)^2+(c^2-b^2)^2))))*180/Pi/2
angle side c = arcsin(sqrt(((c+b+a)*(c+b-a)*(c-b+a)*(-c+b+a)*(c^2+b^2-a^2)^2)/(4*c^4*((c^2+b^2-a^2)^2+(c^2-b^2)^2))))*180/Pi/2
Those equations also lead directly to a solution to the OP's problem of finding the coordinates for each point because: the side lengths are already given from the OP as the input, and my equations give the slope of each side versus the x-axis of the solution, thus revealing the vector for each side of the polygon answer, and summing those sides through vector addition up to a desired vertex will produce the coordinate of that vertex. So if anyone can extend my angle equations to handling beyond three input lengths (but I note: that might be impossible?), it might be a very fast way to the general solution of the OP's question, since slow parts of the algorithms that people gave above like "least square fitting" or "matrix equation solving" might be avoidable.

How to calculate the nearest point of a line and curve? .. or curve and curve?

Given the points of a line and a quadratic bezier curve, how do you calculate their nearest point?
There exist a scientific paper regarding this question from INRIA: Computing the minimum distance between two Bézier curves (PDF here)
I once wrote a tool to do a similar task. Bezier splines are typically parametric cubic polynomials. To compute the square of the distance between a cubic segment and a line, this is just the square of the distance between two polynomial functions, itself just another polynomial function! Note that I said the square of the distance, not the square root.
Essentially, for any point on a cubic segment, one could compute the square of the distance from that point to the line. This will be a 6th order polynomial. Can we minimize that square of the distance? Yes. The minimum must occur where the derivative of that polynomial is zero. So differentiate, getting a 5th order polynomial. Use your favorite root finding tool that generates all of the roots numerically. Jenkins & Traub, whatever. Choose the correct solution from that set of roots, excluding any solutions that are complex, and only picking a solution if it lies inside the cubic segment in question. Make sure you exclude the points that correspond to local maxima of the distance.
All of this can be efficiently done, and no iterative optimizer besides a polynomial root finder need be used, thus one does not require the use of optimization tools that require starting values, finding only a solution near that starting value.
For example, in the 3-d figure I show a curve generated by a set of points in 3-d (in red), then I took another set of points that lay in a circle outside, I computed the closest point on the inner curve from each, drawing a line down to that curve. These points of minimum distance were generated by the scheme outlined above.
I just wanna give you a few hints, in for the case Q.B.Curve // segment :
to get a fast enough computation, i think you should first think about using a kind of 'bounding box' for your algorithm.
Say P0 is first point of the Q. B. Curve, P2 the second point, P1 the control point, and P3P4 the segment then :
Compute distance from P0, P1, P2 to P3P4
if P0 OR P2 is nearest point --> this is the nearest point of the curve from P3P4. end :=).
if P1 is nearest point, and Pi (i=0 or 1) the second nearest point, the distance beetween PiPC and P3P4 is an estimate of the distance you seek that might be precise enough, depending on your needs.
if you need to be more acurate : compute P1', which is the point on the Q.B.curve the nearest from P1 : you find it applying the BQC formula with t=0.5. --> distance from PiP1' to P3P4 is an even more accurate estimate -but more costly-.
Note that if the line defined by P1P1' intersects P3P4, P1' is the closest point of QBC from P3P4.
if P1P1' does not intersect P3P4, then you're out of luck, you must go the hard way...
Now if (and when) you need precision :
think about using a divide and conquer algorithm on the parameter of the curve :
which is nearest from P3P4 ?? P0P1' or P1'P2 ??? if it is P0P1' --> t is beetween 0 and 0.5 so compute Pm for t=0.25.
Now which is nearest from P3P4?? P0Pm or PmP1' ?? if it is PmP1' --> compute Pm2 for t=0.25+0.125=0.375 then which is nearest ? PmPm2 or Pm2P1' ??? etc
you will come to accurate solution in no time, like 6 iteration and your precision on t is 0.004 !! you might stop the search when distance beetween two points becomes below a given value. (and not difference beetwen two parameters, since for a little change in parameter, points might be far away)
in fact the principle of this algorithm is to approximate the curve with segments more and more precisely each time.
For the curve / curve case i would first 'box' them also to avoid useless computation, so first use segment/segment computation, then (maybe) segment/curve computation, and only if needed curve/curve computation.
For curve/curve, divide and conquer works also, more difficult to explain but you might figure it out. :=)
hope you can find your good balance for speed/accuracy with this :=)
Edit : Think i found for the general case a nice solution :-)
You should iterate on the (inner) bounding triangles of each B.Q.C.
So we have Triangle T1, points A, B, C having 't' parameter tA, tB, tC.
and Triangle T2, points D, E, F, having t parameter tD, tE, tF.
Initially we have tA=0 tB=0.5 tC= 1.0 and same for T2 tD=0, tE=0.5, tF=1.0
The idea is to call a procedure recursivly that will split T1 and/or T2 into smaller rectangles until we are ok with the precision reached.
The first step is to compute distance from T1 from T2, keeping track of with segments were the nearest on each triangle. First 'trick': if on T1 the segment is AC, then stop recursivity on T1, the nearest point on Curve 1 is either A or C. if on T2 the nearest segment is DF, then stop recursivity on T2, the nearest point on Curve2 is either D or F. If we stopped recursivity for both -> return distance = min (AD, AF, CD, CF). then if we have recursivity on T1, and segment AB is nearest, new T1 becomes : A'=A B= point of Curve one with tB=(tA+tC)/2 = 0.25, C=old B. same goes for T2 : apply recursivityif needed and call same algorithm on new T1 and new T2. Stop algorithm when distance found beetween T1 and T2 minus distance found beetween previous T1 and T2 is below a threshold.
the function might look like ComputeDistance(curveParam1, A, C, shouldSplitCurve1, curveParam2, D, F, shouldSplitCurve2, previousDistance) where points store also their t parameters.
note that distance (curve, segment) is just a particular case of this algorithm, and that you should implement distance (triangle, triangle) and distance (segment, triangle) to have it worked. Have fun.
1.Simple bad method - by iteration go by point from first curve and go by point from second curve and get minimum
2.Determine math function of distance between curves and calc limit of this function like:
|Fcur1(t)-Fcur2(t)| ->0
Fs is vector.
I think we can calculate the derivative of this for determine extremums and get nearest and farest points
I think about this some time later, and post full response.
Formulate your problem in terms of standard analysis: You have got a quantity to minimize (distance), so you formulate an equation for this quantity and find the points where the first derivatives are zero. Parameterize with a single parameter by using the curve's parameter p, which is between 0 for the first point and 1 for the last point.
In the line case, the equation is fairly simple: Get the x/y coordinates from the spline's equation and compute the distance to the given line via vector equations (scalar product with the line's normal).
In the curve's case, the analytical solution could get pretty complicated. You might want to use a numerical minimization technique such as Nelder-Mead or, since you have a 1D continuous problem, simple bisection.
In the case of a Bézier curve and a line
There are three candidates for the closest point to the line:
The place on the Bézier curve segment that is parallel to the line (if such a place exists),
One end of the curve segment,
The other end of the curve segment.
Test all three; the shortest distance wins.
In the case of two Bézier curves
Depends if you want the exact analytical result, or if an optimised numerical result is good enough.
Analytical result
Given two Bézier curves A(t) and B(s), you can derive equations for their local orientation A'(t) and B'(s). The point pairs for which A'(t) = B'(s) are candidates, i.e. the (t, s) for which the curves are locally parallel. I haven't checked, but I assume that A'(t) - B'(s) = 0 can be solved analytically. If your curves are anything like those you show in your example, there should be either only one solution or no solution to that equation, but there could be two (or infinitely many in the case where the curves identical but translated -- in which case you can ignore this because the winner will always be one of the curve segment endpoints).
In an approach similar to the curve-line case outline above, test each of these point pairs, plus the curve segment endpoints. The shortest distance wins.
Numerical result
Let's say the points on the two Bézier curves are defined as A(t) and B(s). You want to minimize the distance d( t, s) = |A(t) - B(s)|. It's a simple two-parameter optimization problem: find the s and t that minimize d( t, s) with the constraints 0 ≤ t ≤ 1 and 0 ≤ s ≤ 1.
Since d = SQRT( ( xA - xB)² + (yA - yB)²), you can also just minimize the function f( t, s) = [d( t, s)]² to save a square root calculation.
There are numerous ready-made methods for such optimization problems. Pick and choose.
Note that in both cases above, anything higher-order than quadratic Bézier curves can giver you more than one local minimum, so this is something to watch out for. From the examples you give, it looks like your curves have no inflexion points, so this concern may not apply in your case.
The point where there normals match is their nearest point. I mean u draw a line orthogonal to the line. .if that line is orthogonal to the curve as well then the point of intersection is the nearest point

Resources