inverse interpolation of multidimensional grids - math

I am working on a project of interpolating sample data {(x_i,y_i)} where the input domain for x_i locates in 4D space and output y_i locates in 3D space. I need generate two look up tables for both directions. I managed to generate the 4D -> 3D table. But the 3D -> 4D one is tricky. The sample data are not on regular grid points, and it is not one to one mapping. Is there any known method to treat this situation? I did some search online, but what I found is only for 3D -> 3D mapping, which are not suitable for this case. Thank you!
To answer the questions of Spektre:
X(3D) -> Y(4D) is the case 1X -> nY
I want to generate a table that for any given X, we can find the value for Y. The sample data is not occupy all the domain of X. But it's fine, we only need accuracy for point inside the domain of sample data. For example, we have sample data like {(x1,x2,x3) ->(y1,y2,y3,y4)}. It is possible we also have a sample data {(x1,x2,x3) -> (y1_1,y2_1,y3_1,y4_1)}. But it is OK. We need a table for any (a,b,c) in space X, it corresponds to ONE (e,f,g,h) in space Y. There might be more than one choice, but we only need one. (Sorry for the symbol confusing if any)
One possible way to deal with this: Since I have already established a smooth mapping from Y->X, I can use Newton's method or any other method to reverse search the point y for any given x. But it is not accurate enough, and time consuming. Because I need do search for each point in the table, and the error is the sum of the model error with the search error.
So I want to know it is possible to find a mapping directly to interpolate the sample data instead of doing such kind of search in 3.

You are looking for projections/mappings
as you mentioned you have projection X(3D) -> Y(4D) which is not one to one in your case so what case it is (1 X -> n Y) or (n X -> 1 Y) or (n X -> m Y) ?
you want to use look-up table
I assume you just want to generate all X for given Y the problem with non (1 to 1) mappings is that you can use lookup table only if it has
all valid points
or mapping has some geometric or mathematic symmetry (for example distance between points in X and Yspace is similar,and mapping is continuous)
You can not interpolate between generic mapped points so the question is what kind of mapping/projection you have in mind?
First the 1->1 projections/mappings interpolation
if your X->Y projection mapping is suitable for interpolation
then for 3D->4D use tri-linear interpolation. Find closest 8 points (each in its axis to form grid hypercube) and interpolate between them in all 4 dimensions
if your X<-Y projection mapping is suitable for interpolation
then for 4D->3D use quatro-linear interpolation. Find closest 16 points (each in its axis to form grid hypercube) and interpolate between them in all 3 dimensions.
Now what about 1->n or n->m projections/mappings
That solely depends on the projection/mapping properties which I know nothing of. Try to provide an example of your datasets and adding some image would be best.
[edit1] 1 X <- n Y
I still would use quatro-linear interpolation. You still will need to search your Y table but if you group it like 4D grid then it should be easy enough.
find 16 closest points in Y-table to your input Y point
These points should be the closest points to your Y in each +/- direction of all axises. In 3D it looks like this:
red point is your input Y point
blue points are the found closest points (grid) they do not need to be so symmetric as on image .
Please do not want me to draw 4D example that make sense :) (at least for sober mind)
interpolation
find corresponding X points. If there is more then one per point chose the closer one to the others ... Now you should have 16 X points and 16+1 Y points. Then from Y points you need just to calculate the distance along lines from your input Y point. These distances are used as parameter for linear interpolations. Normalize them to <0,1> where
0 means 'left' and 1 means 'right' point
0.5 means exact middle
You will need this scalar distance in each of Y-domain dimension. Now just compute all the X points along the linear interpolations until you get the corresponding red point in X-domain.
With tri-linear interpolation (3D) there are 4+2+1=7 linear interpolations (as on image). For quatro-linear interpolation (4D) there are 8+4+2+1=15 linear interpolations.
linear interpolation
X = X0 + (X1-X0)*t
X is interpolated point
X0,X1 are the 'left','right' points
t is the distance parameter <0,1>

Related

Given a set of points with x, y and z coordinates whose bounds are 0 to 1 (inclusive), determine if they're all uniformly distributed (or close to)

I'm trying to determine whether a set of points are uniformly distributed in a 1 x 1 x 1 cube. Each point comes with an x, y, and z coordinate that corresponds to their location in the cube.
A trivial way that I can think of is to flatten the set of points into 2 graphs and check how normally distributed both are however I do not know whether that's a correct way of doing so.
Anyone else has any idea?
I would compute point density map and then check for anomalies in it:
definitions
let assume we have N points to test. If the points are uniformly distributed then they should form "uniform grid" of mmm points:
m * m * m = N
m = N^(1/3)
To account for disturbances from uniform grid and asses statistics you need to divide your cube to grid of cubes where each cube will hold several points (so statistical properties could be computed) let assume k>=5 points per grid cube so:
cubes = m/k
create a 3D array of counters
simply we need integer counter per each grid cube so:
int map[cubes][cubes][cubes];
fill it with zeroes.
process all points p(x,y,z) and update map[][][]
Simply loop through all of your points, and compute grid cube position they belong to and update their counter by incrementing it.
map[x*(cubes-1)][y*(cubes-1)][z*(cubes-1)]++;
compute average count of the map[][][]
simple average like this will do:
avg=0;
for (xx=0;xx<cubes;xx++)
for (yy=0;yy<cubes;yy++)
for (zz=0;zz<cubes;zz++)
avg+=map[xx][yy][zz];
avg/=cubes*cubes*cubes;
now just compute abs distance to this average
d=0;
for (xx=0;xx<cubes;xx++)
for (yy=0;yy<cubes;yy++)
for (zz=0;zz<cubes;zz++)
d+=fabs(map[xx][yy][zz]-avg);
d/=cubes*cubes*cubes;
the d will hold a metric telling how far are the points from uniform density. Where 0 means uniform distribution. So just threshold it ... the d is also depending on the number of points and my intuition tells me d>=k means totally not uniform so if you want to make it more robust you can do something like this (the threshold might need tweaking):
d/=k;
if (d<0.25) uniform;
else nonuniform;
As you can see all this is O(N) time so it should be fast enough for you. If it isn't you can evaluate every 10th point by skipping points however that can be done only if the order of points is random. If not you would need to pick N/10 random points instead. The 10 might be any constant but you need to take in mind you still need enough points to process so the statistic results are representing your set so I would not go below 250 points (but that depends on what exactly you need)
Here few of my answers using density map technique:
Finding holes in 2d point sets?
Location of highest density on a sphere

How to find whole distance between two points in a curved line in R?

I have a similar line graph plotted using R plot function (plot(df))
I want to get distance of the whole line between two points in the graph (e.g., between x(1) and x(3)). How can I do this?
If your function is defined over a fine grid of points, you can compute the length of the line segment between each pair of points and add them. Pythagoras is your friend here:
To the extent that the points are not close enough together that the function is essentially linear between the points, it will tend to (generally only slightly) underestimate the arc length.
Note that if your x-values are stored in increasing order, these ẟx and ẟy values can be obtained directly by differencing (in R that's diff)
If you have a functional form for y as a function of x you can apply the integral for the arc length -- i.e. integrate
∫ √[1+(dy/dx)²] dx
between a and b. This is essentially just the calculation in 1 taken to the limit.
If both x and y are parametric functions of another variable (t, say) you can simplify the parametric form of the above integral (if we don't forget the Jacobian) to integrating
∫ √[(dx/dt)²+(dy/dt)²] dt
between a and b
(Note the direct parallel to 1.)
if you don't have a convenient-to-integrate functional form in 2. or 3. you can use numerical quadrature; this can be quite efficient (which can be handy when the derivative function is expensive to evaluate).

Uniform sampling of 2D path draped on a set of 3D data points

Imagine you have a grid of sample points of a function z = f(x, y) where 1 < x < N and 1 < y < N. The formula is not given, but just the raw data, that could be for example the grey level of an image.
I would like to find, given a point A, whose x and y coordinates are given (and z is known from the data, so A is a vertex of the surface) a number M of points that lie on the circumference of the circle with center in A and radius R that are a good approximation of a circular "cloth" draped on the imaginary surface described by the data points. Imagine also that the edges of the surface are a triangle mesh.
The biggest constraint in the approximation is that the sum of the length of the edges of the resulting polygon is constantly R * 2 * PI, so that moving the A point across the surface would just change the M points but never the sum of their reciprocal distances. The draping doesn't need to be perfect, it would be nice though to be as close as possible to the surface., or always on one side of the surface, above or below.
Could anybody give me a pointer to something to read about this? Is this a known problem?
I feel that the problem is not completely formulated, I'd already like some help to give a complete description of it.

Finding the coordinates of points from distance matrix

I have a set of points (with unknow coordinates) and the distance matrix. I need to find the coordinates of these points in order to plot them and show the solution of my algorithm.
I can set one of these points in the coordinate (0,0) to simpify, and find the others. Can anyone tell me if it's possible to find the coordinates of the other points, and if yes, how?
Thanks in advance!
EDIT
Forgot to say that I need the coordinates on x-y only
The answers based on angles are cumbersome to implement and can't be easily generalized to data in higher dimensions. A better approach is that mentioned in my and WimC's answers here: given the distance matrix D(i, j), define
M(i, j) = 0.5*(D(1, j)^2 + D(i, 1)^2 - D(i, j)^2)
which should be a positive semi-definite matrix with rank equal to the minimal Euclidean dimension k in which the points can be embedded. The coordinates of the points can then be obtained from the k eigenvectors v(i) of M corresponding to non-zero eigenvalues q(i): place the vectors sqrt(q(i))*v(i) as columns in an n x k matrix X; then each row of X is a point. In other words, sqrt(q(i))*v(i) gives the ith component of all of the points.
The eigenvalues and eigenvectors of a matrix can be obtained easily in most programming languages (e.g., using GSL in C/C++, using the built-in function eig in Matlab, using Numpy in Python, etc.)
Note that this particular method always places the first point at the origin, but any rotation, reflection, or translation of the points will also satisfy the original distance matrix.
Step 1, arbitrarily assign one point P1 as (0,0).
Step 2, arbitrarily assign one point P2 along the positive x axis. (0, Dp1p2)
Step 3, find a point P3 such that
Dp1p2 ~= Dp1p3+Dp2p3
Dp1p3 ~= Dp1p2+Dp2p3
Dp2p3 ~= Dp1p3+Dp1p2
and set that point in the "positive" y domain (if it meets any of these criteria, the point should be placed on the P1P2 axis).
Use the cosine law to determine the distance:
cos (A) = (Dp1p2^2 + Dp1p3^2 - Dp2p3^2)/(2*Dp1p2* Dp1p3)
P3 = (Dp1p3 * cos (A), Dp1p3 * sin(A))
You have now successfully built an orthonormal space and placed three points in that space.
Step 4: To determine all the other points, repeat step 3, to give you a tentative y coordinate.
(Xn, Yn).
Compare the distance {(Xn, Yn), (X3, Y3)} to Dp3pn in your matrix. If it is identical, you have successfully identified the coordinate for point n. Otherwise, the point n is at (Xn, -Yn).
Note there is an alternative to step 4, but it is too much math for a Saturday afternoon
If for points p, q, and r you have pq, qr, and rp in your matrix, you have a triangle.
Wherever you have a triangle in your matrix you can compute one of two solutions for that triangle (independent of a euclidean transform of the triangle on the plane). That is, for each triangle you compute, it's mirror image is also a triangle that satisfies the distance constraints on p, q, and r. The fact that there are two solutions even for a triangle leads to the chirality problem: You have to choose the chirality (orientation) of each triangle, and not all choices may lead to a feasible solution to the problem.
Nevertheless, I have some suggestions. If the number entries is small, consider using simulated annealing. You could incorporate chirality into the annealing step. This will be slow for large systems, and it may not converge to a perfect solution, but for some problems it's the best you and do.
The second suggestion will not give you a perfect solution, but it will distribute the error: the method of least squares. In your case the objective function will be the error between the distances in your matrix, and actual distances between your points.
This is a math problem. To derive coordinate matrix X only given by its distance matrix.
However there is an efficient solution to this -- Multidimensional Scaling, that do some linear algebra. Simply put, it requires a pairwise Euclidean distance matrix D, and the output is the estimated coordinate Y (perhaps rotated), which is a proximation to X. For programming reason, just use SciKit.manifold.MDS in Python.
The "eigenvector" method given by the favourite replies above is very general and automatically outputs a set of coordinates as the OP requested, however I noticed that that algorithm does not even ask for a desired orientation (rotation angle) for the frame of the output points, the algorithm chooses that orientation all by itself!
People who use it might want to know at what angle the frame will be tipped before hand so I found an equation which gives the answer for the case of up to three input points, however I have not had time to generalize it to n-points and hope someone will do that and add it to this discussion. Here are the three angles the output sides will form with the x-axis as a function of the input side lengths:
angle side a = arcsin(sqrt(((c+b+a)*(c+b-a)*(c-b+a)*(-c+b+a)*(c^2-b^2)^2)/(a^4*((c^2+b^2-a^2)^2+(c^2-b^2)^2))))*180/Pi/2
angle side b = arcsin(sqrt(((c+b+a)*(c+b-a)*(c-b+a)*(-c+b+a)*(c^2+b^2-a^2)^2)/(4*b^4*((c^2+b^2-a^2)^2+(c^2-b^2)^2))))*180/Pi/2
angle side c = arcsin(sqrt(((c+b+a)*(c+b-a)*(c-b+a)*(-c+b+a)*(c^2+b^2-a^2)^2)/(4*c^4*((c^2+b^2-a^2)^2+(c^2-b^2)^2))))*180/Pi/2
Those equations also lead directly to a solution to the OP's problem of finding the coordinates for each point because: the side lengths are already given from the OP as the input, and my equations give the slope of each side versus the x-axis of the solution, thus revealing the vector for each side of the polygon answer, and summing those sides through vector addition up to a desired vertex will produce the coordinate of that vertex. So if anyone can extend my angle equations to handling beyond three input lengths (but I note: that might be impossible?), it might be a very fast way to the general solution of the OP's question, since slow parts of the algorithms that people gave above like "least square fitting" or "matrix equation solving" might be avoidable.

Determine if a set of points lie on a regular grid

Problem: Suppose you have a collection of points in the 2D plane. I want to know if this set of points sits on a regular grid (if they are a subset of a 2D lattice). I would like some ideas on how to do this.
For now, let's say I'm only interested in whether these points form an axis-aligned rectangular grid (that the underlying lattice is rectangular, aligned with the x and y axes), and that it is a complete rectangle (the subset of the lattice has a rectangular boundary with no holes). Any solutions must be quite efficient (better than O(N^2)), since N can be hundreds of thousands or millions.
Context: I wrote a 2D vector field plot generator which works for an arbitrarily sampled vector field. In the case that the sampling is on a regular grid, there are simpler/more efficient interpolation schemes for generating the plot, and I would like to know when I can use this special case. The special case is sufficiently better that it merits doing. The program is written in C.
This might be dumb but if your points were to lie on a regular grid, then wouldn't peaks in the Fourier transform of the coordinates all be exact multiples of the grid resolution? You could do a separate Fourier transform the X and Y coordinates. If theres no holes on grid then the FT would be a delta function I think. FFT is O(nlog(n)).
p.s. I would have left this as a comment but my rep is too low..
Not quite sure if this is what you are after but for a collection of 2d points on a plane you can always fit them on a rectangular grid (down to the precision of your points anyway), the problem may be the grid they fit to may be too sparsly populated by the points to provide any benefit to your algorithm.
to find a rectangular grid that fits a set of points you essentially need to find the GCD of all the x coordinates and the GCD of all the y coordinates with the origin at xmin,ymin this should be O( n (log n)^2) I think.
How you decide if this grid is then too sparse is not clear however
If the points all come only from intersections on the grid then the hough transform of your set of points might help you. If you find that two mutually perpendicular sets of lines occur most often (meaning you find peaks at four values of theta all 90 degrees apart) and you find repeating peaks in gamma space then you have a grid. Otherwise not.
Here's a solution that works in O(ND log N), where N is the number of points and D is the number of dimensions (2 in your case).
Allocate D arrays with space for N numbers: X, Y, Z, etc. (Time: O(ND))
Iterate through your point list and add the x-coordinate to list X, the y-coordinate to list Y, etc. (Time: O(ND))
Sort each of the new lists. (Time: O(ND log N))
Count the number of unique values in each list and make sure the difference between successive unique values is the same across the whole list. (Time: O(ND))
If
the unique values in each dimension are equally spaced, and
if the product of the number of unique values of each coordinate is equal to the number of original points (length(uniq(X))*length(uniq(Y))* ... == N,
then the points are in a regular rectangular grid.
Let's say a grid is defined by an orientation Or (within 0 and 90 deg) and a resolution Res. You could compute a cost function that evaluate if a grid (Or, Res) sticks to your points. For example, you could compute the average distance of each point to its closest point of the grid.
Your problem is then to find the (Or, Res) pair that minimize the cost function. In order to narrow the search space and improve the , some a heuristic to test "good" candidate grids could be used.
This approach is the same as the one used in the Hough transform proposed by jilles. The (Or, Res) space is comparable to the Hough's gamma space.

Resources