I have two paths in 3D and I want to "average" them, if there's such a thing.
I have the xyz pairs timestamped at the time they were sampled:
ms x y z
3 0.1 0.2 0.6
12 0.1 0.2 1.3
23 2.1 4.2 0.3
55 0.1 6.2 0.3
Facts about the paths:
They all start and end on/near the same xyz point.
I have the total duration it took to complete the path as well as individual vertices
They have different lengths (i.e. different number of xyz pairs).
Any help would be appreciated.
A simple method is the following...
First build a function interp(t, T, waypoints) that given the current time t, the total path duration T and the path waypoints returns the current position. This can be done using linear interpolation or more sophisticated approaches to avoid speed or acceleration discontinuities.
Once you have interp the average path can be defined as (example in python)
def avg(t, T1, waypoints1, T2, waypoints2):
T = (T1 + T2) / 2
return middlePoint(interp(t*T1/T, T1, waypoints1),
interp(t*T2/T, T2, waypoints2))
the duration of the average path will be the average T = (T1 + T2) / 2 of the two durations.
It's also easy to change this approach to make a weighted average path.
In R, the distances between consecutive points in that series assuming it is in a dataframe named "dat"
would be:
with(dat, sqrt(diff(x)^2 +diff(y)^2 +diff(z)^2) )
#[1] 0.700000 4.582576 2.828427
There are a couple of averages I could think of average distance in interval, average distance traveled per unit time. Depends on what you want. This gives the average velocity in the three intervals:
with(dat, sqrt(diff(x)^2 +diff(y)^2 +diff(z)^2) /diff(ms) )
#[1] 0.07777778 0.41659779 0.08838835
There is definitely such a thing. For each point on path A, find the point that correponds to your current point on path B, and then find the mid-point between those corresponding verticies. You will then get a path in-between the two that is the "average" of the two paths. If you have a mis-match where you did not sample the two paths the same, then for an interior point on path A (i.e., not the end-point), find the two closest sampled points with a similar time-sampling on path B, and locate the mid-point of the triangle those three points will make.
Now since you've discreetized your path by sampling it, this "average" is only going to be an approximation, not a "true" average like you could do by solving for the average function between two differentiable parametric functions defined by r(t) = <x(t), y(t), z(t)>.
Expanding on #6502's answer.
If you wish to retrieve a list of points that would make up the average path, you could sample the avg function at the instances of the individual input points. (Stretched toward the average length)
def avg2(T1, waypoints1, T2, waypoints2):
# Collect the times we want to sample at
T = (T1 + T2) / 2
times = []
times.extend(t*T/T1 for (t,x,y) in waypoints1) # Shift the time towards
times.extend(t*T/T2 for (t,x,y) in waypoints2) # the average
times.sort()
result = []
last_t = None
for t in times:
# Check if we have two points in close succession
if last_t is not None and last_t + 1.0e-6 >= t:
continue
last_t = t
# Sample the average path at this instance
x, y = avg(t, T1, waypoints1, T2, waypoints2)
yield t, x, y
Related
I have two points which form one line: (1,4) and (3,6), and another two which form another line: (2,1) and (4,2). These lines are continuous and I can find their intersection points by finding the equation for each line, and then equating them to find the x value at the intersection point, and then the y value.
i.e. for the first line, the equation is y = x + 3, and the second is y = 0.5x. At the intersection the y values are the same so x + 3 = 0.5x. So x = -6. Subbing this back into either of the equations gives a y value of -3.
From those steps, I now know that the intersection point is (-6,-3). The problem is I need to do the same steps in Excel, preferably as one formula. Can anyone give me some advice on how I would start this?
Its long but here it is:
Define x1,y1 and x2,y2 for the 1st line and x3,y3 and x4,y4 for the second.
x = (x2y1-x1y2)(x4-x3)-(x4y3-x3y4)(x2-x1) / [ (x2-x1)(y4-y3) - (x4-x3)(y2-y1) ]
y = (x2y1-x1y2)(y4-y3)-(x4y3-x3y4)(y2-y1) / [ (x2-x1)(y4-y3) - (x4-x3)(y2-y1) ]
Note that the denominators are the same. They will be ZERO! when the system has no solution. So you may want to check that in another cell and conditionally compute the answer.
Essentially, this formula is derived by solving a system of equations for x and y by hand using generic points (x1,y1), (x2,y2), (x3,y3), and (x4,y4). Easier yet, is solving the system by hand using well developed linear algebra concepts.
Wikipedia outlines this procedure well: Line-line intersection.
Also, this website describes all the different formulas and lets you put in whatever data you have in any mixed format and provides many details of the solutions: Everything about 2 lines.
Here's a matrix based solution:
x - y = -3
0.5*x - y = 0
Written as a matrix equation (I apologize for the poor typesetting):
| 1.0 -1.0 |{ x } { -3 }
| 0.5 -1.0 |{ y } = { 0 }
You can invert this matrix or use LU decomposition to solve it to get the answer. That method will work for any number of cases where you have one equation for each unknown.
This is easy to do by hand:
Subtract the second equation from the first: 0.5*x = -3
Divide both sides by 0.5: x = -6
Substitute this result into the other equation: y = 0.5*x = -3
I have a problem I wish to solve in R with example data below. I know this must have been solved many times but I have not been able to find a solution that works for me in R.
The core of what I want to do is to find how to translate a set of 2D coordinates to best fit into an other, larger, set of 2D coordinates. Imagine for example having a Polaroid photo of a small piece of the starry sky with you out at night, and you want to hold it up in a position so they match the stars' current positions.
Here is how to generate data similar to my real problem:
# create reference points (the "starry sky")
set.seed(99)
ref_coords = data.frame(x = runif(50,0,100), y = runif(50,0,100))
# generate points take subset of coordinates to serve as points we
# are looking for ("the Polaroid")
my_coords_final = ref_coords[c(5,12,15,24,31,34,48,49),]
# add a little bit of variation as compared to reference points
# (data should very similar, but have a little bit of noise)
set.seed(100)
my_coords_final$x = my_coords_final$x+rnorm(8,0,.1)
set.seed(101)
my_coords_final$y = my_coords_final$y+rnorm(8,0,.1)
# create "start values" by, e.g., translating the points we are
# looking for to start at (0,0)
my_coords_start =apply(my_coords_final,2,function(x) x-min(x))
# Plot of example data, goal is to find the dotted vector that
# corresponds to the translation needed
plot(ref_coords, cex = 1.2) # "Starry sky"
points(my_coords_start,pch=20, col = "red") # start position of "Polaroid"
points(my_coords_final,pch=20, col = "blue") # corrected position of "Polaroid"
segments(my_coords_start[1,1],my_coords_start[1,2],
my_coords_final[1,1],my_coords_final[1,2],lty="dotted")
Plotting the data as above should yield:
The result I want is basically what the dotted line in the plot above represents, i.e. a delta in x and y that I could apply to the start coordinates to move them to their correct position in the reference grid.
Details about the real data
There should be close to no rotational or scaling difference between my points and the reference points.
My real data is around 1000 reference points and up to a few hundred points to search (could use less if more efficient)
I expect to have to search about 10 to 20 sets of reference points to find my match, as many of the reference sets will not contain my points.
Thank you for your time, I'd really appreciate any input!
EDIT: To clarify, the right plot represent the reference data. The left plot represents the points that I want to translate across the reference data in order to find a position where they best match the reference. That position, in this case, is represented by the blue dots in the previous figure.
Finally, any working strategy must not use the data in my_coords_final, but rather reproduce that set of coordinates starting from my_coords_start using ref_coords.
So, the previous approach I posted (see edit history) using optim() to minimize the sum of distances between points will only work in the limited circumstance where the point distribution used as reference data is in the middle of the point field. The solution that satisfies the question and seems to still be workable for a few thousand points, would be a brute-force delta and comparison algorithm that calculates the differences between each point in the field against a single point of the reference data and then determines how many of the rest of the reference data are within a minimum threshold (which is needed to account for the noise in the data):
## A brute-force approach where min_dist can be used to
## ameliorate some random noise:
min_dist <- 5
win_thresh <- 0
win_thresh_old <- 0
for(i in 1:nrow(ref_coords)) {
x2 <- my_coords_start[,1]
y2 <- my_coords_start[,2]
x1 <- ref_coords[,1] + (x2[1] - ref_coords[i,1])
y1 <- ref_coords[,2] + (y2[1] - ref_coords[i,2])
## Calculate all pairwise distances between reference and field data:
dists <- dist( cbind( c(x1, x2), c(y1, y2) ), "euclidean")
## Only take distances for the sampled data:
dists <- as.matrix(dists)[-1*1:length(x1),]
## Calculate the number of distances within the minimum
## distance threshold minus the diagonal portion:
win_thresh <- sum(rowSums(dists < min_dist) > 1)
## If we have more "matches" than our best then calculate a new
## dx and dy:
if (win_thresh > win_thresh_old) {
win_thresh_old <- win_thresh
dx <- (x2[1] - ref_coords[i,1])
dy <- (y2[1] - ref_coords[i,2])
}
}
## Plot estimated correction (your delta x and delta y) calculated
## from the brute force calculation of shifts:
points(
x=ref_coords[,1] + dx,
y=ref_coords[,2] + dy,
cex=1.5, col = "red"
)
I'm very interested to know if there's anyone that solves this in a more efficient manner for the number of points in the test data, possibly using a statistical or optimization algorithm.
I have 4 points in space A(x,y,z), B(x,y,z), C(x,y,z) and D(x,y,z). How can I check if these points are the corner points of a rectangle?
You must first determine whether or not the points are all coplanar, since a rectangle is a 2D geometric object, but your points are in 3-space. You can determine they are coplanar by comparing cross products as in:
V1 = (B-A)×(B-C)
V2 = (C-A)×(C-D)
This will give you two vectors which, if A, B, C, and D are coplanar are linearly dependent. By considering what Wolfram has to say on vector dependence, we can test the vectors for linear dependence by using
C = (V1∙V1)(V2∙V2) - (V1∙V2)(V2∙V1)
If C is 0 then the vectors V1 and V2 are linearly dependent and all the points are coplanar.
Next compute the distances between each pair of points. There should be a total of 6 such distances.
D1 = |A-B|
D2 = |A-C|
D3 = |A-D|
D4 = |B-C|
D5 = |B-D|
D6 = |C-D|
Assuming none of these distances are 0, these points form a rectangle if and only if the vertices are coplanar (already verified) and these lengths can be grouped into three pairs where elements of each pair have the same length. If the figure is a square, two sets of the pairs will have be the same length and will be shorter than the remaining pair.
Update: Reading this again, I realize the the above could define a parallelogram, so an additional check is required to check that the square of the longest distance is equal to the sum of the squares of the two shorter distances. Only then will the parallelogram also be a rectangle.
Keep in mind all of this is assuming infinite precision and within a strictly mathematical construct. If you're planning to code this up, you will need to account for rounding and accept a degree of imprecision that's not really a player when speaking in purely mathematical terms.
Check if V1=B-A and V2=D-A are orthogonal using the dot product. Then check if
C-A == V1+V2
within numerical tolerances. If both are true, the points are coplanar and form a rectangle.
Here a function is defined to check whether the 4 points represents the rectangle or not .
from math import sqrt
def Verify(A, B, C, D, epsilon=0.0001):
# Verify A-B = D-C
zero = sqrt( (A[0]-B[0]+C[0]-D[0])**2 + (A[1]-B[1]+C[1]-D[1])**2 + (A[2]-B[2]+C[2]-D[2])**2 )
if zero > epsilon:
raise ValueError("Points do not form a parallelogram; C is at %g distance from where it should be" % zero)
# Verify (D-A).(B-A) = 0
zero = (D[0]-A[0])*(B[0]-A[0]) + (D[1]-A[1])*(B[1]-A[1]) + (D[2]-A[2])*(B[2]-A[2])
if abs(zero) > epsilon:
raise ValueError("Corner A is not a right angle; edge vector dot product is %g" % zero)
else:
print('rectangle')
A = [x1,y1,z1]
print(A)
B = [x2,y2,z2]
C = [x3,y3,z3]
D = [x4,y4,z4]
Verify(A, B, C, D, epsilon=0.0001)
If I have a set of points that have different y positions (A,B,C) each with the same x coordinate. Is it possible to cluster this set of 3 points together and not individually?
I'd like to see the occurrence of this set of 3 points together in a given sample and see what set (A,B,C) is most frequent.
I've seen most of the clustering algorithm can cluster points for a given position (x,y) but not a set of several points for a given x coordinate.
For instance, if i have the following
X A B C
1 0.7 0.1 0.2
2 0.3 0.4 0.1
3 0.4 0.5 0.1
4 0.7 0.1 0.2
5 0.7 0.1 0.2
6 0.2 0.1 0.5
The positions x :1, 4 and 5 should be clustered together because they have the same set (A,B,C) = (0.7,0.1,0.2).
Is there any algorithm or tool (R) that is already doing that, clustering by pair of several points, finding the most occurrent pair with a graphical visualization?
Any help would be much appreciated.
If you're looking to tabulate the instances, then something along the lines of:
tab <- table(sprintf("%s:%s:%s", df1$A, df1$B, df1$C))
which.max(tab)
sort(tab, decreasing=TRUE)
will give you the most frequent combination (you can use strsplit to separate out the individual components if you need to go on and use them programmatically.
If you're looking to cluster, in the sense of find similar distances, then you can just use
dis <- dist(as.matrix(df1[[c("A","B","C")]])
clust <- hclust(dis)
and dis will tell you all the pairwise distances (find the zero's to get the identicals), and clust will give you a tree based on similarity across A:c
If this isn't answering the question, you probably need to clarify. You say things like same x coordinate in the text, but none of your rows have the same X value. And it's fairly unconventional to switch interchangeably between y coordinate / position / (A,B,C) .
It's hard to suggest a visualisation without knowing what feature you want to emphasize. Possibly a multi-dimensional scaling graph, where each node represents all x with the same (A,B,C) triplet, and then neighbours are other X's with closest (A', B', C') values?
I am new to this forum and not a native english speaker, so please be nice! :)
Here is the challenge I face at the moment:
I want to calculate the (approximate) relative coordinates of yet unknown points in a 3D euclidean space based on a set of given distances between 2 points.
In my first approach I want to ignore possible multiple solutions, just taking the first one by random.
e.g.:
given set of distances: (I think its creating a pyramid with a right-angled triangle as a base)
P1-P2-Distance
1-2-30
2-3-40
1-3-50
1-4-60
2-4-60
3-4-60
Step1:
Now, how do I calculate the relative coordinates for those points?
I figured that the first point goes to 0,0,0 so the second one is 30,0,0.
After that the third points can be calculated by finding the crossing of the 2 circles from points 1 and 2 with their distances to point 3 (50 and 40 respectively). How do I do that mathematically? (though I took these simple numbers for an easy representation of the situation in my mind). Besides I do not know how to get to the answer in a correct mathematical way the third point is at 30,40,0 (or 30,0,40 but i will ignore that).
But getting the fourth point is not as easy as that. I thought I have to use 3 spheres in calculate the crossing to get the point, but how do I do that?
Step2:
After I figured out how to calculate this "simple" example I want to use more unknown points... For each point there is minimum 1 given distance to another point to "link" it to the others. If the coords can not be calculated because of its degrees of freedom I want to ignore all possibilities except one I choose randomly, but with respect to the known distances.
Step3:
Now the final stage should be this: Each measured distance is a bit incorrect due to real life situation. So if there are more then 1 distances for a given pair of points the distances are averaged. But due to the imprecise distances there can be a difficulty when determining the exact (relative) location of a point. So I want to average the different possible locations to the "optimal" one.
Can you help me going through my challenge step by step?
You need to use trigonometry - specifically, the 'cosine rule'. This will give you the angles of the triangle, which lets you solve the 3rd and 4th points.
The rules states that
c^2 = a^2 + b^2 - 2abCosC
where a, b and c are the lengths of the sides, and C is the angle opposite side c.
In your case, we want the angle between 1-2 and 1-3 - the angle between the two lines crossing at (0,0,0). It's going to be 90 degrees because you have the 3-4-5 triangle, but let's prove:
50^2 = 30^2 + 40^2 - 2*30*40*CosC
CosC = 0
C = 90 degrees
This is the angle between the lines (0,0,0)-(30,0,0) and (0,0,0)- point 3; extend along that line the length of side 1-3 (which is 50) and you'll get your second point (0,50,0).
Finding your 4th point is slightly trickier. The most straightforward algorithm that I can think of is to firstly find the (x,y) component of the point, and from there the z component is straightforward using Pythagoras'.
Consider that there is a point on the (x,y,0) plane which sits directly 'below' your point 4 - call this point 5. You can now create 3 right-angled triangles 1-5-4, 2-5-4, and 3-5-4.
You know the lengths of 1-4, 2-4 and 3-4. Because these are right triangles, the ratio 1-4 : 2-4 : 3-4 is equal to 1-5 : 2-5 : 3-5. Find the point 5 using trigonometric methods - the 'sine rule' will give you the angles between 1-2 & 1-4, 2-1 and 2-4 etc.
The 'sine rule' states that (in a right triangle)
a / SinA = b / SinB = c / SinC
So for triangle 1-2-4, although you don't know lengths 1-4 and 2-4, you do know the ratio 1-4 : 2-4. Similarly you know the ratios 2-4 : 3-4 and 1-4 : 3-4 in the other triangles.
I'll leave you to solve point 4. Once you have this point, you can easily solve the z component of 4 using pythagoras' - you'll have the sides 1-4, 1-5 and the length 4-5 will be the z component.
I'll initially assume you know the distances between all pairs of points.
As you say, you can choose one point (A) as the origin, orient a second point (B) along the x-axis, and place a third point (C) along the xy-plane. You can solve for the coordinates of C as follows:
given: distances ab, ac, bc
assume
A = (0,0)
B = (ab,0)
C = (x,y) <- solve for x and y, where:
ac^2 = (A-C)^2 = (0-x)^2 + (0-y)^2 = x^2 + y^2
bc^2 = (B-C)^2 = (ab-x)^2 + (0-y)^2 = ab^2 - 2*ab*x + x^2 + y^2
-> bc^2 - ac^2 = ab^2 - 2*ab*x
-> x = (ab^2 + ac^2 - bc^2)/2*ab
-> y = +/- sqrt(ac^2 - x^2)
For this to work accurately, you will want to avoid cases where the points {A,B,C} are in a straight line, or close to it.
Solving for additional points in 3-space is similar -- you can expand the Pythagorean formula for the distance, cancel the quadratic elements, and solve the resulting linear system. However, this does not directly help you with your steps 2 and 3...
Unfortunately, I don't know a well-behaved exact solution for steps 2 and 3, either. Your overall problem will generally be both over-constrained (due to conflicting noisy distances) and under-constrained (due to missing distances).
You could try an iterative solver: start with a random placement of all your points, compare the current distances with the given ones, and use that to adjust your points in such a way as to improve the match. This is an optimization technique, so I would look up books on numerical optimization.
If you know the distance between the nodes (fixed part of system) and the distance to the tag (mobile) you can use trilateration to find the x,y postion.
I have done this using the Nanotron radio modules which have a ranging capability.