Minimum length of lines needed - graph

Suppose we have a set of N points on the cartesian plane (x_i and y_i). Suppose we connect those points with lines.
Is there any way like using a graph and something like a shortest path algorithm or minimum spanning tree so that we can reach any point starting from any point but minimizing the total length of the lines??
I though that maybe I could set the cost of the edges with the distance of a graph and use a shortest path algorithm but I'm not sure if this is possible.
Any ideas ?

I'm not 100% sure what you want, so I go for two algorithms.
First: you just want a robust algorithm, use dijkstras algorithm. The only challange left is to define the edge cost. Which would be 1 for neighboring nodes, I assume.
Second: you want to use heuristics to estimate the next best node and optimize time consumption. Use A*, but you need to write a heuristic which under estimates the distance. You could use the euclidean distance to do so. The edge problematic stays the same.

Related

Shortest distance to cover all the (N+1) points. All the N points lie on x- axis. Remaining one point lies anywhere in the coordinate plane

Given (N+1) points. All the N points lie on x- axis. Remaining one point (HEAD point) lies anywhere in the coordinate plane.
Given a START point on x- axis.
Find the shortest distance to cover all the points starting from START point.We can traverse a point multiple times.
Example N+1=4
points on x axis
(0,1),(0,2),(0,3)
HEAD Point
(1,1) //only head point can lie anywhere //Rest all on x axis
START Point
(0,1)
I am looking for a method as of how to approach this problem.
Whether we should visit HEAD point first or HEAD point in between.
I tried to find a way using Graph Theory to simplify this problem and reduce the paths that need to be considered. If there is an elegant way to represent this problem using graphs to identify a solution, I was not able to find it. This approach becomes very inefficient as the n increases - the time and memory is O(2^n).
Looking at this as a tree graph, the root node would be the START point, then each of its child nodes would be the points it is connected to.
Since the START point and the rest of the points aside from the HEAD all lay on the x-axis, all non-HEAD points only need to be connected to adjacent points on the x-axis. This is because the distance of the path between any two points is the sum of the distances between any adjacent points along the path between those two points (the subset of nodes representing points on the x-axis does not need to form a complete graph). This reduces the brute force approach some.
Here's a simple example:
The upper left shows the original problem: points on the x-axis along with the START and HEAD points.
In the upper right, this has been transformed into a graph with each node representing a point from the original problem. The edges represent the paths that can be taken between points. This assumes that the START point only represents the first point in the path. Unlike the other nodes, it is only included in the path once. If that is not the case and the path can return to the START point, this would approximately double the possible paths, but the same approach can be followed.
In the bottom left, the START point, a, is the root of a tree graph, and each node connected to the START point is a child node. This process is repeated for each child node until either:
A path that is obviously not optimal is identified, in which case that node can just be excluded from the graph. See the nodes in red boxes; going back and forth between the same nodes is unnecessary.
All points are included when traversing the tree from the root to that node, producing a potential solution.
Note that when creating the tree graph, each time a node is repeated, its "potential" child nodes are the same as the first time the node was included. By "potential", I mean cases above still need to be checked, because the result might include a nonsensical path, in which case that node would not be included. It is also possible a potential solution results from the path after its child nodes are included.
The last step is to add up the distances for each of the potential solutions to determine which path is shortest.
This requires a careful examination of the different cases.
Assume for now START (S) is on the far left, and HEAD (H) is somewhere in the middle the path maybe something like
H
/ \
S ---- * ----*----* * --- * ----*
Or it might be shorter to from H to and from the one of the other node
H
//
S ---- * --- * -- *----------*---*
If S is not at one end you might have something like
H
/ \
* ---- * ----*----* * --- * ----*
--------S
Or even going direct from S to H on the first step
H
/ |
* ---- * ----*----* |
S
A full analysis of cases would be quite extensive.
Actually solving the problem, might depend on the number of nodes you have. If the number is small < 10, then compete enumeration might be possible. Just work out every possible path, eliminate the ones which are illegal, and choose the smallest. The number of paths is I think in the order of n!, so its computable for small n.
For large n you can break the problem into small segments. I think its enough just to consider a small patch with nodes either side of H and a small patch with nodes either side of S.
This is not really a solution, but a possible way to think about tackling the problem.
(To be pedantic stackoverflow.com is not the right site for this question in the stack exchange network. Computational Science : algorithms might be a better place.
This is a fun problem. First, lets try to find a brute force solution, as Poosh did.
Observations about the Shortest Path
No repeated points
You are in an Euclidean geometry, thus the triangle inequality holds: For all points a,b,c, the distance d(a,b) + d(b,c) <= d(a,c). Thus, whenever you have an optimal path that contains a point that occurs more than once, you can remove one of them, which means it is not an optimal path, which leads to a contradiction and proves that your optimal path contains each point exactly once.
Permutations
Our problem is thus to find the permutation, lets call it M_i, of the numbers 1...n for points P1...Pn (where P0 is the fixed start point and Pn the head point, P1...Pn-1 are ordered by increasing x value) that minimizes the sum of |(P_M_i)-(P_M_(i-1))| for i from 1 to n, || being the vector length sqrt(v_x²+v_y²).
The number of permutations of a set of size n is n!. In this case we have n+1 points, so a brute force approach testing all permutations would have complexity (n+1)!, which is higher than even 2^n and definitely not practical, so we need further observations to improve this.
Next Steps
My next step would now be to see if there are any other sequences that can be proven to be not optimal, leading to a reduction in the number of candidates to be tested.
Paths of non-head points
Lets look at all paths (sequences of indices of points that don't contain a head point and that are parts of the optimal path. If we don't change the start and end point of a path, then any other transpositions have no effect on the outside environment and we can perform purely local optimizations. We can prove that those sequences must have monotonic (increasing or decreasing) x coordinate values and thus monotonic indices (as they are ordered by ascending x coordinate between indices 0 and n-1):
We are in a purely one dimensional subspace and the total distance of the path is thus equal to the sum of the absolute values of the differences in x coordinates between one such point and the next. It is clear that this sum is minimized by ordering by x coordinate in either ascending or descending order and thus ordering the indices in the same way. Note that this is true for maximal such paths as well as for all continuous "subpaths" of them.
Wrapping it up
The only choices we have left are:
where do we place the head node in the optimal path?
which way do we order the two paths to the left and right?
This means we have n values for the index of the head node (1...n, 0 is fixed as the start node) and 2x2 values for sort order. So we have 4n choices which we can all calculate and pick the shortest one. One of the sort orders probably determines the other but I leave that to you.
Anyways, the complexity of this algorithm is O(4n) = O(n). Because reading in the input of the problems is in O(n) and writing the output is as well, I believe that is an algorithm of optimal complexity. However, if we could reformulate the problem somewhat, so that we could read and write the input and output in some compressed form, as in only the parameters that we actually need to solve the problem, then it is possible that we could do better.
P.S.: I'm not a mathematician so I probably used wrong words for some concepts and missed the usual notation for the variables and functions. I would be glad for some expert to check this for any obvious errors.

A* with manhattan distance or euclidean distance for maze solving?

I have obtained all the possible paths of maze through image processing. Now, I want to use A* algorithm in order to find shortest path of maze. However, I am confused as to whether euclidean distance will be a better heuristic or manhattan distance. Does it depend upon maze type or is the choice of heuristic independent of maze type? Which distance (manhattan or euclidean) will be a good choice for the following possible paths and why? Please suggest.
PS. (Please add your reference too, if your have any. It will be helpful)
The objective of a Heuristic is to provide contextual information to the pathfinder. The more accurate this information is, the more efficient the pathfinder can be.
You have two contradicting requirements to get a good heuristic, which is good because it means there is a sweet spot. Here they are:
An Heuristic must be admissible, which means it shall never overestimate the distance. Otherwise, the algorithm will be broken and may return paths that are not even optimal.
An heuristic must return the largest distance possible. An heuristic that underestimates the remaining path from a cell, will favour that cell when another might have been better.
Of course the optimal Heuristic would return the exact, correct length (which generally is not achievable or defeats the purpose) because it cannot return a longer path without ceasing to be admissible.
In your case, it looks like you're dealing with 4-connected grids. In that case the manhattan distance will be a better metric than euclidian distance, because the Euclidian will under-estimate the cost of all displacements compared to Manhattan (due to the Pythagorean Theorem).
Whithout any further knowledge than 'the graph is a 4-connected Grid', there is no better metric than Manhattan. If however you manage to obtain more data (obstacle density, 'highways', etc.) then you might be able to devise a better heuristic - though keeping it admissible would be a very hard problem in itself.
EDIT Having a closer look, it looks like you have angled vertices in the bottom left. If that is so, you're not in a 4-connected graph, then you MUST use Euclidian distance, because Manhattan would not be admissible.
Its not clear what moves are available to your hero. Does you graph make up a rectangular grid like chess board and can you go diagonally in one step like king in chess? If yes then Chebyshev distance is the best https://en.wikipedia.org/wiki/Chebyshev_distance.
Otherwise use Euclidian distance.
You cant use Manhattan here if you want an optimal path because Manhattan heuristic is not admissible on diagonal routes (it overestimates them) so it can lead to suboptimal pathes

Is This Graph Embedding Possible & Does It Have A Name?

I want to project an undirected graph into the 2d plane such that:
the euclidean distance preserves the stepwise distance (i.e. if the shortest path between A and B is shorter than the shortest path between C and D, then the euclidean distance between A and B is less than the euclidean distance between A and B)
the minimum difference between the euclidean distance and the stepwise distance is minimized. Ideally the set of solutions is generated or described if there is not a unique minimum.
If this is not possible, what are the most minimal sets of constraints on the graph that make it possible? I'm interested in the question in general, although at the moment I want it for a finite lattice with its minimum removed.
It's called graph embeddng. There's even a theorem that gives an upper limit to the distortion. The embedding algorithm that I like the most is SDE. It's fairly easy to implement on any language if you have a SDP library.
Here's an algorithm that's a bit simpler.
I think the first requirement is impossible, at least for the general case. Consider a fully-connected graph consisting of four nodes, with all path-lengths equal. It's not possible to choose four points in 2D Euclidean space that exhibits the same property (other than 4 coincident points).
See Diego's answer for some useful information (my knowledge of graph theory is very limited!).

Dijkstra's algorithm to find most weighted path

I just want to make sure this would work. Could you find the greatest path using Dijkstra's algorithm? Would you have to initialize the distance to something like -1 first and then change the relax subroutine to check if it's greater?
This is for a problem that will not have any negative weights.
This is actually the problem:
Suppose you are given a diagram of a telephone network, which is a graph
G whose vertices represent switches centers, and whose edges represent communication
lines between two centers. The edges are marked by their bandwidth of its lowest
bandwidth edge. Give an algorithm that, given a diagram and two switches centers a
and b, will output the maximum bandwidth of a path between a and b.
Would this work?
EDIT:
I did find this:
Hint: The basic subroutine will be very similar to the subroutine Relax in Dijkstra.
Assume that we have an edge (u, v). If min{d[u],w(u, v)} > d[v] then we should update
d[v] to min{d[u],w(u, v)} (because the path from a to u and then to v has bandwidth
min{d[u],w(u, v)}, which is more than the one we have currently).
Not exactly sure what that's suppose to mean though since all distance are infinity on initialization. So, i don't know how this would work. any clues?
I'm not sure Djikstra's is the way to go. Negative weights do bad, bad things to Djikstra's.
I'm thinking that you could sort by edge weight, and start removing the lowest weight edge (the worst bottleneck), and seeing if the graph is still connected (or at least your start and end points). The point at which the graph is broken is when you know you took out the bottleneck, and you can look at that edge's value to get the bandwidth. (If I'm not mistaken, each iteration takes O(E) time, and you will need O(E) iterations to find the bottleneck edge, so this is an O(E2) algorithm.
Edit: you have to realize that the greatest path isn't necessarily the highest bandwidth: you're looking to maximize the value of min({edges in path}), not sum({edges in path}).
You can solve this easily by modifying Dijkstra's to calculate maximum bandwidth to all other vertices.
You do not need to initialize the start vertex to -1.
Algorithm: Maximum Bandwidth(G,a)
Input: A simple undirected weighted graph G with non -ve edge weights, and a distinguished vertex a of G
Output: A label D[u], for each vertex u of G, such that D[u] is the maximum bandwidth available from a to u.
Initialize empty queue Q;
Start = a;
for each vertex u of G do,
D[u] = 0;
for all vertices z adjacent to Start do{ ---- 1
If D[Start] => D[z] && w(start, z) > D[z] {
Q.enqueue(z);
D[z] = min(D[start], D[z]);
}
}
If Q!=null {
Start = Q.dequeue;
Jump to 1
}
else
finish();
This may not be the most efficient way to calculate the bandwidth, but its what I could think of for now.
calculating flow may be more applicable, however flow allows for multiple paths to be used.
Just invert the edge weights. That is, if the edge weight is d, consider it instead as d^-1. Then do Dijkstra's as normal. Initialize all distances to infinity as normal.
You can use Dijkstra's algorithm to find a single longest path but since you only have two switch centers I don't see why you need to visit each node as in Dijkstra's. There is most likes a more optimal way of going this, such as a branch and bound algorithm.
Can you adjust some of the logic in algorithm AllPairsShortestPaths(Floyd-Warshall)?
http://www.algorithmist.com/index.php/Floyd-Warshall%27s_Algorithm
Initialize unconnected edges to negative infinity and instead of taking the min of the distances take the max?

Optimal rotation of 3D model for 2D projection

I'm looking for a way to determine the optimal X/Y/Z rotation of a set of vertices for rendering (using the X/Y coordinates, ignoring Z) on a 2D canvas.
I've had a couple of ideas, one being pure brute-force involving performing a 3-dimensional loop ranging from 0..359 (either in steps of 1 or more, depending on results/speed requirements) on the set of vertices, measuring the difference between the min/max on both X/Y axis, storing the highest results/rotation pairs and using the most effective pair.
The second idea would be to determine the two points with the greatest distance between them in Euclidean distance, calculate the angle required to rotate the 'path' between these two points to lay along the X axis (again, we're ignoring the Z axis, so the depth within the result would not matter) and then repeating several times. The problem I can see with this is first by repeating it we may be overriding our previous rotation with a new rotation, and that the original/subsequent rotation may not neccesarily result in the greatest 2D area used. The second issue being if we use a single iteration, then the same problem occurs - the two points furthest apart may not have other poitns aligned along the same 'path', and as such we will probably not get an optimal rotation for a 2D project.
Using the second idea, perhaps using the first say 3 iterations, storing the required rotation angle, and averaging across the 3 would return a more accurate result, as it is taking into account not just a single rotation but the top 3 'pairs'.
Please, rip these ideas apart, give insight of your own. I'm intreaged to see what solutions you all may have, or algorithms unknown to me you may quote.
I would compute the principal axes of inertia, and take the axis vector v with highest corresponding moment. I would then rotate the vertices to align v with the z-axis. Let me know if you want more details about how to go about this.
Intuitively, this finds the axis about which it's hardest to rotate the points, ie, around which the vertices are the most "spread out".
Without a concrete definition of what you consider optimal, it's impossible to say how well this method performs. However, it has a few desirable properties:
If the vertices are coplanar, this method is optimal in that it will always align that plane with the x-y plane.
If the vertices are arranged into a rectangular box, the box's shortest dimension gets aligned to the z-axis.
EDIT: Here's more detailed information about how to implement this approach.
First, assign a mass to each vertex. I'll discuss options for how to do this below.
Next, compute the center of mass of your set of vertices. Then translate all of your vertices by -1 times the center of mass, so that the new center of mass is now (0,0,0).
Compute the moment of inertia tensor. This is a 3x3 matrix whose entries are given by formulas you can find on Wikipedia. The formulas depend only on the vertex positions and the masses you assigned them.
Now you need to diagonalize the inertia tensor. Since it is symmetric positive-definite, it is possible to do this by finding its eigenvectors and eigenvalues. Unfortunately, numerical algorithms for finding these tend to be complicated; the most direct approach requires finding the roots of a cubic polynomial. However finding the eigenvalues and eigenvectors of a matrix is an extremely common problem and any linear algebra package worth its salt will come with code that can do this for you (for example, the open-source linear algebra package Eigen has SelfAdjointEigenSolver.) You might also be able to find lighter-weight code specialized to the 3x3 case on the Internet.
You now have three eigenvectors and their corresponding eigenvalues. These eigenvalues will be positive. Take the eigenvector corresponding to the largest eigenvalue; this vector points in the direction of your new z-axis.
Now, about the choice of mass. The simplest thing to do is to give all vertices a mass of 1. If all you have is a cloud of points, this is probably a good solution.
You could also set each star's mass to be its real-world mass, if you have access to that data. If you do this, the z-axis you compute will also be the axis about which the star system is (most likely) rotating.
This answer is intended to be valid only for convex polyhedra.
In http://203.208.166.84/masudhasan/cgta_silhouette.pdf you can find
"In this paper, we study how to select view points of convex polyhedra such that the silhouette satisfies certain properties. Specifically, we give algorithms to find all projections of a convex polyhedron such that a given set of edges, faces and/or vertices appear on the silhouette."
The paper is an in-depth analysis of the properties and algorithms of polyhedra projections. But it is not easy to follow, I should admit.
With that algorithm at hand, your problem is combinatorics: select all sets of possible vertexes, check whether or not exist a projection for each set, and if it does exists, calculate the area of the convex hull of the silhouette.
You did not provide the approx number of vertex. But as always, a combinatorial solution is not recommended for unbounded (aka big) quantities.

Resources