How to find a path from source to destination by adding edges with minimum total weight - graph

I have about 100 atoms in a 3-D space. Each atom is a node. Edges are added between two nodes when they are closer than 0.32 nm with weight equals distance. I want to find a path from source node to destination node. Since the 100 atoms are not fully connected, sometimes I can't find a path.
What I want to do is to add one or more edges to make source and destination connected. Meanwhile, I also want to minimize the total weights of the new added edges. Again, weight is calculated from the two nodes' distance.
It is kind of a reverse problem of minimum cut. Is there any algorithm helps to do this?
Thanks a lot!

It seems like one way would be to make use of a graph search algorithm for finding the shortest path, like Dijkstra's algorithm, and perhaps work from both ends (source and destination).
The only difference is that you can't know if any edge actually exists or not, and so you are creating the graph as you go. So if you start at A, and the nodes of the graph are A, B, C, D, E. Then you need to check if A-B, A-C, A-D, and A-E exists. If only A-B exists, then you check B-C, B-D, and B-E.
This will be O(|V|^2), but will really depend on how many edges get explored.
The same idea applies if you are only interested in adding edges that are longer than 0.32 nm. It's just that the path length calculation changes. Any edges that are less than .32 nm are zero-length, or simply a lot shorter (as they become less important). If that last bit doesn't work, then it gets a bit trickier.

Related

Shortest distance to cover all the (N+1) points. All the N points lie on x- axis. Remaining one point lies anywhere in the coordinate plane

Given (N+1) points. All the N points lie on x- axis. Remaining one point (HEAD point) lies anywhere in the coordinate plane.
Given a START point on x- axis.
Find the shortest distance to cover all the points starting from START point.We can traverse a point multiple times.
Example N+1=4
points on x axis
(0,1),(0,2),(0,3)
HEAD Point
(1,1) //only head point can lie anywhere //Rest all on x axis
START Point
(0,1)
I am looking for a method as of how to approach this problem.
Whether we should visit HEAD point first or HEAD point in between.
I tried to find a way using Graph Theory to simplify this problem and reduce the paths that need to be considered. If there is an elegant way to represent this problem using graphs to identify a solution, I was not able to find it. This approach becomes very inefficient as the n increases - the time and memory is O(2^n).
Looking at this as a tree graph, the root node would be the START point, then each of its child nodes would be the points it is connected to.
Since the START point and the rest of the points aside from the HEAD all lay on the x-axis, all non-HEAD points only need to be connected to adjacent points on the x-axis. This is because the distance of the path between any two points is the sum of the distances between any adjacent points along the path between those two points (the subset of nodes representing points on the x-axis does not need to form a complete graph). This reduces the brute force approach some.
Here's a simple example:
The upper left shows the original problem: points on the x-axis along with the START and HEAD points.
In the upper right, this has been transformed into a graph with each node representing a point from the original problem. The edges represent the paths that can be taken between points. This assumes that the START point only represents the first point in the path. Unlike the other nodes, it is only included in the path once. If that is not the case and the path can return to the START point, this would approximately double the possible paths, but the same approach can be followed.
In the bottom left, the START point, a, is the root of a tree graph, and each node connected to the START point is a child node. This process is repeated for each child node until either:
A path that is obviously not optimal is identified, in which case that node can just be excluded from the graph. See the nodes in red boxes; going back and forth between the same nodes is unnecessary.
All points are included when traversing the tree from the root to that node, producing a potential solution.
Note that when creating the tree graph, each time a node is repeated, its "potential" child nodes are the same as the first time the node was included. By "potential", I mean cases above still need to be checked, because the result might include a nonsensical path, in which case that node would not be included. It is also possible a potential solution results from the path after its child nodes are included.
The last step is to add up the distances for each of the potential solutions to determine which path is shortest.
This requires a careful examination of the different cases.
Assume for now START (S) is on the far left, and HEAD (H) is somewhere in the middle the path maybe something like
H
/ \
S ---- * ----*----* * --- * ----*
Or it might be shorter to from H to and from the one of the other node
H
//
S ---- * --- * -- *----------*---*
If S is not at one end you might have something like
H
/ \
* ---- * ----*----* * --- * ----*
--------S
Or even going direct from S to H on the first step
H
/ |
* ---- * ----*----* |
S
A full analysis of cases would be quite extensive.
Actually solving the problem, might depend on the number of nodes you have. If the number is small < 10, then compete enumeration might be possible. Just work out every possible path, eliminate the ones which are illegal, and choose the smallest. The number of paths is I think in the order of n!, so its computable for small n.
For large n you can break the problem into small segments. I think its enough just to consider a small patch with nodes either side of H and a small patch with nodes either side of S.
This is not really a solution, but a possible way to think about tackling the problem.
(To be pedantic stackoverflow.com is not the right site for this question in the stack exchange network. Computational Science : algorithms might be a better place.
This is a fun problem. First, lets try to find a brute force solution, as Poosh did.
Observations about the Shortest Path
No repeated points
You are in an Euclidean geometry, thus the triangle inequality holds: For all points a,b,c, the distance d(a,b) + d(b,c) <= d(a,c). Thus, whenever you have an optimal path that contains a point that occurs more than once, you can remove one of them, which means it is not an optimal path, which leads to a contradiction and proves that your optimal path contains each point exactly once.
Permutations
Our problem is thus to find the permutation, lets call it M_i, of the numbers 1...n for points P1...Pn (where P0 is the fixed start point and Pn the head point, P1...Pn-1 are ordered by increasing x value) that minimizes the sum of |(P_M_i)-(P_M_(i-1))| for i from 1 to n, || being the vector length sqrt(v_x²+v_y²).
The number of permutations of a set of size n is n!. In this case we have n+1 points, so a brute force approach testing all permutations would have complexity (n+1)!, which is higher than even 2^n and definitely not practical, so we need further observations to improve this.
Next Steps
My next step would now be to see if there are any other sequences that can be proven to be not optimal, leading to a reduction in the number of candidates to be tested.
Paths of non-head points
Lets look at all paths (sequences of indices of points that don't contain a head point and that are parts of the optimal path. If we don't change the start and end point of a path, then any other transpositions have no effect on the outside environment and we can perform purely local optimizations. We can prove that those sequences must have monotonic (increasing or decreasing) x coordinate values and thus monotonic indices (as they are ordered by ascending x coordinate between indices 0 and n-1):
We are in a purely one dimensional subspace and the total distance of the path is thus equal to the sum of the absolute values of the differences in x coordinates between one such point and the next. It is clear that this sum is minimized by ordering by x coordinate in either ascending or descending order and thus ordering the indices in the same way. Note that this is true for maximal such paths as well as for all continuous "subpaths" of them.
Wrapping it up
The only choices we have left are:
where do we place the head node in the optimal path?
which way do we order the two paths to the left and right?
This means we have n values for the index of the head node (1...n, 0 is fixed as the start node) and 2x2 values for sort order. So we have 4n choices which we can all calculate and pick the shortest one. One of the sort orders probably determines the other but I leave that to you.
Anyways, the complexity of this algorithm is O(4n) = O(n). Because reading in the input of the problems is in O(n) and writing the output is as well, I believe that is an algorithm of optimal complexity. However, if we could reformulate the problem somewhat, so that we could read and write the input and output in some compressed form, as in only the parameters that we actually need to solve the problem, then it is possible that we could do better.
P.S.: I'm not a mathematician so I probably used wrong words for some concepts and missed the usual notation for the variables and functions. I would be glad for some expert to check this for any obvious errors.

finding maximum weight subgraph

My graph is as follows:
I need to find a maximum weight subgraph.
The problem is as follows:
There are n Vectex clusters, and in every Vextex cluster, there are some vertexes. For two vertexes in different Vertex cluster, there is a weighted edge, and in the same Vextex cluster, there is no edge among vertexes. Now I
want to find a maximum weight subgraph by finding only one vertex in each
Vertex cluster. And the total weight is computed by adding all weights of the edges between the selected vertex. I add a picture to explain the problem. Now I know how to model this problem by ILP method. However, I do not know how to solve it by an approximation algorithm and how to get its approximation ratio.
Could you give some solutions and suggestions?
Thank you very much. If any unclear points in this description,
please feel free to ask.
I do not think you can find an alpha-approx for this problem, for any alpha. That is because if such an approximation exists, then it would also prove that the unique games conjecture(UGC) is false. And disproving (or proving) the UGC is a rather big feat :-)
(and I'm actually among the UGC believers, so I'd say it's impossible :p)
The reduction is quite straightforward, since any UGC instance can be described as your problem, with weights of 0 or 1 on edges.
What I can see as polynomial approximation is a 1/k-approx (k the number of clusters), using a maximum weight perfect matching (PM) algorithm (we suppose the number of clusters is even, if it's odd just add a 'useless' one with 1 vertex, 0 weights everywhere).
First, you need to build a new graph. One vertex per cluster. The weight of the edge u, v has the weight max w(e) for e edge from cluster u to cluster v. Run a max weight PM on this graph.
You then can select one vertex per cluster, the one that corresponds to the edge selected in the PM.
The total weight of the solution extracted from the PM is at least as big as the weight of the PM (since it contains the edges of the PM + other edges).
And then you can conclude that this is a 1/k approx, because if there exists a solution to the problem that is more than k times bigger than the PM weight, then the PM was not maximal.
The explanation is quite short (lapidaire I'd say), tell me if there is one part you don't catch/disagree with.
Edit: Equivalence with UGC: unique label cover explained.
Think of a UGC instance. Then, every node in the UGC instance will be represented by a cluster, with as many nodes in the cluster as there are colors in the UGC instance. Then, create edge with weight 0 if they do not correspond to an edge in the UGC, or if it correspond to a 'bad color match'. If they correspond to a good color match, then give it the weight 1.
Then, if you find the optimal solution to an instance of your problem, it means it corresponds to an optimal solution to the corresponding UGC instance.
So, if UGC holds, it means it is NP-hard to approximate your problem.
Introduce a new graph G'=(V',E') as follows and then solve (or approximate) the maximum stable set problem on G'.
Corresponding to each edge a-b in E(G), introduce a vertex v_ab in V'(G') where its weight is equal to the weight of the edge a-b.
Connect all of vertices of V'(G') to each other except for the following ones.
The vertex v_ab is not connected to the vertex v_ac, where vertices b and c are in different clusters in G. In this manner, we can select both of these vertices in an stable set of G' (Hence, we can select both of the corresponding edges in G)
The vertex v_ab is not connected to the vertex v_cd, where vertices a, b, c and d are in different clusters in G. In this manner, we can select both of these vertices in an stable set of G' (Hence, we can select both of the corresponding edges in G)
Finally, I think you can find an alpha-approximation for this problem. In other words, in my opinion the Unique Games Conjecture is wrong due to the 1.999999-approximation algorithm which I proposed for the vertex cover problem.

Pathfinding through graph with special vertices

Heyo!
So I've got this directed and/or undirected graph with a bunch of vertices and edges. In this graph there is a start vertex and an end vertex. There's also a subset of vertices which are coloured red (this subset can include the start and end vertices). Also, no pair of vertices can have more than one edge between them.
What I have to do is to find:
A) The shortest path that passes no red vertices
B) If there is a path that passes at least one red vertex
C) The path with the greatest amount of red vertices
D) The path with the fewest amount of red vertices
For A I use a breadth first search ignoring red branches. For B I simply brute force it with a depth first search of the graph. And for C and D I use dynamic programming, memoizing the number of red vertices I find in all paths, using the same DFS as in B.
I am moderately happy with all the solutions and I would very much appreciate any suggestions! Thanks!!
For A I use a breadth first search ignoring red branches
A) is a Typical pathfinding problem happening in the sub-graph that contains no red edges. So your solution is good (could be improved with heuristics if you can come up with one, then use A*)
For B I simply brute force it with a depth first search of the graph
Well here's the thing. Every optimal path A->C can be split at an arbitrary intermediate point B. A Nice property of optimal paths, is that every sub-path is optimal. So A->B and B->C are optimal.
This means if you know you must travel from some start to some end through an intermediary red vertex, you can do the following:
Perform a BFS from the start vertex
Perform a BFS from the endvertex backwards (If your edges are directed - as I think - you'll have to take them in reverse, here)
Alternate expanding both BFS so that both their 'edge' (or open lists, as they are called) have the same distance to their respective start.
Stop when:
One BFS hits a red vertex encountered by (or in the 'closed' list of) the other one. In this case, Each BFS can construct the optimal path to that commen vertex. Stitch both semi-paths, and you have your optimal path with at least a red vertex.
One BFS is stuck ('open' list is empty). In this case, there is no solution.
C) The path with the greatest amount of red vertices
This is a combinatorial problem. the first thing I would do is make a matrix of reachability of [start node + red nodes + end nodes] where:
reachability[i, j] = 1 iff there is a path from node i to node j
To compute this matrix, simply perform one BFS search starting at the start node and at every red node. If the BFS reaches a red node, put a 1 in the corresponding line/column.
This will abstract away the underlying complexity of the graph, and make an order of magnitude speedup on the combinatorial search.
The problem is now a longest path problem through that connectivity matrix. dynamic programming would be the way to go indeed.
D) The path with the fewest amount of red vertices
Simply perform a Dijkstra search, but use the following metric when sorting the nodes in the 'open' list:
dist(start, a) < dist(start, b) if:
numRedNodesInPath(start -> a) < numRedNodesInPath(start -> b)
OR (
numRedNodesInPath(start -> a) == numRedNodesInPath(start -> b)
AND
numNodesInPath(start -> a) < numNodesInPath(start -> a)
)
For this, when discovering new vertices, you'll have to store the path leading up to them (well, just the nb of nodes in the path, as well as the nb of red nodes, separately) in a dedicated map to be fetched. I mention this because usually, the length of the path is stored implicitly as the position of the verrtex in the array. You'll have to enforce it explicitely in your case.
Note on length optimality:
Even though you stated you did not care about length optimality outside of problem A), the algorithm I provided will produce shortest-length solutions. In many cases (like in D) it helps Dijkstra converge better I believe.

Take certain edges in Dijkstra (or DFS)

i have a question about Dijkstra and/or DFS.
Let's assume I have a graph, with several nodes and edges. Now i want to find a Path from Node A to Node B. On this way i have to take certain Edges, for example the Edge (C,D).
Edit:
Sorry if it was a little bit unclear. My question is : I want a path from A to B. Is there a Path so that all edges {a,b}, {b,c}... and so on are taken? I'm interested if that's possible with dfs. And if thats also possible with Dijkstra under the same constraint if i want the shortest path from A to B and some edges {a,b}, {b,c} in the graph are needed to be taken. Also the graph is directed.
Help would be really appreciated!
Create a list of the all the edges that you want to make sure must be navigated.
Perform a DFS starting from the source.
Everytime you pass though an edge that's in the list, mark is as "traversed".
When you reach the destination, check if all the edges in the list are marked as traversed.
a.) If yes, goal state reached.
b.) Else, backtrack and unmark each edge in the list while passing through them.

Pathfinding - A* with least turns

Is it possible to modify A* to return the shortest path with the least number of turns?
One complication: Nodes can no longer be distinguished solely by their location, because their parent node is relevant in determining future turns, so they have to have a direction associated with them as well.
But the main problem I'm having, is how to work number of turns into the partial path cost (g). If I multiply g by the number of turns taken (t), weird things are happening like: A longer path with N turns near the end is favored over a shorter path with N turns near the beginning.
Another less optimal solution I'm considering is: After calculating the shortest path, I could run a second A* iteration (with a different path cost formula), this time bounded within the x/y range of the shortest path, and return the path with the least turns. Any other ideas?
The current "state" of the search is actually represented by two things: The node you're in, and the direction you're facing. What you want is to separate each of those states into different nodes.
So, for each node in the initial graph, split it into E separate nodes, where E is the number of incoming edges. Each of these new nodes represents the old node, but facing in different directions. The outgoing edges of these new nodes will all be the same as the old outgoing edges, but with a different weight. If the old weight was w, then...
If the edge doesn't represent a turn, make the new weight w as well
If the edge does represent a turn, make the new weight w + ε, where ε is some number significantly smaller than the smallest weight.
Then just do a normal A* search. Since none of the weights have decreased, your heuristic will still be admissible, so you can still use the same heuristic.
If you really want to minimize the number of turns (as opposed to finding a nice tradeoff between turns and path length), why not transform your problem space by adding an edge for every pair of nodes connected by an unobstructed straight line; these are the pairs you can travel between without a turn. There are O(n) such edges per node, so the new graph is O(n3) in terms of edges. That makes A* solutions as much as O(n3) in terms of time.
Manhattan distance might be a good heuristic for A*.
Is it possible to modify A* to return the shortest path with the least number of turns?
It is most likely not possible. The reason being that it is an example of the weight-constrained shortest path problem. It is therefore NP-Complete and cannot be solved efficiently.
You can find papers that discuss solving this problem e.g. http://web.stanford.edu/~shushman/math15_report.pdf

Resources