Searching for an atypical graph pathfinding algorithm - graph

Graph structure
graph is oriented
each edge has an assigned value representing the cost of using this edge. The value can be positive or negative
Problem description
input: graph, starting node and initial value (I don't have a goal node)
both, nodes and edges can be used repeatedly
goal: change the initial value to zero by passing through the graph. The answer should be if it is possible to reach zero (exactly zero, without reaching a negative actual value in the process)
I don't need the final path as the result, just the information if it is possible is enough. I would be most interested in a name of algorithm that is designed for this problem.

It is clearly NP-Hard (subset-sum can be reduced to it by using an appropriate complete graph). Breadth-first search seems like a natural approach, though to get a decision-procedure out of it you would need to find an upper-bound on the length of the path.

Related

Shortest distance to cover all the (N+1) points. All the N points lie on x- axis. Remaining one point lies anywhere in the coordinate plane

Given (N+1) points. All the N points lie on x- axis. Remaining one point (HEAD point) lies anywhere in the coordinate plane.
Given a START point on x- axis.
Find the shortest distance to cover all the points starting from START point.We can traverse a point multiple times.
Example N+1=4
points on x axis
(0,1),(0,2),(0,3)
HEAD Point
(1,1) //only head point can lie anywhere //Rest all on x axis
START Point
(0,1)
I am looking for a method as of how to approach this problem.
Whether we should visit HEAD point first or HEAD point in between.
I tried to find a way using Graph Theory to simplify this problem and reduce the paths that need to be considered. If there is an elegant way to represent this problem using graphs to identify a solution, I was not able to find it. This approach becomes very inefficient as the n increases - the time and memory is O(2^n).
Looking at this as a tree graph, the root node would be the START point, then each of its child nodes would be the points it is connected to.
Since the START point and the rest of the points aside from the HEAD all lay on the x-axis, all non-HEAD points only need to be connected to adjacent points on the x-axis. This is because the distance of the path between any two points is the sum of the distances between any adjacent points along the path between those two points (the subset of nodes representing points on the x-axis does not need to form a complete graph). This reduces the brute force approach some.
Here's a simple example:
The upper left shows the original problem: points on the x-axis along with the START and HEAD points.
In the upper right, this has been transformed into a graph with each node representing a point from the original problem. The edges represent the paths that can be taken between points. This assumes that the START point only represents the first point in the path. Unlike the other nodes, it is only included in the path once. If that is not the case and the path can return to the START point, this would approximately double the possible paths, but the same approach can be followed.
In the bottom left, the START point, a, is the root of a tree graph, and each node connected to the START point is a child node. This process is repeated for each child node until either:
A path that is obviously not optimal is identified, in which case that node can just be excluded from the graph. See the nodes in red boxes; going back and forth between the same nodes is unnecessary.
All points are included when traversing the tree from the root to that node, producing a potential solution.
Note that when creating the tree graph, each time a node is repeated, its "potential" child nodes are the same as the first time the node was included. By "potential", I mean cases above still need to be checked, because the result might include a nonsensical path, in which case that node would not be included. It is also possible a potential solution results from the path after its child nodes are included.
The last step is to add up the distances for each of the potential solutions to determine which path is shortest.
This requires a careful examination of the different cases.
Assume for now START (S) is on the far left, and HEAD (H) is somewhere in the middle the path maybe something like
H
/ \
S ---- * ----*----* * --- * ----*
Or it might be shorter to from H to and from the one of the other node
H
//
S ---- * --- * -- *----------*---*
If S is not at one end you might have something like
H
/ \
* ---- * ----*----* * --- * ----*
--------S
Or even going direct from S to H on the first step
H
/ |
* ---- * ----*----* |
S
A full analysis of cases would be quite extensive.
Actually solving the problem, might depend on the number of nodes you have. If the number is small < 10, then compete enumeration might be possible. Just work out every possible path, eliminate the ones which are illegal, and choose the smallest. The number of paths is I think in the order of n!, so its computable for small n.
For large n you can break the problem into small segments. I think its enough just to consider a small patch with nodes either side of H and a small patch with nodes either side of S.
This is not really a solution, but a possible way to think about tackling the problem.
(To be pedantic stackoverflow.com is not the right site for this question in the stack exchange network. Computational Science : algorithms might be a better place.
This is a fun problem. First, lets try to find a brute force solution, as Poosh did.
Observations about the Shortest Path
No repeated points
You are in an Euclidean geometry, thus the triangle inequality holds: For all points a,b,c, the distance d(a,b) + d(b,c) <= d(a,c). Thus, whenever you have an optimal path that contains a point that occurs more than once, you can remove one of them, which means it is not an optimal path, which leads to a contradiction and proves that your optimal path contains each point exactly once.
Permutations
Our problem is thus to find the permutation, lets call it M_i, of the numbers 1...n for points P1...Pn (where P0 is the fixed start point and Pn the head point, P1...Pn-1 are ordered by increasing x value) that minimizes the sum of |(P_M_i)-(P_M_(i-1))| for i from 1 to n, || being the vector length sqrt(v_x²+v_y²).
The number of permutations of a set of size n is n!. In this case we have n+1 points, so a brute force approach testing all permutations would have complexity (n+1)!, which is higher than even 2^n and definitely not practical, so we need further observations to improve this.
Next Steps
My next step would now be to see if there are any other sequences that can be proven to be not optimal, leading to a reduction in the number of candidates to be tested.
Paths of non-head points
Lets look at all paths (sequences of indices of points that don't contain a head point and that are parts of the optimal path. If we don't change the start and end point of a path, then any other transpositions have no effect on the outside environment and we can perform purely local optimizations. We can prove that those sequences must have monotonic (increasing or decreasing) x coordinate values and thus monotonic indices (as they are ordered by ascending x coordinate between indices 0 and n-1):
We are in a purely one dimensional subspace and the total distance of the path is thus equal to the sum of the absolute values of the differences in x coordinates between one such point and the next. It is clear that this sum is minimized by ordering by x coordinate in either ascending or descending order and thus ordering the indices in the same way. Note that this is true for maximal such paths as well as for all continuous "subpaths" of them.
Wrapping it up
The only choices we have left are:
where do we place the head node in the optimal path?
which way do we order the two paths to the left and right?
This means we have n values for the index of the head node (1...n, 0 is fixed as the start node) and 2x2 values for sort order. So we have 4n choices which we can all calculate and pick the shortest one. One of the sort orders probably determines the other but I leave that to you.
Anyways, the complexity of this algorithm is O(4n) = O(n). Because reading in the input of the problems is in O(n) and writing the output is as well, I believe that is an algorithm of optimal complexity. However, if we could reformulate the problem somewhat, so that we could read and write the input and output in some compressed form, as in only the parameters that we actually need to solve the problem, then it is possible that we could do better.
P.S.: I'm not a mathematician so I probably used wrong words for some concepts and missed the usual notation for the variables and functions. I would be glad for some expert to check this for any obvious errors.

Pathfinding - A* with least turns

Is it possible to modify A* to return the shortest path with the least number of turns?
One complication: Nodes can no longer be distinguished solely by their location, because their parent node is relevant in determining future turns, so they have to have a direction associated with them as well.
But the main problem I'm having, is how to work number of turns into the partial path cost (g). If I multiply g by the number of turns taken (t), weird things are happening like: A longer path with N turns near the end is favored over a shorter path with N turns near the beginning.
Another less optimal solution I'm considering is: After calculating the shortest path, I could run a second A* iteration (with a different path cost formula), this time bounded within the x/y range of the shortest path, and return the path with the least turns. Any other ideas?
The current "state" of the search is actually represented by two things: The node you're in, and the direction you're facing. What you want is to separate each of those states into different nodes.
So, for each node in the initial graph, split it into E separate nodes, where E is the number of incoming edges. Each of these new nodes represents the old node, but facing in different directions. The outgoing edges of these new nodes will all be the same as the old outgoing edges, but with a different weight. If the old weight was w, then...
If the edge doesn't represent a turn, make the new weight w as well
If the edge does represent a turn, make the new weight w + ε, where ε is some number significantly smaller than the smallest weight.
Then just do a normal A* search. Since none of the weights have decreased, your heuristic will still be admissible, so you can still use the same heuristic.
If you really want to minimize the number of turns (as opposed to finding a nice tradeoff between turns and path length), why not transform your problem space by adding an edge for every pair of nodes connected by an unobstructed straight line; these are the pairs you can travel between without a turn. There are O(n) such edges per node, so the new graph is O(n3) in terms of edges. That makes A* solutions as much as O(n3) in terms of time.
Manhattan distance might be a good heuristic for A*.
Is it possible to modify A* to return the shortest path with the least number of turns?
It is most likely not possible. The reason being that it is an example of the weight-constrained shortest path problem. It is therefore NP-Complete and cannot be solved efficiently.
You can find papers that discuss solving this problem e.g. http://web.stanford.edu/~shushman/math15_report.pdf

How to find total number of minimum spanning trees in a graph?

I don't want to find all the minimum spanning trees but I want to know how many of them are there, here is the method I considered:
Find one minimum spanning tree using prim's or kruskal's algorithm and then find the weights of all the spanning trees and increment the running counter when it is equal to the weight of minimum spanning tree.
I couldn't find any method to find the weights of all the spanning trees and also the number of spanning trees might be very large, so this method might not be suitable for the problem.
As the number of minimum spanning trees is exponential, counting them up wont be a good idea.
All the weights will be positive.
We may also assume that no weight will appear more than three times in the graph.
The number of vertices will be less than or equal to 40,000.
The number of edges will be less than or equal to 100,000.
There is only one minimum spanning tree in the graph where the weights of vertices are different. I think the best way of finding the number of minimum spanning tree must be something using this property.
EDIT:
I found a solution to this problem, but I am not sure, why it works. Can anyone please explain it.
Solution: The problem of finding the length of a minimal spanning tree is fairly well-known; two simplest algorithms for finding a minimum spanning tree are Prim's algorithm and Kruskal's algorithm. Of these two, Kruskal's algorithm processes edges in increasing order of their weights. There is an important key point of Kruskal's algorithm to consider, though: when considering a list of edges sorted by weight, edges can be greedily added into the spanning tree (as long as they do not connect two vertices that are already connected in some way).
Now consider a partially-formed spanning tree using Kruskal's algorithm. We have inserted some number of edges with lengths less than N, and now have to choose several edges of length N. The algorithm states that we must insert these edges, if possible, before any edges with length greater than N. However, we can insert these edges in any order that we want. Also note that, no matter which edges we insert, it does not change the connectivity of the graph at all. (Let us consider two possible graphs, one with an edge from vertex A to vertex B and one without. The second graph must have A and B as part of the same connected component; otherwise the edge from A to B would have been inserted at one point.)
These two facts together imply that our answer will be the product of the number of ways, using Kruskal's algorithm, to insert the edges of length K (for each possible value of K). Since there are at most three edges of any length, the different cases can be brute-forced, and the connected components can be determined after each step as they would be normally.
Looking at Prim's algorithm, it says to repeatedly add the edge with the lowest weight. What happens if there is more than one edge with the lowest weight that can be added? Possibly choosing one may yield a different tree than when choosing another.
If you use prim's algorithm, and run it for every edge as a starting edge, and also exercise all ties you encounter. Then you'll have a Forest containing all minimum spanning trees Prim's algorithm is able to find. I don't know if that equals the forest containing all possible minimum spanning trees.
This does still come down to finding all minimum spanning trees, but I can see no simple way to determine whether a different choice would yield the same tree or not.
MST and their count in a graph are well-studied. See for instance: http://www14.informatik.tu-muenchen.de/konferenzen/Jass08/courses/1/pieper/Pieper_Paper.pdf.

Dijkstra's algorithm to find most weighted path

I just want to make sure this would work. Could you find the greatest path using Dijkstra's algorithm? Would you have to initialize the distance to something like -1 first and then change the relax subroutine to check if it's greater?
This is for a problem that will not have any negative weights.
This is actually the problem:
Suppose you are given a diagram of a telephone network, which is a graph
G whose vertices represent switches centers, and whose edges represent communication
lines between two centers. The edges are marked by their bandwidth of its lowest
bandwidth edge. Give an algorithm that, given a diagram and two switches centers a
and b, will output the maximum bandwidth of a path between a and b.
Would this work?
EDIT:
I did find this:
Hint: The basic subroutine will be very similar to the subroutine Relax in Dijkstra.
Assume that we have an edge (u, v). If min{d[u],w(u, v)} > d[v] then we should update
d[v] to min{d[u],w(u, v)} (because the path from a to u and then to v has bandwidth
min{d[u],w(u, v)}, which is more than the one we have currently).
Not exactly sure what that's suppose to mean though since all distance are infinity on initialization. So, i don't know how this would work. any clues?
I'm not sure Djikstra's is the way to go. Negative weights do bad, bad things to Djikstra's.
I'm thinking that you could sort by edge weight, and start removing the lowest weight edge (the worst bottleneck), and seeing if the graph is still connected (or at least your start and end points). The point at which the graph is broken is when you know you took out the bottleneck, and you can look at that edge's value to get the bandwidth. (If I'm not mistaken, each iteration takes O(E) time, and you will need O(E) iterations to find the bottleneck edge, so this is an O(E2) algorithm.
Edit: you have to realize that the greatest path isn't necessarily the highest bandwidth: you're looking to maximize the value of min({edges in path}), not sum({edges in path}).
You can solve this easily by modifying Dijkstra's to calculate maximum bandwidth to all other vertices.
You do not need to initialize the start vertex to -1.
Algorithm: Maximum Bandwidth(G,a)
Input: A simple undirected weighted graph G with non -ve edge weights, and a distinguished vertex a of G
Output: A label D[u], for each vertex u of G, such that D[u] is the maximum bandwidth available from a to u.
Initialize empty queue Q;
Start = a;
for each vertex u of G do,
D[u] = 0;
for all vertices z adjacent to Start do{ ---- 1
If D[Start] => D[z] && w(start, z) > D[z] {
Q.enqueue(z);
D[z] = min(D[start], D[z]);
}
}
If Q!=null {
Start = Q.dequeue;
Jump to 1
}
else
finish();
This may not be the most efficient way to calculate the bandwidth, but its what I could think of for now.
calculating flow may be more applicable, however flow allows for multiple paths to be used.
Just invert the edge weights. That is, if the edge weight is d, consider it instead as d^-1. Then do Dijkstra's as normal. Initialize all distances to infinity as normal.
You can use Dijkstra's algorithm to find a single longest path but since you only have two switch centers I don't see why you need to visit each node as in Dijkstra's. There is most likes a more optimal way of going this, such as a branch and bound algorithm.
Can you adjust some of the logic in algorithm AllPairsShortestPaths(Floyd-Warshall)?
http://www.algorithmist.com/index.php/Floyd-Warshall%27s_Algorithm
Initialize unconnected edges to negative infinity and instead of taking the min of the distances take the max?

Find All Cycle Bases In a Graph, With the Vertex Coordinates Given

A similar question is posted here.
I have an undirected graph with Vertex V and Edge E. I am looking for an algorithm to identify all the cycle bases in that graph. An example of such a graph is shown below:
Now, all the vertex coordinates are known ( unlike previous question, and contrary to the explanation in the above diagram), therefore it is possible to find the smallest cycles that encompass the whole graph.
In this graph, it is possible that there are edges that don't form any cycles.
What is the best algorithm to do this?
Here's another example that you can take a look at:
Assuming that e1 is the edge that gets picked first, and the arrow shows the direction of the edge.
I haven't tried this and it is rather greedy but should work:
Pick one node
Go to one it's neighbors's
Keep on going until you get back to your starting node, but you're not allowed to visit an old node.
If you get a cycle save it if it doesn't already exist or a subset of those node make up a cycle. If the node in the cycle is a subset of the nodes in another cycle remove the larger cycle (or maybe split it in two?)
Start over at 2 with a new neighbor.
Start over at 1 with a new node.
Comments: At 3 you should of course do the same thing as for step 2, so take all possible paths.
Maybe that's a start? As I said, I haven't tried it so it is not optimized.
EDIT: An undocumented and not optimized version of one implementation of the algorithm can be found here: https://gist.github.com/750015. But, it doesn't solve the solution completely since it can only recognize "true" subsets.

Resources