Longest path in a Graph with Dijkstra - graph

I was just wondering: can you inverse all the weights in a graph and then do a Dijkstra? As we are minimizing the reciprocal values of the weights, the obtained path would maximize it all in all, right?
So, in that way, we can obtain the longest path in a graph using Dijkstra!
It seems too easy, am I mistaken? Please, enlighten me.

It is not possible to do so because the longest path problem doesn't have the optimal substructure problem as the shortest path one.
Say that you can consider any path as longest path (so it can have cycles) but if there is a cycle and the weights are positive the algorithm will never end since it can always improve the longest path by looping through the cycle.
Now say that we want to have only simple paths (without cycle) as candidates for the longest path. Consider, without loss of generality, the following graph with unitary weights for all edges:
A------B
| |
| |
C------D
And consider the longest path from A to D (A->B->D). For the problem to have optimal substructure property it must be the case that longest path from A to Bis A -> B but clearly it isn't because path A->C->D->B is longer. Similar argument can be done for the path from B to D. So we can see why this problem can't be solved with Dijkstra algorithm. As a matter of fact this problem is NP, there isn't a reasonable time complexity solution.

It's easy to understand with a simple example graph.
Suppose you want to go from point A to point D. To minimize the reciprocal values of the weights, you will go through C. But A->B->D is larger.
Edit: Perhaps I should include some math at least.
Suppose the sum of a sequence of positive numbers is s.
a1 + a2 + a3 + ... + an = s.
What's the minimum value of reciprocal sum?
1/a1 + 1/a2 + 1/a3 + ... + 1/an
Playing around this will give you some intuition.

Related

Graph theory, all paths with given distance

So I found a problem where a traveller can travel a certain distance in a graph and all bidirectional edges have some length(distance). Suppose when travelling a certain edge(either direction) you get some money/gift (it's given in question for all edges)so you have to find the max money you can collect for the given distance you can travel. Basic problem is how do I find all possible paths with given distance (there might be loops in graph) and after finding all possible paths, path with max money collected will simply be the answer. Note: any possible paths you come up with should not have a loop (straight path).
You are given an undirected connected graph with double weight on the edges (distance and reward).
You are given a fixed number d corresponding to a possible distance.
For each couple of nodes (u,v), u not equal to v, you are looking for
All the paths {P_j} connecting u and v with no repeating nodes whose total distance is d.
The paths {P_hat(j)} subset of {P_j} whose reward is maximal.
To get the first, I would try to use a modified version of the Floyd-Warshall algorithm, where you do not look for the shortest, but for any path.
Floyd-Warshall uses a strategy based on considering a "middle node" w between u and v and recursively finds the path minimising the distance between u and v.
You can do the same, while taking all path instead of excluding the minimisation, taking care of putting to inf the nodes where you have already b visited in the distance matrix and excluding at runtime every partial path in the recursion whose distance is longer than d or that arrives to an end (they connects u and v) and whose distance is shorter than d.
Can be generalised if an interval of possible distances [d, D] is given, instead of a single value d, as in this second case you would probably get the empty set all the time.
For the second step, you simply compare the reward of each of the path found in solving the first step, and you take the best one.
Is more a suggested direction rather than a complete answer, but I hope it helps!

Possible Heuristic Function for Word Ladder

Hey I'm thinking of using A* to find and optimal solution for the Word Ladder problem but I'm having a bit of difficulty thinking of an appropriate g(x) and h(x). For this particular problem, could g(x) be the number of hops from the start vertex and h(x) be the number of different characters from the goal word? I'm advice would be a big help.
I never really was a fan of the A* notation of f(x) = g(x) + h(x) notation because it oversimplifies the algorithm. A* is based on two heuristics; often labelled g(x) + h(x).
You already have most of it figured out; for Djikstra's/g(x), you want to return the amount of hops taken. For Greedy/h(x), you want to check how many characters are wrong; you are at the goal when h(x) = 0.
By combining these two values, you have the A* heuristic, which essentially says to expand the best nodes along the shortest path. In other projects, you may wish to add heuristics to A* to get behaviours like avoiding enemies (this is why I prefer not to think A* = g+h).
EDIT: Don't forget to check each candidate using a dictionary file; word ladder requires that intermediate words be real words.

How to show that a prob is in NP and that it is NP-complete

Longest Path
We have a graph G=(V,E), lengths l(e) in Z^(+) for each e in E, a positive integer K and two nodes s,t in V.
The question is if there is a simple path in G from s to t of length at least K ?
Show that the problem Longest Path belongs to NP.
Show that the problem Longest Path is NP-complete, reducing Hamiltonian Path to it.
Show that if the graph is directed and acyclic then the problem can be solved in time O(|V|+|E|).
Could you give me a hint how we could show that the problem belongs to NP?
Also, how can we reduce a problem to an other, in order to show that the latter is NP-complete?
EDIT:
So in order to show that the problem belongs to NP, do we have to draw a simple and count the sum of the lengths of the edges?
Do we say for example the following?
We see that the length of the path from the node s to the node t is equal to l((s,w))+l((w,t))=3+12=15, so there is no simple path in G from s to t of length at least K.
Or does it suffice the following?
"Given a a simple path , we can easily check if its length is at least K(by simply computing the sum of lengths of all edges in it). Thus, it is in NP."
EDIT 2: Also could you explain me further why we reduce the Hamiltonian path problem to this one in polynomial time by setting all edges' lengths equal to one and set K = |V| - 1 ?
EDIT 3: Suppose that we have a problem A and a problem B and it is known that B is NP-complete. If we want to show that A is also NP-complete, do we change the data of A in that way so that we have the same problem as the problem B and so we deduce that A is also NP-complete? Or have I understood it wrong?
EDIT 4: Also how can we show that if the graph is directed and acyclic then the problem can be solved in time O(|V|+|E|)?
EDIT 5: All edges'lengths of a Hamiltonian path are equal to 1, right? And if we have V vertices, the length of the longest path is at V-1, yes? But in our problems, the lengths of the edges aren't specific and K is also not a fixed number. So if we set all edges' lengths equal to one and set K = |V| - 1, don't we reduce our problem to the Hamiltonian path problem? Or have I understood it wrong?
To show that a problem is in NP, we need to show that it can be verified in polynomial time. Given a certificate(a simple path in this case), we can easily check that it length is at least K(by simply computing the sum of lengths of all edges in it). Thus, it is in NP.
Reduction from A to B means: given an instance of A, create an instance of B(to be more precise, we are interested in polynomial time reduction here) and solve it in order to solve the original problem. So how can we reduce the Hamiltonian path problem to this one in polynomial time? It is pretty straightforward: we can set all edges' lengths equal to one and set K = |V| - 1. Then we should try all pairs of vertices in the graph (s, t), s != t and if the solution for this problem returns true for at least one pair, return true. Otherwise, we should return false(checking that we have a path of length |V| - 1 in a graph where all edges have unit length is exactly the same thing as checking that a Hamiltonian path exists by its definition).

Cormen's "Introduction to algorithms" 3rd Edition - Edmonds-karps-Algorithm - Lemma 26.7

since I think many of us don't have the same edition of "Introduction to algorithms" of Prof. Cormen et al., I'm gonna write the Lemma (and my question) in the following.
Edmonds-Karp-Algorithm
Lemma 26.7 (in 3rd edition; in 2nd it may be Lemma 26.8):
If the Edmonds-Karp algorithm is run on a flow network G=(V,E) with source s and sink t, then for all vertices v in V{s,t}, the shortest-path distance df(s,v) in the residual network Gf increases monotonically with each flow augmentation
Proof:
First, suppose that for some vertex v in V{s,t}, there is a flow augmentation that causes the shortest-path distance from s to v to decrease, then we will derive a contradiction.
Let f be the flow just before the first augmentation that decreases some shortest-path distance, and let f' be the flow just afterward. Let v be the vertex with the minimum df'(s,v), whose distance was decreased by the augmentation, so that df'(s,v) < df(s,v). Let p = s ~~> u -> u be a shortest path from s to v in Gf', so that (u,v) in Ef' and
df'(s,u) = df'(s,v) - 1. (26.12)
Because of how we chose v, we know that the distance of vertex u from soruce s did not decrease, i.e.
df'(s,u) >= df(s,u). (26.13)
...
My question is: I don't really understand the phrase
"Because of how we chose v, we know that the distance of vertex u from soruce s did not decrease, i.e.
df'(s,u) >= df(s,u). (26.13)"
How does the way we chose v affect the property that "the distance of vertex u from s did not decrease" ? How can I derive the equation (26.13).
We know, u is a vertex on the path (s,v) and (u,v) is also a part in (s,v). Why can (s,u) not decrease as well?
Thank you all for your help.
My answer may be drawn out, but hopefully it helps for an all around understanding.
For some history, note that the Ford-Fulkerson algorithm came first. Ford-Fulkerson simply selects any path from the source to the sink, adds the amount of flow to the current capacity, then augments the Residual graph accordingly. Since the path that is selected could hypothetically be anything, there are scenarios where this approach takes 'forever' (figuratively and literally speaking, if the edge weights are allowed to be irrational) to actually terminate.
Edmonds-Karp does the same thing as the Ford-Fulkerson, only it chooses the 'shortest' path, which can be found via a breadth-first search (BFS).
BFS guarantees a certain (partial) ordering among the traversed vertices. For example, consider the following graph:
A -> B -> C,
BFS guarantees that B will be traversed before C. (You should be able to generalize this argument with more sophisticated graphs, an exercise I leave to you.) For the remainder of this post, let "n" denote the number of levels it takes in BFS to reach the target node. So if we were searching for node C in the example above, n = 2.
Edmonds-Karp behaves similarly to Ford-Fulkerson, only it guarantees that the shortest paths are chosen first. When Edmonds-Karp updates the residual graph, we know that only nodes at a level equal to or smaller than n have actually been traversed. Similarly, only edges between nodes for the first n levels could have possibly been updated in the residual graph.
I'm pretty sure that the 'how we chose v' reflects the ordering that BFS guarantees, since the added residual edges necessarily flow in the opposite direction of any selected path. If the residual edges were to create a shorter path, then it would have been possible to find a shorter path than n in the first place, because the residual edges are only created when a path to the target node has been found and BFS guarantees that the shortest such path has already been found.
Hope this helps and at least gives some insight.
I don't quite understand either. But I think that "how we choose v" here means that the flow augmentation only causes the path from s to v becomes shorter, in another way, v is the first node whose path from s becomes shorter because of the augmentation, thus the node u's distance from s does not become shorter.

A Shortest Path Algorithm With Minimum Number Of Nodes Traversed

I am looking for a Dijkstra's algorithm implementation, that also takes into consideration the number of nodes traversed.
What I mean is, a typical Dijkstra's algorithm, takes into consideration the weight of the edges connecting the nodes, while calculating the shortest route from node A to node B. I want to insert another parameter into this. I want the algorithm to give some weightage to the number of nodes traversed, as well.
So that the shortest route computed from A to B, under certain values, may not necessarily be the Shortest Route, but the route with the least number of nodes traversed.
Any thoughts on this?
Cheers,
RD
Edit :
My apologies. I should have explained better. So, lets say, the shortest route from
(A, B) is A -> C -> D -> E -> F -> B covering a total of 10 units
But I want the algorithm to come up with the route A -> M -> N -> B covering a total of 12 units.
So, what I want, is to be able to give some weightage to the number of nodes as well, not just the distance of the connected nodes.
Let me demonstrate that adding a constant value to all edges can change which route is "shortest" (least total weight of edges).
Here's the original graph (a triangle):
A-------B
\ 5 /
2 \ / 2
\ /
C
Shortest path from A to B is via C. Now add constant 2 to all edges. The shortest path becomes instead the single step from A directly to B (owing to "penalty" we've introduced for using additional edges).
Note that the number of edges used is (excluding the node you start from) the same as the number of nodes visited.
The way you can do that is adapt the weights of the edges to always be 1, so that you traverse 5 nodes, and you've gone a distance of "5". The algorithm would be the same at that point, optimizing for number of nodes traversed rather than distance traveled.
If you want some sort of hybrid, you need to determine how much importance to give to traversing a node and the distance. The weight used in calculations should look something like:
weight = node_importance * 1 + (1 - node_importance) * distance
Where node_importance would be a percentage which gauges how much distance is a factor and how much minimum node traversal is important. Though you may have to normalize the distances to be an average of 1.
I'm going to go out on a limb here, but have you tried the A* algorithm? I may have understood your question wrong, but it seems like A* would be a good starting point for what you want to do.
Check out: http://en.wikipedia.org/wiki/A*_search_algorithm
Some pseudo code there to help you out too :)
If i understood the question correctly then its best analogy would be that used to find the best network path.
In network communication a path may not only be selected because it is shortest but has many hop counts(node), thus may lead to distortion, interference and noise due to node connection.
So the best path calculation contains the minimizing the function of variables as in your case Distance and Hop Count(nodes).
You have to derive a functional equation that could relate the distance and node counts with quality.
so something as suppose
1 hop count change = 5 unit distance (which means the impact is same for 5unit distace or 1 node change)
so to minimize the loss you can use it in the linear equation.
minimize(distance + hopcount);
where hopcount can be expressed as distance.

Resources