How to find the shortest path cost with under special conditions? - graph

Given a weighted directed graph G = (V, E) , a source vertex s , a destination vertex t and a subset v’, how to finds the shortest non-loop path from s to vertex t in the graph? The vertexs in the subset must be included in the path.

This is a variant of Traveling Salesman Problem (TSP).
You can transform your problem to an exact instance of TSP, by running all-to-all shortest path in the graph (Floyd Warshall Algorithm), and then create a new graph:
G'={V' U {s,t}, E'}
V' - the "must go through" subset
E' = { (s,v), (v,t), (u,v) | u,v in V'} (In words: all edges between two nodes in the new graphs)
Now, finding the miminal path that goes through all nodes (TSP) in G', is also the minimal path that meets the criteria (after expending each path between two pairs back).
Unfortunately, TSP is NP-Complete problem (there is no known polynomial time algorithm to solve it, and most believe one doesn't even exist), and unless your set of "must go through nodes" V' is relatively small (and you can afford exponential time running time of the algorithm), you will need to settle on a "good enough" algorithm, which might not be optimal.
Sidenote regarding "no loops" - note it might be infeasible, for example:
---------
| |
v |
s -> a -> b -> c
|
|
t
In this example, the only path to meet the criteria is s->a->b->c->t

Related

Longest path in a Graph with Dijkstra

I was just wondering: can you inverse all the weights in a graph and then do a Dijkstra? As we are minimizing the reciprocal values of the weights, the obtained path would maximize it all in all, right?
So, in that way, we can obtain the longest path in a graph using Dijkstra!
It seems too easy, am I mistaken? Please, enlighten me.
It is not possible to do so because the longest path problem doesn't have the optimal substructure problem as the shortest path one.
Say that you can consider any path as longest path (so it can have cycles) but if there is a cycle and the weights are positive the algorithm will never end since it can always improve the longest path by looping through the cycle.
Now say that we want to have only simple paths (without cycle) as candidates for the longest path. Consider, without loss of generality, the following graph with unitary weights for all edges:
A------B
| |
| |
C------D
And consider the longest path from A to D (A->B->D). For the problem to have optimal substructure property it must be the case that longest path from A to Bis A -> B but clearly it isn't because path A->C->D->B is longer. Similar argument can be done for the path from B to D. So we can see why this problem can't be solved with Dijkstra algorithm. As a matter of fact this problem is NP, there isn't a reasonable time complexity solution.
It's easy to understand with a simple example graph.
Suppose you want to go from point A to point D. To minimize the reciprocal values of the weights, you will go through C. But A->B->D is larger.
Edit: Perhaps I should include some math at least.
Suppose the sum of a sequence of positive numbers is s.
a1 + a2 + a3 + ... + an = s.
What's the minimum value of reciprocal sum?
1/a1 + 1/a2 + 1/a3 + ... + 1/an
Playing around this will give you some intuition.

Cormen's "Introduction to algorithms" 3rd Edition - Edmonds-karps-Algorithm - Lemma 26.7

since I think many of us don't have the same edition of "Introduction to algorithms" of Prof. Cormen et al., I'm gonna write the Lemma (and my question) in the following.
Edmonds-Karp-Algorithm
Lemma 26.7 (in 3rd edition; in 2nd it may be Lemma 26.8):
If the Edmonds-Karp algorithm is run on a flow network G=(V,E) with source s and sink t, then for all vertices v in V{s,t}, the shortest-path distance df(s,v) in the residual network Gf increases monotonically with each flow augmentation
Proof:
First, suppose that for some vertex v in V{s,t}, there is a flow augmentation that causes the shortest-path distance from s to v to decrease, then we will derive a contradiction.
Let f be the flow just before the first augmentation that decreases some shortest-path distance, and let f' be the flow just afterward. Let v be the vertex with the minimum df'(s,v), whose distance was decreased by the augmentation, so that df'(s,v) < df(s,v). Let p = s ~~> u -> u be a shortest path from s to v in Gf', so that (u,v) in Ef' and
df'(s,u) = df'(s,v) - 1. (26.12)
Because of how we chose v, we know that the distance of vertex u from soruce s did not decrease, i.e.
df'(s,u) >= df(s,u). (26.13)
...
My question is: I don't really understand the phrase
"Because of how we chose v, we know that the distance of vertex u from soruce s did not decrease, i.e.
df'(s,u) >= df(s,u). (26.13)"
How does the way we chose v affect the property that "the distance of vertex u from s did not decrease" ? How can I derive the equation (26.13).
We know, u is a vertex on the path (s,v) and (u,v) is also a part in (s,v). Why can (s,u) not decrease as well?
Thank you all for your help.
My answer may be drawn out, but hopefully it helps for an all around understanding.
For some history, note that the Ford-Fulkerson algorithm came first. Ford-Fulkerson simply selects any path from the source to the sink, adds the amount of flow to the current capacity, then augments the Residual graph accordingly. Since the path that is selected could hypothetically be anything, there are scenarios where this approach takes 'forever' (figuratively and literally speaking, if the edge weights are allowed to be irrational) to actually terminate.
Edmonds-Karp does the same thing as the Ford-Fulkerson, only it chooses the 'shortest' path, which can be found via a breadth-first search (BFS).
BFS guarantees a certain (partial) ordering among the traversed vertices. For example, consider the following graph:
A -> B -> C,
BFS guarantees that B will be traversed before C. (You should be able to generalize this argument with more sophisticated graphs, an exercise I leave to you.) For the remainder of this post, let "n" denote the number of levels it takes in BFS to reach the target node. So if we were searching for node C in the example above, n = 2.
Edmonds-Karp behaves similarly to Ford-Fulkerson, only it guarantees that the shortest paths are chosen first. When Edmonds-Karp updates the residual graph, we know that only nodes at a level equal to or smaller than n have actually been traversed. Similarly, only edges between nodes for the first n levels could have possibly been updated in the residual graph.
I'm pretty sure that the 'how we chose v' reflects the ordering that BFS guarantees, since the added residual edges necessarily flow in the opposite direction of any selected path. If the residual edges were to create a shorter path, then it would have been possible to find a shorter path than n in the first place, because the residual edges are only created when a path to the target node has been found and BFS guarantees that the shortest such path has already been found.
Hope this helps and at least gives some insight.
I don't quite understand either. But I think that "how we choose v" here means that the flow augmentation only causes the path from s to v becomes shorter, in another way, v is the first node whose path from s becomes shorter because of the augmentation, thus the node u's distance from s does not become shorter.

finding the total number of distinct shortest paths between 2 nodes in undirected weighted graph in linear time?

I was wondering, that if there is a weighted graph G(V,E), and I need to find a single shortest path between any two vertices S and T in it then I could have used the Dijkstras algorithm. but I am not sure how this can be done when we need to find all the distinct shortest paths from S to T. Is it solvable on O(n) time? I had one more question like if we assume that the weights of the edges in the graph can assume values only in certain range lets say 1 <=w(e)<=2 will this effect the time complexity?
You can do it using Dijkstra. In Dijkstra's algorithm, you assign labels to each node based on the distance it has from your source and settle nodes according to this distance (smallest first). Once the target node is settled, you have the length of the shortest path. To retrieve the path (=the sequence of edges), you can keep track of the parent of each node you settle. To retrieve all possible paths, you have to account for multiple parents of each node.
A small example:
Suppose you have a graph which looks like this (all edges weight 1 for simplicity):
B
/ \
A C - D
\ /
E
When you do dijkstra to find the distance A->D, you would get 3. You would first settle node A with distance 0, then nodes B and E with distance 1, node C with distance 2 and finally node D with distance 3. To get the paths, you would remember the parents of the nodes when you settle them. In this case you would first set the parents of B and C =node A. You then need to set the parents of node C to B and E as they both supply the same length. Node D would get node C as a parent.
When you need the paths, you could just follow the parent pointers, starting at D and branching each time a node has multiple parents. This would give you a branch at node C.
The post mentioned in the comment above uses the same principles, although it does not allow you to actually extract the paths, it only gives you the count.
One final remark: Dijkstra is not a linear time algorithm as you need to do a lot of operations to maintain you queue and node sets.
(a) Using BFS from s to mark nodes visited as usual, in the meantime, changing following things:
(b) for each node v, it has a record L(v) representing the layer that v is on and a record f(v)
representing the number of incoming paths:
(c) everytime a node v1 is found in another node's v2 neighbors:
(d) if v1 is not in the queue, add it into queue, L(v1) = L(v2) + 1,f(v1)+ = f(v2)
(d) if v1 is already in the queue and L(v1) equals L(v2), do nothing.
(e) if v1 is already in the queue but L(v1) doe not equal to L(v2), f(v1)+ = f(v2)
(f) until in this modied BFS, t pop from the queue for the rst time, we have f(t)
(g) and this f(t) represents the number of dierent shortest paths from s to t
This might solve your problem.

A Shortest Path Algorithm With Minimum Number Of Nodes Traversed

I am looking for a Dijkstra's algorithm implementation, that also takes into consideration the number of nodes traversed.
What I mean is, a typical Dijkstra's algorithm, takes into consideration the weight of the edges connecting the nodes, while calculating the shortest route from node A to node B. I want to insert another parameter into this. I want the algorithm to give some weightage to the number of nodes traversed, as well.
So that the shortest route computed from A to B, under certain values, may not necessarily be the Shortest Route, but the route with the least number of nodes traversed.
Any thoughts on this?
Cheers,
RD
Edit :
My apologies. I should have explained better. So, lets say, the shortest route from
(A, B) is A -> C -> D -> E -> F -> B covering a total of 10 units
But I want the algorithm to come up with the route A -> M -> N -> B covering a total of 12 units.
So, what I want, is to be able to give some weightage to the number of nodes as well, not just the distance of the connected nodes.
Let me demonstrate that adding a constant value to all edges can change which route is "shortest" (least total weight of edges).
Here's the original graph (a triangle):
A-------B
\ 5 /
2 \ / 2
\ /
C
Shortest path from A to B is via C. Now add constant 2 to all edges. The shortest path becomes instead the single step from A directly to B (owing to "penalty" we've introduced for using additional edges).
Note that the number of edges used is (excluding the node you start from) the same as the number of nodes visited.
The way you can do that is adapt the weights of the edges to always be 1, so that you traverse 5 nodes, and you've gone a distance of "5". The algorithm would be the same at that point, optimizing for number of nodes traversed rather than distance traveled.
If you want some sort of hybrid, you need to determine how much importance to give to traversing a node and the distance. The weight used in calculations should look something like:
weight = node_importance * 1 + (1 - node_importance) * distance
Where node_importance would be a percentage which gauges how much distance is a factor and how much minimum node traversal is important. Though you may have to normalize the distances to be an average of 1.
I'm going to go out on a limb here, but have you tried the A* algorithm? I may have understood your question wrong, but it seems like A* would be a good starting point for what you want to do.
Check out: http://en.wikipedia.org/wiki/A*_search_algorithm
Some pseudo code there to help you out too :)
If i understood the question correctly then its best analogy would be that used to find the best network path.
In network communication a path may not only be selected because it is shortest but has many hop counts(node), thus may lead to distortion, interference and noise due to node connection.
So the best path calculation contains the minimizing the function of variables as in your case Distance and Hop Count(nodes).
You have to derive a functional equation that could relate the distance and node counts with quality.
so something as suppose
1 hop count change = 5 unit distance (which means the impact is same for 5unit distace or 1 node change)
so to minimize the loss you can use it in the linear equation.
minimize(distance + hopcount);
where hopcount can be expressed as distance.

Find all possible paths from one vertex in a directed cyclic graph in Erlang

I would like to implement a function which finds all possible paths to all possible vertices from a source vertex V in a directed cyclic graph G.
The performance doesn't matter now, I just would like to understand the algorithm. I have read the definition of the Depth-first search algorithm, but I don't have full comprehension of what to do.
I don't have any completed piece of code to provide here, because I am not sure how to:
store the results (along with A->B->C-> we should also store A->B and A->B->C);
represent the graph (digraph? list of tuples?);
how many recursions to use (work with each adjacent vertex?).
How can I find all possible paths form one given source vertex in a directed cyclic graph in Erlang?
UPD: Based on the answers so far I have to redefine the graph definition: it is a non-acyclic graph. I know that if my recursive function hits a cycle it is an indefinite loop. To avoid that, I can just check if a current vertex is in the list of the resulting path - if yes, I stop traversing and return the path.
UPD2: Thanks for thought provoking comments! Yes, I need to find all simple paths that do not have loops from one source vertex to all the others.
In a graph like this:
with the source vertex A the algorithm should find the following paths:
A,B
A,B,C
A,B,C,D
A,D
A,D,C
A,D,C,B
The following code does the job, but it is unusable with graphs that have more that 20 vertices (I guess it is something wrong with recursion - takes too much memory, never ends):
dfs(Graph,Source) ->
?DBG("Started to traverse graph~n", []),
Neighbours = digraph:out_neighbours(Graph,Source),
?DBG("Entering recursion for source vertex ~w~n", [Source]),
dfs(Neighbours,[Source],[],Graph,Source),
ok.
dfs([],Paths,Result,_Graph,Source) ->
?DBG("There are no more neighbours left for vertex ~w~n", [Source]),
Result;
dfs([Neighbour|Other_neighbours],Paths,Result,Graph,Source) ->
?DBG("///The neighbour to check is ~w, other neighbours are: ~w~n",[Neighbour,Other_neighbours]),
?DBG("***Current result: ~w~n",[Result]),
New_result = relax_neighbours(Neighbour,Paths,Result,Graph,Source),
dfs(Other_neighbours,Paths,New_result,Graph,Source).
relax_neighbours(Neighbour,Paths,Result,Graph,Source) ->
case lists:member(Neighbour,Paths) of
false ->
?DBG("Found an unvisited neighbour ~w, path is: ~w~n",[Neighbour,Paths]),
Neighbours = digraph:out_neighbours(Graph,Neighbour),
?DBG("The neighbours of the unvisited vertex ~w are ~w, path is:
~w~n",[Neighbour,Neighbours,[Neighbour|Paths]]),
dfs(Neighbours,[Neighbour|Paths],Result,Graph,Source);
true ->
[Paths|Result]
end.
UPD3:
The problem is that the regular depth-first search algorithm will go one of the to paths first: (A,B,C,D) or (A,D,C,B) and will never go the second path.
In either case it will be the only path - for example, when the regular DFS backtracks from (A,B,C,D) it goes back up to A and checks if D (the second neighbour of A) is visited. And since the regular DFS maintains a global state for each vertex, D would have 'visited' state.
So, we have to introduce a recursion-dependent state - if we backtrack from (A,B,C,D) up to A, we should have (A,B,C,D) in the list of the results and we should have D marked as unvisited as at the very beginning of the algorithm.
I have tried to optimize the solution to tail-recursive one, but still the running time of the algorithm is unfeasible - it takes about 4 seconds to traverse a tiny graph of 16 vertices with 3 edges per vertex:
dfs(Graph,Source) ->
?DBG("Started to traverse graph~n", []),
Neighbours = digraph:out_neighbours(Graph,Source),
?DBG("Entering recursion for source vertex ~w~n", [Source]),
Result = ets:new(resulting_paths, [bag]),
Root = Source,
dfs(Neighbours,[Source],Result,Graph,Source,[],Root).
dfs([],Paths,Result,_Graph,Source,_,_) ->
?DBG("There are no more neighbours left for vertex ~w, paths are ~w, result is ~w~n", [Source,Paths,Result]),
Result;
dfs([Neighbour|Other_neighbours],Paths,Result,Graph,Source,Recursion_list,Root) ->
?DBG("~w *Current source is ~w~n",[Recursion_list,Source]),
?DBG("~w Checking neighbour _~w_ of _~w_, other neighbours are: ~w~n",[Recursion_list,Neighbour,Source,Other_neighbours]),
? DBG("~w Ready to check for visited: ~w~n",[Recursion_list,Neighbour]),
case lists:member(Neighbour,Paths) of
false ->
?DBG("~w Found an unvisited neighbour ~w, path is: ~w~n",[Recursion_list,Neighbour,Paths]),
New_paths = [Neighbour|Paths],
?DBG("~w Added neighbour to paths: ~w~n",[Recursion_list,New_paths]),
ets:insert(Result,{Root,Paths}),
Neighbours = digraph:out_neighbours(Graph,Neighbour),
?DBG("~w The neighbours of the unvisited vertex ~w are ~w, path is: ~w, recursion:~n",[Recursion_list,Neighbour,Neighbours,[Neighbour|Paths]]),
dfs(Neighbours,New_paths,Result,Graph,Neighbour,[[[]]|Recursion_list],Root);
true ->
?DBG("~w The neighbour ~w is: already visited, paths: ~w, backtracking to other neighbours:~n",[Recursion_list,Neighbour,Paths]),
ets:insert(Result,{Root,Paths})
end,
dfs(Other_neighbours,Paths,Result,Graph,Source,Recursion_list,Root).
Any ideas to run this in acceptable time?
Edit:
Okay I understand now, you want to find all simple paths from a vertex in a directed graph. So a depth-first search with backtracking would be suitable, as you have realised. The general idea is to go to a neighbour, then go to another one (not one which you've visited), and keep going until you hit a dead end. Then backtrack to the last vertex you were at and pick a different neighbour, etc.
You need to get the fiddly bits right, but it shouldn't be too hard. E.g. at every step you need to label the vertices 'explored' or 'unexplored' depending on whether you've already visited them before. The performance shouldn't be an issue, a properly implemented algorithm should take maybe O(n^2) time. So I don't know what you are doing wrong, perhaps you are visiting too many neighbours? E.g. maybe you are revisiting neighbours that you've already visited, and going round in loops or something.
I haven't really read your program, but the Wiki page on Depth-first Search has a short, simple pseudocode program which you can try to copy in your language. Store the graphs as Adjacency Lists to make it easier.
Edit:
Yes, sorry, you are right, the standard DFS search won't work as it stands, you need to adjust it slightly so that does revisit vertices it has visited before. So you are allowed to visit any vertices except the ones you have already stored in your current path.
This of course means my running time was completely wrong, the complexity of your algorithm will be through the roof. If the average complexity of your graph is d+1, then there will be approximately d*d*d*...*d = d^n possible paths.
So even if every vertex has only 3 neighbours, there's still quite a few paths when you get above 20 vertices.
There's no way around that really, because if you want your program to output all possible paths then indeed you will have to output all d^n of them.
I'm interested to know whether you need this for a specific task, or are just trying to program this out of interest. If the latter, you will just have to be happy with small, sparsely connected graphs.
I don't understand question. If I have graph G = (V, E) = ({A,B}, {(A,B),(B,A)}), there is infinite paths from A to B {[A,B], [A,B,A,B], [A,B,A,B,A,B], ...}. How I can find all possible paths to any vertex in cyclic graph?
Edit:
Did you even tried compute or guess growing of possible paths for some graphs? If you have fully connected graph you will get
2 - 1
3 - 4
4 - 15
5 - 64
6 - 325
7 - 1956
8 - 13699
9 - 109600
10 - 986409
11 - 9864100
12 - 108505111
13 - 1302061344
14 - 16926797485
15 - 236975164804
16 - 3554627472075
17 - 56874039553216
18 - 966858672404689
19 - 17403456103284420
20 - 330665665962403999
Are you sure you would like find all paths for all nodes? It means if you compute one milion paths in one second it would take 10750 years to compute all paths to all nodes in fully connected graph with 20 nodes. It is upper bound for your task so I think you don't would like do it. I think you want something else.
Not an improved algorithmic solution by any means, but you can often improve performance by spawning multiple worker threads, potentially here one for each first level node and then aggregating the results. This can often improve naive brute force algorithms relatively easily.
You can see an example here: Some Erlang Matrix Functions, in the maximise_assignment function (comments starting on line 191 as of today). Again, the underlying algorithm there is fairly naive and brute force, but the parallelisation speeds it up quite well for many forms of matrices.
I have used a similar approach in the past to find the number of Hamiltonian Paths in a graph.

Resources