Find all BFS/DFS traversals - graph

Given an undirected cyclic graph, I want to find all possible traversals with Breadth-First search or Depth-First search. That is given a graph as an adjacency-list:
A-BC
B-A
C-ADE
D-C
E-C
So all BFS paths from root A would be:
{ABCDE,ABCED,ACBDE,ACBED}
and for DFS:
{ABCDE,ABCED,ACDEB,ACEDB}
How would I generate those traversals algorithmically in a meaningful way? I suppose one could generate all permutations of letters and check their validity, but that seems like last-resort to me.
Any help would be appreciated.

Apart from the obvious way where you actually perform all possible DFS and BFS traversals you could try this approach:
Step 1.
In a dfs traversal starting from the root A transform the adjacency list of the currently visited node like so: First remove the parent of the node from the list. Second generate all permutations of the remaining nodes in the adj list.
So if you are at node C having come from node A you will do:
C -> ADE transform into C -> DE transform into C -> [DE, ED]
Step 2.
After step 1 you have the following transformed adj list:
A -> [CB, BC]
B -> []
C -> [DE, ED]
D -> []
E -> []
Now you launch a processing starting from (A,0), where the first item in the pair is the traversal path and the second is an index. Lets assume we have two queues. A BFS queue and a DFS queue. We put this pair into both queues.
Now we repeat the following, first for one queue until it is empty and then for the other queue.
We pop the first pair off the queue. We get (A,0). The node A maps to [BC, CB]. So we generate two new paths (ACB,1) and (ABC,1). Put these new paths in the queue.
Take the first one of these off the queue to get (ACB,1). The index is 1 so we look at the second character in the path string. This is C. Node C maps to [DE, ED].
The BFS children of this path would be (ACBDE,2) and (ACBED,2) which we obtained by appending the child permutation.
The DFS children of this path would be (ACDEB,2) and (ACEDB,2) which we obtained by inserting the child permutation right after C into the path string.
We generate the new paths according to which queue we are working on, based on the above and put them in the queue. So if we are working on the BFS queue we put in (ACBDE,2) and (ACBED,2). The contents of our queue are now : (ABC,1) , (ACBDE,2), (ACBED,2).
We pop (ABC,1) off the queue. Generate (ABC,2) since B has no children. And get the queue :
(ACBDE,2), (ACBED,2), (ABC,2) and so on. At some point we will end up with a bunch of pairs where the index is not contained in the path. For example if we get (ACBED,5) we know this is a finished path.

BFS is should be quite simple: each node has a certain depth at which it will be found. In your example you find A at depth 0, B and C at depth 1 and E and D at depth 2. In each BFS path, you will have the element with depth 0 (A) as the first element, followed by any permutation of the elements at depth 1 (B and C), followed by any permutation of the elements at depth 2 (E and D), etc...
If you look at your example, your 4 BFS paths match that pattern. A is always the first element, followed by BC or CB, followed by DE or ED. You can generalize this for graphs with nodes at deeper depths.
To find that, all you need is 1 Dijkstra search which is quite cheap.
In DFS, you don't have the nice separation by depth which makes BFS straightforward. I don't immediately see an algorithm that is as efficient as the one above. You could set up a graph structure and build up your paths by traversing your graph and backtracking. There are some cases in which this would not be very efficient but it might be enough for your application.

Related

Write a pseudo code for a Graph algorithm

Given a DAG and a function which maps every vertex to a unique number from 1 to , I need to write a pseudo code for an algorithm that finds for every the minimal value of , among all vertices u that are reachable from v, and save it a an attribute of v. The time complexity of the algorithm needs to be (assuming that time complexity of is ).
I thought about using DFS (or a variation of it) and/or topological sort, but I don't know how to use it in order to solve this problem.
In addition, I need to think about an algorithm that gets an undirected graph and the function , and calculate the same thing for every vertex, and I don't know how to do it either.
Your idea of using DFS is right. Actually the function f(v) is only given for saying that each node can be uniquely identified from a number between 1 and |V|.
Just a hint for solving, you would have to modify DFS so that it returns the minimum value of f(v) of the vertex v reachable from this node and save it another array, let's say minReach and the index would be given by the function f(v). The visited array vis would similarly be identified using f(v).
I am also giving the pseudocode if you are not able to solve but try on your own first.
The pseudocode is similar to python and assuming the graph and function f(v) are available. And 0-based indexing is assumed.
vis=[0, 0, 0, .. |V| times] # visited array for dfs
minReach= [1, 2, 3, .. |V| times] # array for storing no. reachable nodes from a node
function dfs(node):
vis[f(node)-1]=1 ### mark unvisited node as visited
for v in graph[node]:
if vis[v]!=1: ## check whether adjacent node is visited
minReach[f(node)-1]=min(minReach[f(node)-1],dfs(v) ## if not visited apply dfs again
else:
minReach[f(node)-1]=min(minReach[f(node)-1],countReach[f(v)-1]) ## else store the minimum node that can be reached from this node.
return countReach[f(node)-1]
for vertex in graph: ### each vertex is checked in graph
if vis[f(vertex)-1]!=1: ### if vertex is not visited dfs is applied
dfs(vertex)

Depth First Search

Perform Depth-first Search on the graph shown starting with vertex a. When you traverse the neighbours, process them in alphabetical order.
The question is to find the DFI, Level and the Parent of each vertex.
Here is a picture of it:
I'm unsure of how to get going with this, it is a practice question for an upcoming exam. I know for depth first search, it uses a stack and it will start at vertex a and go in alphabetical order in the stack but i'm not sure how I would get the values for each of the columns. Can someone explain further or help me with this?
So you start at 'a' and must traverse the nodes in alphabetical order so from a you either have the option of going to b or g so you choose b because it is first alphabetically. from b your only choice is g and so on....
now for your values. the parent of a is null since you have no previous nodes the parent of b is a and the parent of g is b and so on.
the dfs level is the level that it would end up on a tree. so imagine that you do your traversal then erase all lines that weren't part of the traversal. and then you take your root and 'shake it out' what i mean is you rearrange it so that it looks like a tree. (this particular graph is very uninteresting) and then you assign levels based on that tree.
And the dfs index is simply the order in which you touched the nodes.
The folowing are for your graph but using g as a starting point....I think it makes it slightly more intersting
the numbers are the order in which the edges were taken.
Here is what i was talking about when i said 'shake it out' this is what your tree looks like and in blue i show the level of each node(0 based). I hope the images make it a little more understandable.
the one i drew( the terrible free hand one) was formed by deleting all of the edges that weren't used and then rearranging them to look like a tree.
You can think of the depth as how many steps did i have to take from the root to get to the current node. so from g to b is 1 step so depth of 1 from g to i 3 because we go from g->c->d->i 3 steps. after you have made your traversal you ignore the fact that you can in fact get from g to i in two steps(g->h->i) because it wasnt part of the traversal
The index is simply the number in order that the node is visited. a is first, write 1 there. Knowing depth first search as you do, you should know what the second node is; so write 2 under that. Depth is how high a node is; every time you deepen the depth, it increases, and whenever you go shallower, it's less. So a is on depth 1; the next node and its sister will be on depth 2, etc. The parent is the letter identifying the node that you just came from; so a has no parent, and the node with index 2 will have a as parent.
If your class uses a zero-based numbering system, replace 2 in the above paragraph with 1, and 1 with 0. If you have no idea what "zero-based numbering system" is, ignore this paragraph.

finding the total number of distinct shortest paths between 2 nodes in undirected weighted graph in linear time?

I was wondering, that if there is a weighted graph G(V,E), and I need to find a single shortest path between any two vertices S and T in it then I could have used the Dijkstras algorithm. but I am not sure how this can be done when we need to find all the distinct shortest paths from S to T. Is it solvable on O(n) time? I had one more question like if we assume that the weights of the edges in the graph can assume values only in certain range lets say 1 <=w(e)<=2 will this effect the time complexity?
You can do it using Dijkstra. In Dijkstra's algorithm, you assign labels to each node based on the distance it has from your source and settle nodes according to this distance (smallest first). Once the target node is settled, you have the length of the shortest path. To retrieve the path (=the sequence of edges), you can keep track of the parent of each node you settle. To retrieve all possible paths, you have to account for multiple parents of each node.
A small example:
Suppose you have a graph which looks like this (all edges weight 1 for simplicity):
B
/ \
A C - D
\ /
E
When you do dijkstra to find the distance A->D, you would get 3. You would first settle node A with distance 0, then nodes B and E with distance 1, node C with distance 2 and finally node D with distance 3. To get the paths, you would remember the parents of the nodes when you settle them. In this case you would first set the parents of B and C =node A. You then need to set the parents of node C to B and E as they both supply the same length. Node D would get node C as a parent.
When you need the paths, you could just follow the parent pointers, starting at D and branching each time a node has multiple parents. This would give you a branch at node C.
The post mentioned in the comment above uses the same principles, although it does not allow you to actually extract the paths, it only gives you the count.
One final remark: Dijkstra is not a linear time algorithm as you need to do a lot of operations to maintain you queue and node sets.
(a) Using BFS from s to mark nodes visited as usual, in the meantime, changing following things:
(b) for each node v, it has a record L(v) representing the layer that v is on and a record f(v)
representing the number of incoming paths:
(c) everytime a node v1 is found in another node's v2 neighbors:
(d) if v1 is not in the queue, add it into queue, L(v1) = L(v2) + 1,f(v1)+ = f(v2)
(d) if v1 is already in the queue and L(v1) equals L(v2), do nothing.
(e) if v1 is already in the queue but L(v1) doe not equal to L(v2), f(v1)+ = f(v2)
(f) until in this modied BFS, t pop from the queue for the rst time, we have f(t)
(g) and this f(t) represents the number of dierent shortest paths from s to t
This might solve your problem.

Find all possible paths from one vertex in a directed cyclic graph in Erlang

I would like to implement a function which finds all possible paths to all possible vertices from a source vertex V in a directed cyclic graph G.
The performance doesn't matter now, I just would like to understand the algorithm. I have read the definition of the Depth-first search algorithm, but I don't have full comprehension of what to do.
I don't have any completed piece of code to provide here, because I am not sure how to:
store the results (along with A->B->C-> we should also store A->B and A->B->C);
represent the graph (digraph? list of tuples?);
how many recursions to use (work with each adjacent vertex?).
How can I find all possible paths form one given source vertex in a directed cyclic graph in Erlang?
UPD: Based on the answers so far I have to redefine the graph definition: it is a non-acyclic graph. I know that if my recursive function hits a cycle it is an indefinite loop. To avoid that, I can just check if a current vertex is in the list of the resulting path - if yes, I stop traversing and return the path.
UPD2: Thanks for thought provoking comments! Yes, I need to find all simple paths that do not have loops from one source vertex to all the others.
In a graph like this:
with the source vertex A the algorithm should find the following paths:
A,B
A,B,C
A,B,C,D
A,D
A,D,C
A,D,C,B
The following code does the job, but it is unusable with graphs that have more that 20 vertices (I guess it is something wrong with recursion - takes too much memory, never ends):
dfs(Graph,Source) ->
?DBG("Started to traverse graph~n", []),
Neighbours = digraph:out_neighbours(Graph,Source),
?DBG("Entering recursion for source vertex ~w~n", [Source]),
dfs(Neighbours,[Source],[],Graph,Source),
ok.
dfs([],Paths,Result,_Graph,Source) ->
?DBG("There are no more neighbours left for vertex ~w~n", [Source]),
Result;
dfs([Neighbour|Other_neighbours],Paths,Result,Graph,Source) ->
?DBG("///The neighbour to check is ~w, other neighbours are: ~w~n",[Neighbour,Other_neighbours]),
?DBG("***Current result: ~w~n",[Result]),
New_result = relax_neighbours(Neighbour,Paths,Result,Graph,Source),
dfs(Other_neighbours,Paths,New_result,Graph,Source).
relax_neighbours(Neighbour,Paths,Result,Graph,Source) ->
case lists:member(Neighbour,Paths) of
false ->
?DBG("Found an unvisited neighbour ~w, path is: ~w~n",[Neighbour,Paths]),
Neighbours = digraph:out_neighbours(Graph,Neighbour),
?DBG("The neighbours of the unvisited vertex ~w are ~w, path is:
~w~n",[Neighbour,Neighbours,[Neighbour|Paths]]),
dfs(Neighbours,[Neighbour|Paths],Result,Graph,Source);
true ->
[Paths|Result]
end.
UPD3:
The problem is that the regular depth-first search algorithm will go one of the to paths first: (A,B,C,D) or (A,D,C,B) and will never go the second path.
In either case it will be the only path - for example, when the regular DFS backtracks from (A,B,C,D) it goes back up to A and checks if D (the second neighbour of A) is visited. And since the regular DFS maintains a global state for each vertex, D would have 'visited' state.
So, we have to introduce a recursion-dependent state - if we backtrack from (A,B,C,D) up to A, we should have (A,B,C,D) in the list of the results and we should have D marked as unvisited as at the very beginning of the algorithm.
I have tried to optimize the solution to tail-recursive one, but still the running time of the algorithm is unfeasible - it takes about 4 seconds to traverse a tiny graph of 16 vertices with 3 edges per vertex:
dfs(Graph,Source) ->
?DBG("Started to traverse graph~n", []),
Neighbours = digraph:out_neighbours(Graph,Source),
?DBG("Entering recursion for source vertex ~w~n", [Source]),
Result = ets:new(resulting_paths, [bag]),
Root = Source,
dfs(Neighbours,[Source],Result,Graph,Source,[],Root).
dfs([],Paths,Result,_Graph,Source,_,_) ->
?DBG("There are no more neighbours left for vertex ~w, paths are ~w, result is ~w~n", [Source,Paths,Result]),
Result;
dfs([Neighbour|Other_neighbours],Paths,Result,Graph,Source,Recursion_list,Root) ->
?DBG("~w *Current source is ~w~n",[Recursion_list,Source]),
?DBG("~w Checking neighbour _~w_ of _~w_, other neighbours are: ~w~n",[Recursion_list,Neighbour,Source,Other_neighbours]),
? DBG("~w Ready to check for visited: ~w~n",[Recursion_list,Neighbour]),
case lists:member(Neighbour,Paths) of
false ->
?DBG("~w Found an unvisited neighbour ~w, path is: ~w~n",[Recursion_list,Neighbour,Paths]),
New_paths = [Neighbour|Paths],
?DBG("~w Added neighbour to paths: ~w~n",[Recursion_list,New_paths]),
ets:insert(Result,{Root,Paths}),
Neighbours = digraph:out_neighbours(Graph,Neighbour),
?DBG("~w The neighbours of the unvisited vertex ~w are ~w, path is: ~w, recursion:~n",[Recursion_list,Neighbour,Neighbours,[Neighbour|Paths]]),
dfs(Neighbours,New_paths,Result,Graph,Neighbour,[[[]]|Recursion_list],Root);
true ->
?DBG("~w The neighbour ~w is: already visited, paths: ~w, backtracking to other neighbours:~n",[Recursion_list,Neighbour,Paths]),
ets:insert(Result,{Root,Paths})
end,
dfs(Other_neighbours,Paths,Result,Graph,Source,Recursion_list,Root).
Any ideas to run this in acceptable time?
Edit:
Okay I understand now, you want to find all simple paths from a vertex in a directed graph. So a depth-first search with backtracking would be suitable, as you have realised. The general idea is to go to a neighbour, then go to another one (not one which you've visited), and keep going until you hit a dead end. Then backtrack to the last vertex you were at and pick a different neighbour, etc.
You need to get the fiddly bits right, but it shouldn't be too hard. E.g. at every step you need to label the vertices 'explored' or 'unexplored' depending on whether you've already visited them before. The performance shouldn't be an issue, a properly implemented algorithm should take maybe O(n^2) time. So I don't know what you are doing wrong, perhaps you are visiting too many neighbours? E.g. maybe you are revisiting neighbours that you've already visited, and going round in loops or something.
I haven't really read your program, but the Wiki page on Depth-first Search has a short, simple pseudocode program which you can try to copy in your language. Store the graphs as Adjacency Lists to make it easier.
Edit:
Yes, sorry, you are right, the standard DFS search won't work as it stands, you need to adjust it slightly so that does revisit vertices it has visited before. So you are allowed to visit any vertices except the ones you have already stored in your current path.
This of course means my running time was completely wrong, the complexity of your algorithm will be through the roof. If the average complexity of your graph is d+1, then there will be approximately d*d*d*...*d = d^n possible paths.
So even if every vertex has only 3 neighbours, there's still quite a few paths when you get above 20 vertices.
There's no way around that really, because if you want your program to output all possible paths then indeed you will have to output all d^n of them.
I'm interested to know whether you need this for a specific task, or are just trying to program this out of interest. If the latter, you will just have to be happy with small, sparsely connected graphs.
I don't understand question. If I have graph G = (V, E) = ({A,B}, {(A,B),(B,A)}), there is infinite paths from A to B {[A,B], [A,B,A,B], [A,B,A,B,A,B], ...}. How I can find all possible paths to any vertex in cyclic graph?
Edit:
Did you even tried compute or guess growing of possible paths for some graphs? If you have fully connected graph you will get
2 - 1
3 - 4
4 - 15
5 - 64
6 - 325
7 - 1956
8 - 13699
9 - 109600
10 - 986409
11 - 9864100
12 - 108505111
13 - 1302061344
14 - 16926797485
15 - 236975164804
16 - 3554627472075
17 - 56874039553216
18 - 966858672404689
19 - 17403456103284420
20 - 330665665962403999
Are you sure you would like find all paths for all nodes? It means if you compute one milion paths in one second it would take 10750 years to compute all paths to all nodes in fully connected graph with 20 nodes. It is upper bound for your task so I think you don't would like do it. I think you want something else.
Not an improved algorithmic solution by any means, but you can often improve performance by spawning multiple worker threads, potentially here one for each first level node and then aggregating the results. This can often improve naive brute force algorithms relatively easily.
You can see an example here: Some Erlang Matrix Functions, in the maximise_assignment function (comments starting on line 191 as of today). Again, the underlying algorithm there is fairly naive and brute force, but the parallelisation speeds it up quite well for many forms of matrices.
I have used a similar approach in the past to find the number of Hamiltonian Paths in a graph.

Definition of cycle in undirected graph

I can't find a formal definition of cycle in an undirected graph. The CLRS only reports a definition of symple cycles that i can't manage to generalize for a generic cycle.
This is the CLRS definition: a symple cycle in a undirected graph is a path <v0,v1,..,vk> so that:
k >= 3
v0 = vk
v1,..,vk are distincts
So i tried to remove the 3 condition to define a generic cycle but this can't work because we can have something like this: <a, b, c, b, a> that is obviously not a cycle.
I think you can generalize 3 as follows:
Either v1,...,vk are distinct or they contain a simple cycle (a contiguous list of vertices satisfying 1,2,&3).
Path in graph theory means list of edges or/and vertices satisfying some connectivity conditions. Cycle is closed path, first and last list element are same. Wikipedia article
Path or cycle is called simple if there are no repeated vertices or edges other than the starting and ending vertices. Your example is cycle, but not simple cycle.

Resources