Finding all topological orders - graph

I have to make an algorithm that finds all the topological orders(using predecessor counting) and the highest cost paths and their costs between 2 pairs of vertices. My algorithm looks like this for now:
def topologicalSort(self):
sorted = []
count = {}
q = deque()
for x in self.parseX():
count[x] = self.innerDegree(x)
if count[x] == 0:
q.append(x)
while len(q) > 0:
x = q.popleft()
sorted.append(x)
for y in self.parseNout(x):
count[y] -= 1
if count[y] == 0:
q.append(y)
return sorted
It works fine but the problem is that is will find only one topological order. And my question would be: How can I make it to find all the topological orders?

Your loops are in a fixed order. Different topological sorts are achieved by iterating over them in different orders. So you need another level of recursion trying an topological sort on each of them being the first one to be tried.
I'd elaborate, but cursory search found several pages apparently describing the algorithm you want (albeit in different languages):
https://www.geeksforgeeks.org/all-topological-sorts-of-a-directed-acyclic-graph/
https://www.techiedelight.com/find-all-possible-topological-orderings-of-dag/

Related

Detecting cycles in Topological sort using Kahn's algorithm (in degree / out degree)

I have been practicing graph questions lately.
https://leetcode.com/problems/course-schedule-ii/
https://leetcode.com/problems/alien-dictionary/
The current way I detect cycles is to use two hashsets. One for visiting nodes, and one for fully visited nodes. And I push the result onto a stack with DFS traversal.
If I ever visit a node that is currently in the visiting set, then it is a cycle.
The code is pretty verbose and the length is long.
Can anyone please explain how I can use a more standard top-sort algorithm (Kahn's) to detect cycles and generate the top sort sequence?
I just want my method to exit or set some global variable which flags that a cycle has been detected.
Many thanks.
Khan's algorithm with cycle detection (summary)
Step 1: Compute In-degree: First we create compute a lookup for the in-degrees of every node. In this particular Leetcode problem, each node has a unique integer identifier, so we can simply store all the in-degrees values using a list where indegree[i] tells us the in-degree of node i.
Step 2: Keep track of all nodes with in-degree of zero: If a node has an in-degree of zero it means it is a course that we can take right now. There are no other courses that it depends on. We create a queue q of all these nodes that have in-degree of zero. At any step of Khan's algorithm, if a node is in q then it is guaranteed that it's "safe to take this course" because it does not depend on any courses that "we have not taken yet".
Step 3: Delete node and edges, then repeat: We take one of these special safe courses x from the queue q and conceptually treat everything as if we have deleted the node x and all its outgoing edges from the graph g. In practice, we don't need to update the graph g, for Khan's algorithm it is sufficient to just update the in-degree value of its neighbours to reflect that this node no longer exists.
This step is basically as if a person took and passed the exam for
course x, and now we want to update the other courses dependencies
to show that they don't need to worry about x anymore.
Step 4: Repeat: When we removing these edges from x, we are decreasing the in-degree of x's neighbours; this can introduce more nodes with an in-degree of zero. During this step, if any more nodes have their in-degree become zero then they are added to q. We repeat step 3 to process these nodes. Each time we remove a node from q we add it to the final topological sort list result.
Step 5. Detecting Cycle with Khan's Algorithm: If there is a cycle in the graph then result will not include all the nodes in the graph, result will return only some of the nodes. To check if there is a cycle, you just need to check whether the length of result is equal to the number of nodes in the graph, n.
Why does this work?:
Suppose there is a cycle in the graph: x1 -> x2 -> ... -> xn -> x1, then none of these nodes will appear in the list because their in-degree will not reach 0 during Khan's algorithm. Each node xi in the cycle can't be put into the queue q because there is always some other predecessor node x_(i-1) with an edge going from x_(i-1) to xi preventing this from happening.
Full solution to Leetcode course-schedule-ii in Python 3:
from collections import defaultdict
def build_graph(edges, n):
g = defaultdict(list)
for i in range(n):
g[i] = []
for a, b in edges:
g[b].append(a)
return g
def topsort(g, n):
# -- Step 1 --
indeg = [0] * n
for u in g:
for v in g[u]:
indeg[v] += 1
# -- Step 2 --
q = []
for i in range(n):
if indeg[i] == 0:
q.append(i)
# -- Step 3 and 4 --
result = []
while q:
x = q.pop()
result.append(x)
for y in g[x]:
indeg[y] -= 1
if indeg[y] == 0:
q.append(y)
return result
def courses(n, edges):
g = build_graph(edges, n)
ordering = topsort(g, n)
# -- Step 5 --
has_cycle = len(ordering) < n
return [] if has_cycle else ordering

Translating recursion to divide and conquer

I'm trying to return the three smallest items in a list. Below is an O(n) solution I've written:
def three_smallest(L):
# base case
if (len(L) == 3):
return sorted(L)
current = L[0]
(first_smallest,second_smallest,third_smallest) = three_smallest(L[1:])
if (current < first_smallest):
return (current, first_smallest, second_smallest)
elif (current < second_smallest):
return (first_smallest, current, second_smallest)
elif (current < third_smallest):
return (first_smallest, second_smallest, current)
else:
return (first_smallest,second_smallest,third_smallest)
Now I'm trying to write a divide and conquer approach but I'm not sure how I should divide the list. Any help would be appreciated.
Note that this solution (to my understanding) is NOT divide and conquer. It is just a basic recursive solution as divide and conquer involves dividing a list of length n by integer b, and calling the algorithm on those parts.
For divide and conquer you generally want to divide a set (roughly) in half rather than whittle it away one by one. Perhaps the following would meet your needs?
def three_smallest(L):
if (len(L) <= 3): # base case, technically up to 3 (can be fewer)
return sorted(L)
mid = len(L) // 2 # find the midpoint of L
# find (up to) the 3 smallest in first half, ditto for the second half, pool
# them to a list with no more than 6 items, sort, and return the 3 smallest
return sorted(three_smallest(L[:mid]) + three_smallest(L[mid:]))[:3]
Note that due to the inequality in the base case, this implementation does not have a minimum list size requirement.
At the other end of the scale, your original implementation is limited to lists with fewer than a thousand values or it will blow out the recursion stack. The implementation given above was able to handle a list of a million values with no problem.

Recurrence Analysis (Time complexity)

im writing a boolean function that calculate if two binary trees are identical.
Lets see the program:
boolean func(Node head1 , Node head2){
if(head1 == null || head2 == null) return head1 == null && head2 == null;
return func(head1.left , head2.left) && (head1.right,head2.right);
}
i know that in the worst case the program check n element so its O(n).
and i want to describe this recurrence function in T(n).
i dont know were to start becuse i dont know what is the value of the stopping point ,
i think that the function is T(n,m) = 2*t(n-1,m-1) + n + m.
Let n be the number of nodes in the binary trees. If the binary trees have different number of nodes, then n is smaller of the two sizes (since you'll stop comparing once you reach the end of one tree).
On each call you are doing some comparisons between the nodes of the tree. This will be in constant time, so you can call this time spent d which represents some arbitrary constant.
Lastly, you're making 2 recursive calls for the current node's 2 children. Note that in the worst case, each of the children are the root of a subtree that holds half the total number of nodes in the tree. In other words if you have a tree with n nodes, and you are looking at the root node, then each child of the root has about n/2 nodes below it (and including it).
So your recurrence is as follows:
T(n) = 2*T((n-1)/2) + d
You can simplify this to:
T(n) = 2*T(n/2) + d

Ideas on how to beat a greedy search algorithm in maze by collecting maximum items?

First I'll explain the problem. I have a player in a closed maze filled with items that he should collect to win the game. We also have an opponent which tries to do just the same.The player with the biggest amount for items collected wins. Suppose the opponent follows a BFS algorithm to collect the items, and we have access to all its decisions for every turn, can we make some prediction on what items in the maze should we go to first (so it doesn't get a chance in having the ones close to it), or just pin point a location where items are more dense?
It feels like randomness could also affect this very badly (most of the items land next to the opponent for example). What about if the opponent follows an A* algorithm?
I have already implemented an A* algorithm for our player.First, I look for the closest item heuristically using manhattan distance, then i go collect it and look for the new closest one again and so on.I feel like the "looking for the closest item" method might not be that efficient, maybe pin pointing (somehow haha) a location where the items are more dense is better as i said.
def astar(start, items, mazeMap):
# mazeMap is a dictionary with nodes and as a key for every node
is associated another dictionary containing the neighbors as keys
and the weight of edges to them as values
# items is a list of pairs giving the location of each item
# Apparent goal
# goal is a pair (closest_item, distance_to_closest_item)
goal = closest_item(start, items)
# Set of nodes not needed to be checked anymore
# closedSet = {node: [gscore, fscore]}
closedSet = {}
# Set of potential short-path nodes
# openSet = {node: [gscore, fscore]}
openSet = {start: [0, goal[1]]}
# Set to construct the optimal path
cameFrom = {}
while len(openSet) > 0:
# Looking for the node with the smallest fscore
current = list(openSet.keys())[0]
for keys, values in openSet.items():
if values[1] < openSet[current][1]:
current = keys
# If the chosen node is an item of cheese, we are done
if current in items:
return reconstruct_path(cameFrom, current)
# The current node no longer needs to be checked
closedSet[current] = openSet[current]
del openSet[current]
for keys, values in mazeMap[current].items():
# We don't need to check the node if it's already been done
if keys in list(closedSet.keys()):
continue
# Calculate Gscore
tentative_gscore = closedSet[current][0] + values
if keys not in list(openSet.keys()):
openSet[keys] = [0, 0]
elif tentative_gscore >= openSet[keys][0]:
continue
# This new path is better than the previous one, save it !
cameFrom[keys] = current
openSet[keys][0] = tentative_gscore
openSet[keys][1] = tentative_gscore + manhattan_distance(keys, goal[0])
return "Impossible"

Weakly connected graph traversal from least number of nodes

I've been given the following exercise: There's an unweighted, directed, weakly connected graph with n nodes (n < 1 000 000). We want to traverse the whole graph, starting from the least number of nodes. The question is: from which nodes do I start the traversals? I couldn't find any content on this particular topic. However, I managed to come up with an algorithm, but it's not efficient enough:
I store the graph in an adjacency list (n can be too high for a two-dimensional matrix)
I start a BFS from each node i, and store the nodes it reached in x[i][...] (x = List<List<int>>)
I check whether any x[i].Count == n
I check whether any (x[i] union x[j]).Count == n
I check whether any (x[i] union x[j] union x[k]).Count == n
... So I make all possible unions of 2, 3, 4... subsets of x, and check whether its count is n.
It works all right if n is not too high, but I would need a more efficient algorithm for bigger n.
Any help is appreciated (you would make me be able to fall asleep again)! :)
Find the nodes that do not have any incoming edges. Loop over these nodes, and for each node v, begin traversing the graph. Remember which nodes you visited (by putting them in a hash table or marking them). Stop traversing when you reach a node you have already visited.
You would need an adjacency list representation, where each node has a list of incoming and a list of outgoing edges. Then do something like this:
Set nodesToVisit = emptySet;
for i=1 to n:
if incoming[i].size() == 0:
nodesToVisit.add(i)
Set visited = emptySet;
for v in nodesToVisit:
nodesToVisit.remove(v)
if(v is not in visited):
visit(v);
visited.add(v);
for u in outgoing[v]:
nodesToVisit.add(u)

Resources