In this post biziclop inserted the pseudocode for the non-recursive Depth-First Search algorithm.
If we want to use the recursive DFS algorithm is used to check the nodes for a propriety, we can exploit two variants: pre-order (when a node is checked before his children) and post-order (when the children are checked before the node), plus a third variant (in-order: left subtree, then node, then right subtree) for binary tree only.
Since I am interested into having all three variants if possible, I tried to modify biziclop 's pseudocode in order to obtain all three variants of the DFS algorithm. Problem being, I got stuck on the fact that the node is added to the stack (and thus checked) before its children. Any idea?
why " (and thus checked)" ??
in recursive approach you choose to check current node first or first see it's children,here in the same way,you just see nodes,but when to check them?it's on you.
for example if you see it's children and checked them,in simple way just use flag to seen_its_children(not means checked)to handle it for post_order,and for in_order it is the same, in pre_order you do just what you said and it is enough.
Related
I am given a list of edges (each of which have a 'From' and 'To' properties that specify which vertices they connect).
I want to either return a null if they don't form the cycle in this (undirected) graph, or return the list forming a cycle.
Does anyone know how I would go about such a problem? I am clueless.
The way I've been taught to do this involves storing a list of visited vertices.
Navigate through the graph, storing each vertex and adding it to the list. Each turn, compare the current vertex to the list - if it is present, you've visited it before, and are therefore in a cycle.
Algorithms of this type are called Graph Cycle Detection Algorithms. There are some intricacies about which algorithms to select based on the needs of the application or the context of the problem. For example, do you want to find the first cycle, the shortest cycle, longest cycle, all of the cycles, is the graph unidirectional or bidirectional, etc.?
There are numerous cycle detection algorithms to select from, depending on the need and the allowable computational complexity and the nature of the cycle (e.g. finding first, longest, etc.). Some common algorithms include the following:
Floyd's Algorithm
Brent's Algoritm (see also here)
Tarjan's Strongly Connected Components Algorithm (see also this Stack Overflow post)
The specific algorithm you select will depend on your need and how efficient the algorithm must be. If serious efficiency is needed, I would suggest looking through some scholarly articles on the topic and compare and contrast some of the trade-offs of various algorithms.
My current project features a set of nodes with inputs and outputs. Each node can take its input values and generate some output values. Those outputs can be used as inputs for other nodes. To minimize the amount of computation needed, node dependencies are checked on application start. When updating the nodes, they are updated in the reverse order they depend on each other.
That said, the nodes resemble a directed graph. I am using iterative DFS (no recursion to avoid stack overflows in huge graphs) to work out the dependencies and create an order for updating the nodes.
I further want to avoid cycles in a graph because cyclic dependencies will break the updater algorithm and causing a forever running loop.
There are recursive approaches to finding cycles with DFS by tracking nodes on the recursion stack, but is there a way to do it iteratively? I could then embed the cycle search in the main dependency resolver to speed things up.
There are plenty of cycle-detection algorithms available on line. The simplest ones are augmented versions of Dijkstra's algorithm. You maintain a list of visited nodes and costs to get there. In your design, replace the "cost" with the path to get there.
In each iteration of the algorithm, you grab the next node on the "active" list and look at each node that follows it in the graph (i.e. each of its dependencies). If that node is on the "visited" list, then you have a cycle. The path you maintained in getting here shows the loop path.
Is that enough to get you moving?
Try a timestamp. Add a meta timestamp and set it to zero on your nodes.
Previous Answer (non applicable):
When you start a search, increment or grab a time() stamp. Then, when
you visit a node, compare it to the current search timestamp. If it
is the same, then you have found a cycle. If not then set the stamp
to current.
Next search, increment again.
Ok, this is how I'm assuming you are performing your DFS search:
Add Root node to a stack (for searching) and a vector (for updating).
Pop the stack and add children of the current node to the stack and to the vector
loop until stack is empty
reverse iterate the vector and update values (by referencing child nodes)
The problem: Cycles will cause the same set of nodes to be added to the stack.
Solution 1: Use a boolean/timestamp to see if the node has been visited before adding to the DFS search stack. This will eliminate cycles, but will not resolve them. You can spit out an error and quit.
Solution 2: Use a timestamp, but increment it each time you pop the stack. If a child node has a timestamp set, and it is less than the current stamp, you have found a cycle. Here's the kicker. When iterating over the values backwards, you can check the timestamps of the child nodes to see if they are greater than the current node. If less, then you've found a cycle, but you can use a default value.
In fact, I think Solution 1 can be resolved the same way by never following more than one child when updating the value and setting all nodes to a default value on start. Solution 2 will give you a warning while evaluating the graph whereas solution 1 only gives you a warning when creating the vector.
I've read A* Pathfinding for Beginners and looked at several source code implementations in C++ and other languages. I understand most of what is happening, with the exception of one possible issue I think I have found, and none of the tutorials/implementations I've found cover this.
When you get to this part:
If an adjacent square is already on the open list [...], if the G cost
of the new path is lower, change the parent of the adjacent square to
the selected square. Finally, recalculate both the F and G scores of
that square.
Changing the G score of a square should also change the G score of every child, right? That is, every square that already has this square as the parent, should get a new G score now also. So, shouldn't you find every child (and child of child) square in the open list and recalculate the G values? That will also change the F value, so if using a sorted list/queue, that also means a bunch of resorting.
Is this just not an actual problem, not worth the extra CPU for the extra calculations, and that is why the implementations I've seen just ignore this issue (do not update children)?
It depends on your heuristic.
For correctness, the basic A* algorithm requires that you have an admissible heuristic, that is, one that never overestimates the minimum cost of moving from a node to the goal. However, a search using an admissible heuristic may not always find the shortest path to intermediate nodes along the way. If that's the case with your heuristic, you might later find a shorter path to a node you've already visited and need to expand that node's children again. In this situation, you shouldn't use a closed list, as you need to be able to revisit nodes multiple times if you keep finding shorter routes.
However, if you use a consistent heuristic (meaning that the estimated cost of a node is never more than the estimated cost to one of its neighbors, plus the cost of moving from the node to that neighbor), you will only ever visit a node by the shortest path to it. That means that you can use a closed list and never revisit a node once you've expanded its children.
All consistent heuristics are admissible, but not all admissible heuristics are consistent. Most admissible heuristics are also consistent though, so you'll often seen descriptions and example code for A* that assumes the heuristic is consistent, even when it doesn't say so explicitly (or only mentions admissibility).
On the page you link to, the algorithm uses a closed list, so it requires a consistent heuristic to be guaranteed of finding an optimal path. However, the heuristic it uses (Manhattan distance) is not consistent (or admissible for that matter) given the way it handles diagonal moves. So while it might find the shortest path, it could also find some other path and incorrectly believe it is the shortest one. A more appropriate heuristic (Euclidean distance, for example) would be both admissible and consistent, and you'd be sure of not running into trouble.
#eselk : As the square, whose parent and G-score are to be updated, is still in OL, so this means that it has Not been expanded yet, and therefore there would be no child of the square in the OL. So updating G-scores of children and then their further children does not arise. Please let me know if this is not clear.
I have a cyclic graph-like structure that is represented by Node objects.
A Node is either a scalar value (leaf) or a list of n>=1 Nodes (inner node).
Because of the possible circular references, I cannot simply use a recursive HashCode() function, that combines the HashCode() of all child nodes: It would end up in an infinite recursion.
While the HashCode() part seems at least to be doable by flagging and ignoring already visited nodes, I'm having some troubles to think of a working and efficient algorithm for Equals().
To my surprise I did not find any useful information about this, but I'm sure many smart people have thought about good ways to solve these problems...right?
Example (python):
A = [ 1, 2, None ]; A[2] = A
B = [ 1, 2, None ]; B[2] = B
A is equal to B, because it represents exactly the same graph.
BTW. This question is not targeted to any specific language, but implementing hashCode() and equals() for the described Node object in Java would be a good practical example.
I would like to know a good answer as well. So far I use a solution based on the visited set.
When computing hash, I traverse the graph structure and I keep a set of visited nodes. I do not enter the same node twice. When I hit an already visited node, the hash returns a number without recursion.
This work even for the equality comparison. I compare node data and recursively invoke on the children. When I hit an already visited node, the comparison returns true without recursion.
If you think about this as graph, a leaf node is a node that has only one connection and a complex node is one with at least 2. So one you got it that way, implement a simple BFS algorithm witch applies the hash function to each node it passes and then drops the result. This way you ensure yourself that you wont fall in cicles or go through any node more than once.
The implementation could be very straihgtforward. Read about it here.
You need to walk the graphs.
Here's a question: are these graphs equal?
A = [1,2,None]; A[2] = A
B = [1,2,[1,2,None]]; B[2][2] = B
If so, you need a set of (Node, Node) tuples. Use this set to catch loops, and return 'true' when you catch a loop.
If not, you can be a bit more efficient, and use a map from Node to Node. Then, when walking the graphs, build up a set of correspondances. (In the case above, A would correspond to B, A[2] would correspond to B[2], &c.) Then when you visit a node pair you make sure the exact pair is in the map; if it isn't the graph doesn't match.
Is there an algorithm that will, if given two nodes on a graph, find a route between them that takes the specified number of hops? Any node can be connected to any other.
The points at the moment are located in 2D space, so I'm not sure if a graph is the best approach.
Have you tried iterated-deepening DFS?
If you have nodes are seeking to find routes in terms of hops, then a graph is probably the right approach. I'm not sure I understand what you are trying to do and what the constraints are, though, especially with respect to "Any Node can be connected to any other" .. which seems a bit open ended.
Disregarding that, however; with some graph representation:
It seems like starting at the first node, and doing a depth first search from there, and terminating a search if (a) the hops taken is larger than your specified number or (b) we have arrived at the second node; this will determine the first (not only) path connecting the two nodes in (at most) that many hops.
If it has to be exactly the specified hops, terminate any branch of the search if the hops have gone over, and terminate with success if you have also arrived at the second node.
Dumb approach: (data structure is array of stacks). This is basically doing Breadth First Search (BFS) to depth N, except that if loops are allowed (you did not clarify but I assume they are), you don't exclude the visited nodes from further searching.
Push starting node on a stack stored in the array at index 0 (index=depth)
For each level/index "l" 0-N:
For each node on a stack stored at level "l", find all its neighbors, and push them onto a stack stored in level "l+1".
Important: if your task allows finding paths that contain loops, do NOT check if you already visited any node you add. If it does not allow loops, use a hash of visited nodes to not add any node twice**
Stop when you end level "N-1".
Loop over all the nodes you just added to stack at index "N" and find your destination node. If found: success, if not, no such path.
Please note that if by "every node can be connected" you are implying a FULLY CONNECTED graph, then there exists a mathematical answer not involving actually visiting nodes
(however, the formula is too long to write in the text-entry field of StackOverflow)