A-star (A*) with "correct" heuristic function and without negative edges - graph

In A* heuristic there is a step that updates value of the node if better route to this node was found. But what if we had no negative edges and correct heuristic function (goal-aware, safe and consistent). Is it true that updating will no longer be necessary because we always get to that state first by the shortest path?
Considering the euclidean distance heuristic, it seems to me that it works but I am unable to generalize it in my thoughts as why it should. If not, can anyone provide me with a counter example or in other case confirm my initial though?
Context: I am solving a task with heuristic function which I don't really understand and I don't need to (pseudo-code is provided), but I am guaranteed it is (goal-aware, safe and consistent). The state space is huge so I am unable to build the graph so I am looking for a way how to completely omit remembering the graph and just keep a hash map so I know if I visited particular state before, therefore avoid the cycles.

So I have tested my hypothesis on various instances and it seems that even if the heuristic is (goal-aware, safe and consistent) and the "graph" is without negative edges, the first entry into the state might not be on the shortest possible path. (Some instances seemed to give proper result only if revisiting and updating the states as well as pushing them back into the openSet held by A* was supported.)
I wasn't able to isolate why, but when debugging, some states were visited multiple times and over the shorter path. My guess is that maybe it can happen when we go towards goal but add a neighbor in the graph which is in the direction away from the goal. Therefore being on the longer path that it can possibly be if we would move on the most optimized path from the start towards this node in the direction of the goal.

Related

Maximum coin that can be collected in a grid

There is a grid G .
G(i,j)=-1 implies path blocked.
Otherwise G(i,j) = number of coins at (i,j).
Always start from point start point (0,0) goto end point (n,n) then come back to start point (0,0). Once coins at any location is collected it can't be collected again. However you can still have a path through that location.
What is the maximum number of coins that can be collected if you can travel in top, bottom, left, right directions?
I know how to get the maximum number of coins when you only have to go from (0,0) to (n,n) using recursion but not able to track which coins I have already collected while traveling from start to end.
Please suggest a good approach or refer to some source.
Task specification doesn't require the path to be shortest nor to avoid repeated visit of field. So if the point (n,n) is reachable from point (0,0), the result is effectively sum of all points reachable from them (which can be computed using Flood-fill algorithm); zero otherwise.
Frankly, I suppose you omitted some important constraint in the task specification. Until fixed, it's hard to guess which algorithm is the best solution (for example Dijkstra when looking for shortest path, dynamic programming when moves restricted to right and down etc.).
As your question does not have any restriction on how many steps can be performed. You only need to add up the value of all grids as you can potentially pick up all coins by walking over all grids.

How to detect cycles in directed graph in iterative DFS?

My current project features a set of nodes with inputs and outputs. Each node can take its input values and generate some output values. Those outputs can be used as inputs for other nodes. To minimize the amount of computation needed, node dependencies are checked on application start. When updating the nodes, they are updated in the reverse order they depend on each other.
That said, the nodes resemble a directed graph. I am using iterative DFS (no recursion to avoid stack overflows in huge graphs) to work out the dependencies and create an order for updating the nodes.
I further want to avoid cycles in a graph because cyclic dependencies will break the updater algorithm and causing a forever running loop.
There are recursive approaches to finding cycles with DFS by tracking nodes on the recursion stack, but is there a way to do it iteratively? I could then embed the cycle search in the main dependency resolver to speed things up.
There are plenty of cycle-detection algorithms available on line. The simplest ones are augmented versions of Dijkstra's algorithm. You maintain a list of visited nodes and costs to get there. In your design, replace the "cost" with the path to get there.
In each iteration of the algorithm, you grab the next node on the "active" list and look at each node that follows it in the graph (i.e. each of its dependencies). If that node is on the "visited" list, then you have a cycle. The path you maintained in getting here shows the loop path.
Is that enough to get you moving?
Try a timestamp. Add a meta timestamp and set it to zero on your nodes.
Previous Answer (non applicable):
When you start a search, increment or grab a time() stamp. Then, when
you visit a node, compare it to the current search timestamp. If it
is the same, then you have found a cycle. If not then set the stamp
to current.
Next search, increment again.
Ok, this is how I'm assuming you are performing your DFS search:
Add Root node to a stack (for searching) and a vector (for updating).
Pop the stack and add children of the current node to the stack and to the vector
loop until stack is empty
reverse iterate the vector and update values (by referencing child nodes)
The problem: Cycles will cause the same set of nodes to be added to the stack.
Solution 1: Use a boolean/timestamp to see if the node has been visited before adding to the DFS search stack. This will eliminate cycles, but will not resolve them. You can spit out an error and quit.
Solution 2: Use a timestamp, but increment it each time you pop the stack. If a child node has a timestamp set, and it is less than the current stamp, you have found a cycle. Here's the kicker. When iterating over the values backwards, you can check the timestamps of the child nodes to see if they are greater than the current node. If less, then you've found a cycle, but you can use a default value.
In fact, I think Solution 1 can be resolved the same way by never following more than one child when updating the value and setting all nodes to a default value on start. Solution 2 will give you a warning while evaluating the graph whereas solution 1 only gives you a warning when creating the vector.

A* Pathfinding - Updating Parent

I've read A* Pathfinding for Beginners and looked at several source code implementations in C++ and other languages. I understand most of what is happening, with the exception of one possible issue I think I have found, and none of the tutorials/implementations I've found cover this.
When you get to this part:
If an adjacent square is already on the open list [...], if the G cost
of the new path is lower, change the parent of the adjacent square to
the selected square. Finally, recalculate both the F and G scores of
that square.
Changing the G score of a square should also change the G score of every child, right? That is, every square that already has this square as the parent, should get a new G score now also. So, shouldn't you find every child (and child of child) square in the open list and recalculate the G values? That will also change the F value, so if using a sorted list/queue, that also means a bunch of resorting.
Is this just not an actual problem, not worth the extra CPU for the extra calculations, and that is why the implementations I've seen just ignore this issue (do not update children)?
It depends on your heuristic.
For correctness, the basic A* algorithm requires that you have an admissible heuristic, that is, one that never overestimates the minimum cost of moving from a node to the goal. However, a search using an admissible heuristic may not always find the shortest path to intermediate nodes along the way. If that's the case with your heuristic, you might later find a shorter path to a node you've already visited and need to expand that node's children again. In this situation, you shouldn't use a closed list, as you need to be able to revisit nodes multiple times if you keep finding shorter routes.
However, if you use a consistent heuristic (meaning that the estimated cost of a node is never more than the estimated cost to one of its neighbors, plus the cost of moving from the node to that neighbor), you will only ever visit a node by the shortest path to it. That means that you can use a closed list and never revisit a node once you've expanded its children.
All consistent heuristics are admissible, but not all admissible heuristics are consistent. Most admissible heuristics are also consistent though, so you'll often seen descriptions and example code for A* that assumes the heuristic is consistent, even when it doesn't say so explicitly (or only mentions admissibility).
On the page you link to, the algorithm uses a closed list, so it requires a consistent heuristic to be guaranteed of finding an optimal path. However, the heuristic it uses (Manhattan distance) is not consistent (or admissible for that matter) given the way it handles diagonal moves. So while it might find the shortest path, it could also find some other path and incorrectly believe it is the shortest one. A more appropriate heuristic (Euclidean distance, for example) would be both admissible and consistent, and you'd be sure of not running into trouble.
#eselk : As the square, whose parent and G-score are to be updated, is still in OL, so this means that it has Not been expanded yet, and therefore there would be no child of the square in the OL. So updating G-scores of children and then their further children does not arise. Please let me know if this is not clear.

Load balancing using the heat equation

I didn't think this was mathsy enough for mathoverflow, so I thought I would try here. Its for a programming problem though, so its sorta on topic.
I have a graph (as in the maths one), where I have a number of vertices, A,B,C.. each with a load "value", these are connected by some arbitrary topology (there is at least a spanning tree). The objective of the problem is to transfer load between each of the vertices, preferably using the minimal flow possible.
I wish to know the transfer values along each edge.
The solution to the problem I was thinking of was to treat it as a heat transfer problem, and iteratively transfer load, or solve the heat equation in some way, counting the amount of load dissipated along each edge. The amount of heat transfer until the network reaches steady state should thus yield the result.
Whilst I think this will work, it seems like the stupid solution. I am wondering if there is a reference or sample problem that someone can point me to -- I am unsure what keywords to search for.
I could not see how to couple the problem as either a simplex problem or as a network flow problem -- each edge has unlimited capacity, and so does each node. There are two simultaneous minimisation problems to solve, so simplex seems to not apply??
If you have a spanning tree, then there is an O(n) solution.
Calculate the average value. (In the end, every node will have this value.) Then iterate over the leaves: calculate the change in value necessary for that leaf (average - value), add that change to the leaf, assign it (as flow) to the edge that connects that leaf to the tree, and subtract it from the other node. Remove that leaf and edge from the tree (not from the graph of course); the other node to may become a new leaf, if it has only one remaining edge in the tree.
When you reach the last node, if you've done the arithmetic right it will end up with the average value, just like all the other nodes.

Backtracking maze algorithm doesn't seem to be going all the way back

Basically, I have this: First, a bunch of code generates a maze that is non-traversable. It randomly sets walls in certain spaces of a 2D array based on a few parameters. Then I have a backtracking algorithm go through it to knock out walls until the whole thing is traversable. The thing is, the program doesn't seem to be going all the way back in the stack.
It's pretty standard backtracking code. The algorithm starts at a random location, then proceeds thus in pseudocode:
move(x, y){
if you can go up and haven't been there already:
move (x, y - 1)
if you can go right and haven't been there already:
move (x + 1, y)
...
}
And so on for the other directions. Every time you move, two separate 2D arrays of booleans (one temporary, one permanent) are set at the coordinates to show that you've been in a certain element. Once it can't go any further, it checks the permanent 2D array to see if it has been everywhere. If not, it randomly picks a wall that borders between a visited and non visited space (according to the temporary array) and removes it. This whole thing is invoked in a while loop, so once it's traversed a chunk of the maze, the temporary 2D array is reset while the other is kept and it traverses again at another random location until the permanent 2D array shows that the whole maze has been traversed. The check in the move method is compared against the temporary 2D array, not the permanent one.
This almost works, but I kept finding a few unreachable areas in the final generated maze. Otherwise it's doing a wonderful job of generating a maze just the way I want it to. The thing is, I'm finding that the reason for this is that it's not going all the way back in the stack.
If I change it to check the temporary 2D array for completion instead of the permanent one (thus making it do one full traversal in a single run to mark it complete instead of doing a full run across multiple iterations), it will go on and on and on. I have to set a counter to break it. The result is a "maze" with far, far too many walls removed. Checking the route the algorithm takes, I find that it has not been properly backtracking and has not gone back in the stack nearly far enough in the stack and often just gets stuck on a single element for dozens of recursions before declaring itself finished for no reason at all and removing a wall that had zero need to be removed.
I've tried running the earlier one twice, but it keeps knocking out walls that don't need to be knocked out and making the maze too sparse. I have no idea why the heck this is happening.
I've had a similar problem when trying to make a method for creating a labyrinth.
The important thing when making mazes is to try to NOT create isolated "islands" of connected rooms in the maze. Here's my solution in pseudocode
Room r=randomRoom();
while(r!=null){
recursivelyDigNewDoors(r);
r=null;
for(i=0;i<rooms.count;i++){
if(rooms[i].doors.length == 0 && rooms[i].hasNeighborWithDoor() ){
//if there is a room with no doors and that has a neighbor with doors
Create a door between the doorless room and the one
connected to the rest of your maze
r=rooms[i];
}
}
}
where recursivelyDigNewDoors is a lot like your move() function
In reality, you might like to
describe a "door" as a lack of a wall
and a room without doors as a room with four walls
But the general principle is:
Start your recursive algorithm somewhere
When the algorithm stops: find a place where there's 1 unvisited square and 1 visited.
Link those two together and continue from the previously unvisited square
When no two squares fulfill (2) you're done and all squares are connected

Resources