Graph Edges Deletion - graph

We are given a undirected graph without loops.We have to check if it is possible to delete edges such that the degree of each vertex is one.
What should I try to this question? Should I use adjacency matrix or list.
Please suggest me the efficient way.

If the graph needs to be fully connected, it's possible if and only if
there are exactly two vertices and they have an edge between them.
If it does not need to be fully connected you must search for a
similar constellation. That is, you need to see if it is possible to
partition the graph into pairs of vertices with one edge in each
pair. That is what we are after. What do we know?
The graph has no loops. This means it must be a tree! (Not necessarily
binary, though). Maybe we can solve this eagerly by starting at
the bottom of the tree? We don't know what the bottom is; how do we
solve that? We can decide this ourselves, so I decide to pick all the
leaves as "bottom".
I now propose the following algorithm (whose efficiency you may
evaluate yourself, it isn't necessarily the best algorithm):
For each leaf L:
Let P be the parent of L, and Q the parent of P (Q may be NULL).
Is the degree of P <= 2? That is, does it have only one edge
connecting it to L, and possibly one connecting it to Q?
If no: Pick another leaf L and go to 1.1.
If yes: L and P can form a pair by removing the edge between P and Q. So
remove L and P from your memory (in some way; let's come back to
data structures later).
Do we have any vertices left in the graph?
No: The answer is "Yes, we can partition the graph by removing edges".
Only one: The answer is "No, we cannot partition the graph".
More:
Did we remove any nodes?
If yes: Go back to 1 and check all the current leaves.
If no: The answer is "No, we cannot partition the graph".
So what data structure do you use for this? I think it's easiest to
use a priority queue, perhaps based on a min-heap, with degree as
priority. You can use a linked list as a temporary storage for leaves
you have already visited but not removed yet. Don't forget to decrease
the priority of Q in step 1.2.2.
But you can do even better! Add all leaves (vertices with degree 1) to
a linked list. Loop through that list in step 1. In step 1.2.2, check
the degree of Q after removing L and P. If the degree is 1, add it to
the list of leaves!
I believe you can implement the entire algorithm recursively as well, but I let you
think about that.

Related

Multi-goal path-finding [duplicate]

I have a undirected graph with about 100 nodes and about 200 edges. One node is labelled 'start', one is 'end', and there's about a dozen labelled 'mustpass'.
I need to find the shortest path through this graph that starts at 'start', ends at 'end', and passes through all of the 'mustpass' nodes (in any order).
( http://3e.org/local/maize-graph.png / http://3e.org/local/maize-graph.dot.txt is the graph in question - it represents a corn maze in Lancaster, PA)
Everyone else comparing this to the Travelling Salesman Problem probably hasn't read your question carefully. In TSP, the objective is to find the shortest cycle that visits all the vertices (a Hamiltonian cycle) -- it corresponds to having every node labelled 'mustpass'.
In your case, given that you have only about a dozen labelled 'mustpass', and given that 12! is rather small (479001600), you can simply try all permutations of only the 'mustpass' nodes, and look at the shortest path from 'start' to 'end' that visits the 'mustpass' nodes in that order -- it will simply be the concatenation of the shortest paths between every two consecutive nodes in that list.
In other words, first find the shortest distance between each pair of vertices (you can use Dijkstra's algorithm or others, but with those small numbers (100 nodes), even the simplest-to-code Floyd-Warshall algorithm will run in time). Then, once you have this in a table, try all permutations of your 'mustpass' nodes, and the rest.
Something like this:
//Precomputation: Find all pairs shortest paths, e.g. using Floyd-Warshall
n = number of nodes
for i=1 to n: for j=1 to n: d[i][j]=INF
for k=1 to n:
for i=1 to n:
for j=1 to n:
d[i][j] = min(d[i][j], d[i][k] + d[k][j])
//That *really* gives the shortest distance between every pair of nodes! :-)
//Now try all permutations
shortest = INF
for each permutation a[1],a[2],...a[k] of the 'mustpass' nodes:
shortest = min(shortest, d['start'][a[1]]+d[a[1]][a[2]]+...+d[a[k]]['end'])
print shortest
(Of course that's not real code, and if you want the actual path you'll have to keep track of which permutation gives the shortest distance, and also what the all-pairs shortest paths are, but you get the idea.)
It will run in at most a few seconds on any reasonable language :)
[If you have n nodes and k 'mustpass' nodes, its running time is O(n3) for the Floyd-Warshall part, and O(k!n) for the all permutations part, and 100^3+(12!)(100) is practically peanuts unless you have some really restrictive constraints.]
run Djikstra's Algorithm to find the shortest paths between all of the critical nodes (start, end, and must-pass), then a depth-first traversal should tell you the shortest path through the resulting subgraph that touches all of the nodes start ... mustpasses ... end
This is two problems... Steven Lowe pointed this out, but didn't give enough respect to the second half of the problem.
You should first discover the shortest paths between all of your critical nodes (start, end, mustpass). Once these paths are discovered, you can construct a simplified graph, where each edge in the new graph is a path from one critical node to another in the original graph. There are many pathfinding algorithms that you can use to find the shortest path here.
Once you have this new graph, though, you have exactly the Traveling Salesperson problem (well, almost... No need to return to your starting point). Any of the posts concerning this, mentioned above, will apply.
Actually, the problem you posted is similar to the traveling salesman, but I think closer to a simple pathfinding problem. Rather than needing to visit each and every node, you simply need to visit a particular set of nodes in the shortest time (distance) possible.
The reason for this is that, unlike the traveling salesman problem, a corn maze will not allow you to travel directly from any one point to any other point on the map without needing to pass through other nodes to get there.
I would actually recommend A* pathfinding as a technique to consider. You set this up by deciding which nodes have access to which other nodes directly, and what the "cost" of each hop from a particular node is. In this case, it looks like each "hop" could be of equal cost, since your nodes seem relatively closely spaced. A* can use this information to find the lowest cost path between any two points. Since you need to get from point A to point B and visit about 12 inbetween, even a brute force approach using pathfinding wouldn't hurt at all.
Just an alternative to consider. It does look remarkably like the traveling salesman problem, and those are good papers to read up on, but look closer and you'll see that its only overcomplicating things. ^_^ This coming from the mind of a video game programmer who's dealt with these kinds of things before.
This is not a TSP problem and not NP-hard because the original question does not require that must-pass nodes are visited only once. This makes the answer much, much simpler to just brute-force after compiling a list of shortest paths between all must-pass nodes via Dijkstra's algorithm. There may be a better way to go but a simple one would be to simply work a binary tree backwards. Imagine a list of nodes [start,a,b,c,end]. Sum the simple distances [start->a->b->c->end] this is your new target distance to beat. Now try [start->a->c->b->end] and if that's better set that as the target (and remember that it came from that pattern of nodes). Work backwards over the permutations:
[start->a->b->c->end]
[start->a->c->b->end]
[start->b->a->c->end]
[start->b->c->a->end]
[start->c->a->b->end]
[start->c->b->a->end]
One of those will be shortest.
(where are the 'visited multiple times' nodes, if any? They're just hidden in the shortest-path initialization step. The shortest path between a and b may contain c or even the end point. You don't need to care)
Andrew Top has the right idea:
1) Djikstra's Algorithm
2) Some TSP heuristic.
I recommend the Lin-Kernighan heuristic: it's one of the best known for any NP Complete problem. The only other thing to remember is that after you expanded out the graph again after step 2, you may have loops in your expanded path, so you should go around short-circuiting those (look at the degree of vertices along your path).
I'm actually not sure how good this solution will be relative to the optimum. There are probably some pathological cases to do with short circuiting. After all, this problem looks a LOT like Steiner Tree: http://en.wikipedia.org/wiki/Steiner_tree and you definitely can't approximate Steiner Tree by just contracting your graph and running Kruskal's for example.
Considering the amount of nodes and edges is relatively finite, you can probably calculate every possible path and take the shortest one.
Generally this known as the travelling salesman problem, and has a non-deterministic polynomial runtime, no matter what the algorithm you use.
http://en.wikipedia.org/wiki/Traveling_salesman_problem
The question talks about must-pass in ANY order. I have been trying to search for a solution about the defined order of must-pass nodes. I found my answer but since no question on StackOverflow had a similar question I'm posting here to let maximum people benefit from it.
If the order or must-pass is defined then you could run dijkstra's algorithm multiple times. For instance let's assume you have to start from s pass through k1, k2 and k3 (in respective order) and stop at e. Then what you could do is run dijkstra's algorithm between each consecutive pair of nodes. The cost and path would be given by:
dijkstras(s, k1) + dijkstras(k1, k2) + dijkstras(k2, k3) + dijkstras(k3, 3)
How about using brute force on the dozen 'must visit' nodes. You can cover all the possible combinations of 12 nodes easily enough, and this leaves you with an optimal circuit you can follow to cover them.
Now your problem is simplified to one of finding optimal routes from the start node to the circuit, which you then follow around until you've covered them, and then find the route from that to the end.
Final path is composed of :
start -> path to circuit* -> circuit of must visit nodes -> path to end* -> end
You find the paths I marked with * like this
Do an A* search from the start node to every point on the circuit
for each of these do an A* search from the next and previous node on the circuit to the end (because you can follow the circuit round in either direction)
What you end up with is a lot of search paths, and you can choose the one with the lowest cost.
There's lots of room for optimization by caching the searches, but I think this will generate good solutions.
It doesn't go anywhere near looking for an optimal solution though, because that could involve leaving the must visit circuit within the search.
One thing that is not mentioned anywhere, is whether it is ok for the same vertex to be visited more than once in the path. Most of the answers here assume that it's ok to visit the same edge multiple times, but my take given the question (a path should not visit the same vertex more than once!) is that it is not ok to visit the same vertex twice.
So a brute force approach would still apply, but you'd have to remove vertices already used when you attempt to calculate each subset of the path.

Reachable vertices from each other

Given a directed graph, I need to find all vertices v, such that, if u is reachable from v, then v is also reachable from u. I know that, the vertex can be find using BFS or DFS, but it seems to be inefficient. I was wondering whether there is a better solution for this problem. Any help would be appreciated.
Fundamentally, you're not going to do any better than some kind of search (as you alluded to). I wouldn't worry too much about efficiency: these algorithms are generally linear in the number of nodes + edges.
The problem is a bit underspecified, so I'll make some assumptions about your data structure:
You know vertex u (because you didn't ask to find it)
You can iterate both the inbound and outbound edges of a node (efficiently)
You can iterate all nodes in the graph
You can (efficiently) associate a couple bits of data along with each node
In this case, use a convenient search starting from vertex u (depth/breadth, doesn't matter) twice: once following the outbound edges (marking nodes as "reachable from u") and once following the inbound edges (marking nodes as "reaching u"). Finally, iterate through all nodes and compare the two bits according to your purpose.
Note: as worded, your result set includes all nodes that do not reach vertex u. If you intended the conjunction instead of the implication, then you can save a little time by incorporating the test in the second search, rather than scanning all nodes in the graph. This also relieves assumption 3.

Traversal of directed acyclic weighted graph with constraints

I have a directed acyclic weighted graph which I want to traverse.
The constraints for a valid solution route are:
The sum of the weights of all edges traversed in the route must be the highest possible in the graph, taking in mind the second constraint.
Exactly N vertices must have been visited in the chosen route (including the start and end vertex).
Typically the graph will have a high amount of vertices and edges, so trying all possibilities is not an option, and requires quite an efficient algorithm.
Looking for some pointers or a suitable algorithm for this problem. I know the first condition is easily fulfilled using Dijkstra's algorithm, but I am not sure how to incorporate the second condition, or even where to begin to look.
Please let me know if any additional information is needed.
I'm not sure if you are interested in any path of length N in the graph or just path between two specific vertices; I suspect the latter, but you did not mention that constraint in your question.
If the former, the solution should be a trivial Dijkstra-like algorithm where you sort all edges by their potential path value that starts at the edge weight and gets adjusted by already built adjecent paths. In each iteration, take the node with the best potential path value and add it to an adjecent path. Stop when you get a path of length N (or longer that you cut off at the sides). There are some other technical details esp. wrt. creating long paths, but I won't go into details as I suspect this is not what you are interested in. :-)
If you have fixed source and sink, I think there is no deep magic involved - just run a basic Dijkstra where a path will be associated with each vertex added to the queue, but do not insert vertices with path length >= N into the queue and do not insert sink into the queue unless its path length is N.

How do I learn Tarjan's algorithm?

I have been trying to learn Tarjan's algorithm from Wikipedia for 3 hours now, but I just can't make head or tail of it. :(
http://en.wikipedia.org/wiki/Tarjan's_strongly_connected_components_algorithm#cite_note-1
Why is it a subtree of the DFS tree? (actually DFS produces a forest? o_O)
And why does v.lowlink=v.index imply that v is a root?
Can someone please explain this to me / give the intuition or motivation behind this algorithm?
The idea is: When traversing the tree, every time you've searched through a branch and are backtracking, you check whether you've encountered an edge to an 'upper' node in the tree.
If you didn't (if (v.lowlink = v.index)), then you've just completed an SCC - it consists of the current node and all nodes on the stack. That's exactly a subtree of the DFS tree, except for the nodes in SCCs that were already completed.
If you did, you propagate this information to 'upper' nodes (v.lowlink := min(v.lowlink, w.lowlink)), because combined with the path in DFS tree the edge creates an 'upward' path.
DFS produces a forest, but you always consider one tree a time. An SCC is always included in one DFS tree, otherwise (being an SCC) there would be a path in both directions between both (all) trees in question - that's a contradiction.
just adding to pjotr's answer: v.lowlink is basically the index of the upmost node that you have found in the tree. Keep in mind that upmost in this context means minimum as you keep increasing indices as you walk down. Now after processing all your successors, there's basically three cases:
v.lowlink < v.index: This indicates that you have found a back edge. Note that we haven't
just found any back edge, but one that points to a node that is "above" the current one. That's what v.lowlink < v.index implies.
v.lowlink = v.index: What we know in this case is that there is no back edge referring to anything above the current node. There might be a back edge to this node (which means that one of your successor nodes w has a lowlink such that w.lowlink = v.lowlink = v.index). It could also be that there was a back edge referring to something below the current node, which means that there was a strongly-connected component below the current node that has been printed out already. The current node, however, is definitely the root of a strongly-connected component as well.
v.lowlink > v.index: That's actually not possible. I'm just listing it for the sake of completeness. ;)
Hope it helps!
Some Intuition about the Tarjan's Algorithm:
During DFS, when we encounter a back edge from vertex v, we update its lowest reachable ancestor i.e. we update the value of low[v]
Now when the all the outgoing edges of a vertex are processed i.e we are about to exit the DFS call for the vertex v, we check the value of low[v], whether low[v] == v (Explanation below). If not this means v is not the root of the SCC and we now give the benefit to the parent of v i.e. the lowest reachable ancestor of parent[v] is now changed to low[v].
This sounds logical as if although there is no direct back edge from the parent[v] to the ancestor of v, but there is a path (back edge of v + edge towards v) via which the parent[v] can still reach the ancestor of v.
Thus we have also updated the low[parent[v]] here.
Therefore, we will keep on updating this chain and low[v] for all v will keep on updating until, we reach to the ancestor (via backtracking). For this ancestor low[v] will be equal to v. And thus this will act as the root of the SCC.
Hope this helps
Path for Mastering Bridges and Articulation Algorithms
Watch this video to get an awesome feeling that you have understood the
algorithm very well.
practice the algorithm from here and here.
practice problems to master it:
A. Cutting Figure
EC_P - Critical Edges
SUBMERGE - Submerging Islands
POLQUERY - Police Query

What is a "graph carving"?

I've faved a question here, and the most promising answer to-date implies "graph carvings". Problem is, I have no clue what it is (neither does the OP, apparently), and it sounds very promising and interesting for several uses. My Googlefu failed me on this topic, as I found no useful/free resource talking about them.
Can someone please tell me what is a 'graph carving', how I can make one for a graph, and how I can determine what makes a certain carving better suited for a task than another?
Please don't go too mathematical on me (or be ready to answer more questions): I understand what's a graph, what's a node and what's a vertex, I manage with big O notation, but I have no real maths background.
I think the answer given in the linked question is a little loose with terminology. I think it is describing a tree carving of a graph G. This is still not particularly google-friendly, I admit, but perhaps it will get you going on your way. The main application of this structure appears to be in one particular DFS algorithm, described in these two papers.
A possibly more clear description of the same algorithm may appear in this book.
I'm not sure stepping through this algorithm would be particularly helpful. It is a reasonably complex algorithm and the explanation would probably just parrot those given in the papers I linked. I can't claim to understand it very well myself. Perhaps the most fruitful approach would be to look at the common elements of those three links, and post specific questions about parts you don't understand.
Q1:
what is a 'graph carving'
There are two types of graph carving: Tree-Carving and Carving.
A tree-carving of a graph is a partition of the vertex set V into subsets V1,V2,...,Vk with the following properties. Each subset constitutes a node of a tree T. For every vertex v in Vj, all the neighbors of v in G belong either to Vj itself, or to Vi where Vi is adjacent to Vj in the tree T.
A carving of a graph is a partitioning of the vertex set V into a collection of subsets V1,V2,...Vk with the following properties. Each subset constitutes a node of a rooted tree T. Each non-leaf node Vj of T has a special vertex denoted by g(Vj) that belongs to p(Vj). For every vertex v in Vi, all the neighbors of v that are in ancestor sets of Vi belong to either
Vi or
Vj, where Vj is the parent of Vi in the tree T, or
Vl, where Vl is the grandparent in the tree T. In this case, however the neighbor of v can only be g(p(Vi))
Those defination referred from chapter 6 of book "Approximation Algorithms for NP-Hard problems" and paper1. (paper1 is picked from Gain's answer, thanks Gain.)
According to my understanding. Tree-Carving or Carving are a kind of representation (or a simplification) of an original graph G. So that the resulting new graph still preserve 'connection properties' of G, but with much smaller size(less vertex, less nodes). These two methods both somehow try to delete 'local' 'similar' information but to keep 'structure' 'vital' information. By merging some 'closed' vertices into one vertex and deleting some edges.
And It seems that Tree Carving is a little bit simpler and easier to understand Since in **Carving**, edges are allowed to go to a single vertex in the grapdhparenet node as well. It would preserve more information.
Q2:
how I can make one for a graph
I only know how to get a tree-carving.
You can refer the algorithm from paper1.
It's a Depth-First-Search based algorithm.
Do DFS, before return from an iteration, check whether this edges is 'bridge' edge or not. If yes, you need remove this 'bridge' and adding some 'back edge'.
You would get a DFS-partition which yields a tree-carving of G.
Q3:
how I can determine what makes a certain carving better suited for a task than another?
Sorry I don't know. I am also a new guy in graph theory.
If you have more question:
What's g function of g(Vj)?
a special node called gray node. go to paper1
What's p function of p(Vj)?
I am not sure. maybe p represent 'parent'. go to paper1
What's the back edge of node t?
some edge(u,v) s.t. u is a decent of t and v is a precedent of t. goto to paper1
What's bridge?
bridge wiki

Resources