Find the minimum number of nodes that needs to be removed to make graph disconnected( there exists no path from any node to all other nodes). Number of nodes can be 105
First of all, to find a node that is not needed to make the graph connected, you need to find a cycle. Once you find a cycle, then you know all those edges in those cycle are not needed. All the remaining edges in the graph are needed to hold the graph together, so you do a simple traversal of the graph and count the number of vertices connected to find the minimum number of nodes that needs to be removed to make graph disconnected. I'm not sure if this is the most efficient way, just the most efficient in my head. If V is the number of vertices and E is the number of edges, the time complexity will be somewhere around O(V + E) + O(V), since the initial cycle finding is O(V + E) and the counting of the nodes is O(V). Since E >= V, the time complexity can be rounded to O(E), but with huge constant factors.
Consider the graph undirected.
totalDegree = sum of degree of all nodes.
while(totalDegree > 0) {
Remove the node with highest degree and it's edges.
Update degree count of each node.
totalDegree = sum of degree of all remaining nodes.
}
remaining nodes are all disconnected
Trying to think of a case where it will not give the min number of nodes.
I'm having trouble conceptualizing my problem. But essentially if I have m nodes, and want to generate no more than n connections for each node. I also want to ensure that there is a always a path from each node to any other node. I don't care about cycles.
I don't have the proper vocabulary to find this problem already existing though I'm sure it has to exist somewhere.
Does anyone know of a place where this problem is explained, or know the answer themselves?
The simplest way is to construct a Spanning Tree to ensure that all nodes are connected, then add edges that don't violate the maximum number of edges per node until you have the target number of them. In pseudocode:
// nodes[] is a list of all m nodes in our graph
connected_nodes = nodes[0];
// add nodes one by one until all are in the spanning tree
for next_node = 1 to (m-1)
repeat
select node connected_nodes[k] // randomly, nearest, smallest degree, whatever
until degree(k) < n // make sure node to connect to does not violate max degree
add edge between nodes[next_node] and new node connected_nodes[k]
add nodes[next_node] to connected_nodes[]
end for
// the graph is now connected, add the desired number of edges
for e = m+1 to desired_edge_count
select 2 nodes i,j from connected_nodes, each with degree < n
add edge between nodes i and j
end for
I have a undirected connected graph and I want to isolate all of its vertices by removing not edges but vertices, I want to keep the number of vertex that I remove to the minimum. I know to achieve this I must remove the vertices with the highest degree every time until the graph becomes disconnected. But I need to write a Java program for it and I do not know how to keep track of the vertex with highest degree and which data structure to use. I am given the following inputs.
{V, E}: Number of vertices and Edges respectively.
{A - B}: Vertex pair specifying an edge
Sample input:
4 2
1-2
3-4
Sample Output: 2 (that is the minimum number of vertices that need to be removed to make the vertices isolated)
constraints:
1 <= V <= 10^5
1 <= E <= 3 * 10^5
I second the idea that greedy algorithm is not always optimal here, even though the task is to isolate the vertices, not to disconnect the graph.
The problem here is the Vertex cover problem, and it is NP-hard.
For a quick counter-example consider this graph taken from here:
A greedy algorithm would start at the root, but that would take 4 vertices instead of optimal 3.
I would start of with the following DS:
class Node
{
int ID;
int NumberOfNeighbors;
List<int> NeighborIDs;
}
You then go on to keep all the Nodes in a maximum heap (where the key is NumberOfNeighbors).
Your algorithm should go something like:
int numberOfDeletedNodes = 0;
While (!heap.Empty)
{
node = heap.PopTop();
foreach (int ID in node.NeighborIDs)
{
tempNode = heap.Extract(ID);
tempNode.NumberOfNeighbors--;
tempNode.NeighborIDs.Remove(node.ID);
if (tempNode.NumberOfNeighbors != 0)
heap.Insert(tempNode);
}
numberOfDeletedNodes++;
}
I probably missed some end cases or something, but the general idea is to remove the node with the most neighbors, take care of all the neighbors*, and keep going until the heap is empty.
* Important: if the neighbors has no more neighbors of its own, it doesn't go back in.
I'm new to Neo4j and am trying wrap my mind around the following problem in Cypher.
I am looking for a list of nodes, sorted by ascending visitation order, after a run of n path iterations, each of which adds nodes to the list. The visitation sort depends on depth and edge cost. Because the final list represents a sequence of nodes you could also look at it as a path of paths.
Description
My graph has an initial starting node (START), is directional, of unknown size, and has weighted edges.
A node can only be added to the list once, when it is first visited (e.g. when visiting a node, we compare to the final list and add if the node isn't on the list already).
Every edge can only be traveled once.
We can only visit the next adjacent, lowest-cost node.
There are two underlying hierarchies: depth (the closer to START the better) and edge costs (the lower the cost incurred to reach the next adjacent node the better). Depth follows the alphabetical order in the example below. Cost properties are integers but are presented in the example as strings (e.g. "costs1" means edge cost = 1).
Each path starts with the starting node of least depth that is "available" (= possessing untraveled outbound edges). In the example below all edges emanating from START will have been exhausted at some point. For the next run we'll continue with A as starting node.
A path run is done when it cannot continue anymore (i.e. no available outbound edges to travel on)
We're done when the list contains y nodes, which may or may not represent a traversal.
Any ideas on how to tackle this using Cypher queries?
Example data: http://console.neo4j.org/r/o92sjh
This is what happens:
We start at START and travel along the lowest-cost edge available to arrive at A. --> A gets the #1 spot the list and the costs1 edge in START-[:costs1]->a gets eliminated because we've just used it.
We’re on A. The lowest cost edge (costs1) circles back to START, which is a no-go, so we take this edge off the table as well and choose the next available lowest-cost edge (costs2), leading us to B. --> We output B to the list and eliminate the edge in a-[:costs2]->b.
We're now on B. The lowest cost edge (costs1) circles back to START, which is a no-go, so we eliminate that edge as well. The next lowest-cost edge (costs2) leads us to C. --> We output C to the list and eliminate the just traveled edge between B and C.
We're on C and continue from C over its lowest-cost relation on to G. --> We output G to the list and eliminate the edge in c-[:costs1]->g.
We're on G and move on to E via g-[:costs1]->e. --> E goes on the list and the just traveled edge is eliminated.
We're on E, which only has one relation with I. We incur the cost of 1 and travel on to I. --> I goes on the list and E's "costs1" edge gets eliminated.
We're on I, which has no outbound edges and thus no adjacent nodes. Our path run ends and we return to START iterating the whole process with the edges that remain.
We're on START. Its lowest available outbound edge is "cost3", leading us to C. --> C is already on the list, so we just eliminate the edge in START-[:costs3]->c and move on to the next available lowest-cost node, which is F. Note that now we've used up all edges emanating from START.
We're on F, which leads us to J (cost =1) --> J goes on the list, the edge gets eliminated.
We're on J, which leads us to L (cost = 1)--> L goes on the list, the edge gets eliminated.
We're on L, which leads us to N (cost = 1)--> N goes on the list, the edge gets eliminated.
We're on N, which is a dead end, meaning our second path run ends. Because we cannot start the next run from START (as it has no edges available anymore), we move on to next available node of least depth, i.e. A.
We're on A, which leads us to B (cost = 2) --> B is already on the list and we dump the edge.
We're on B, which leads us to D (cost = 3) --> D goes on the list, the edge gets eliminated.
Etc.
Output / final list / "path of paths" (hopefully I did this correctly):
A
B
C
G
E
I
F
J
L
N
D
M
O
H
K
P
Q
R
CREATE ( START { n:"Start" }),(a { n:"A" }),(b { n:"B" }),(c { n:"C" }),(d { n:"D" }),(e { n:"E" }),(f { n:"F" }),(g { n:"G" }),(h { n:"H" }),(i { n:"I" }),(j { n:"J" }),(k { n:"K" }),(l { n:"L" }),(m { n:"M" }),(n { n:"N" }),(o { n:"O" }),(p { n:"P" }),(q { n:"Q" }),(r { n:"R" }),
START-[:costs1]->a, START-[:costs2]->b, START-[:costs3]->c,
a-[:costs1]->START, a-[:costs2]->b, a-[:costs3]->c, a-[:costs4]->d, a-[:costs5]->e,
b-[:costs1]->START, b-[:costs2]->c, b-[:costs3]->d, b-[:costs4]->f,
c-[:costs1]->g, c-[:costs2]->f,
d-[:costs1]->g, d-[:costs2]->f, d-[:costs3]->h,
e-[:costs1]->i,
f-[:costs1]->j,
g-[:costs1]->e, g-[:costs2]->j, g-[:costs3]->k,
j-[:costs1]->l, j-[:costs2]->m, j-[:costs3]->n,
l-[:costs1]->n, l-[:costs2]->f,
m-[:costs1]->o, m-[:costs2]->p, m-[:costs3]->q,
q-[:costs1]->n, q-[:costs2]->r;
The algorithm being sought is a modification to the nearest neighbor (greedy) heuristic for TSP. The changes to the algorithm result in an algorithm that looks like this:
stand on the start vertex an arbitrary vertex as current vertex.
find out the shortest unvisited edge, E, connecting current vertex, terminate if no such edge.
set current vertex to V.
mark E as visited.
if the the number of visited edges has reached the limit, then terminate.
Go to step 2.
As with the original algorithm, the output is the visited vertices.
To handle the use case, allow for the algorithm to take in a set of already visited edges as an additional input. Rather than always starting with an empty set of visited edges. You then just call the function again but with the set of visited edges rather than an empty set until the starting vertex only leads to visited edges.
(Sorry, I'm new on the site, can't comment)
I was hired to find a solution to this particular query. I only learned of this question afterward. I am not going to post it at full here, but I am willing to discuss the solution and get feedback of anyone interested.
It turned out not being possible with cypher alone (well, I could not find out how myself). So I wrote a java function with the Neo4j bindings to implement this.
The solution is single threaded, flat (no recursion), and very close to the description of #Nuclearman. It uses two data structures (ordered maps), one to remember visited edges, another to keep a list of "start" nodes (for when the path runs out):
Follow path of smallest costs (memorize visited edges, store nodes by depth/cost)
On end of path, pick a new start node (smallest depth first, then smallest cost)
Report any new node in the order they are accessed
The use of hash sets, coupled with the fact that edges are visited only once makes it fast and memory efficient.
Let G = (V,E) be an undirected graph. Let w(e) be a weighting function with positive weights. Let T be a minimum spanning tree of G with respect to w.
Given a set of edges S, where S is a subset of E(G), define a new weighting function Q as Q(e) = w(e) if e is not in S, and Q(e) = w(e)+100 if e is in S. Design an algorithm that accepts as input, G, T, w and a set S, |S| = 10 and outputs a minimum spanning tree w.r.t. Q. Make it run in O(V+E) time.
Ok: what I've learned since I originally asked this question is that partitioning an MST is to "break apart" a single edge, which makes two separate components, each MST's of the vertices in their own components. So, in this problem, the edges in S might break up the MST into smaller MST's (11, right?). I have to find the lightest edges that connect one component to another.
My plan is to start at one vertex and expand with BFS until I cover an entire one of these components. For every u in the component, u.color = black. Then, I go back and cover the component again with BFS, this time finding all the edges that connect to vertices that are not colored black, therefore not contained within the component, and crossing the existing cut. The vertices opposite of these edges are placed in a Queue R. Once I am done, I u = RemoveMin(R), which is runs O(lgE). Since it will only be called every time I cover a component, it will run overall at a max of 10*O(lgE), which is still O(lgE). So once I remove u, I perform BFS on the new component, so that all u.color = black in that component. I go through all the black vertices again so that I may queue all the white vertices with updated keys into R. I do u = RemoveMin(R).
So I really think that this works and is proveable. Can anyone suggest something similar?
Any help, however small, would be appreciado.