Parallel edge detection - graph

I am working on a problem (from Algorithms by Sedgewick, section 4.1, problem 32) to help my understanding, and I have no idea how to proceed.
"Parallel edge detection. Devise a linear-time algorithm to count the parallel edges in a (multi-)graph.
Hint: maintain a boolean array of the neighbors of a vertex, and reuse this array by only reinitializing the entries as needed."
Where two edges are considered to be parallel if they connect the same pair of vertices
Any ideas what to do?

I think we can use BFS for this.
Main idea is to be able to tell if two or more paths exist between two nodes or not, so for this, we can use a set and see if adjacent nodes corresponding to a Node's adjacent list already are in the set.
This uses O(n) extra space but has O(n) time complexity.
boolean bfs(int start){
Queue<Integer> q = new Queue<Integer>(); // get a Queue
boolean[] mark = new boolean[num_of_vertices];
mark[start] = true; // put 1st node into Queue
q.add(start);
while(!q.isEmpty()){
int current = q.remove();
HashSet<Integer> set = new HashSet<Integer>(); /* use a hashset for
storing nodes of current adj. list*/
ArrayList<Integer> adjacentlist= graph.get(current); // get adj. list
for(int x : adjacentlist){
if(set.contains(x){ // if it already had a edge current-->x
return true; // then we have our parallel edge
}
else set.add(x); // if not then we have a new edge
if(!marked[x]){ // normal bfs routine
mark[x]=true;
q.add(x);
}
}
}
}// assumed graph has ArrayList<ArrayList<Integer>> representation
// undirected

Assuming that the vertices in your graph are integers 0 .. |V|.
If your graph is directed, edges in the graph are denoted (i, j).
This allows you to produce a unique mapping of any edge to an integer (a hash function) which can be found in O(1).
h(i, j) = i * |V| + j
You can insert/lookup the tuple (i, j) in a hash table in amortised O(1) time. For |E| edges in the adjacency list, this means the total running time will be O(|E|) or linear in the number of edges in the adjacency list.
A python implementation of this might look something like this:
def identify_parallel_edges(adj_list):
# O(n) list of edges to counts
# The Python implementation of tuple hashing implements a more sophisticated
# version of the approach described above, but is still O(1)
edges = {}
for edge in adj_list:
if edge not in edges:
edges[edge] = 0
edges[edge] += 1
# O(n) filter non-parallel edges
res = []
for edge, count in edges.iteritems():
if count > 1:
res.append(edge)
return res
edges = [(1,0),(2,1),(1,0),(3,4)]
print identify_parallel_edges(edges)

Related

Initialization of Arcs depening on Sets/Subsets in directed graphs in CPLEX

I am dealing with a directed weighted graph and have a question about how to initialize a set a defined in the following:
Assume that the graph has the following nodes, which are subdivided into three different subsets.
//Subsets of Nodes
{int} Subset1= {44,99};
{int} Subset2={123456,123457,123458};
{int} Subset3={1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
{int} Nodes=Subset1 union Subset2 union Subset3;
Now there is a set of H_j arcs, where j is in Nodes. H_j gives all arcs outgoing from Node j.
The arcs are stored in an excel file with the following structure:enter image description here
For node 44 in Nodes (Subset 1), there are the arcs <44,123456>, <44,123457>, <44,123458>. For 66 in Nodes (Subset 2), there is no arc. Can somebody help me how to implement this?
Important is that the code uses the input from the excel because in my real case there will be too much data to make a manual input... :(
Maybe there is a real easy solution for that. I would be very thankful!
Thank you so much in advance!
enter image description here
This addition refers to the answer from #Alex Fleischer:
Your code seems to work also in the overall context.
I am trying to implement the following constraints within a Maximization optimization ( The formulations (j,99) and (j,i) in the lower sum boundaries represent arcs):
enter image description here
I tried to implement it like this:
{int} TEST= {99};
subject to {
sum(m in M, j in a[m])x[<44,j>]==3;
sum(j in destPerOrig[99], t in TEST)x[<j,t>]==3;
forall(i in Nodes_wo_Subset1)
sum(j in destPerOrig[i],i in destPerOrig[i])x[<j,i>]==1;
}
M is a set of trains and a[M] gives a specific cost value for each indiviudal train. CPLEX shows 33 failure messages.
The most frequent one is that it cannot extract x[<j,i>], sum(in in destPerOrig[i]), sum(j in destPerOrig[i] and that x and destPerOrig are outside of the valid area.
Most probably the problem is that I implement the constraints in the wrong manner. Again, it is a directed graph.
Referring to the mathematical formulation in the screenshot: Could the format of destPerOrig[i] be a problem?
At the moment destPerOrig[44] gives {2 3 4}. But should´t it give:
{<44 2> <44 3> <44 4>} to work within the mathematical formulation?
I hope that this is enoug information for you to help me :(
I would be very thankful!
all arcs outgoing from Node j.
How to do this depends on how you store the adjacencies of the graph.
Perhaps you store a vector of arcs:
LOOP over arcs
IF arc source == node J
ADD to output
.mod
tuple arcE
{
string o;
string d;
}
{arcE} arcsInExcel=...;
{int} orig={ intValue(a.o) | a in arcsInExcel};
{int} destPerOrig[o in orig]={intValue(a.d) | a in arcsInExcel : intValue(a.o)==o && a.d!="" };
execute
{
writeln(orig);
writeln("==>");
writeln(destPerOrig);
}
/*
which gives
{44 66}
==>
[{2 3 4} {}]
*/
https://github.com/AlexFleischerParis/oplexcel/blob/main/readarcs.mod
.dat
SheetConnection s("readarcs.xlsx");
arcsInExcel from SheetRead(s,"A2:B5");
https://github.com/AlexFleischerParis/oplexcel/blob/main/readarcs.dat

Write an algorithm to find a path that traverses all edges of directed graph G exactly once

...You may visit nodes multiple times, if necessary. Show the run time complexity of your algorithm. This graph is not necessarily strongly connected, but starting from a node there should exist such a path.
My approach so far was to repeatedly DFS at unvisited nodes to find the node with the highest post number as this will be part of the source node in a meta graph.
Then repeatedly DFS on the reverse graph to find the sink node meta graph.
Then run the Eulerian path algorithm until we exhaust all of the edges, backtracking if a path leads to a dead end.
I cant figure out how to proceed from here.
Hi, I came up with this. Can someone please verify it?
function edge_dfs(vertex u, graph, path, num_edges) {
if num_edges == 0 { // Found a path.
return true;
}
for all neighbors n of u {
if exists((u, n)) == true {
num_edges--
exists((u, n)) = false
path.append((n))
if edge_dfs(n, g, p, num_edges) == true:
return true
else:
num_edges++ // Backtrack if this edge was unsuccessful.
exists((u, n)) = true
path.pop()
}
}
return false // No neighbors or No valid paths from this vertex.
}
repeatedly do dfs and fine a source component
node: call it s.
path = array{s}
exists((u, v)) = true for all edges (u, v) in graph
num_edges = number of edges in graph
if edge_dfs(s, graph, path, num_edges) == true:
Path is the elements in array 'path' in order.
else:
Such a path does not exist.
And this is O(|E| + |V|) as it is just a DFS of all of the edges.

Finding the minimum node and then returning the next minimum natural value that's not in the BST

So I have a problem and I have the Algorithm for it, but I just can't seem to be able to turn it into code in C.
Problem: Given an AVL tree, return the next minimum natural value that's not in the tree.
Example: if 2 is the minimum in the tree, I should find out whether or not 3 is one of the nodes in the tree, if it is not, I should return the value 3, if it is, I should see if 4 is in the tree, and so on...
Algorithm to the problem that works in O(logn) (when n is the Number of nodes found in the tree):
first, we check if node->size = node -> key - TreeMinimum
if yes, go to the right side of the tree.
if no, then go to the left.
when we reach NULL, we should return the value of the last node we visited plus 1.
SIZE of the node is the number of nodes that are under this node, including the node itself.
I wrote this code in c but it doesn't seem to work :
int next_missing( AVLNodePtr tnode )
{
int x,y;
if(tnode==NULL)
{
return (tnode->key)+1;
}
if(tnode->size == tnode->key - FindMin(tnode))
x = next_missing(tnode->child[1]);
if(tnode->size != tnode->key - FindMin(tnode))
y = next_missing(tnode->child[0]);
if(x>y) return y;
else return x;
}
Any help/tips on how to fix the code would be appreciated.
Thanks.

Recursion and a counter variable in a binary tree

Following are 2 codes:
1. Find the kth smallest integer in a binary search tree:
void FindKthSmallest(struct TreeNode* root, int& k)
{
if (root == NULL) return;
if (k == 0) return; // k==0 means target node has been found
FindKthSmallest (root->left, k);
if (k > 0) // k==0 means target node has been found
{
k--;
if (k == 0) { // target node is current node
cout << root->data;
return;
} else {
FindKthSmallest (root->right, k);
}
}
}
Find the number of nodes in a binary tree:
int Size (struct TreeNode* root)
{
if (root == NULL) return 0;
int l = Size (root->left);
int r = Size (root->right);
return (l+r+1);
}
My Question:
In both these codes, I will have to keep track of the number of nodes I visit. Why is it that code 1 requires passing a parameter by reference to keep track of the number of nodes I visit, whereas code 2 does not require any variable to be passed by reference ?
The first code (1) is looking for the smallest node in your BST. You search from the root down the left side of the tree since the smallest valued node will be found in that location. You make several checks:
root == null - to determine if the tree is empty.
k == 0 - zero in this case is the smallest element. You are making this assumption based on whatever principles are apart of this tree.
Then you recursively traverse the list to find the next smallest in the left side of the tree. You perform one more check that if k > 0 you decrement k <- this is why you need to pass by reference since you are making changes to some value k given by a separate function, global variable, etc. If k happens to be zero then you have found the smallest valued node, if not you go one right of the current node and then continue the process from there. This seems like a very arbitrary way of finding the smallest node...
For the second code (2) you are just counting the nodes in your tree starting at the root and counting each subsequent node (either left or right) recursively until no more nodes can be found. You return your result which is the total amount of left nodes,right nodes. and + 1 for the root since it was not counted earlier. In this instance no passed by reference variable is needed although you could potentially implement one if you choose to do so.
Does this help?
Passing the parameter by reference allows you to keep track of the count within the recursive process, otherwise the count would reset. It allows you to modify the data within the memory space, thus changing the former value not the current/local value.

Sum of ranks in a binary tree - is there a better way

Maybe this question does not belong as this is not a programming question per se, and i do apologize if this is the case.
I just had an exam in abstract data structures, and there was this question:
the rank of a tree node is defined like this: if you are the root of the tree, your rank is 0. Otherwise, your rank is the rank of your parents + 1.
Design an algorithm that calculates the sum of the ranks of all nodes in a binary tree. What is the runtime of your algorithm?
My answer I believe solves this question, my psuedo-code is as such:
int sum_of_tree_ranks(tree node x)
{
if x is a leaf return rank(x)
else, return sum_of_tree_ranks(x->left_child)+sum_of_tree_ranks(x->right_child)+rank(x)
}
where the function rank is
int rank(tree node x)
{
if x->parent=null return 0
else return 1+rank(x->parent)
}
it's very simple, the sum of ranks of a tree is the sum of the left subtree+sum of the right subtree + rank of the root.
The runtime of this algorithm I believe is n^2. i believe this is the case because we were not given the binary tree is balanced. it could be that there are n numbers in the tree but also n different "levels", as in, the tree looks like a linked list rather than a tree. so to calculate the rank of a leaf, potentially we go n steps up. the father of the leaf will be n-1 steps up etc...so thats n+(n-1)+(n-2)+...+1+0=O(n^2)
My question is, is this correct? does my algorithm solve the problem? is my analysis of the runtime correct? and most importantly, is there a better solution to solve this, that does not run in n^2?
Your algorithm works. your analysis is correct. The problem can be solved in O(n) time: (take care of leaves by yourself)
int rank(tree node x, int r)
{
if x is a leaf return r
else
return rank(x->left_child, r + 1)+ ranks(x->right_child, r + 1) + r
}
rank(tree->root, 0)
You're right but there is an O(n) solution providing you can use a more "complex" data structure.
Let each node hold its rank and update the ranks whenever you add/remove, that way you can use the O(1) statement:
return 1 + node->left.rank + node->right.rank;
and do this for each node on the tree to achieve O(n).
A thumb rule for reducing Complexity time is: if you can complex the data structure and add features to adapt it to your problem, you can reduce Complexity time to O(n) most of the times.
It can be solved in O(n) time where n is number of Nodes in Binary tree .
It's nothing but sum of height of all nodes where height of root node is zero .
As
Algorithm:
Input binary tree with left and right child
sum=0;
output sum
PrintSumOfrank(root,sum):
if(root==NULL) return 0;
return PrintSumOfrank(root->lchild,sum+1)+PrintSumOfRank(root->Rchild,sum+1)+sum;
Edit:
This can be also solved using queue or level order of traversal tree.
Algorithm using Queue:
int sum=0;
int currentHeight=0;
Node *T;
Node *t1;
if(T!=NULL)
enque(T);
while(Q is not empty) begin
currentHeight:currentHeight+1 ;
for each nodes in Q do
t1 = deque();
if(t1->lchild!=NULL)begin
enque(t1->lchild);sum = sum+currentHeight;
end if
if(t1->rchild!=NULL)begin
enque(t1->rchild);sum = sum+currentHeight;
end if
end for
end while
print sum ;

Resources