Number of ties repetition among vertices - r

I have an undirected graph with 343 nodes and 4000+ edges. These nodes are connected to each other by different type of ties. Imagine I have A, B, C, D, E as nodes and tie 1, tie 2, and tie 3. I have organized my dataframe as below:
Column1 Column2 Column3
node1 node2 tie
A C 1
C A 3
D A 2
B A 2
A B 1
As the example above can show, A and C and A and B have been linked with each other twice, regardless of their tie and whether they belong to "node1" or "node2" column. And the sum of these repetitions for the whole network is 4.
I exactly want to get this sum for my graph with 4000+ edges. Also would be cool to see the number of repeated links for a specific node/pair of nodes. Hope I make sense
Any help truly appreciated.

Related

Count unique triangles in a network

Using iGraph I can count the number of triangles a given vertex is part of, but I can't find a way to simply count the number of unique triangles within a network. For instance, we create a network that forms two distinct triangles: A-B-D, B-C-E
library(igraph)
edges <- data.frame("vertex1" = c("A","A","B","B","B","C"),"vertex2"= c("B","D","D","C","E","E"))
example_graph <- graph_from_data_frame(edges, directed = FALSE)
If I run sum(count_triangles()) I get a result of 6
> sum(count_triangles(example_graph))
[1] 6
This makes sense because this is merely summing the number of triangles each vertex belongs to: A = 1, B = 2, C = 1, D = 1, E = 1.
However, we can see that there are only two distinct triangles:
> triangles(example_graph)
+ 6/5 vertices, named, from 9c62b6b:
[1] B A D B C E
Is there a way to count only unique triangles in the graph? So that I get an answer of 2 to the above? In my actual data I have thousands of vertices and a few million edges so calculating the number of unique triangles manually isn't an option. Should I simply use length(triangles(example_graph))/3 ?

How to calculate probability of a specific type of independent chance intersection

Let's say I have 4 events: A, B, C, and D. All of these events have an independent chance of 25% to occur.
It's possible that only 1 of these 4 events occur : A, B, C, or D.
It's possible that 2 of these 4 events occur at once : (A,B), (A,C), (A,D), (B,C), or (B,D).
It's possible that 3 of these 4 events occur at once : (A,B,C) or (B,C,D)
It's possible that all 4 events occur at once : (A,B,C,D)
I understand that the odds of (A,B) or (A,B,C) happening would be calculated by P(A)*P(B) or P(A)*P(B)*P(C) respectively. But how do you determine the odds of whether you get any one of the pairs of two or any one of the pairs of three?
Is it as simple as saying there are 12 possible outcomes and, for instance, any one pair of two would be 5 of those outcomes so 5/12 = 41.67%? Is this consistent regardless of what A, B, C, and D's individual occurrence chances are?
Your event "any two" is equivalent to:
A and B but neither C nor D
A and C but neither B nor D
A and D but neither B nor C
B and C but neither A nor D
B and D but neither A nor C
C and D but neither A nor B
Each of these events has probability (1/4)(1/4)(3/4)(3/4) = 9/256; there are six of them, so the total probability of any of them is 54/256.
For exactly three happening, we get four distinct events having probability 3/256 each. The overall probability of any of these is 12/256.
The probability of all and none are 1/256 and 81/256, respectively. Each can occur only one way.
Finally, there are four ways for only one of the four events to happen, and each of these outcomes has probability 27/256. The total is this 108/256.
81/256 + 108/256 + 54/256 + 12/256 + 1/256 = 256/256, as expected.

shortest path in directed graph by selecting the strongest connected nodes at each level

I am looking for an algorithm that finds a path between points A and B in a directed graph. The constraint is that is has to always try the edges with the highest cost first.
It is different than finding the shortest or longest path because this constraint has to be checked at every level.
I will give an exemple: the first column are the starting points and the second column are the destination points:
A C 8
A D 5
A E 2
B A 1
D A 5
D B 1
D E 3
E B 2
The correct path from A to B would be:
A--5--> D --1--> B
A to C was the first attempt, but since C is not connected to any other node it continued with the second option: A to D
D to A was discarded because A is part of the current path (AD)
D to B is selected to minimize the path, although there is a another stronger connection between D and E.
So the contraints in order of importance are:
- shortest path
- strongest connection at each level
Thanks,
Cristi
That's a modified Dijkstra's algorithm, with a per-step preference for heavy edges.

Query ArangoDB general-graph for common neighbors with more than 2 start vertices?

Consider the following example graph:
 
Given the vertices A, B and C (creators), how to figure out their common neighbors?(projects all 3 participated in)For two vertices, I could simply use GRAPH_COMMON_NEIGHBORS("myGraph", A, B), but what if I want to query for 3 or more? Expected result: 1 and 2.
Given the same vertices, how can I make it return common neighbors with no other connections?(creators exclusively participated in a project, no additional edges allowed)?Expected result: 1, because 2 has an edge coming from D, which isn't one of the starting vertices.
You can simply pass the same set of vertices as both parameters for common neighbors. Then repack the result in a better format for AQL to compute the intersection:
let res = (
let nodes = ["a/A","a/B","a/C"]
for n in GRAPH_COMMON_NEIGHBORS("g",nodes , nodes)
for f in VALUES(n)
return VALUES(f)
)
return CALL("intersection", res[0])

Find length of longest trail in directed unweighted graph

I have a directed, unweighted, possibly cyclic graph that can contain loops and multiple duplicate edges (i.e. two edges from node 1 to node 2).
I would now like to find the length of the longest trail in this graph, i.e. the longest path that:
- uses no edge twice (but if there are multiple edges from node 1 to node 2, it can use every one of them)
- possibly visits nodes several time (i.e. it does not have to be a simple path)
In particular, is this problem NP-hard? I know that the longest simple path is NP-hard (reducing Hamiltonian Path to it) and the longest trail with edge reusal is in P (Bellman ford with weight -1 on every edge). However, with this problem, I am not quite sure and I could not find good information on it.
Although I am not completely sure, I think that this problem is NP-hard. As I understand, your question arises due to multiple edges between nodes. The graphs that has multiple edges between same nodes can be expanded to larger graphs with no multiple edges between them. Thus, a graph with multiple edges between same nodes has no difference than a graph without multiple edges.
Let me walkthrough a simple example to explain:
Let there be a graph with 3 nodes (A,B,C) and 5 edges between them (A to B, A to B, B to A, B to C, C to A)
This graph can be expanded and shown with 5 nodes and 7 edges.
Lets expand the node A to 3 different nodes (A1, A2, A3). When we adjust the edges according to previous edges, there exists 7 edges(A1 to B, A2 to B, B to A3, B to C, C to A1, C to A2, C to A3)
As a result, now we have a graph without multiple edges and can be evaluated with the help of Hamiltonian and Bellman Ford.
Hope I've at least cleared the problem a bit.

Resources