Extract edges between community nodes and other nodes - r

Suppose we have a simple weighted network on which we perform some sort of community detection. Next we extract particular community and the final task is to extract all edges between nodes of this community and all other nodes.
Below I pasted the toy code.
# Create toy graph
library(igraph)
set.seed(12345)
g <- make_graph("Zachary")
# Add weights to edges
E(g)$weight <- sample(x = 1:10, size = ecount(g), replace = TRUE)
# Run community detection
cl <- cluster_louvain(g)
There are 5 nodes which belong to community #1, 12 nodes which belong to community #2, etc.
> table(membership(cl))
1 2 3 4
5 12 2 15
Now we extract community #1:
g1 <- induced_subgraph(g, which(cl$membership == 1))
Question: how to find edges which connect nodes in community #1 with all other nodes (excluding edges which define community #1)?

Start by getting all edges based in your community:
all_edges <- E(g)[inc(V(g)[membership(cl) == 1])]
all_edges
+ 10/78 edges:
[1] 1-- 5 1-- 6 1-- 7 1--11 5-- 7 5--11 6-- 7 6--11 6--17 7--17
Then, filter out the ones that are completely internal (both vertices are in the community):
all_edges_m <- get.edges(g, all_edges) #matrix representation
all_edges[!(
all_edges_m[, 1] %in% V(g)[membership(cl) == 1] &
all_edges_m[, 2] %in% V(g)[membership(cl) == 1]
)] # filter where in col1 and col2
+ 4/78 edges:
[1] 1-- 5 1-- 6 1-- 7 1--11

Related

how to convert a node list to an edge list in igraph?

I have a empty graph and need to plot the graph based on the convex hull with inner verticies.
My attemp is:
library(igraph)
set.seed(45)
n = 10
g <- graph.empty(n)
xy <- cbind(runif(n), runif(n))
vp <- convex_hull(xy)$resverts + 1
#[1] 8 10 7 2 1
## convert node_list to edge_list
plot(g, layout=xy)
Expected result in right figure.
Question. How to convert a node list to an edge list in igraph??
You can use add_edges along with embed
g2 <- g %>%
add_edges(c(t(embed(vp, 2)), vp[1], vp[length(vp)])) %>%
as.undirected()
and plot(g2, layout = xy) in turn gives
convex_hull does not output a node list in the same sense that an igraph object has a node list. In this case, vp is the sequence of indices so in order to create an edge list, you just need to have the from vertex be going to the next vertex in the sequence. This can be accomplished with dplyr::lead using the first vertex as the default to create a circuit.
data.frame(
from = vp,
to = dplyr::lead(vp, 1, default = vp[1])
)
#> from to
#> 1 8 10
#> 2 10 7
#> 3 7 2
#> 4 2 1
#> 5 1 8
Try this.
## create graph.
vids <- as.character(c(8, 10, 7, 2, 1))
g <- make_graph(c(), length(vids))
V(g)$name <- vids
## and connect the dots.
g2 <- g + path(c(vids, vids[1]))
g2

Changing edge color based on attribute

I'm working on the visual representation of a network on R software, using the igraph package.
I have a data set with links between all the nodes and, for each link/edge, the district that they are assign to.
So, I would like to change the edge color of each edge, based on each district they are assign to. The table above shows the structure of the table.
nodei
nodej
depot1
depot2
4
5
1
0
In this case the link (4-5) is assigned to depot1, so the edge color of the edge should be green, for example.
Here is an exmaple that edges associated with 1 will be colored in "red", and "green" otherwise:
g <- make_ring(5)
g %>%
set_edge_attr(
name = "color",
value = c("green", "red")[1 + (rowSums(ends(., E(.)) == "1") > 0)]
) %>%
plot()
In your case, you could replace "1" by "depot1" and have a try.
I tried to understand your question, I created a node type to color the edges and nodes.
library(igraph)
data <- read.table(text = "
N D type
1 6 A
3 7 B
7 8 A
4 5 B
7 10 A
4 6 B
1 7 A
6 8 B
7 9 B
6 10 A ", header=T )
nodes <- data.frame(id=unique(c(data$N, data$D)) )
nodes$type <- c("A","B") # this if for the layout
nodes$x <- c(1,3,7,4,7, 4, 1,6,7,6)
nodes$y <- c(6,7,8,5,10,,5, 7,8,9,10)
nodes
G <- graph_from_data_frame(dd, vertices = nodes )
V(G)$color <- ifelse( V(G)$type == "A", "red", "green")
E(G)$color <- ifelse( E(G)$type == "A", "red","green")
edge_attr(G)
vertex_attr(G)
plot(G)

Contract verticies by attribute with igraph

I am working on a graph, where each node has an attribute "group" of the following: "Baby Product", "Book" "CE" "DVD" "Music" "Software" "Toy" "Video" "Video Games".
I would like to know how to plot a graph reppresenting those communities: there shall be 9 verticies, one for each group, and a link (possibly weighted) each time two nodes of two categories are connected.
I have tried using the igraph contract function, but this is the result:
> contract(fullnet, mapping=as.factor(products$group), vertex.attr.comb = products$group)
Error in FUN(X[[i]], ...) :
Unknown/unambigous attribute combination specification
Inoltre: Warning message:
In igraph.i.attribute.combination(vertex.attr.comb) :
Some attributes are duplicated
I guess I have misunderstood what this function is used for.
Now I am thinking about creating a new edgelist, made like the one before but instead of the Id of each vertex the name of the group. Sadly, I do not know how to do this in a fast way on an edgelist of over 1200000 elements.
Thank you very much in advance.
I think using contract() should be correct. In the example code below, I added an anonymous function to vertex.attr.comb to combine the vertices by group. Then, simplify() removes loop edges and calculate the sum of edge weight.
# Create example graph
set.seed(1)
g <- random.graph.game(10, 0.2)
V(g)$group <- rep(letters[1:3], times = c(3, 3, 4))
E(g)$weight <- 1:length(E(g))
E(g)
# + 9/9 edges from 7017c6a:
# [1] 2-- 3 3-- 4 4-- 7 5-- 7 5-- 8 7-- 8 3-- 9 2--10 9--10
E(g)$weight
# [1] 1 2 3 4 5 6 7 8 9
# Contract graph by `group` attribute of vertices
g1 <- contract(g, factor(V(g)$group),
vertex.attr.comb = function(x) levels(factor(x)))
# Remove loop edges and compute the sum of edge weight by group
g1 <- simplify(g1, edge.attr.comb = "sum")
E(g1)
# + 3/3 edges from a852397:
# [1] 1--2 1--3 2--3
E(g1)$weight
# [1] 2 15 12

r igraph find all cycles

I have directed igraph and want to fetch all the cycles. girth function works but only returns the smallest cycle. Is there a way in R to fetch all the cycles in a graph of length greater then 3 (no vertex pointing to itself and loops)
It is not directly a function in igraph, but of course you can code it up. To find a cycle, you start at some node, go to some neighboring node and then find a simple path back to the original node. Since you did not provide any sample data, I will illustrate with a simple example.
Sample data
## Sample graph
library(igraph)
set.seed(1234)
g = erdos.renyi.game(7, 0.29, directed=TRUE)
plot(g, edge.arrow.size=0.5)
Finding Cycles
Let me start with just one node and one neighbor. Node 2 connects to Node 4. So some cycles may look like 2 -> 4 -> (Nodes other than 2 or 4) -> 2. Let's get all of the paths like that.
v1 = 2
v2 = 4
lapply(all_simple_paths(g, v2,v1, mode="out"), function(p) c(v1,p))
[[1]]
[1] 2 4 2
[[2]]
[1] 2 4 3 5 7 6 2
[[3]]
[1] 2 4 7 6 2
We see that there are three cycles starting at 2 with 4 as the second node. (I know that you said length greater than 3. I will come back to that.)
Now we just need to do that for every node v1 and every neighbor v2 of v1.
Cycles = NULL
for(v1 in V(g)) {
for(v2 in neighbors(g, v1, mode="out")) {
Cycles = c(Cycles,
lapply(all_simple_paths(g, v2,v1, mode="out"), function(p) c(v1,p)))
}
}
This gives 17 cycles in the whole graph. There are two issues though that you may need to look at depending on how you want to use this. First, you said that you wanted cycles of length greater than 3, so I assume that you do not want the cycles that look like 2 -> 4 -> 2. These are easy to get rid of.
LongCycles = Cycles[which(sapply(Cycles, length) > 3)]
LongCycles has 13 cycles having eliminated the 4 short cycles
2 -> 4 -> 2
4 -> 2 -> 4
6 -> 7 -> 6
7 -> 6 -> 7
But that list points out the other problem. There still are some that you cycles that you might think of as duplicates. For example:
2 -> 7 -> 6 -> 2
7 -> 6 -> 2 -> 7
6 -> 2 -> 7 -> 6
You might want to weed these out. To get just one copy of each cycle, you can always choose the vertex sequence that starts with the smallest vertex number. Thus,
LongCycles[sapply(LongCycles, min) == sapply(LongCycles, `[`, 1)]
[[1]]
[1] 2 4 3 5 7 6 2
[[2]]
[1] 2 4 7 6 2
[[3]]
[1] 2 7 6 2
This gives just the distinct cycles.
Addition regarding efficiency and scalability
I am providing a much more efficient version of the code that I
originally provided. However, it is primarily for the purpose of
arguing that, except for very simple graphs, you will not be able
produce all cycles.
Here is some more efficient code. It eliminates checking many
cases that either cannot produce a cycle or will be eliminated
as a redundant cycle. In order to make it easy to run the tests
that I want, I made it into a function.
## More efficient version
FindCycles = function(g) {
Cycles = NULL
for(v1 in V(g)) {
if(degree(g, v1, mode="in") == 0) { next }
GoodNeighbors = neighbors(g, v1, mode="out")
GoodNeighbors = GoodNeighbors[GoodNeighbors > v1]
for(v2 in GoodNeighbors) {
TempCyc = lapply(all_simple_paths(g, v2,v1, mode="out"), function(p) c(v1,p))
TempCyc = TempCyc[which(sapply(TempCyc, length) > 3)]
TempCyc = TempCyc[sapply(TempCyc, min) == sapply(TempCyc, `[`, 1)]
Cycles = c(Cycles, TempCyc)
}
}
Cycles
}
However, except for very simple graphs, there is a combinatorial
explosion of possible paths and so finding all possible cycles is
completely impractical I will illustrate this with graphs much smaller
than the one that you mention in the comments.
First, I will start with some small graphs where the number of edges
is approximately twice the number of vertices. Code to generate my
examples is below but I want to focus on the number of cycles, so I
will just start with the results.
## ecount ~ 2 * vcount
Nodes Edges Cycles
10 21 15
20 41 18
30 65 34
40 87 424
50 108 3433
55 117 22956
But you report that your data has approximately 5 times as
many edges as vertices. Let's look at some examples like that.
## ecount ~ 5 * vcount
Nodes Edges Cycles
10 48 3511
12 61 10513
14 71 145745
With this as the growth of the number of cycles, using 10K nodes
with 50K edges seems to be out of the question. BTW, it took several
minutes to compute the example with 14 vertices and 71 edges.
For reproducibility, here is how I generated the above data.
set.seed(1234)
g10 = erdos.renyi.game(10, 0.2, directed=TRUE)
ecount(g10)
length(FindCycles(g10))
set.seed(1234)
g20 = erdos.renyi.game(20, 0.095 , directed=TRUE)
ecount(g20)
length(FindCycles(g20))
set.seed(1234)
g30 = erdos.renyi.game(30, 0.056 , directed=TRUE)
ecount(g30)
length(FindCycles(g30))
set.seed(1234)
g40 = erdos.renyi.game(40, 0.042 , directed=TRUE)
ecount(g40)
length(FindCycles(g40))
set.seed(1234)
g50 = erdos.renyi.game(50, 0.038 , directed=TRUE)
ecount(g50)
length(FindCycles(g50))
set.seed(1234)
g55 = erdos.renyi.game(55, 0.035 , directed=TRUE)
ecount(g55)
length(FindCycles(g55))
##########
set.seed(1234)
h10 = erdos.renyi.game(10, 0.55, directed=TRUE)
ecount(h10)
length(FindCycles(h10))
set.seed(1234)
h12 = erdos.renyi.game(12, 0.46, directed=TRUE)
ecount(h12)
length(FindCycles(h12))
set.seed(1234)
h14 = erdos.renyi.game(14, 0.39, directed=TRUE)
ecount(h14)
length(FindCycles(h14))

R: Calculating adjacent vertex after deletion of nodes

I'm very new to R and trying to calculate the adjacent vertices of a graph, which is obtained from deleting certain nodes from an original graph.
However, the output of the result doesn't match with the plot of the graph.
For example:
library(igraph)
g <- make_ring(8)
g <- add_edges(g, c(1,2, 2,7, 3,6, 4,5, 8,2, 6,2))
V(g)$label <- 1:8
plot(g)
h <- delete.vertices(g, c(1,2))
plot(h)
If I compute:
adjacent_vertices(h,6)= 5
However, I want the output to be 3,5,7 as the plot shows. The problem lies in the fact that it doesn't know I'm trying to find the adjacent vertices of node labelled 6.
Could someone please help. Thanks.
The issue here is that when you delete the vertices, the indices for the remaining vertices are shifted down to [0,6]:
> V(h)
+ 6/6 vertices:
[1] 1 2 3 4 5 6
To find the neighbors, using the original vertex names, you could then simply offset the values by the number of vertices removed, e.g.:
> neighbors(h, 6 - offset) + offset
+ 3/6 vertices:
[1] 3 5 7
A better approach, however, would be to refer to the vertex labels instead of using the indices:
> V(g)$label
[1] 1 2 3 4 5 6 7 8
> V(h)$label
[1] 3 4 5 6 7 8
> V(h)[V(h)$label == 6]
+ 1/6 vertex:
[1] 4
To get the neighbors of your vertex of interest, you can modify your code to look like:
> vertex_of_interest <- V(h)[V(h)$label == 6]
> neighbors(h, vertex_of_interest)$label
[1] 3 5 7

Resources