I have a network (g) with thousands of clusters, however I can't seem to figure out how to order them by size. It looks like the membership attribute sorts clusters somewhat arbitrarily. For example:
c <- clusters(g)
c$membership
gs <- induced.subgraph(g, c$membership==1)
This will indeed give me the largest cluster, but if I try
gs <- induced.subgraph(g, c$membership==2)
It doesn't give me the second largest cluster, but an arbitrary cluster that happens to be second in the list.
Is there a way to order c$membership according to cluster size, i.e., 1 – largest, 2 – second largest, etc.?
You could do it this way:
# largest subgraph
gs <- induced.subgraph(g, c$membership==order(-c$csize)[1])
# second largest subgraph
gs <- induced.subgraph(g, c$membership==order(-c$csize)[2])
# etc...
Here's a working example.
library(igraph)
g <- graph.full(5) %du% graph.full(4) %du% graph.full(3)
set.seed(1) # for reproducible plots
par(mar=c(0,0,0,0),mfrow=c(1,2))
plot(g)
c <- clusters(g)
gs <- induced.subgraph(g, c$membership==order(-c$csize)[1])
plot(gs)
Related
I have an igraph network object constructed in R and generated weight information for each edge. I want to see the nodes of the most weighted edges (descending). What codes should I use to do that? Thank you!
# create an igraph project of user interaction network and check descriptives.
library(igraph)
#edge list
EL = read.csv("(file path omitted)user_interaction_structure.csv")
head(EL)
#node list: I do not have a node list
#construct an igraph oject
g <- graph_from_data_frame(EL, directed = TRUE, vertices = NULL)
#check the edge and node number of the network
gsize(g)
vcount(g)
#check nodes based on degree (descending)
deg <- igraph::degree(g)
dSorted <-sort.int(deg,decreasing=TRUE,index.return=FALSE)
dSorted
#check edges based on weight
E(g)
#the network will contain loop edges and multiple edges
#simplify multiple edges
g_simple <- graph.adjacency(get.adjacency(g),weighted=TRUE)
#check edge weight
E(g_simple)$weight
#igraph can generate a matrix
g_simple[]
Then I wanted to see who were interacting heavily with whom (the nodes of the edges with the largest weight),so I tried
e_top_weights <- order(order(E(g_simple))$weight, decreasing=TRUE)
but it did not work.
I think what you want is the igraph function strength(), which gives the sum of the weights of the edges incident to each node. Here's an example:
library(igraph)
# A small graph we can visualize
g <- make_ring(5)
# Assign each edge an increasing weight, to make things
# easy
edgeweights<- 1:ecount(g)
E(g)$weight <- edgeweights
# The strength() function sums the weights of edges incident
# to each node
strengths <- strength(g)
# We can collect the top two strengths by sorting the
# strengths vector, then asking for which elements of the
# strengths vector are equal to or greater than the second
# largest element.
toptwo <- which(strengths >= sort(strengths, decreasing = TRUE)[2])
## [1] 4 5
# Assign nodes a color blue that is more saturated when nodes
# have greater strength.
cr <- colorRamp(c(rgb(0,0,1,.1), rgb(0,0,1,1)), alpha = TRUE)
colors <- cr(strengths/max(strengths))
V(g)$color <- apply(colors, 1, function(row) rgb(row[1], row[2], row[3], row[4], maxColorValue = 255))
# Plot to confirm
plot(g, edge.width = edgeweights)
Edit
Here are two different ways to find the two nodes (the "from" node and the "to" node) which are the ends of the edge with the maximum weight:
## 1
edge_df <- as_data_frame(g, "edges")
edge_df[which(edge_df$weight == max(edge_df$weight)), c("from", "to")]
## 2
max_weight_edge <- E(g)[which(E(g)$weight == max(E(g)$weight))]
ends(g, es = max_weight_edge)
I have a few large igraph objects that represent social networks. All nodes have various attributes, among them sector which is a factor variable. I have contracted this large network into a small where vertices represent groups and edges have the sum of individual edges in the original network. The label attribute in the second network represents the sector attribute in the first.
groupnet <- contract(g, as.integer(as.factor(V(g)$sector)), "ignore")
E(groupnet)$weight <- 1
groupnet <- simplify(groupnet, edge.attr.comb = list(weight = "sum"))
V(groupnet)$label <- levels(as.factor(V(g)$sector))
I would like to add another attribute to the second object V(groupnet)$groupsize that represents the number of original vertices that were contracted into groupnet. I have tried it with the following code but it did not work:
V(groupnet)$groupsize <- length(V(g)$sector[V(g)$sector == V(groupnet)$label])
How can I do this properly?
table() could be helpful here. Try out:
set.seed(1234)
library(igraph)
g <- make_ring(1000)
V(g)$sector <- factor(sample(LETTERS, 100, replace = T))
V(g)$sector
## contracted network
groupnet <- contract(g, as.integer(as.factor(V(g)$sector)), "ignore")
E(groupnet)$weight <- 1
V(groupnet)$label <- levels(as.factor(V(g)$sector))
## number of original vertices that were contracted into groupnet
# the tip is to see that table(V(g)$sector) provides the number of vertices per sector and
# its output is also arranged like V(groupnet)
table(V(g)$sector)
V(groupnet)
# solution
V(groupnet)$groupsize <- as.numeric(table(V(g)$sector))
I have a graph G(V,E) unweighted, undirected and connected graph with 12744 nodes and 166262 edges. I have a set of nodes (sub_set) that is a subset of V. I am interested in extracting the smallest connected subgraph where sub_set is a part of this new graph. I have managed to get a subgraph where my subset of nodes is included but I would like to know if there is a way to minimise the graph.
Here is my code (adapted from http://sidderb.wordpress.com/2013/07/16/irefr-ppi-data-access-from-r/)
library('igraph')
g <- erdos.renyi.game(10000, 0.003) #graph for illustrating my propose
sub_set <- sample(V(g), 80)
order <- 1
edges <- get.edges(g, 1:(ecount(g)))
neighbours.vid <- unique(unlist(neighborhood(g, order, which(V(g) %in% sub_set))))
rel.vid <- edges[intersect(which(edges[,1] %in% neighbours.vid), which(edges[,2] %in% neighbours.vid)),]
rel <- as.data.frame(cbind(V(g)[rel.vid[,1]], V(g)[rel.vid[,2]]), stringsAsFactors=FALSE)
names(rel) <- c("from", "to")
subgraph <- graph.data.frame(rel, directed=F)
subgraph <- simplify(subgraph)
I have read this post
minimum connected subgraph containing a given set of nodes, so I guess that my problem could be "The Steiner Tree problem", is there any way to try to find a suboptimal solution using igraph?
Not sure if that's what you meant but
subgraph<-minimum.spanning.tree(subgraph)
produces a graph with the minimum number of edges in which all nodes stay connected in one component.
I have a weighted, undirected network with weights 1 and 2. I need to choose the neighbors of all the vertices that are up to 5 step away. However the 5 steps should include 2 edges with weight=2. For example, if all the 5 edges have weight 1, these neighbors should be excluded.
Question: How do I choose neighbors that are connected with specific edge weights?
Code:
matrix= matrix(as.integer(runif(100,0,3)), 10, 10)
matrix
ntwrk=graph.adjacency(matrix,weighted=TRUE, mode="undirected")
neighborhood(ntwrk,5)
Now, I need to figure out which one of those include edges with weight=2. Then, I need to keep only those neighbors, and measure the neighborhood size with neighborhood.size
The result of neighborhood is a list of of list of edges. You can use lapply and filter each list using the attribute weight.
res.tokeep <- lapply(res, function(x) which(E(ntwrk)[x]$weight==2))
Here a complete example , where I plot the graph before and after the weight filter.
library(igraph)
set.seed(1)
mat = matrix(as.integer(runif(10*10,0,3)), 10, 10)
ntwrk=graph.adjacency(mat,weighted=TRUE, mode="undirected")
res <- neighborhood(ntwrk,5)
op <- par(mfrow=c(1,2))
E(ntwrk)$label <- E(ntwrk)$weight
plot(ntwrk)
res.tokeep <- lapply(res, function(x) which(E(ntwrk)[x]$weight==2))
res.todelete <- lapply(res, function(x) which(E(ntwrk)[x]$weight!=2))
ntwrk <- delete.vertices(ntwrk, unique(unlist(res.todelete)))
plot(ntwrk)
par(mfrow=op)
I think have this working correctly, but I am looking to mimic something similar to Facebook's Friend suggestion. Simply, I am looking to find 2nd degree connections (friends of your friends that you do not have a connection with). I do want to keep this as a directed graph and identify the 2nd degree outward connections (the people your friends connect to).
I believe my dummy code is achieving this, but since the reference is on indices and not vertex labels, I was hoping you could help me modify the code to return useable names.
### create some fake data
library(igraph)
from <- sample(LETTERS, 50, replace=T)
to <- sample(LETTERS, 50, replace=T)
rel <- data.frame(from, to)
head(rel)
### lets plot the data
g <- graph.data.frame(rel)
summary(g)
plot(g, vertex.label=LETTERS, edge.arrow.size=.1)
## find the 2nd degree connections
d1 <- unlist(neighborhood(g, 1, nodes="F", mode="out"))
d2 <- unlist(neighborhood(g, 2, nodes="F", mode="out"))
d1;d2;
setdiff(d2,d1)
Returns
> setdiff(d2,d1)
[1] 13
Any help you can provide will be great. Obviously I am looking to stay within R.
You can index back into the graph vertices like:
> V(g)[setdiff(d2,d1)]
Vertex sequence:
[1] "B" "W" "G"
Also check out ?V for ways to get at this type of info through direct indexing.
You can use the adjacency matrix $G$ of the graph $g$ (no latex here?). One of the properties of the adjacency matrix is that its nth power gives you the number of $n$-walks (paths of length n).
G <- get.adjacency(g)
G2 <- G %*% G # G2 contains 2-walks
diag(G2) <- 0 # take out loops
G2[G2!=0] <- 1 # normalize G2, not interested in multiplicity of walks
g2 <- graph.adjacency(G2)
An edge in graph g2 represents a "friend-of-a-friend" bond.