I would like to generate a lattice with 100 nodes but would like to ensure that all nodes have the same number of neighbours.
However when I do:
d=graph.lattice(100,0,nei=10,directed=TRUE,circular=TRUE)
get.edgelist(d)
then I can see that many of the nodes do not have the same number of neighbours.
Is there any way to ensure that every node has the same number of connections assuming that the first column represents nodes and the second column connections?
This is because the default edge directions for graph.lattice are not the best for directed graphs. What you can do is creating an undirected graph, and then converting it to directed:
d <- as.directed(graph.lattice(100, 0, nei=10, directed=FALSE, circular=TRUE))
unique(degree(d, mode="in"))
# [1] 20
unique(degree(d, mode="out"))
# [1] 20
If you want non-mutual edges, then the easiest (but somewhat less readable) solutions is
d <- graph(sapply(1:100, function(i) {
rbind(i, ((i+1):(i+10)-1) %% 100 + 1)
}))
unique(degree(d, mode="in"))
# [1] 10
unique(degree(d, mode="out"))
# [1] 10
You can create a edgelist and make the graph from that. In that case, assuming that you only consider neighbors linked to (directed), then you could do something like:
el <- do.call(rbind,
lapply(1:100,
function(e) {cbind(rep(e,10),
sample(setdiff(1:100, e),10))}))
d <- graph.edgelist(el)
This picks 10 random nodes (other than itself) to link a node to.
Related
I recently began working on r for social network analysis. Everything goes well and up until now, I found answers to my questions here or on google. But not this time!
I am trying to find a way to calculate "vertex reciprocity" (% of reciprocal edges of each actor of the network). On igraph, reciprocity(g) works fine to calculate the reciprocity of the whole network, but it doesn't help me with the score per actor. Does anybody know what I could do?
Thank you!
I am going to assume that you have a simple graph, that is no loops and no multiple links between nodes. In that case, it is fairly easy to compute this. What does it mean for a link to be reciprocated? When there is a link from a to b, there is a link back from b to a. That means that there is a path of length two from a to itself a->b->a. How many such paths are there? If A is the adjacency matrix, then the entries of AA gives the number of paths of length two. We only want the ones from a node to itself, so we want the diagonal of AA. This will only count a->b->a as one path, but you want to count it twice: once for the link a->b and once for b->a. So for each node you can get the number of reciprocated links from 2*diag(A*A). You want to divide by the total number of links to and from a which is just the degree.
Let me show the computation with an example. Since you do not provide any data, I will use the Enron email data that is available in the 'igraphdata' package. It has loops and multiple links which i will remove. It also has a few isolated vertices, which I will also remove. That will leave us with a connected, directed graph with no loops.
library(igraph)
library(igraphdata)
data(enron)
enron = simplify(enron)
## remove two isolated vertices
enron = delete_vertices(enron, c(72,118))
Now the reciprocity computation is easy.
EnronAM = as.matrix(as_adjacency_matrix(enron))
Path2 = diag(EnronAM %*% EnronAM)
degree(enron)
VertRecip = 2*Path2 / degree(enron)
Let's check it by walking through one node in detail. I will use node number 1.
degree(enron,1)
[1] 10
ENDS = ends(enron, E(enron))
E(enron)[which(ENDS[,1] == 1)]
+ 6/3010 edges from b72ec54:
[1] 1-> 10 1-> 21 1-> 49 1-> 91 1->104 1->151
E(enron)[which(ENDS[,2] == 1)]
+ 4/3010 edges from b72ec54:
[1] 10->1 21->1 105->1 151->1
Path2[1]
[1] 3
Node 1 has degree 10; 6 edges out and 4 edges in. Recip shows that there are three paths of length 2 from 1 back to itself.
1->10->1
1->21->1
1->151->1
That makes 6 reciprocated links and 4 unreciprocated links. The vertex reciprocity should be 6/10 = 0.6 which agrees with what we computed above.
VertRecip[1]
[1] 0.6
Using iGraph I can count the number of triangles a given vertex is part of, but I can't find a way to simply count the number of unique triangles within a network. For instance, we create a network that forms two distinct triangles: A-B-D, B-C-E
library(igraph)
edges <- data.frame("vertex1" = c("A","A","B","B","B","C"),"vertex2"= c("B","D","D","C","E","E"))
example_graph <- graph_from_data_frame(edges, directed = FALSE)
If I run sum(count_triangles()) I get a result of 6
> sum(count_triangles(example_graph))
[1] 6
This makes sense because this is merely summing the number of triangles each vertex belongs to: A = 1, B = 2, C = 1, D = 1, E = 1.
However, we can see that there are only two distinct triangles:
> triangles(example_graph)
+ 6/5 vertices, named, from 9c62b6b:
[1] B A D B C E
Is there a way to count only unique triangles in the graph? So that I get an answer of 2 to the above? In my actual data I have thousands of vertices and a few million edges so calculating the number of unique triangles manually isn't an option. Should I simply use length(triangles(example_graph))/3 ?
I have a directed graph (grafopri1fase1) the graph has no loops and it has a tree structure (not binary tree).
I have an array of nodes (meterdiretti) that i have extracted from the graph (grafopri1fase1) matching a condition.
I would like to know starting from each node of Meterdiretti how many nodes are under each node of Meterdiretti.
The result I would like to have is a Matrix with the following format
first column------------ second column
meterdiretti[1] -------- total amount of nodes reachable starting from meterdiretti[1]
meterdiretti[2] -------- total amount of nodes reachable starting from meterdiretti[2]
....
meterdiretti[n] ----------total amount of nodes reachable starting from meterdiretti[n]
Take a punt at what you want - it would be good if you could add a reproducible example to your question.
I think what you want is to count the descendents of a node. You can do this with neighborhood.size and mode="out" argument.
library(igraph)
# create a random graph
g <- graph.tree(17, children = 2)
plot(g, layout=layout.reingold.tilford)
# test on a single node
neighborhood.size( g, vcount(g), "1", "out") - 1
# [1] 16
# apply over a few nodes
neighborhood.size( g, vcount(g), c(1,4,7), "out") - 1
[1] 16 4 2
I have a table with shortest paths obtained with:
g<-barabasi.game(200)
geodesic.distr <- table(shortest.paths(g))
geodesic.distr
# 0 1 2 3 4 5 6 7
# 117 298 3002 2478 3342 3624 800 28
I then build a matrix with 100 rows and same number of columns as length(geodesic.distr):
geo<-matrix(0, nrow=100, ncol=length(unlist(labels(geodesic.distr))))
colnames(geo) <- unlist(labels(geodesic.distr))
Now I run 100 experiments where I create preferential attachment-based networks with
for(i in seq(1:100)){
bar <- barabasi.game(vcount(g))
geodesic.distr <- table(shortest.paths(bar))
distance <- unlist(labels(geodesic.distr))
for(ii in distance){
geo[i,ii]<-WHAT HERE?
}
}
and for each experiment, I'd like to store in the matrix how many paths I have found.
My question is: how to select the right column based on the column name? In my case, some names produced by the simulated network may not be present in the original one, so I need not only to find the right column by its name, but also the closest one (suppose my max value is 7, I may end up with a path of length 9 which is not present in the geo matrix, so I want to add it to the column named 7)
There is actually a problem with your approach. The length of the geodesic.distr table is stochastic, and you are allocating a matrix to store 100 realizations based on a single run. What if one of the 100 runs will give you a longer geodesic.distr vector? I assume you want to make the allocated matrix bigger in this case. Or, even better, you want run the 100 realizations first, and allocate the matrix after you know its size.
Another potential problem is that if you do table(shortest.paths(bar)), then you are (by default) considering undirected distances, will end up with a symmetric matrix and count all distances (expect for self-distances) twice. This may or may not be what you want.
Anyway, here is a simple way, with the matrix allocated after the 100 runs:
dists <- lapply(1:100, function(x) {
bar <- barabasi.game(vcount(g))
table(shortest.paths(bar))
})
maxlen <- max(sapply(dists, length))
geo <- t(sapply(dists, function(d) c(d, rep(0, maxlen-length(d)))))
Is there a method or a class in igraph to do this procedure fast and efectively?
Let's assume that your graph is in g and the set of vertices to be used is in sampled (which is a vector consisting of zero-based vertex IDs).
First, we select the set of edges where at least one endpoint is in sampled:
all.vertices <- (1:vcount(g)) - 1
es <- E(g) [ sampled %--% 1:n ]
es is now an "edge sequence" object that consists of the edges of interest. Next, we take the edge list of the graph (which is an m x 2 matrix) and select the rows corresponding to the edges:
el <- get.edgelist(g)[as.vector(es)+1]
Here, as.vector(es) converts the edge sequence into a vector consisting of the edge IDs of the edges in the edge sequence, and use it to select the appropriate subset of the edge list. Note that we had to add 1 to the edge IDs because R vectors are indexed from 1 but igraph edge IDs are from zero.
Next, we construct the result from the edge list:
g1 <- graph(el, vcount(g), directed=is.directed(g))
Note that g1 will contain exactly as many vertices as g. You can take the subgraph consisting of the sampled vertices as follows:
g1 <- subgraph(g1, sampled)
Note to users of igraph 0.6 and above: igraph 0.6 will switch to 1-based indexing instead of 0-based, so there is no need to subtract 1 from all.vertices and there is no need to add 1 to as.vector(es). Furthermore, igraph 0.6 will contain a function called subgraph.edges, so one could simply use this:
g1 <- subgraph.edges(g, es)