I am trying to improve my understanding of eigenvector centrality. This overview from the University of Washington was very helpful, especially when read in conjunction with this R code. However, when I use evcent(graph_from_adjacency_matrix(A)), the result differs.
The below code
library(matrixcalc)
library(igraph)
# specify the adjacency matrix
A <- matrix(c(0,1,0,0,0,0,
1,0,1,0,0,0,
0,1,0,1,1,1,
0,0,1,0,1,0,
0,0,1,1,0,1,
0,0,1,0,1,0 ),6,6, byrow= TRUE)
EV <- eigen(A) # compute eigenvalues and eigenvectors
max(EV$values) # find the maximum eigenvalue
centrality <- data.frame(EV$vectors[,1])
names(centrality) <- "Centrality"
print(centrality)
B <- A + diag(6) # Add self loops
EVB <- eigen(B) # compute eigenvalues and eigenvectors
# they are the same as EV(A)
c <- matrix(c(2,3,5,3,4,3)) # Degree of each node + self loop
ck <- function(k){
n <- (k-2)
B_K <- B # B is the original adjacency matrix, w/ self-loops
for (i in 1:n){
B_K <- B_K%*%B #
#print(B_K)
}
c_k <- B_K%*%c
return(c_k)
}
# derive EV centrality as k -> infinity
# k = 100
ck(100)/frobenius.norm(ck(100)) # .09195198, .2487806, .58115487, .40478177, .51401731, .040478177
# Does igraph match?
evcent(graph_from_adjacency_matrix(A))$vector # No: 0.1582229 0.4280856 1.0000000 0.6965127 0.8844756 0.6965127
The rank correlation is the same, but it is still bothersome that the values are not the same. What is going on?
The result returned by igraph is not wrong, but note that there are subtleties to defining eigenvector centrality, and not all implementations handle self-loops in the same way.
Please see what I wrote here.
One way to define eigenvector centrality is simply as "the leading eigenvector of the adjacency matrix". But this is imprecise without specifying what the adjacency matrix is, especially what its diagonal elements should be when there are self-loops present. Depending on application, diagonal entries of the adjacency matrix of an undirected graph are sometimes defined as the number of self-loops, and sometimes as twice the number of self-loops. igraph uses the second definition when computing eigenvector centrality. This is the source of the difference you see.
A more intuitive definition of eigenvector centrality is that the centrality of each vertex is proportional to the sum of its neighbours centralities. Thus the details of the computation hinge on who the neighbours are. Consider a single vertex with a self-loop. It is its own neighbour, but how many times? We can traverse the self-loop in both directions, so it is reasonable to say that it is its own neighbour twice. Indeed, its degree is conventionally taken to be 2, not 1.
You will find that different software packages treat self-loops differently when computing the eigenvector centrality. In igraph, we made a choice by looking at the intuitive interpretation of eigenvector centrality rather than rigidly following a formal definition, with no regard for the motivation behind that definition.
Note: What I wrote about refers to how eigenvector centrality computations work internally, not to what as_adjacency_matrix() return. as_adjacency_matrix() adds one (not two) to the diagonal for each self-loop.
I've read that betweenness centrality is defined as the number of times a vertex lies on the shortest path of the other pairs of nodes.
However, in case weights have a positive meaning (i.e. the more the weight of an edge the merrier), then how does one define betweenness centrality?
In this case, is there another way to calculate betweenness centrality? Or is it simply interpreted in a different way?
Computing the betweenness centrality of a vertex v relies on the following fraction, for any u and w: s(u,w,v) / s(u,w) where s(u,w,v) is the number of shortest paths between u and w that involve v, and s(u,w) is the total number of shortest paths between u and w.
With positive edge weights, I would suggest that you count each shortest path with its own weight: replace s(u,w,v) by the sum of weights of shortest paths between u and w that involve v; and s(u,w) by the sum of weights of all shortest paths between u and w.
Then, you have to define the weight of paths, and this depends on what you have in mind. You may for instance consider the sum of edge weights, their product, their minimum or maximal value, etc.
Warning: this definition still relies on shortest unweighted paths; if longer paths with higher weights exist, they will be ignored, which means that graph structure prevails. This may not be satisfactory.
Note: this approach is somewhat equivalent, if edges have integer weight and a path weight is its edge weight product, to use the classical definition on a multi-graph (an unweighted graph where several edges may exist between two same vertices).
Is there a graph algorithm for solving the following problem:
Given a weighted undirected graph G (all weights are positive), a start node N and a total weight W*. Generate a random cycle through the graph, starting and ending at node N, of which the total weight (the summed weight of all the edges) approximates the given weight W*.
One could see this as generating the cycle that best approximates W*, but generating a cycle that approximates W* within some margin of error is also fine.
If you want a simple cycle, you want an approximation algorithm for the travelling salesman problem. I believe there are known hardness results, indicating that this is NP-hard for general graphs, but there is a wide range of heuristics; you can check the literature.
Hi so bascially lets say I have a network(A) and I want to find the betweeness centrality of it.
I used: centr_betw(graph, directed = FALSE, normalized = TRUE)
This returned every node with the value:
[1] 1.827102e+04 3.554450e+04 5.000000e-01 9.524383e+04
[5] 0.000000e+00 0.000000e+00 1.078184e+05 4.768125e+04
I really want to know what these numbers mean.
It also shows the between centralization of the whole network and a max value. Lets say the network(A) as a whole has a betweenness centrality of 0.04. What can you say about this network(A) when it is compared to a random network with a betweeness centrality of 0.001?
MUCH THANKS GUYS
Quite a bit of information can be found simply if you type ?centr_betw. In particular, centr_betw returns a list of three components: res, centralization, theoretical_max.
Each element of res is the betweenness centrality of a corresponding vertex i computed in this manner. Specifically, given a shortest path between some vertices j and k (not equal to i), i is considered to be more central if this shortest path includes i. Going over all possible pairs of j and k we can find this betweenness centrality of i.
Further, centralization and theoretical_max concern the Freeman centralization. centralization is C_x, which measures how central network's most central vertex is in relation to how central all the other vertices are. theoretical_max is the denominator of C_x providing the maximal possible value of the numerator across all networks with the same number of vertices.
So, if network A has Freeman centralization 0.04 and network B has 0.001, then we may say that the most central vertex of A is significantly more central than the most central vertex of B. If B is random (i.e., Erdos-Renyi), then that makes sense, because in a big enough network all vertices should play pretty similar role.