What is the computational complexity of breadth-first and depth-first traversal in terms of the number of vertices v and the number of edges e when the graph is represented as an adjacency matrix?
The complexity is O(v^2), as the adjacency matrix must be searched to obtain all neighbors of a single vertex.
Related
How is decay centrality defined for a bipartite graph? I am unable to find a clear definition. All I got is https://www.centiserver.org/centrality/Decay_Centrality/. Which wasn't really helpful.
Also, is there some nice implementation of decay centrality for graphs in python? Because I managed to find only networkx (https://networkx.org/documentation/stable/index.html) and it does not have decay centrality. Though it does have all the other centrality measures like degree, closeness, betweenness, eigenvector centrality.
The definition of decay centrality given on the website you linked should work for bipartite and nonbipartite graphs alike.
Here’s how decay centrality is computed. Given a node v whose centrality you’re interested in, your first step is to pick a parameter δ between 0 and 1. The closer δ is to zero, the more emphasis you’ll place on nodes close to v. The closer δ is to 1, the more emphasis you place on nodes further from v.
Next, compute the distance from v to each other node x in the graph. (I’m not specifically familiar with networkx, but this should be easy to compute via breadth-first search if the graph doesn’t have edge weights, Dijkstra’s algorithm if the graph has edge weights and they’re nonnegative, or the Bellman-Ford algorithm if the graph has edge weights which can be negative.) Notationally, let’s have d(v, x) denote the distance from v to x that you computed.
Finally, for each node x other than v, compute δd(x, v), and add up all those values. That final number is the decay centrality for v.
I am trying to improve my understanding of eigenvector centrality. This overview from the University of Washington was very helpful, especially when read in conjunction with this R code. However, when I use evcent(graph_from_adjacency_matrix(A)), the result differs.
The below code
library(matrixcalc)
library(igraph)
# specify the adjacency matrix
A <- matrix(c(0,1,0,0,0,0,
1,0,1,0,0,0,
0,1,0,1,1,1,
0,0,1,0,1,0,
0,0,1,1,0,1,
0,0,1,0,1,0 ),6,6, byrow= TRUE)
EV <- eigen(A) # compute eigenvalues and eigenvectors
max(EV$values) # find the maximum eigenvalue
centrality <- data.frame(EV$vectors[,1])
names(centrality) <- "Centrality"
print(centrality)
B <- A + diag(6) # Add self loops
EVB <- eigen(B) # compute eigenvalues and eigenvectors
# they are the same as EV(A)
c <- matrix(c(2,3,5,3,4,3)) # Degree of each node + self loop
ck <- function(k){
n <- (k-2)
B_K <- B # B is the original adjacency matrix, w/ self-loops
for (i in 1:n){
B_K <- B_K%*%B #
#print(B_K)
}
c_k <- B_K%*%c
return(c_k)
}
# derive EV centrality as k -> infinity
# k = 100
ck(100)/frobenius.norm(ck(100)) # .09195198, .2487806, .58115487, .40478177, .51401731, .040478177
# Does igraph match?
evcent(graph_from_adjacency_matrix(A))$vector # No: 0.1582229 0.4280856 1.0000000 0.6965127 0.8844756 0.6965127
The rank correlation is the same, but it is still bothersome that the values are not the same. What is going on?
The result returned by igraph is not wrong, but note that there are subtleties to defining eigenvector centrality, and not all implementations handle self-loops in the same way.
Please see what I wrote here.
One way to define eigenvector centrality is simply as "the leading eigenvector of the adjacency matrix". But this is imprecise without specifying what the adjacency matrix is, especially what its diagonal elements should be when there are self-loops present. Depending on application, diagonal entries of the adjacency matrix of an undirected graph are sometimes defined as the number of self-loops, and sometimes as twice the number of self-loops. igraph uses the second definition when computing eigenvector centrality. This is the source of the difference you see.
A more intuitive definition of eigenvector centrality is that the centrality of each vertex is proportional to the sum of its neighbours centralities. Thus the details of the computation hinge on who the neighbours are. Consider a single vertex with a self-loop. It is its own neighbour, but how many times? We can traverse the self-loop in both directions, so it is reasonable to say that it is its own neighbour twice. Indeed, its degree is conventionally taken to be 2, not 1.
You will find that different software packages treat self-loops differently when computing the eigenvector centrality. In igraph, we made a choice by looking at the intuitive interpretation of eigenvector centrality rather than rigidly following a formal definition, with no regard for the motivation behind that definition.
Note: What I wrote about refers to how eigenvector centrality computations work internally, not to what as_adjacency_matrix() return. as_adjacency_matrix() adds one (not two) to the diagonal for each self-loop.
Is there a relation between edges and nodes? How could it be expressed in asymptotic notation?
I've got a probabilistic adjacency matrix (probability that i knows j), and I want to calculate eigenvector centrality for all i. The graph is directed.
Because the graph is directed, the adjacency matrix isn't symmetric. Because the adjacency matrix isn't symmetric, the result depends on whether the matrix is transposed. I suppose that one is the adjacency matrix for being linked to, and the other is the adjacency matrix for linking to others. Which is which?
Here's a dummy example demonstrating the issue:
set.seed(333)
N=4
adj = matrix(runif(N^2),N)
diag(adj)<-0
A = graph.adjacency(adj,weighted=TRUE)
evcent(A,directed=TRUE)$vector
A = graph.adjacency(t(adj),weighted=TRUE)
evcent(A,directed=TRUE)$vector
For directed graphs, matrix element A[i,j] represents the edge from vertex i to vertex j. See also http://en.wikipedia.org/wiki/Adjacency_matrix
I need to calculate the trace of a matrix to the power of 3 and 4 and it needs to be as fast as it can get.
The matrix here is an adjacency matrix of a simple graph, therefore it is square, symmetric, its entries are always 1 or 0 and the diagonal elements are always 0.
Optimization is trivial for the trace of the matrix to the power of 2:
We only need the diagonal entries (i,i) for the trace, skip all others
As the matrix is symmetric these entries are just the entries of the i-th row squared and summed up
And as the entries are just 1 or 0 the square-operation can be skipped
Another idea I found on wikipedia was summing up all elements of the Hadamard product, i.e. entry-wise multiplication, but I don't know how to extend this method to the power of 3 and 4.
See http://en.wikipedia.org/wiki/Trace_(linear_algebra)#Properties
Maybe I'm just blind but I can't think of a simple solution.
In the end I need a C++ implementation, but I think that's not important to the question.
Thanks in advance for any help.
The trace is the sum of the eigenvalues and the eigenvalues of a matrix power are just the eigenvalues to that power.
That is, if l_1,...,l_n are the eigenvalues of your matrix then trace(M^p) = 1_1^p + l_2^p +...+l_n^p.
Depending on your matrix you may want to go with computing the eigenvalues and then summing. If your matrix has low rank (or can be well approximated with a low rank matrix) you can compute the eigenvalues very cheaply (a partial eigendecomposition has complexity O(n*k^2) where k is the rank).
Edit: You mention in the comments that it's 1600x1600 in which case finding all the eigenvalues should be no problem. Here's one of many C++ codes that you can use for this http://code.google.com/p/redsvd/
Ok, I just figured this one out myself.
The important thing I did not know was this:
If A is the adjacency matrix of the directed or undirected graph G, then the matrix An (i.e., the matrix product of n copies of A) has an interesting interpretation: the entry in row i and column j gives the number of (directed or undirected) walks of length n from vertex i to vertex j. This implies, for example, that the number of triangles in an undirected graph G is exactly the trace of A^3 divided by 6.
(Copied from http://en.wikipedia.org/wiki/Adjacency_matrix#Properties)
Retrieving the number of paths of a given length from node i to i for all n nodes can essentially be done in O(n) when dealing with sparse graphs and using adjacency lists instead of matrices.
Nevertheless, thanks for your answers!