I have a network model of trade between nodes, where the edge weights are the total trade flow between nodes and are therefore undirected.
I would like to cluster the network using modularity maximization but the needed binary (0/1) adjacency matrix does not adequately represent the different edge weights. Is there a way to deal with this complexity or is modularity maximization simply the wrong method?
Happy for any hints!
Related
I have seen a paper that uses non-binary adjacency matrix to define the weights of node connections. The weights are ratios in the range [0,1]. Can these weights be also considered as edge features? Then what is the difference between having an adjacency matrix and an edge feature matrix?
It all depends on how you utilise this information. You can use a binary adjacency matrix to define a graph, but you can also interpret it as a fully connected graph with 0/1 features. Same with weigths in [0,1], that depending on the semantic can mean probability of observing an edge etc. (with 0 being no edge) or can be seen as a fully connected graph with float features. Depending on the choice of what you do with this interpretation you can end up with neural nets of different representational power, inductive biases etc. So unfortunately "it all depends"
I am analysing a directed, weighted network with R igraph. The network is based on a correlation matrix, i.e. weights go from -1 to +1. This network is clearly undirected, but I am also interested in more general cases.
Based on this network I would like to perform a community detection to group "similar" nodes together. I know there is a whole bunch of community detection methods in R igraph.
See for example here or here.
But none of these cases deals with negative weights.
Is there an implementation in igraph (or in some other R package) which can deal with directed networks which have negative weights? Any hints are very appreciated.
Not 100 % sure, if it violates any assumptions, but as a workaround I set all negative edge weights to zero before calculating Louvain community detection with igraph in R. At least, they are never included in community relationships.
E(g)$width <- ifelse(E(g)$width < 0, 0, E(g)$width)
g.louv <- cluster_louvain(g, weights = E(g)$width)
Note: this applies only to undirected graphs (I overlooked this detail of the question, sorry)
As far as I understand, there is classical eigenvector centrality and there are variants such as Katz centrality or PageRank. I wonder if the latter is the "latest stage" in the evolution of eigenvector centrality and therefore always superior? Or are there certain conditions, depending on which one should use one or the other. If so, what conditions would that be?
Might be a little bit late, but
Eigen Vector Centrality assumes that nodes with more important connections are important. For example, people who know the president are probably important. mathematically, this is performed by calculating the centrality measurements by finding the eigen vector of the largest eigenvalue of the adjacency matrix.
The problem with Eigen Vector Centrality is that it does not handle directed graphs well as centrality is not passed to incoming edges, leading to lots of zeroes for centrality despite having many outgoing edges. Katz Centrality seeks to fix this problem by adding a small bias term so that no node has strictly zero centrality, thus affecting the centralities of the neighboring nodes as well.
However, the problem with Katz Centrality is that when a node becomes very central in a network, it passes its centrality to all of its outgoing links, making all those nodes very popular. For example, even though people who know the president are important, not all of them are (the car driver of the president for example). To fix this, PageRank Centrality utilizes the degree centrality of the node, mixed with Katz centrality to balance this problem.
In Conclusion, If graph is undirected, use Eigen Vector Centrality. If graph is directed, using Katz or PageRank is dependent upon the situation. If you want nodes that are extremely central to highly influence its neighbors, then use Katz; else, use PageRank.
you can not compare these three cause they are base on different prospective and definition of Centrality. PageRank uses eigenvector centrality concept to determine how important a website is read this
for instance :
in eingenvector centrality we use right eigenvector in the power Iteration algorithm. Now in Pagerank algorithm, we are interesting in inlinks of nodes not outlinks(directed graph). so instead of using right eigenvector, we use left eigenvector. Eigenvector centrality
Also read : Katz centrality
I am running Community Detection in graphs and I run different community detection algorithm implemented in igraph listed here :
1. Edge-betweennes.community(w,-d)
2. walktrap.community (w,-d)
3. fastgreedy.community(w)
4. spinglass.community (w,d, not for unconnected graph)
5. infomap.community (w,d)
6. label.propagation.community(w)
7. Multivel.community(w)
8.leading.eigenvector.community (w)
as I have two types of graph one is directed an weighted and the other one is undirected and unweighted,
the one which I could use for both are four (1,2,4,5) which I get the error on the forth one as my graph is an unconnected graph, so there is three.
now I want to compare them using different evaluation metrics provided in here http://lab41.github.io/Circulo/ , as I searched there is modularity and compare.communities ( metrics listed here :http://www.inside-r.org/packages/cran/igraph/docs/compare.communities are ("vi", "nmi","split.join", "rand","adjusted.rand) in igraph).
what I am wondering about are :
is there any other algorithm which is implemented in igraph and is not in the list? and which will give me overlapping communities as well.
which of these metric could be used for weighted and directed graph and is there any implementation in igraph?
also which metric could be used for which algorithm? , as I go through one of the article "edge-betweeness"the metric used in there was the ground truth and they compare to the known community graph.
thank you in advance.
Yes, there are many algorithms which are not in iGraph package, to name one: RG+, presented in Cluster "Cores and Modularity Maximization" on 2010.
Modularity by far is the best measure to evaluate communities.
edge.betweenness simply gives you the betweenness centrality values of all the edges, it's not a measure to evaluate communities but can be used for one.
I have a JUNG graph containing about 10K vertices and 100K edges, and I'd like to get a measure of similarity between any pair of vertices.
The vertices represent concepts (e.g. dog, house, etc), and the links represent relations between concepts (e.g. related, is_a, is_part_of, etc).
The vertices are densely inter-linked, so a shortest-path approach doesn't give good results (the shortest paths are always very short).
What approaches would you recommend to rank the connectivity between vertices?
JUNG has some algorithms to score the importance of vertices, but I don't understand if there are measures of similarity between 2 vertices.
SimPack seems also promising.
Any hints?
The centrality scores don't measure similarity of pairs of vertices, but some kind of (depending on the method) centrality of single nodes of the network in general. Therefore this approach is possibly not what you want.
SimPack indeed has a nice goal set out, but for graphs it implements isomorphism-based comparations, which rather compare multiple graphs for similarity than pairs of nodes of one given graph. Therefore this is out of scope for now.
What you are seeking are so-called graph clustering methods (also called network module determination or network community determination methods), which divide the graph (network) into multiple partitions so that the nodes in each partition are more strongly interconnected with each other than with nodes of other partitions.
The most classic method is maybe the betweenness centrality clustering of Newman & Girvan where you can exploit the dendrogram for similarity calculation, and it is in JUNG. Of course there are throngs of methods nowadays. You may want to try (shameless plug) our ModuLand method, or read the fine table of module detection algorithms at the end of the Electronic Supplementary Material. That is an overlapping graph clustering method family, that is its result for each node is a vector containing the strengths of belonging to any respective cluster of the network. Pairwise node similarity is easy to derive from pairs of these node-to-cluster vectors.
Graph clustering is non-trivial, and possible you would need to adapt any method for very precise domain-specific results, but that's up to the reader ;) Good luck!