Visualizing Multiple Networks in R using Igraph - r

I need to visualize a very large number (10k) of distinct networks, all on the same page, and label each node a type (binary). Each network aside from a few are relatively small. Link lengths/weighting is not important for this dataset, and a high degree of overlap on the bigger networks is fine as long as density is evident.
I have gone over some of the Igraph documentation and have been able to create individual graphs using a node/link list. However, I would like some insight if possible on how to translate this large number of networks (currently just 10k arrays with node identity and type inside) into 10k plots, and whether this is a feasible task with Igraph.
Any insight is greatly appreciated.

Related

What is the most efficient way to represent an directed dynamic massive 3D graph?

I know that there are many techniques to represent graphs.
Suppose I have a directed massive 3D graph with 100,000 nodes at maximum.
Suppose the graph looks somewhat like the following:
Suppose each node of the graph has three pieces of information:
A 30-character string as a label
floating point values as coordinates
three integer values
The graph is dynamic. I.e., connections frequently change, and the nodes frequently change their coordinates.
What would be the most efficient way to represent this graph in computer memory so that I can apply mathematical operations on each node?
Should I use data structures, or should I use big-data analytics or ML?

Optimal number of graph vertices when conducting an experiment

I want to conduct an experiment about graph drawing algorithms and for this purpose I have to generate graphs, but I don't know what is the optimal number of graph vertices that should be generated, is it 100 or 200 vertices ? What is the best number of vertices that humans can understand and comprehend ? How can I decide that, do you have any ideas or some papers that are useful for me, I searched online about this topic in Google scholar and many other papers search engine, but I did not find anything.
Thanks in advance
This is a very broad question. Size and type of graphs may depend on the research focus.
The GDToolkit (which i am not affiliated with) publishes several graph drawing test case collections from academic literature which might be a starting point.
In general graph drawing gets more interesting the higher the number of vertices is, especially if labelling comes into play.
A number of vertices up to 100 (maybe more in graphs with a structure to exploit geometrically) has the benefit that you can ask humans to layout the graph and compare their results with what the tested algos produce.
As for the maximum number of vertices that people can 'understand', there is no fixed limit - think of a 2D or 3D lattice, the number of vertices up to which humans can grasp the essence of the graph is virtually unlimited.
There is of course a lot of leeway in what you mean exactly by 'understand'. In general human respondents will be able to tell about non-trivial properties of the graph or create hypotheses on such properties if some visual pattern shows up (this might be an interesting research topic in itself [I have not checked for existing work in this domain], think of 'distorted' drawings of lattices or drawings projections of lattices in higher dimensions).

Which community detection algorithms to choose in igraph?

I am trying to cluster this network. The vertices are tags and the edges depict the co-occurance of these tags. The edge widths show the number of times a tag pair occurs. 'energy' and 'electricity' occur the most together.
I tried using community detection algorithms in R, especially edge.between.community which gives a modularity of 0.35 with this network. fastgreedy.community does not work on a weighted-edge graph. Is there any other algorithm somebody could suggest for this specific case? I am a novice in both graph theory and R.

Clustering in Gephi (Louvain Method)

I have started to work with gephi to help me display a dataset.
The dataset contains:
tags (terms for a certain picture) as nodes
Normalized Google Similarity Distance between those tags as edges with a weight (between 0 und 1)
Every tag is connected to every other tag, as long as they both belong to the same picture. So I have one cluster of nodes and edges for every picture.
I have now imported this dataset to gephi in the following format:
nodes: id, label
edges: target, source, weight (between 0 and 1)
Like 500 nodes and 6000 edges.
My problem now is that after importing all those nodes and edges the graph looks kind of bunched with no real order. Every cluster of every picture is mixed into other clusters of other pictures.
Now using Modularity as Partition algorithm (which should use the Louvain method) the graph is getting colored, each color represent a picture. Now I can split this mess, using the Force Atlas 2 Layout.
I now have a colored graph with something like 15 clusters (every cluster represent 1 picture)
Now I want to cluster those clusters again using tags (nodes) according to their Normalized google distance (weight of the edges), which should then be tags which are somewhat equal in their meaning.
I hope you guys understand what I want to accomplish.
I can also upload a picture to clarify it.
Thanks a lot
I don't think you can do that with the standard version of Gephi. You would need to develop a plugin to implement the very last step of your process.
Gephi is good for visualizing and browsing graphs, but (for now) there are more complete tools when it comes to processing topological properties. for instance, the igraph library (available in C, R and python) might be more appropriate for you. And note that you can use a file format compatible with both Gephi and igraph, which allows you to use both tools on the same data.
I was able to solve my problem. I had to import every one of these 15 clusters on their own. In this way i could use the Modularity method on just those few.

Clustered Graphs Visualization Techniques

I need to visualize a relatively large graph (6K nodes, 8K edges) that has the following properties:
Distinct Clusters. Approximately 50-100 Nodes per cluster and moderate interconnectivity at the cluster level
Minimal (5-10 inter-cluster edges per cluster) interconnectivity between clusters
Let global edge overlap = The edge overlaps caused by directly visualizing a graph of Clusters = {A, B, C, D, E}, Edges = {Pentagram of those clusters, which is non-planar by the way and will definitely generate edge overlap if you draw it out directly}
Let Local Edge Overlap = the above but { A, B, C, D, E } are just nodes.
I need to visualize graphs with the above in a way that satisfies the following requirements
No global edge overlap (i.e. edge overlaps caused by inter-cluster properties is not okay)
Local edge overlap within a cluster is fine
Anyone have thoughts on how to best visualize a graph with the requirements above?
One solution I've come up with to deal with the global edge overlap is to make sure a cluster A can only have a max of 1 direct edge to another cluster (B) during visualization. Any additional inter-cluster edges between cluster A -> C, A -> D, ... are disconnected and additional node/edges A -> A_C, C -> C_A, A -> A_D, D -> D_A... are created.
Anyone have any thoughts?
Prefuse has some good graph drawing link text algorithms built in and it seems to handle fairly large graphs relatively well. You might try Flow Map Layout which is built on top of Prefuse.
Given your objectives, I think that the Fruchterman-Reingold algorithm does a pretty decent job of preventing edge overlap. See for example this screenshot of a network consisting of multiple components drawn using the Fruchterman-Reingold algorithm. IGraph has built-in support for this algorithm (as does Networkx I believe) and is really fast.
There is a program built on top of Prefuse called SocialAction. You have to request the code from the author, but it does a lot of statistical analysis on the graph for you, such as identifying subgraphs. I've used it on a graph with more than 18,000 nodes, and although it is very slow at that scale it still works.
Although it may be silly to ask at this point, had you tried out http://www.graphviz.org/ ?
I haven't seen too many graph visualization tools that support separating clusters within a graph visually. One option might be to take a look at WilmaScope. It looks to have some support for cluster based layouts.
Organic layout manages fairly well clustered graphs in yFiles framework. Try first in yEd to see if it does what needed. It is probably reasonable to use nested graphs alias groups for each cluster. Organic layout has feature called Group Layout Policy which can be used if layout needs to be done using different principles for inter-cluster and intra-cluster edges, with incremental layouting. With some effort, one can translate graph into GraphML to avoid manual work.

Resources