Applications of large graph layout - graph

I've been playing with large graph layout for a while now. What are some applications for the layout, visualization, and interaction with large graphs? Large in this case meaning at least 10's of thousands of nodes.
I'd like to find something that doesn't just look cool but is actually useful.

I believe Tulip is designed to visualize huge graphs, some of the screenshots might give you ideas for applications. There is also a similar conversation, but in reverse here. There the question is which existing tools are best for huge graphs. The post by Scott has a number of links which might contain useful galleries.

the best tools for large graphs are mental mathematical tools from topology and differential geometry such as fiber bundles, ricci flows or manifold surgery as well as dimensional reduction via either compression techniques or linear algebraic methods. anyone who sits there drawing a zillion nodes and edges without any analysis of structural redundancy is just basically trying to make pretty pictures with no purpose or understanding.

Related

Fast volume representation, modification and polygonisation

I am looking for ideas for algorithms and data structures for representing volumetric objects. I am working on a sculpting system, like sculptrix or mudbox, and want to find a good implementation strategy.
I currently have a very nice dynamic halfedge mesh system to collapse/subdivide faces. It works very well and is incredibly fast, but since it is a surface algorithm, it is not easy to robustly change topology.
So I want to go back to the drawingboard and implement a proper volumetric system. My first idea was some kind of octtree representation for the volume and marching cubes to polygonise it.
However, I have a few problems with this. First, marching cubes often produces small or thin triangles, something that is highly undesirable (reason why later). Second, I want to polygonise the volume only in the area of editing, and at different levels of detail. For example, I may want a low res sphere, but with a few tiny high res bumps. I can easily get that kind of subdivision behaviour with my current surface based sustem, but I can't envision how I could do it robustly with marching cubes.
Another problem is that the actual trianglular mesh is further subdivided on the gpu for smooth surfaces, so I need neighbourhood information too. Again, I already have this with the current half-edge system, but with a volume polygonisation system, I imagine it taking a lot of extra processing to find the extra connectivity information. This is the reason thin triangles are bad.
So I have a lot of constraints, and I am asking this community for ideas or pertinent papers to read. I was thinking about surfacenets to avoid the small/thin triangle problem. Also, I have a feeling kd-trees may be better for storing multiresolution volumes since they seem more flexible then octtrees.
Anyway, any ideas/suggestions very welcome.

Metrics for hierarchical graphs connectivity

this is my first question on Stack Overflow. This is not really a programming question but since most of us have to deal with theoretical problems at some point and there might be some graph theory specialists around, I thought I might give it a go.
I am currently doing some research on multilingual websites and I found some interesting patterns in the website structure. The graphs below are the website graphs of two different multilingual websites. Sorry, I don't have enough rep points to post images so I leave them as links. I used the Force Atlas algorithm for the layout. Vertices are colored according to the page language. The shaded areas correspond to the subgraphs of a specific language.
Here is the graph of the website where different language versions of the same content are very closely linked. Hence the planes representing the different language versions are overlapping.
http://www.ai.soc.i.kyoto-u.ac.jp/~julien/phd/images/tight.png
In this second graph, we have a website where language versions of a website are almost independent, thus we have almost no overlap.
http://www.ai.soc.i.kyoto-u.ac.jp/~julien/phd/images/loose.png
So here is my question:
Is there a specific metric to quantify this overlap? If so, what is it named?
Since I used a force-based layout, the number of edges between the language subgraphs. So I guess something like taking the ratio of the number of edges within the subgraph to the number edges going outside/coming inside a specific subgraph might do the trick. I am sure I am not the first to get this idea so I was wondering if this metric had a name. I could then Google it from there :)
Thank you in advance!
It sounds like what you're looking for is Network Modularity. Given a graph, and a partition (breaking the graph into disjoint subgraphs), the modularity is defined as:
The fraction of the edges that fall within the given groups minus the
expected such fraction if edges were distributed at random.
Modularity was the basis of some of the first community detection algorithms on networks, which try to find sets of nodes that are densely connected. Recently, modularity has been shown to be a poor metric for community detection though because of resolution limits that fail to detect small groups or break apart well defined groups in certain cases (see this paper).
And there are now other approaches than modularity, designed to overcome the limitations mentionned by job, such as surprise; or the B- and C-scores (designed to be significance indices).

What's the fastest force-directed network graph engine for large data sets?

We currently have a dynamically updated network graph with around 1,500 nodes and 2,000 edges. It's ever-growing. Our current layout engine uses Prefuse - the force directed layout in particular - and it takes about 10 minutes with a hefty server to get a nice, stable layout.
I've looked a little GraphViz's sfpd algorithm, but haven't tested it yet...
Are there faster alternatives I should look at?
I don't care about the visual appearance of the nodes and edges - we process that separately - just putting x, y on the nodes.
We do need to be able to tinker with the layout properties for specific parts of the graph, for instance, applying special tighter or looser springs for certain nodes.
Thanks in advance, and please comment if you need more specific information to answer!
EDIT: I'm particularly looking for speed comparisons between the layout engine options. Benchmarks, specific examples, or just personal experience would suffice!
I wrote a JavaScript-based graph drawing library VivaGraph.js.
It calculates layout and renders graph with 2K+ vertices, 8.5K edges in ~10-15 seconds. If you don't need rendering part it should be even faster.
Here is a video demonstrating it in action: WebGL Graph Rendering With VivaGraphJS.
Online demo is available here. WebGL is required to view the demo but is not needed to calculate graphs layouts. The library also works under node.js, thus could be used as a service.
Example of API usage (layout only):
var graph = Viva.Graph.graph(),
layout = Viva.Graph.Layout.forceDirected(graph);
graph.addLink(1, 2);
layout.run(50); // runs 50 iterations of graph layout
// print results:
graph.forEachNode(function(node) { console.log(node.position); })
Hope this helps :)
I would have a look at OGDF, specifically http://www.ogdf.net/doku.php/tech:howto:frcl
I have not used OGDF, but I do know that Fast Multipole Multilevel is a good performant algorithm and when you're dealing with the types of runtimes involved with force directed layout with the number of nodes you want, that matters a lot.
Why, among other reasons, that algorithm is awesome: Fast Multipole method. The fast multipole method is a matrix multiplication approximation which reduces the O() runtime of matrix multiplication for approximation to a small degree. Ideally, you'd have code from something like this: http://mgarland.org/files/papers/layoutgpu.pdf but I can't find it anywhere; maybe a CUDA solution isn't up your alley anyways.
Good luck.
The Gephi Toolkit might be what you need: some layouts are very fast yet with a good quality: http://gephi.org/toolkit/
30 secondes to 2 minutes are enough to layout such a graph, depending on your machine.
You can use the ForAtlas layout, or the Yifan Hu Multilevel layout.
For very large graphs (+50K nodes and 500K links), the OpenOrd layout wil
In a commercial scenario, you might also want to look at the family of yFiles graph layout and visualization libraries.
Even the JavaScript version of it can perform layouts for thousands of nodes and edges using different arrangement styles. The "organic" layout style is an implementation of a force directed layout algorithm similar in nature to the one used in Neo4j's browser application. But there are a lot more layout algorithms available that can give better visualizations for certain types of graph structures and diagrams. Depending on the settings and structure of the problem, some of the algorithms take only seconds, while more complex implementations can also bring your JavaScript engine to its knees. The Java and .net based variants still perform quite a bit better, as of today, but the JavaScript engines are catching up.
You can play with these algorithms and settings in this online demo.
Disclaimer: I work for yWorks, which is the maker of these libraries, but I do not represent my employer on SO.
I would take a look at http://neo4j.org/ its open source which is beneficial in your case so you can customize it to your needs. The github account can be found here.

Convert graph in to data points using Mechanical Turk?

I looked around but did not see anyone using Mechanical Turk for this. I've heard of the service, but never used it before. I need to take the following graph and digitize it so I get a list of data points for each line (noting that there are two Y-axes, and thus depends on which line we are talking about). This is pretty time consuming for me, and I saw other posts on StackOverflow about digitizing software doing a poor job at this. Would Mechanical Turk be well suited to my task?
Here is the graph for reference: http://www.yourpicturehost.com/dyno_hbspeed.jpg
Depends how many of these you have. Mechanical turk could work quite well, but you'd have to check the accuracy carefully (eg by re-plotting the graphs, and comparing them yourself).
If you have a lot, though - you should be able to design an image processing algorithm to pick up the data.

Generating a picture/graphic of a graph

In working on a shortest path algorithm across a network I would like to generate a picture of the network. I'd like to represent nodes (circles), links (lines), cost to traverse the link (number in the middle of the link line), and capacity of the link (number on the link line next to the node it represents) in the picture. Is there any library/software out there that would help to automate creating this picture?
I can do this manually in Visio or with some drawing application but I'd like to generate them from code as I change/tweak the network.
Sounds like a job for GraphViz , it generates graphs from a short text description file. I've used it to produce connected node graphs and I believe it should be possible to add link labels, as you require.
If you're using python, Nodebox draws pretty graphs.
One of the big problems in displaying networks like this is figuring out where to put the nodes on the display screen. If arranging nodes is logically simple given your network, then an off-the-shelf product is likely to suit your needs.
If the arrangements are much more complicated, you may have to accept a certain amount of manual intervention to get this to work with off-the-shelf stuff, or byte the bullet and program the whole thing yourself.
.NET is one choice, and once you've mastered the Graphics class it's easy to use and plenty fast for something like this. However, there are probably better languages/platforms than .NET for something graphics-oriented like this.
Update: .NET is much better for 2D graphics than I knew. The key is finding a fast workaround to the pitifully slow GetPixel() and SetPixel() methods in the Bitmap class. Once you can read and write individual pixels easily and quickly, you can do whatever you want as a programmer.
Did you by chance check out the R programming language? I'm not positive but I believe that you can make images and such out of graphs. r-project.org
There are a bunch of visualizations of various algorithms here: Algorithmics Animation Workshop

Resources