Tree Graph Layout Algorithm - graph

I have an acyclic graph - a tree.
Each node of the graph will have some attributes that I need to show in a glyph.
Let's us say the size of each Glyph is 100x100 pixels.
I am looking for an algorithm that is fast and efficient for up to 100,000 nodes that will lay out the nodes in xy space.

JUNG has a few different layout algorithms for trees that should work fine: TreeLayout, obviously, but you might also look at RadialTreeLayout and BalloonLayout.
http://jung.sourceforge.net/doc/api/edu/uci/ics/jung/algorithms/layout/package-summary.html

Related

How to redistribute graph elements to maximize readability?

I have a graph that consists of nodes. Each node can have multiple parents and/or children. I want to display that graph and connections between nodes.
But I don't know how to redistribute nodes to maximize readability. Currently I'm facing following problems:
Node connections cross each other too much even though it's unnecessary and can be evaded
Connections between nodes are too long visually
Some connections have the same angle so they overlap and become one line
Connections beteween column i and column i-2 (and further away) sometimes go straight through elements in column i-1
Also I can shift nodes only vertically, not horizontally because the amount of columns is limited.
To make it easier for myself I tried to place nodes in a grid-like pattern. And I've managed to group them by columns. But then I somehow need to iterate through columns and compare them with other columns to re-arrange stuff. And I don't know where to start.
UPD: I may be wrong but I feel like my problem with graph alignment is somehow related to a typical graph problem of the shortest path. Except that in my case there are multiple paths that should be calculated at the same time and some nodes can be passed only once.
On the image below you can see a nearly ideal redistribution that I made by just scribbling stuff on paper (direction left-to-right shows parent-to-child connections).
It is graph layout and drawing problem. You can take one of the following two approaches
Use already existing libraries: There are many graph layout libraries available for example GraphViz, Gephi, D3js etc. You can use theirs APIs directly or you can find applications/tools build on top of them. But to get best layout, you need to have guess on family of layout. e.g. Layered graph layout (Good for dense but layered graph like flowcharts) Tree layout (Used when graph is actually a tree or forest. There are many variants of tree), Radial tree layout (Again for tree but in polar system), Force directed layouts (When you don't know what visual structure will best represent the data, it is good starting point). All these layout will have many customization parameters like spacing between nodes, spacing between nodes and edges, overall aspect ratio of drawing etc.
GraphViz
Gephi
Implement layout algorithms yourself
Detailed coverage of graph drawing algorithms for different families can be found here
Graph Drawing Handbook
If you don't want to get into details, here are quick start points
For graphs, Do a topological sort and place nodes in layers as dictated by topo order. It can give you a very good starting point and help you avoid unnecessary crossings. Grid can be good idea here. But place nodes in grid in topo order.
Alternatively, Find a spanning tree for the graph, use tree layouts to draw spanning tree and then add remaining edges
For trees: Use recursive bottoms up approach for placing subtrees. For radial trees, do rectilinear layout and then transform to polar coordinate system
For unknown family: Use force directed method. Define force between two nodes (e.g. spring force) and then go through iterations to find equilibrium point.
Best auto visualization of a graph is very interesting area and people are trying many ML techniques here.
You could implement force directed drawing. Or you could use a graph drawing library that already supports force directed drawing, such as D3's force directed layout.

Graph partitioning based on nodes and edges weights

I have a graph G=(V,E) that both edges and nodes have weights. I want to partition this graph to create equal sized partitions. The definition of the size of partition is sum(vi)-sum(ej) where vi is a node inside that partition and ej is an edge between two nodes in that partition. In my problem the graph is very dense (almost complete). Is there any approximation algorithm for that?
This is somehow similar to the problem in bin packing with overlapping objects where bins have the same size. Weight of nodes are their size and weight of Edges show how much two object can overlap.
I think if you use METIS program solved problem.
you can download this program of this link
http://glaros.dtc.umn.edu/gkhome/views/metis
it has a good documentation and very fast program.

shortest path search in a map represented as 2d shapes

I have a small library of a few shortest path search algorithms. They were developed for simple undirected graphs (the normal representation - vertices and edges). Now I'd like to somehow apply them on a bit different scenario - where the maps are represented as 2-dimensional shapes, connected by shared edges (edges of the polygons, that is). In this scenario, the search can start/end either at a map object or some point (x,y). What would be the best approach? Try to apply the algorithms onto shapes? or try to extract a 'normal' graph out of the shapes (I have preprocessing time available)? Any advice would be much appreciated, as I'm really not sure which way to go, and I don't have enough time (and skill) to explore many options...
Thanks a lot
What's the "path" you're looking for? A list of the shapes to traverse? (Otherwise you just draw a straight line between start+end points.)
It's easy to preprocess it into a format where the shapes are vertices and are connected by edges when the shapes share a polygon side. Then, just pass it off to your existing library to get the answer.

Clustered Graphs Visualization Techniques

I need to visualize a relatively large graph (6K nodes, 8K edges) that has the following properties:
Distinct Clusters. Approximately 50-100 Nodes per cluster and moderate interconnectivity at the cluster level
Minimal (5-10 inter-cluster edges per cluster) interconnectivity between clusters
Let global edge overlap = The edge overlaps caused by directly visualizing a graph of Clusters = {A, B, C, D, E}, Edges = {Pentagram of those clusters, which is non-planar by the way and will definitely generate edge overlap if you draw it out directly}
Let Local Edge Overlap = the above but { A, B, C, D, E } are just nodes.
I need to visualize graphs with the above in a way that satisfies the following requirements
No global edge overlap (i.e. edge overlaps caused by inter-cluster properties is not okay)
Local edge overlap within a cluster is fine
Anyone have thoughts on how to best visualize a graph with the requirements above?
One solution I've come up with to deal with the global edge overlap is to make sure a cluster A can only have a max of 1 direct edge to another cluster (B) during visualization. Any additional inter-cluster edges between cluster A -> C, A -> D, ... are disconnected and additional node/edges A -> A_C, C -> C_A, A -> A_D, D -> D_A... are created.
Anyone have any thoughts?
Prefuse has some good graph drawing link text algorithms built in and it seems to handle fairly large graphs relatively well. You might try Flow Map Layout which is built on top of Prefuse.
Given your objectives, I think that the Fruchterman-Reingold algorithm does a pretty decent job of preventing edge overlap. See for example this screenshot of a network consisting of multiple components drawn using the Fruchterman-Reingold algorithm. IGraph has built-in support for this algorithm (as does Networkx I believe) and is really fast.
There is a program built on top of Prefuse called SocialAction. You have to request the code from the author, but it does a lot of statistical analysis on the graph for you, such as identifying subgraphs. I've used it on a graph with more than 18,000 nodes, and although it is very slow at that scale it still works.
Although it may be silly to ask at this point, had you tried out http://www.graphviz.org/ ?
I haven't seen too many graph visualization tools that support separating clusters within a graph visually. One option might be to take a look at WilmaScope. It looks to have some support for cluster based layouts.
Organic layout manages fairly well clustered graphs in yFiles framework. Try first in yEd to see if it does what needed. It is probably reasonable to use nested graphs alias groups for each cluster. Organic layout has feature called Group Layout Policy which can be used if layout needs to be done using different principles for inter-cluster and intra-cluster edges, with incremental layouting. With some effort, one can translate graph into GraphML to avoid manual work.

Graph Drawing With Weighted Edges

I'm looking to build an algorithm (or reuse one) that organizes nodes and edges on a 2 dimensional canvas where edges can have corresponding weights.
Any starting material and info would be helpful.
What would the weights do to affect their placement on your canvas?
That being said, you might want to look into graphviz and, more specifically, the DOT language, which organizes nodes on a canvas.
Many graph visualization frameworks use a force-based simulation, in which all nodes exert a repulsive force against each other (with their mass being their size), and edges exert tension on the nodes they connect. This creates aesthetically-arranged graph visualizations.
Although again, I'm not sure where you want node "weights" to come into play. Do you want weighted nodes to be more in the center? To be larger? More further apart?
Many graph/network layout algorithms are implicitly capable of handling weighted networks, but you may need to do some pre-processing and tweaks to the implementation to get it to work. Usually the first step is to determine if your weights represent "similarities" (usually interpreted to mean that stronger weights should place nodes closer togeter) or "dissimilarities" (stronger weights = father apart). The most common case is the former, so you will need to translate them to dissimilarities, often done by subtracting each edge value from the maximum observed edge value in the network. The matrix of dissimilarity values for each edge can then be fed to the algorithm and interpreted as desired distances in the layout space for each edge (i.e. "spring lengths")--usually after multiplying by some constant to transform to display units (pixels).
If you tell me what language you are using, I may be able to point you to some code examples.

Resources