Partition graph into groups of neighbours having the same class - graph

Using JGraphT, I would like partition a graph into groups where each group consists of a connected sub-graph of vertices that have the same "class" (denoted using colors below).
Example -- the desired groups are in red:
I think this is a rather simple demand and yet I can't find a (built-in) way to do this. I notice there is a PartitioningImpl class, that one constructs using a List<Set<V>> classes, yet I don't see a way to use this to partition a graph.
Ideally, I'd provide something with my graph and vertex classes (a map of V-->Integer for instance) and it would return something like a List<Set<V>> of partitioned vertex groups.

Sometimes you just cannot avoid writing some code
LOOP over classes
LOOP over nodes that are in class
Copy node to new graph
LOOP over edges that connect nodes in class
Copy edge to new graph
LOOP over connected components in new graph
Save component as a group graph for class

This is a fairly simple approach using JGraphT:
First remove edges linking neighbouring vertices that belong to different classes and then use ConnectivityInspector on the reduced graph to find connected components, which form the groups.
SimpleGraph<V, E> graph; // given by user
Map<V, Integer> classes; // given by user
List<E> toRemove = new ArrayList<>();
graph.edgeSet().forEach(e -> {
V a = graph.getEdgeSource(e);
V b = graph.getEdgeTarget(e);
if (!classes.get(a).equals(classes.get(b))) {
toRemove.add(e);
}
});
graph.removeAllEdges(toRemove);
ConnectivityInspector<V, E> ci = new ConnectivityInspector<>(graph);
List<Set<V>> groups = ci.connectedSets();

Related

Efficiently performance Graph Contraction based on a criteria

I have a directed weighted simple graph. I want to contract every node that has equal node value with another node that is directly connected to it. After contraction parallel edges would be made into one with the sum of the weights.
What is the most efficient way/algorithm for doing this? My graph is stored as an adjacency list if that would alter the answer.
If you are allowed to create a new graph and do not want to do this inplace, maybe a union-find data structure: https://en.wikipedia.org/wiki/Disjoint-set_data_structure would help you here.
This structure to allow you to define a representative vertex for each set of vertices that get merged together. The you create your new graph on this set of vertices and use the union-find structure to to create the edges on this new graph.

Break up graph into smallest sub-components of 2-nodes or greater

I wish to be able to separate my graph into subcomponent such that the removal of any single node would create no further sub-components (excluding single nodes). As an example see the two images below.
The first image shows the complete graph. The second image shows the sub-components of the graph when it has been split into the smallest possible subcomponents. As can be seen from the second image, the vertex names have been maintained. I don't need the new structure to be a single graph it can be a list of graphs, or even a list of the nodes in each component.
The component of nodes 4-5-6 remains as removing any of the three nodes will not create a new component as the node that was broken off will only be a single node.
At the moment I am trying to put together an iterative process, that removes nodes sequentially in ascending degree order and recurses into the resultant new components. However, it is difficult and I imagine someone else has done it better before.
You say you want the "smallest subcomponents of 2 nodes of greater", and that your example has the "smallest possible subcomponents". But what you actually meant is the largest possible subcomponents such that the removal of any single node would create no further sub-components, right? Otherwise you could just separate the graph into a collection of all of the 2-graphs.
I believe, then, that your problem can be described as finding all "biconnected components" (aka maximal biconnected subgraphs of a graph): https://en.wikipedia.org/wiki/Biconnected_component
As you said in the comments, igraph has the function biconnected_components(g), which will solve your problem. :)

How to get all Graph Nodes in ArangoDB without Start-Node

Like in OrientDB, for get the All Graph Only use 'Select From v'
So far, I have use AQL in ArangoDB with start node:
for v,e,p IN 2 ANY 'user/188802' graph 'a' return p
And now I want to get all graph nodes in ArangoDB without the start node?
Graphs are a grouping of Edge collections. Each Edge collection references _from and _to documents which are stored in Document collections.
The graph traversal queries expect you to have a starting position and it returns the results for that single starting position.
It is possible to identify all possible starting positions, and then run graph traversals over those positions.
You'll need to know the names of the document collections that make up your graph, you can insert them into an AQL query like this:
FOR vertex IN UNION(
(FOR v IN document_collection_1 RETURN v._id),
(FOR v IN document_collection_2 RETURN v._id),
(FOR v IN document_collection_3 RETURN v._id)
)
FOR v, e IN 1..5 OUTBOUND vertex GRAPH 'my_graph_name' OPTIONS { uniqueVertices: true }
RETURN DISTINCT [
{
_from: e._from,
_to: e._to
}
]
Remember that in ArangoDB it is possible for a document collection to be bound to more than one graph, so you'll need to ensure you identify all document collections that are part of the graph.
This query will then extract an array of objects that contain all links defined in the graph. This query focuses only on vertices with edges that are part of the graph. If the vertex has no edge on it, it won't appear in the output as it is not part of the graph.

How to randomly pick a vertex or edge from graph of jGraphT

I have created a Graph with a set of edges I have (4000K Edges and 4K nodes).
Now I want to take 10% of the edges from the corpus to create a train and test data set.
I want to pick an edge in random, verify if the vertices of this edge has an edge with a random vertex. If so, I will remove that edge in the graph and also write that edge in a test file. So, that later I will predict the edges of the test file using some similarity function.
Logic is I am trying to predict A->C, given A->B and B->C.
Now the problem is, I cannot get a way to randomly pick an edge and randomly pick a vertex in JGraphT. My vertex names are some strings with random numbers.
Any one has a solution for this ?
There is a possibility. See the example first:
DirectedGraph<String, DefaultEdge> graph = new DefaultDirectedGraph<String, DefaultEdge>(DefaultEdge.class);
Object[] vertexSet = graph.vertexSet().toArray();
Object[] edgeSet = graph.edgeSet().toArray();
String someRndNode = (String) vertexSet [ getSomeRandomNumberBetween(0, vertexSet.length)];
DefaultEdge someRndEdge = (DefaultEdge) edgeSet [ getSomeRandomNumberBetween(0, edgeSet.length)];
You simply get the set of edges and nodes of your graph. Determine a random number based on the arrays. Get the stuff you need out of it.

Neo4j and Lucene multikey queries (or: should I use Cypher?)

I have a graph in Neo4j in which nodes represent random points in the plane, each node having the coordinates stored as properties (x and y, the value type is double).
I create the nodes and index them:
IndexManager index = graph.index();
Index<Node> nodesIndex = index.forNodes("points");
for (int i = 0; i < points.length; i++) {
Point p = points[i];
Node n;
Transaction tx = graph.beginTx();
try {
n = graph.createNode();
n.setProperty("x", p.getX());
n.setProperty("y", p.getY());
nodesIndex.add(n, "x", new ValueContext(p.getX()).indexNumeric());
nodesIndex.add(n, "y", new ValueContext(p.getY()).indexNumeric());
tx.success();
} finally {
tx.finish();
}
}
Now, what I need to do, is query for the nodes which are in a square area. So for example I made this query:
http://localhost:7474/db/data/index/node/points?query=x:[0.0 TO 3.0] AND y:[0.0 TO 3.0]
And this is the response:
Node
Properties
y 1.0
x 14.0
Node info
self /db/data/node/10
Node
Properties
y 1.0
x 2.0
Node info
self /db/data/node/7
Node
Properties
y 1.0
x 6.0
Node info
self /db/data/node/8
Node
Properties
y 1.0
x 7.0
Node info
self /db/data/node/9
[Etc...]
As you see it is not working. And I don't understand why (maybe I need to configure the index?).
Note that I haven't got to use Lucene. If there is a way to gather that information with Cypher (starting from a node centered in the square area) would be actually better, for I also need the relationships between the nodes found.
Additional informations
If that matters, the graph represents a Delaunay triangolation on a random set of points on the plane. In more abstract terms, I need to "extract" the entire subgraph that lays in a given area.
Any help is really appreciated!
I am afraid you can't do this via Cypher. There is no way Cypher can infer that you want to use a numeric Value context for this query (could probably be in some indexing meta info in future releases), and you need that to be able to query Lucene the way you want to. Easiest way over REST is probably to use Groovy, see custom sorting (same issue) at http://docs.neo4j.org/chunked/snapshot/gremlin-plugin.html#rest-api-send-an-arbitrary-groovy-script---lucene-sorting

Resources