Neo4j - Adding logic to graph traversal - graph

My question in short - is whether I can modify the traversal logic used by Neo4j - how to have a control over which edges are traversed and which aren't, during the reachbility computation.
Full description:
I am considering migrating from our current DB to neo4j, and I am wondering if neo4j is a good fit for the following task:
We have large graphs with about 10M simple nodes - their attribute is only a single id.
We also have 3 kinds of edges - "standard", "opening" and "closing". The "opening" and "closing" ones also have a "color" attribute, so they are matched. Each "opening" edge has exactly one matching "closing" edge". For example, there is one opening edge colored as "3", so there's also one closing edge colored the same.
We need to solve a reachability between two nodes where the rules of traversal are fairly simple:
You can go through standard edges as you want, you can go through opening edges as you want, while maintaining the order of visited "opening" edges in a stack BUT (that's the tricky part) when you come to a junction that has several "closing" edges, you MUST go through the closing edge which matches the last "open" edge encountered, and then pop out that "open" edge from the stack.
For example:
a -[STANDARD]->B-[Open color:3]->C-[Standard]->D-[Close color:3]->E
and also
D-[Close color:4]->F
note that D has two "closing" edges, with different colors.
By the rules defined above, E is reachable by A as the color stack has [3] on its top.
F, however, is not reachable by A.
Can neo4j be configured for such graph traversal logic?
Thanks!!

This is possible by implementing your own PathExpander and passing to the TraversalDescription. As Michael Hunger pointed out: BranchState can be used to optimize your expander so that you don't have to check the full path each expansion, but instead some kind of boiled down (immutable mind you) state that each traversal branch carries. The expander can pass on a modified state to each next step.
Unfortunately the neo4j manual lacks good examples of using branch state. This sounds like an awesome usage though!

Related

AWS Neptune: More performant to drop Edges before Vertices?

Using - Neptune Engine: 1.0.5.1, Apache Tinkerpop: 3.5.2
My question is in regard to the performance of Vertex removal - it is not about the loading of the Vertices.
We have a cron job that clears out a limited number (1000) of "expired" Vertices.
We get hold of and store the vertices to be removed in a Set.
We then remove these via a g.V([vertices]).sideEffect(drop()).next().
This works fine.
All of the Vertices to be removed will have 1 inE and 1 outE.
These Edges obviously get automatically removed when the linked Vertex is removed.
I am wondering though if Neptune (under the hood) would be more performant if we got hold of and removed the Edges first, and then removed the Vertices.
Just wondered if anyone out their (mainly using Neptune, but it is probably a "thing" with other graph databases too) has looked into this and has any hard evidence either way.
Many thanks
As far as using Amazon Neptune - If you are just doing a single threaded drop of 1,000 vertices where each only has one incident edge then what you are doing is fine. If you were dropping thousands (or more) of vertices, in a multi threaded fashion, then dropping the edges first can avoid collisions as different threads may try to get locks on the same object in the database. In such cases, to avoid conflicts, and therefore avoid retries, dropping the edges first can improve performance.

Alloy model - how to display cycle in visualization of graph?

We have an Alloy model intended to find possible deadlocks in a system. It is set up such that when a counterexample is found, that implies the system being modeling may have a deadlock, i.e. a circular dependency between nodes in the graph. The problem is the visual model of the graph is so complicated, it's nearly impossible to find the cycle representing the deadlock. If there were a way to highlight the cycle, or at least perhaps highlight arcs in the graph that are directed "up" rather than "down" this would help us visualize things better (since in the model we have, a deadlock-free system has all arcs directed in a downward direction). Is there a way to highlight or selectively plot nodes and arcs that create the counterexample?
The first thing that comes to my mind is that when Alloy shows an instance of a predicate, the various arguments to the predicate can be styled specially. So you might try (1) defining a predicate that is the inverse of your assertion, i.e. one that holds when a deadlock is present and which assigns a named role to the nodes in the cycle, and (2) setting the style to display those nodes in a distinct color or shape. You may be able to hide everything that's not in the cycle, or gray it out.

How can the number of strongly connected components of a graph change if a new edge is added

Exercise: 22.5-1 CLRS
How can the number of strongly connected components of a graph change if a new
edge is added?
Somewhere the answer given is If a new edge is added, one of two things could happen.
1) If the new edge connects two vertices that belong to a strongly connected component, the number of strongly connected components will remain the same.
2) If, instead, the edge connects two strongly connected components, and the edge is in the reverse direction of an existing path between the two components, then a new strongly connected component will be made, increasing the number of components.
I think the second point is incorrect.
Lets say we have two strongly connected component C and C'
a) If no edge or edge C->C' exists between them and new edge connects as C->C' then nothing will happen.
b) If edge C->C' exists between them and new edge connects as C'->C then C' will be merged to C decreasing the number of strongly connected component by 1 as every vertex will be reachable from each other.
Please correct me if i am wrong.
You're exactly correct. The answer you quoted is wrong in its description: adding edges is only ever going to decrease the number of strongly connected components. Once all possible edges have been added, there's just a single strongly connected component left - the entire graph.

A* Pathfinding - closest to unwalkable destination

I already have an A* Implementation that works. The problem is that if you pick a destination that is unwalkable, no path is returned. I want to be able to get the 'closest' I can get.
The preferable option would be completely dynamic (not just checking the 8 tiles around the destination to try to find one). That way, even if they click an unwalkable tile surrounded by a huge square of unwalkable tiles, it will still get as close as it can.
While the simple answers provided here MIGHT be sufficient enough, I think it depends on your game type and what you're trying to achieve.
For example, take this play field (sorry I'm reusing the same software I used to show you the fog of war :)) :
As you can see, an Angry Chicken is blocking the path between the left side and the right side. The Angry Chicken could be anything... if it's a static obstacle, then going with the lowest h node might be enough, but if it's a dynamic object (like a locked door, draw bridge, etc...) the following examples might help you find out how you want to solve your problem.
If we set the destination for our Hero on the other side
We need to think what we want the path to be, since obviously we can't reach it. Using a standard heuristic like manhattan distance or euclidian distance, you will get this result:
Which might be good enough, but if there's any way our little Hero could interact with the chicken to pass, it doesn't make sense at all, what you want is this
How can you do this? Well, an easy way to do this is to pathfind on hierarchical graphs. This sounds complicated but it isn't. First, you want to be able to build a new set of high level nodes and edges that will contain multiple grid nodes (or other representation, wouldn't change a thing)
As you can see, we now have a right blue node and a left red node. The arrow represents the edge between the two nodes. How to build this graph you ask? It's easy, simply start from an open node, expand all of its neighbors and add them to a high level node, when you're done, open the dynamic nodes that could lead to another part of the graph and do the same.
Now, when you ask for a path from our Hero to the red X, you first do the pathfinding on the high level... is there a way from blue node to red node? Yes! Through the chicken.
You can now easily know how to navigate on the blue side by going to the edge that will allow you to cross, which is the chicken.
If it was just a plain wall, you could determine very quickly, by visiting a single node, that there is NO way to reach on the other side and then handle it the way you want, possibly still performing an A* and returning the lowest h node.
You could keep a pointer which holds a tile with the lowest h-value, then if no path is returned simply generate a path to the tile you're holding onto instead.

Fixed length path between two graph nodes

Is there an algorithm that will, if given two nodes on a graph, find a route between them that takes the specified number of hops? Any node can be connected to any other.
The points at the moment are located in 2D space, so I'm not sure if a graph is the best approach.
Have you tried iterated-deepening DFS?
If you have nodes are seeking to find routes in terms of hops, then a graph is probably the right approach. I'm not sure I understand what you are trying to do and what the constraints are, though, especially with respect to "Any Node can be connected to any other" .. which seems a bit open ended.
Disregarding that, however; with some graph representation:
It seems like starting at the first node, and doing a depth first search from there, and terminating a search if (a) the hops taken is larger than your specified number or (b) we have arrived at the second node; this will determine the first (not only) path connecting the two nodes in (at most) that many hops.
If it has to be exactly the specified hops, terminate any branch of the search if the hops have gone over, and terminate with success if you have also arrived at the second node.
Dumb approach: (data structure is array of stacks). This is basically doing Breadth First Search (BFS) to depth N, except that if loops are allowed (you did not clarify but I assume they are), you don't exclude the visited nodes from further searching.
Push starting node on a stack stored in the array at index 0 (index=depth)
For each level/index "l" 0-N:
For each node on a stack stored at level "l", find all its neighbors, and push them onto a stack stored in level "l+1".
Important: if your task allows finding paths that contain loops, do NOT check if you already visited any node you add. If it does not allow loops, use a hash of visited nodes to not add any node twice**
Stop when you end level "N-1".
Loop over all the nodes you just added to stack at index "N" and find your destination node. If found: success, if not, no such path.
Please note that if by "every node can be connected" you are implying a FULLY CONNECTED graph, then there exists a mathematical answer not involving actually visiting nodes
(however, the formula is too long to write in the text-entry field of StackOverflow)

Resources