Is there any suggested javascript alternative(s) to pythons pygraph or NetworkX? It should be noted that visualization is not necessary (even prefered not to have this).
The library should be able to parse a format capable of retaining labeling and attributes on nodes and edges (DOT, GraphML?). It should support operations such as:
Listing nodes and edges.
Given a node, the edges which point in/out to/from it.
Given a node or edge, return the attached attributes.
Given two nodes that are connected, determine the most complete path. When running this operation a predicate function should be provided to determine if a node should be included in the search or not.
To put it in context, the web browser based application will traverse the graph from a pre-determined start node. Each node holds an attribute 'userValue' which is compared against conditions (rules?) held as attributes on the nodes out-edges. For the traversal to continue the edge condition must evaluate to true against 'userValue'. The graph will always contain a predetermined start and end (or goal) node.
You could try
JSNetworkX
A port of the NetworkX graph library to JavaScript
http://felix-kling.de/JSNetworkX/
Related
I have read the documentation about anonymous traversals. I understand they can be started with __ and they can be used inside step modulators. Although I dont understand it conceptually. Why cannot we use a normal traversal spawned from graph traversal source inside step modulators? For example, in the following gremlin code to create an edge
this.g
.V(fromId) // get vertex of id given for the source
.as("fromVertex") // label as fromVertex to be accessed later
.V(toId) // get vertex of id given for destination
.coalesce( // evaluates the provided traversals in order and returns the first traversal that emits at least one element
inE(label) // check incoming edge of label given
.where( // conditional check to check if edge exists
outV() // get destination vertex of the edge to check
.as("fromVertex")), // against staged vertex
addE(label) // add edge if not present
.property(T.id, id) // with given id
.from("fromVertex")) // from source vertexx
.next(); // end traversal to commit to graph
why are __.inE() and __.addE() anonymous? Why cannot we write this.g.inE() and this.g.addE() instead? Either ways, the compiler is not complaining. So what special benefit does anonymous traversal gives us here?
tldr; Note that in 3.5.0, users are prevented from utilizing a traversal spawned from a GraphTraversalSource and must use __ so it is already something you can expect to see enforced in the latest release.
More historically speaking....
A GraphTraversalSource, your g, is meant to spawn new traversals from start steps with the configurations of the source assigned. An anonymous traversal is meant to take on the internal configurations of the parent traversal it is assigned to as it is spawned "blank". While a traversal spawned from g can have its internal configuration overwritten, when assigned to a parent, it's not something that is really part of the design for it to always work that way, so you take a chance in relying on that behavior.
Another point is that from the full list of Gremlin steps, only a few are actually "start steps" (i.e. addV(), addE(), inject(), V(), E()) so in building your child traversals you can really only ever use those options. As you often need access to the full list of Gremlin steps to start a child traversal argument, it is better to simply prefer __. By being consistent with this convention, it prevents confusion as to why child traversals "sometimes start with g and other times start with __" if they are used interchangeably throughout a single traversal.
There are perhaps other technical reasons why the __ is required. An easy one to see that doesn't require a ton of explanation can be demonstrated in the following Gremlin Console snippet:
gremlin> __.addV('person').steps[0].class
==>class org.apache.tinkerpop.gremlin.process.traversal.step.map.AddVertexStep
gremlin> g.addV('person').steps[0].class
==>class org.apache.tinkerpop.gremlin.process.traversal.step.map.AddVertexStartStep
The two traversals do not produce analogous steps. If using g in replace of __ works today, it is by coincidence and not by design, which means that it could have the potential to break in the future.
My current project features a set of nodes with inputs and outputs. Each node can take its input values and generate some output values. Those outputs can be used as inputs for other nodes. To minimize the amount of computation needed, node dependencies are checked on application start. When updating the nodes, they are updated in the reverse order they depend on each other.
That said, the nodes resemble a directed graph. I am using iterative DFS (no recursion to avoid stack overflows in huge graphs) to work out the dependencies and create an order for updating the nodes.
I further want to avoid cycles in a graph because cyclic dependencies will break the updater algorithm and causing a forever running loop.
There are recursive approaches to finding cycles with DFS by tracking nodes on the recursion stack, but is there a way to do it iteratively? I could then embed the cycle search in the main dependency resolver to speed things up.
There are plenty of cycle-detection algorithms available on line. The simplest ones are augmented versions of Dijkstra's algorithm. You maintain a list of visited nodes and costs to get there. In your design, replace the "cost" with the path to get there.
In each iteration of the algorithm, you grab the next node on the "active" list and look at each node that follows it in the graph (i.e. each of its dependencies). If that node is on the "visited" list, then you have a cycle. The path you maintained in getting here shows the loop path.
Is that enough to get you moving?
Try a timestamp. Add a meta timestamp and set it to zero on your nodes.
Previous Answer (non applicable):
When you start a search, increment or grab a time() stamp. Then, when
you visit a node, compare it to the current search timestamp. If it
is the same, then you have found a cycle. If not then set the stamp
to current.
Next search, increment again.
Ok, this is how I'm assuming you are performing your DFS search:
Add Root node to a stack (for searching) and a vector (for updating).
Pop the stack and add children of the current node to the stack and to the vector
loop until stack is empty
reverse iterate the vector and update values (by referencing child nodes)
The problem: Cycles will cause the same set of nodes to be added to the stack.
Solution 1: Use a boolean/timestamp to see if the node has been visited before adding to the DFS search stack. This will eliminate cycles, but will not resolve them. You can spit out an error and quit.
Solution 2: Use a timestamp, but increment it each time you pop the stack. If a child node has a timestamp set, and it is less than the current stamp, you have found a cycle. Here's the kicker. When iterating over the values backwards, you can check the timestamps of the child nodes to see if they are greater than the current node. If less, then you've found a cycle, but you can use a default value.
In fact, I think Solution 1 can be resolved the same way by never following more than one child when updating the value and setting all nodes to a default value on start. Solution 2 will give you a warning while evaluating the graph whereas solution 1 only gives you a warning when creating the vector.
Given a directed graph with multiple start nodes and multiple end nodes, I need to form paths that visit every reachable edge, but I cannot visit any edge (or vertex) more than once during a single pass. [This is to electrically test every connection in a network by sending signals from start to end nodes, but I cannot allow paths to short together.]
Because I cannot re-visit edges during a single pass:
I can safely ignore the cycles in the graph.
I know each path I form will block other paths.
Consequently, I cannot visit every reachable edge in one pass, so multiple passes are necessary.
From context, I know that the minimum number of passes will be the maximum number of edges entering any vertex. Once I finish a given pass, I am free to re-visit edges that were visited in previous passes, but never-visited edges are the ones that I most want to visit.
I would like to visit "many" edges per pass, so that I can reduce total the number of passes, but I do not strictly need to minimize the number of passes.
Any suggestions on algorithms to accomplish this? It sounds a little like the route inspection problem, except that my graph is directed.
It is not clear from the question whether you have one or many start points and one or many end points. For simplicity let me assume "one-to-many" network. Then your requirement (not visit any edge or vertex more then once) means you actually generate a spanning tree of your graph with the given root.
A simple but not 100% solution that comes to mind is the following:
Assign some initial weights to the edges and apply random spanning tree algorithm. Then decrease the weight (actually, relative probability) of visited edges. It is very likely all edges will be visited.
In the case of "many-to-many" connection you can play with different starting points. If some sources are not connected to some sinks the algorithm would throw an exception. If this is not what you inspect, you can run regular DFS first to collect all reacheable vertices into some set; then you can use this set as a filter to form a boost::filtered_graph.
My question in short - is whether I can modify the traversal logic used by Neo4j - how to have a control over which edges are traversed and which aren't, during the reachbility computation.
Full description:
I am considering migrating from our current DB to neo4j, and I am wondering if neo4j is a good fit for the following task:
We have large graphs with about 10M simple nodes - their attribute is only a single id.
We also have 3 kinds of edges - "standard", "opening" and "closing". The "opening" and "closing" ones also have a "color" attribute, so they are matched. Each "opening" edge has exactly one matching "closing" edge". For example, there is one opening edge colored as "3", so there's also one closing edge colored the same.
We need to solve a reachability between two nodes where the rules of traversal are fairly simple:
You can go through standard edges as you want, you can go through opening edges as you want, while maintaining the order of visited "opening" edges in a stack BUT (that's the tricky part) when you come to a junction that has several "closing" edges, you MUST go through the closing edge which matches the last "open" edge encountered, and then pop out that "open" edge from the stack.
For example:
a -[STANDARD]->B-[Open color:3]->C-[Standard]->D-[Close color:3]->E
and also
D-[Close color:4]->F
note that D has two "closing" edges, with different colors.
By the rules defined above, E is reachable by A as the color stack has [3] on its top.
F, however, is not reachable by A.
Can neo4j be configured for such graph traversal logic?
Thanks!!
This is possible by implementing your own PathExpander and passing to the TraversalDescription. As Michael Hunger pointed out: BranchState can be used to optimize your expander so that you don't have to check the full path each expansion, but instead some kind of boiled down (immutable mind you) state that each traversal branch carries. The expander can pass on a modified state to each next step.
Unfortunately the neo4j manual lacks good examples of using branch state. This sounds like an awesome usage though!
Is there an algorithm that will, if given two nodes on a graph, find a route between them that takes the specified number of hops? Any node can be connected to any other.
The points at the moment are located in 2D space, so I'm not sure if a graph is the best approach.
Have you tried iterated-deepening DFS?
If you have nodes are seeking to find routes in terms of hops, then a graph is probably the right approach. I'm not sure I understand what you are trying to do and what the constraints are, though, especially with respect to "Any Node can be connected to any other" .. which seems a bit open ended.
Disregarding that, however; with some graph representation:
It seems like starting at the first node, and doing a depth first search from there, and terminating a search if (a) the hops taken is larger than your specified number or (b) we have arrived at the second node; this will determine the first (not only) path connecting the two nodes in (at most) that many hops.
If it has to be exactly the specified hops, terminate any branch of the search if the hops have gone over, and terminate with success if you have also arrived at the second node.
Dumb approach: (data structure is array of stacks). This is basically doing Breadth First Search (BFS) to depth N, except that if loops are allowed (you did not clarify but I assume they are), you don't exclude the visited nodes from further searching.
Push starting node on a stack stored in the array at index 0 (index=depth)
For each level/index "l" 0-N:
For each node on a stack stored at level "l", find all its neighbors, and push them onto a stack stored in level "l+1".
Important: if your task allows finding paths that contain loops, do NOT check if you already visited any node you add. If it does not allow loops, use a hash of visited nodes to not add any node twice**
Stop when you end level "N-1".
Loop over all the nodes you just added to stack at index "N" and find your destination node. If found: success, if not, no such path.
Please note that if by "every node can be connected" you are implying a FULLY CONNECTED graph, then there exists a mathematical answer not involving actually visiting nodes
(however, the formula is too long to write in the text-entry field of StackOverflow)