Using the AQL traversal - graph

I want to use AQL to traverse a graph,I start from a vertex, I don't know how many layers of relationships, I want to go through the end,I want this traversal not to be influenced by direction,Is there any way to do it?Thank you for coming to help me.

Following edges in both, inbound and outbound direction, can be achieved with ANY. Unlimited traversal depth is not directly supported, but you can use an arbitrary high number for the maximum depth.
The same question was answered on GitHub by Simran-B.

Related

Is there a version of gremlin "bothE" that isn't actually "eitherE"?

I want to find all vertecies which has an edge to a specific vertex but the vertex should also have an edge back to the other vertex.
The following query does not behave in the way i want, it seems that bothE() means either in or out or both, not explicitly that both an incoming and outgoing edge must be present.
g.V(id).bothE().outV()
I also don't want the starting vertex (g.v(id)) to be returned. I've looked through the documentation but there's so many complex queries available that it'll take me a very long time to find this one simple thing i need. Any help is appreciated
You need something like this
g.V(id).as('a').out().where(out().as('a'))
Which basically says, find everything I am connected to by an outgoing edge that also has an outgoing edge back to me.

Does an Increased Number of Node Types Impact Performance of Graph DBs?

I am in the process of creating a graph database, a simple one for movies with several types of information like the actors, producers, directors and so on.
What I would like to know is, is it better to break down your nodes to a more granular level? For example, is it better to have two kinds of nodes for 'actors' and 'directors' or is it better to have one node, say 'person' and use different kinds of relationships like 'acted_in' and 'directed'? Does this even matter at all?
Further, is there any impact on the traversal queries? Does having more types of nodes mean that the traversal is slower?
Note: I intend to implement this using the Gremlin console in Amazon Neptune.
The answer really is it depends. If I were building such a model I would break out the key "nouns" into their own nodes. I would also label the edges appropriately such as ACTED_IN or DIRECTED.
The performance of any graph query depends on how much data it will need to touch (the fan out factor as you go from depth to depth).
The best advice I can give you is think about the questions you will need the graph to answer and try to design your data model so that writing those queries is as easy as possible. Don't be afraid to iterate multiple times on your data model also. That is common and expected.
Properties can be useful when you want to add a unique piece of information to a node - perhaps the birthday of the director.
Edge properties can be useful for filtering out unneeded edges but edge labels can also. In some cases you may find a label such as DIRECTED-IN-2005 is a useful short cut to avoid checking a label and a property on an edge.

Gremlin: how to get all of the graph structure surrounding a single vertex into a subgraph

I would like to get all of the graph structure surrounding a single vertex into a subgraph.
The TinkerPop documentation shows how to do this for a fixed number of traversal steps.
My question: are there any recipes for getting the entire surrounding structure (which may include cycles) without knowing ahead of time how many traversal steps are needed?
I have updated this question for the benefit of anyone who might land on this question, here is a gremlin statement that will capture an arbitrary graph structure that surrounds a vertex with id='xxx'
g.V('xxx').repeat(out().simplePath()).until(outE().count().is(eq(0))).path()
-- this incorporates Stephen Mallete's suggestion to use simplePath within the repeat step.
That example uses repeat()...times() but you could easily replace times() with until() and not know the number of steps ahead of time. See more information in the Reference Documentation on how repeat() works to see how to express different types of flow control and traversal stop conditions.

ArangoDB anonymous graph traversal

I am planing to use ArangoDB and I am faced with a problem I don't know how to solve. I would like to do simple traversals but in my case but there are two requirements that I don't know how to solve:
I will not know in advance the type of vertices than an edge will connect to. I want to be able to connect edge of one type to any vertex on any side.
For one vertex, I want to retrieve all connected vertices (depth 1) no matter the edge type.
For the requirement 1, an example would be a Tag vertex (to tag some entity with some information) and I want to be able to tag any vertex using i.e. HasTag edge in a named graph. From what I currently see is that I need to define the "From" collections ("To" collection is the Tag collection) and this is limited to 10 collections. Since I could have 100 or more From collections I don't see how to solve this with named graphs.
Option would be to use anonymous graphs but then I have a problem in the second requirement. I also want to have an option, when given a vertex, to find all connected vertices (depth = 1) no matter the type of an edge. In an anonymous graph I would need to specify all of the edge collections in a query and again, there could be 100 or more of them. I don't know if there is a limit to this number but I would assume there is one - maybe I'm mistaken since I haven't yet tried it out.
Has anyone any idea how to solve this with ArrangoDB? I really like the database but I would like it to be more "typeless", that is, that I wouldn't have to define the type of vertex collection an edge can connect to.
Best regards
Tomaz
You can have more than 10 vertex collections in a named graph. The limitation of 10 only exists in the webUI. Creating the named graph over the ArangoShell or the server console will work.

Graph exploration: does the choice to use incoming edges or outgoing edges affect performance?

I have been tinkering with Graphs for some time, with the objective that I implement appropriate portions of the server-side stack using them. I have used Scala-Graph and Neo4J, and I am learning Spark GraphX. In almost all the applications I have implemented, the model has been that of a Property Graph (Node -> Edge -> Node, with attributes).
When designing the graph (DAGs to be precise), if I spot a strong and directed relationship between two nodes, I set up an edge from one node to one node. This is obvious and intuitive. If a Person likes a Site, an edge with property 'likes' connects them. Thus:
[Nirmalya] -- (Likes) --> [StackOverFlow]
[John] -- (Likes) --> [StackOverFlow]
[Ted] -- (Likes) --> [GoogleGroups ]
[Nirmalya] -- (Likes) --> [Neo4J]
Now, using outgoing edges, I can easily find out which sites Nirmalya likes.
But, when I want to find out who else likes what Nirmalya likes (i.e.,John), I tend to think that I should create an edge from Site-type Node to Person-type Node also (with property 'isLikedBy'), so that the path is obvious and the traversal is intuitive. Every Person and Site must be connected in both the directions, so that I can reach the other from either to answer queries like this one.
[Nirmalya] -- (Likes) --> [StackOverFlow] -- (IsLikedBy) --> [John]
But from many examples given by experts, I see that this is not prescribed. Instead, this is achieved by making use of operators like incoming. In other words, if two Nodes have an edge set up between them, I don't need to set both the directions of the edge explicitly (just 'likes' is sufficient, 'isLikedBy' is superfluous). Implementation of adjacency matrix makes this possible perhaps but I get a bit confused because I am being allowed to derive a contra-direction even when that direction is not explicit in the DAG.
My question is where is the gap in my understanding? Is it that 'IsLikedBy' direction should ideally be present, but we are optimizing? Alternatively, is it that there can be UseCases where such bidirectional edges are necessary and I need to spot them? Am I completely missing a theoretical underpinning?
I will be glad to become wiser.
I think it depends on the software. I can speak for Neo4j, but not for the other tools that you mentioned ;)
In Neo4j relationships are designed to be traversable both forwards and backwards without a performance cost. This applies both to traversing in the Java APIs as well as using Cypher. You can query both specifying a direction of incoming/outgoing as well as querying for relationships without concern for the direction and it should also be the same performance characteristics.

Resources