how to specify the breadth-first strategy or the depth-first strategy in gremlin - gremlin

The depth-first strategy is the default strategy in Gremlin unless specified otherwise.how to specify the breadth-first strategy?
gremlin language as follow:
g.V().repeat(_,in('edgelabel').simplePath()).times(3).path()

Currently when using repeat it defaults to BFS, which is unfortunate given that all other OLTP queries are DFS. There's a ticket to address that: https://github.com/apache/tinkerpop/pull/838

Related

Gremlin shortest path

1- What is the difference between using the the shortestPathVertexProgram and running something like "g.V(1).repeat(out().simplePath()).until(hasId(5)).path().limit(1)" using the TraversalVertexProgram.
This comes down to the more general question: is it worth implementing a parallelized algorithm (if possible at all) and implementing it in a Pregel like fashion using low primitive of message passing as can be seen in the code of the shortestPath vertexProgram, or it is better to simply write it as a traversal and execute it as a parallel traversal? What's the complexity difference between the two approach ?
My intuition is that the former would be faster, but it seems like the VertexProgram API is hard, not super documented, and there is no super tutorial about it such as in GraphX Pregel API.

How to write like queries using gremlin for Neptune as Neptune is not supporting the Lambda function

Is there any way to write like queries Like '%match%' in gremlin without using the lambda functions?.
Neptune does not support Lambda functions
There are usually ways to express lambdas with Gremlin steps. Indeed, it is often better to do so because graph providers can't optimize the portions of your queries that contain lambdas (as it is just arbitrary code).
Typically, the nature of the lambda contents determines whether or not it can be expressed easily with Gremlin steps. If the lambda uses a third-party library (e.g. a JDBC driver) that abstracts a bunch of complex or custom behavior then expressing such concepts are usually not possible with just Gremlin steps.
For string comparisons like %match% TinkerPop has long left that sort of support to the graph providers (e.g. DSE Graph full text search API). Each has its own way of expressing text searches and you would use those provider specific APIs in your applications.
As you have found, Neptune does not have such constructs at this time so there is little recourse for that capability. If you really need that feature, I'm afraid that you will have to be satisfied with a startsWith type of query:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().has('name',between('m','mz'))
==>v[1]
or choose a different graph system. Note that there has been recent discussion in the community for making text based search a first-class feature of the Gremlin language, but no decision has really been made at this point.

Cypher and Gremlin querying

I'm doing a research about graph query languages and I considered that Gremlin is dedicated for traversal querying and Cypher is efficient and more easy, but I can't find a concrete example that differentiate them.
Can some one give me some example of queries that we can do with Cypher and not with Gremlin or the opposite.
Thanks
It's not a matter of what one language can do that the other language can't. They're both complete enough that you can do any kind of graph query in either. The question is simply how hard you'll have to work to make it happen, and whether the result will be performant, readable, and easy to change.
Cypher is a declarative language, meaning that you declare what you want to see, and the engine figures out how to get that data for you. Gremlin is largely imperative, meaning that you specify how to traverse the graph. This tends to make Gremlin more brittle,

SQL like query language for Apache TinkerPop

Background: I have been using NEO4J and its Cypher Query till now and am looking to move to Apache TinkerPop to support multiple Graph DB.
In Cypher Query Language I to find my friends I would write this Query.
MATCH (you {name:"You"})-[:FRIEND]->(yourFriends)
RETURN you, yourFriends
Now I am looking for a Query Language similar to the one that is already coded in my code to work with Gremlin
from what I have looked, Gremlin has a script like "g.v(12).outE('knows').inV" but this is not similar to a SQL syntax, which is what I am looking for.
Note: I am NOT looking for SQL connectivity, I am just looking for a
SQL LIKE Script
TLDR;
The short answer to your question is that for Tinkerpop-enabled databases you will need to write your queries in Gremlin, there is not SQL-Like language currently.
Details
Gremlin differs from SQL and Cypher in multiple ways but a significant one being that Gremlin is a declarative language and SQL/Cypher are imperative languages. In Gremlin you define how you want to traverse through your graph and in SQL/Cypher you define what you want and the engine optimizes the traversal for you.
For example the Cypher query you have above would be written in Gremlin as:
g.V().has('name', 'You')
.as('you').out('friend')
.as('yourFriends')
.select ('you', 'yourFriends')
Currently you would need to translate your Cypher queries to Gremlin to work against any number of TP databases including JanusGraph, CosmosDB, DSE Graph, AWS Neptune.... All the current providers can be found here: Tinkerpop Providers
Daniel Kuppitz has written a sight teaching you how to migrate from wiriting SQL Queries to writing gremlin ones and it is available here: SQL2Gremlin

Tinkerpop - Is it better to use Redis for key-value property indexes or to use KeyIndexableGraph

Pretty straightforward question but I can't find the info that I want - is it advisable to use the KeyIndexableGraph of tinkerpop or to roll your own super performant key/index solution on the most performant and specialized stores like redis to get the node/edge locations you need?
It would appear to me that Redis should be better here as a technology that only focuses on key/value lookups and then pass the address in to the graph but I'd like to justify the costs.
The promise from tinkerpop is that index lookups should be log(n) on articles that are indexed with the property which is pretty good. Is it possible to do better in redis, or that the n*constant is much better than in the graph lookup?
Edit: I realized later this isn't really an intelligent question - Redis is an in memory store so is bounded by memory. Looking up a graph node location is still going to require a second lookup of the node in the graph.
It is important to remember that aside from TinkerGraph (an in-memory graph), TinkerPop is not a graph database on its own. KeyIndexableGraph is an interface that is implemented by an underlying graph databases (Titan, Neo4j, OrientDB, etc.) utilizing that graph's index capability. Therefore, you should make your indexing choice based on the capabilities of the underlying graph database itself.
Generally speaking, implementing Redis for indexing purposes for the graphs that do implement KeyIndexableGraph seems like an unnecessary layer. I would guess that it will complicate your programming without much benefit.
Here is the difference:
Databases like OrientDb have apx O(log2n) lookup times on an index.
Reddis has O(1) - constant time lookup.

Resources