Azure Cosmos Graph DB Compound Conditional Query - azure-cosmosdb

Using Azure's Cosmos DB Graph Database I have a sample DB:Sample Graph Database. I am trying to expand a simple query into one that returns all the "Person" Nodes that have a "Skill Ratings" Node where MS_OFFICE=FAIL AND .NET=TRUE.
Current Query Thus Far:
ENGLISH DESCRIPTION: Find "Person Nodes" whose "Skill Ratings" Node match MS_OFFICE=FAIL
GREMLIN QUERY: g.V().hasLabel("Person").as("PersonNode")
.out("scored").as("SkillNode")
.has("skill_Name","MS_OFFICE")
.has("skill_Value","FAIL")
.select("PersonNode").by("Name")
To modify the existing query to include a conditional element, I was wondering if I am just missing a specific traversal step in the TinkerPop Documentation?

I think you just need something like this:
g.V().hasLabel("Person").
where(out("scored").
has("skill_name","MS_OFFICE").
has('skill_value","FAIL")).
values('name')

Related

Tinkerpop Gremlin is it better to query with hasId or to search by property values

Using Tinkerpop Gremlin (Neptune DB), is there a preferred/"faster" way to query?
For example, let's say I have a graph containing the node:
label: Student
id: 'student/12345'
studentId: '12345'
name: 'Bob'
Is there a preferred query? (for this example let's say we know the field 'studentId' value, which is also part of the id)
g.V().filter('studentId', '12345')
vs
g.V().filter(hasId(TextP.containing('12345'))
or using "has"/"hasId" vs "filter"?
g.V().has('studentId', '12345')
vs
g.V().hasId(TextP.containing('12345'))
So there seems to be two questions here, one about filter() vs has() and the other about using the vertex id versus a property.
The answer to the first question is going to depend on the underlying database implementation and what is has/has not optimized. In general, and in Neptune, I would suggest using the g.V().has('studentId', '12345') pattern to filter on a property as it is optimized and easier to read.
The answer to the second question also depends on the database implementaiton, as not all allow for setting of the vertex ids. Other databases may vary but in Neptune setting ids is allowed and a direct lookup by ID is the fastest (e.g. g.V('12345') or g.V().hasId('12345')) way to look something up as it is a single index lookup. One thing to note is that in Neptune vertex/edge id values need to be globally unique so you need to ensure that you will only have one vertex or edge with a specific id.

How to return only the id of a vertex in Cosmos Graph

I am importing data from SQL Server to Cosmos Gremlin Graph. During the process, I need to search for a vertex by its old id, as the new id is needed to create an edge. I only want the id not all the other properties.
I found the answer.
You can use the SubmitWithSingleResultAsync( query ); method with a query like g.V().has('elementTypeId','1').has('importId',123456).properties('id').value()

Gremlin OLAP traversal query error regarding local star-graph

I'm trying to execute an OLAP traversal on a query that needs to check if a vertex has a neighbour of certain type.
i keep getting
Local traversals may not traverse past the local star-graph on GraphComputer
my query looks something like:
g.V().hasLabel('label1').
where(_.out().hasLable('label2'))
I'm using the TraversalVertexProgram.
needless to say, when running the same query in oltp mode there is no problem
is there a way to execute such logic?
That is limitation of TinkerPop OLAP GraphComputer. It operate on 'star-graph' objects. The vertex and connected edges only. It uses message passing engine inside. So you have to rewrite you query.
Option 1: start from label2 and go t label1. This should return the same result
g.V().hasLabel('label2').in().hasLabel('label1')
Option2: try to use unique edge labels and check edge label instead
g.V().hasLabel('label1').where(_.outE('label1_label2'))

Create Vertex only if "from" and "to" vertex exists

I want to create 1000+ Edges in a single query.
Currently, I am using the AWS Neptune database and gremlin.net for creating it.
The issue I am facing is related to the speed. It took huge time because of HTTP requests.
So I am planning to combine all of my queries in a string and executing in a single shot.
_g.AddE("allow").From(_g.V().HasLabel('person').Has('name', 'name1')).To(_g.V().HasLabel('phone').Where(__.Out().Has('sensor', 'nfc'))).Next();
There are chances that the "To" (target) Vertex may not be available in the database. When it is the case this query fails as well. So I had to apply a check if that vertex exists before executing this query using hasNext().
So as of now its working fine, but when I am thinking of combining all 1000+ edge creation at once, is it possible to write a query which doesn't break if "To" (target) Vertex not found?
You should look at using the Element Existence pattern for each vertex as shown in the TinkerPop Recipes.
In your example you would replace this section of your query:
_g.V().HasLabel('person').Has('name', 'name1')
with something like this (I don't have a .NET environment to test the syntax):
__.V().Has('person', 'name', 'name1').Fold().
coalesce(__.Unfold(), __.AddV('person').Property('name', 'name1')
This will act as an Upsert and either return the existing vertex or add a new one with the name property. This same pattern can then be used on your To step to ensure that it exists before the edge is created as well.

Best way to limit edges to the same group of verices using Gremlin against Azure Cosmos db graph

I need to get vertices filtered by a specific predicate on the properties, and all the edges (with a particular label, and perhaps some predicate on the properties of an edge) existing between them.
This is for a Cosmos Azure Db graph, and the solution should be a single Gremlin query.
So far I am thinking of something along the lines of:
g.V().has('property1', value1).has('property2', value2).select('vertices')
.outE().as('edges').inV().has('property1', value1).has('property2', value2)
.select('vertices','edges')
Is there a better way to achieve this?
Given the description and your comment, this traversal should work for you:
g.V().has('property1', value1).has('property2', value2).
aggregate('v').
outE(). /* add edge filters here */
inV().where(within('v')).
path().
union(union(limit(local, 1),
tail (local, 1)).dedup().aggregate('vertices'),
range(local, 1, 2).aggregate('edges')).
cap('vertices','edges').next()

Resources