In gremlin I want to retrieve specific properties from a collection of vertices based on the property (an array which lists which properties in can have) of an adjacent property of their parents' vertex.
I am not sure if it even possible to do in one query.Below I have an example graph. There would actually be many Projects each Project has a single template (which has the list of allowed properties) and a collection of Documents. In the example below, because the Template connected to ProjectA only has property1 and property2 in the list I only want to get back property1 and property2 from the Documents.
g.addV('Project').property('name', 'ProjectA').as('p').
......1> addV('Template').property('name','TemplateA').property(list,'allowed', 'property1').property(list,'allowed', 'property2').as('t').
......2> addV('Document').property('name', 'DocumentA').property('property1', '1').property('property2', '2').property('property3', '3').as('d1').
......3> addV('Document').property('name', 'DocumentB').property('property1', '1').property('property2', '2').property('property3', '3').as('d2').
......4> addE('hasDocuments').from('p').to('d1').
......5> addE('hasDocuments').from('p').to('d2').
......6> addE('hasTemplate').from('p').to('t')
You can do it in one query, by using aliasing and where filtering.
gremlin> g.V().
......1> hasLabel('Project').as('project').
......2> out('hasTemplate').
......3> values('allowed').as('allowed_properties').
......4> select('project').
......5> out('hasDocuments').as('docs').
......6> properties().as('selected_properties').
......7> label().
......8> where(eq('allowed_properties')).
......9> select('selected_properties')
==>vp[property1->1]
==>vp[property1->1]
==>vp[property2->2]
==>vp[property2->2]
Interesting question. It will take some Gremlin gymnastics:
gremlin> g.V().has('name','ProjectA').
......1> aggregate('p').by(out('hasTemplate').values('allowed').fold()).
......2> out('hasDocuments').
......3> elementMap().
......4> map(unfold().as('kv').
......5> filter(select('p').
......6> unfold().unfold().as('p1').
......7> where('p1',eq('kv')).
......8> by().by(keys)).
......9> group().by(keys).by(select(values)))
==>[property2:2,property1:1]
==>[property2:2,property1:1]
At line 1, we aggregate() the template keys to "p". Then we elementMap() all the documents which will grab all of their keys/values. We then start to filter each map at line 4 by unfold()-ing each one to its key/value pairs and then matching those keys against the ones in "p". We then use the common group() pattern on line 9 to merge the key/value pairs back into a Map.
Related
Why .as().step used before .fold() is not able to use/refer after .fold() in apache Gremlin?
Eg:
g.V().hasLabel('country').has('name', 'Japan').fold()
.coalesce(__.unfold(), __.addV('country').property('name', 'Japan')).as('country')
.outE('has').inV().hasLabel('state').has('name', 'A').fold()
.coalesce(__.unfold(), __.addV('state').property('name', 'A').addE('has').from('country'))
What is the alternative step to as?
Your query while traversing gets into a problem of filtering all the available values. So when the add Edge is called it doesn't have any value bind to the label country
You can change your query a bit to make sure the value for label is country is not filtered. I wrote a simple rewrite below which does that for you.
gremlin> g.V().
......1> hasLabel('country').
......2> has('name', 'Japan').
......3> fold().
......4> coalesce(__.unfold(), __.addV('country').property('name', 'Japan')).
......5> as('country').
......6> coalesce(
......7> outE('has').inV().hasLabel('state').has('name', 'A'),
......8> __.addV('state').property('name', 'A')).
......9> addE('has').from('country')
==>e[18][14-has->16]
gremlin> g.V().valueMap()
==>[name:[A]]
==>[name:[Japan]]
gremlin> g.E()
==>e[18][14-has->16]
How to access the vertex which is created using coalesce along with an edge in gremlin?
In the below query,
when the country already exists, it return vertex to next step (adding state)
returns new created edge, how to access the newly created vertex outside coalesce? I tried as() and select() but getting the error Edge cannot be cast to Vertex
g.V().hasLabel('country').has('name', 'Japan').fold()
.coalesce(__.unfold(), __.addV('country').property('name', 'Japan')).as('country')
//Adding state
.coalesce(outE('has').inV().hasLabel('state').has('name', 'B'),
__.addV('state').as('s').property('name', 'B').addE('has').from('country').select('s')).as('state')
//Adding city/town
.coalesce(outE('has').inV().hasLabel('city_town').has('name', 'F'),
__.addV('city_town').property('name', 'F').addE('has').from('state')).as('city_town')
Here 2 child traversal within Coalesce are not returning same value.
1st child traversal i.e. outE('has').inV().hasLabel('state').has('name', 'B') returns a vertex.
2nd child traversal i.e. __.addV('state').property('name', 'B').addE('has').from('country') returns an edge.
So When the edge exists your query would return a vertex, however when the edge doesn't exists it will return an Edge.
You could modify your query to give the same vertex in both the cases.
# First run creates the required entities.
gremlin> g.V().
......1> hasLabel('country').
......2> has('name', 'Japan').
......3> fold().
......4> coalesce(__.unfold(), __.addV('country').property('name', 'Japan')).
......5> as('country').
......6> coalesce(
......7> outE('has').inV().hasLabel('state').has('name', 'B'),
......8> __.addV('state').
......9> property('name', 'B').as('targetVertex').
.....10> addE('has').from('country').
.....11> select('targetVertex'))
==>v[2]
# Second run accesses the existing entities.
gremlin> g.V().
......1> hasLabel('country').
......2> has('name', 'Japan').
......3> fold().
......4> coalesce(__.unfold(), __.addV('country').property('name', 'Japan')).
......5> as('country').
......6> coalesce(
......7> outE('has').inV().hasLabel('state').has('name', 'B'),
......8> __.addV('state').
......9> property('name', 'B').as('targetVertex').
.....10> addE('has').from('country').
.....11> select('targetVertex'))
==>v[2]
Before the new ReferenceElementStrategy default setting, Gremlin query g.E() would return edge ID, label, inV, outV, and all the properties. What query should I use to return the same data with ReferenceElementStrategy enabled? g.E().valueMap().with(WithOptions.tokens).by(unfold()) returns everything except inV and outV.
For right now the answer is project():
gremlin> g.E(12).union(valueMap(true),
......1> project('inV','outV','inVLabel','outVLabel').
......2> by(inV().id()).
......3> by(outV().id()).
......4> by(inV().label()).
......5> by(outV().label())).unfold().
......6> group().
......7> by(keys).
......8> by(select(values))
==>[inV:3,id:12,inVLabel:software,weight:0.2,outVLabel:person,label:created,outV:6]
but for next release of 3.4.4 it will be elementMap():
gremlin> g.E(11).elementMap()
==>[id:11,label:created,IN:[id:3,label:software],OUT:[id:4,label:person],weight:0.4]
Right now I am able to generate a query to create as many vertices and edges as I want.
e.g.
g.V().
addV('vert1').as('a').
addV('vert2').as('b').
addE('has').from('a').to('b')
^^^^^^^^^^^^^ This works. Easy enough right? Now lets create a gremlin query that only creates these vertices if their label is unique. Then create an edge between the two.
g.V().has(label,'vert1').fold().
coalesce(
unfold(),
addV('vert1')
).as('a').
V().has(label,'vert2').fold().
coalesce(
unfold(),
addV('vert2')
).as('b').
addE('has').from('a').to('b')
^^^^^^^^^^^^^This does not work
hopefully you can understand what I am trying to do though. Can anyone help me?
Thanks
You have a fold() which is a ReducingBarrierStep that follows after your step label at as('a') and the path history to "a" is lost after that step. You can read more about this aspect of Gremlin here.
You just need to re-write your query to account for that - one way might be to just aggregate() the value of "a" rather than simply naming the step "a":
gremlin> g = TinkerGraph.open().traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.V().
......1> has(label,'vert1').fold().
......2> coalesce(unfold(),
......3> addV('vert1')).aggregate('a').
......4> V().has(label,'vert2').fold().
......5> coalesce(unfold(),
......6> addV('vert2')).as('b').
......7> select('a').unfold().
......8> addE('has').to('b')
==>e[2][0-has->1]
If you need to return all the elements, just project() the returned edge and transform the results as necessary:
gremlin> g.V().
......1> has(label,'vert1').fold().
......2> coalesce(unfold(),
......3> addV('vert1')).aggregate('a').
......4> V().has(label,'vert2').fold().
......5> coalesce(unfold(),
......6> addV('vert2')).as('b').
......7> select('a').unfold().
......8> addE('has').to('b').
......9> project('e','in','out').
.....10> by().
.....11> by(inV()).
.....12> by(outV())
==>[e:e[2][0-has->1],in:v[1],out:v[0]]
Of course, using a select() at the end might not be so bad either:
gremlin> g = TinkerGraph.open().traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.V().
......1> has(label,'vert1').fold().
......2> coalesce(unfold(),
......3> addV('vert1')).aggregate('a').
......4> V().has(label,'vert2').fold().
......5> coalesce(unfold(),
......6> addV('vert2')).as('b').
......7> select('a').unfold().
......8> addE('has').to('b').as('x').
......9> select('a','b','x')
==>[a:[v[0]],b:v[1],x:e[2][0-has->1]]
What is the best way to get values with same property key?
EDIT: Sorry for changing the question my requirement was to get an employee from either of the departments
I need to fetch all the the employees who work for IT or Sales departments and are being managed by manager with id 123.
I have used
g.V().has('managerId',123).out('manages').as('employee')
.out('worksFor').has('departmentName','IT','Sales')
.select('employee')
where out('worksAt') gives department.
Can we do this in a has() step or should we use union() step like
g.V().has('managerId',123).out('manages').as('employee').out('worksFor')
.union(__.has('departmentName','IT'),__.has('departmentName','Sales')
.select('employee')
You are probably only missing the within predicate which is also explained in the context of the has step in the TinkerPop documentation:
g.V().has('managerId',123).out('manages').as('employee').out('worksFor').
has('departmentName',within('IT','Sales')).select('employee')
edit: After reading stephen's answer I noticed that I read over the and in your question:
employees who work for IT and Sales
That makes my answer of course invalid. I still leave it here just in case that you actually meant or as indicated by your later use of the union step.
Here's a sample graph:
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.addV("managerId",123).as("manager").
......1> addV("employee","A").as("A").
......2> addV("employee","B").as("B").
......3> addV("department", "IT").as("it").
......4> addV("department", "Sales").as("sales").
......5> addE("manages").from("manager").to("A").
......6> addE("manages").from("manager").to("B").
......7> addE("worksFor").from("A").to("it").
......8> addE("worksFor").from("B").to("it").
......9> addE("worksFor").from("A").to("sales").iterate()
In this case, I make it so that employee A is in both "Sales" and "IT", but employee B is only in "IT". Since you said you wanted employees who work in both departments employee A is who should be returned from the query and B should be filtered out.
Note that the use of within yields an incorrect answer in that case:
gremlin> g.V().has('managerId',123).
......1> out('manages').
......2> where(out('worksFor').
......3> has('department',within('IT','Sales'))).
......4> valueMap()
==>[employee:[A]]
==>[employee:[B]]
Here is the approach if you want both departments:
gremlin> g.V().has('managerId',123).
......1> out('manages').
......2> where(out('worksFor').
......3> has('department','Sales')).
......4> where(out('worksFor').
......5> has('department','IT')).
......6> valueMap()
==>[employee:[A]]