Gremlin order on Map results - gremlin

I have the below query:
g.V('1')
.union(
out('R1')
.project('timestamp', 'data')
.by('completionDate')
.by(valueMap().by(unfold()))
out('R2')
.project('timestamp', 'data')
.by('endDate')
.by(valueMap().by(unfold()))
)
How can I order the UNION results by timestamp?
I've tried using ".order().by('timestamp')" but this only works on traversals and UNION returns a MAP object.

Here are a couple of ways to approach it. First, you could just use your code as-is and then order() by the "timestamp":
g.V('1').
union(out('R1').
project('timestamp', 'data').
by('completionDate').
by(valueMap().by(unfold())),
out('R2').
project('timestamp', 'data').
by('endDate').
by(valueMap().by(unfold()))).
order().by(select('timestamp'))
Note the difference is to select() the key from the Map that you want to sort on. Versions after 3.4.5 will work more as you expect and you can simply do by('timestamp') for a Map as well as an Element.
I think that a more readable approach however would be to go with this approach:
g.V('1').
out('R1','R2').
project('timestamp', 'data').
by(coalesce(values('endDate'), values('completionDate'))).
by(valueMap().by(unfold())).
order().by(select('timestamp'))
You might need to enhance the by(coalesce(...)) depending on the nature of your schema, but hopefully you get the idea of what I'm trying to do there.

Related

Gremlin/Tinkerpop - is there a way to add metadata to a union step so I know which query the resulting traversal came from?

This is a little strange, but I have a situation where it'd be beneficial for me to know which traversal an element came from.
For a simple example, something like this:
.union(
select('parent').out('contains'), //traversal 1
select('parent2').out('contains') //traversal 2
)
.dedup()
.project('id','traversal')
.by(id())
.by( //any way to determine which traversal it came from? or if it was in both? )
Edit: One thing I found is that I can use Map with Group/By to get partly there:
.union(
select('parent').out('contains')
.map(group().by(identity()).by(constant('t1'))),
select('parent2').out('contains')
.map(group().by(identity()).by(constant('t2'))),
)
.dedup() //Dedup isn't gonna work here because each hashmap will be different.
.project('id','traversal')
.by( //here I can't figure out how to read a value from the hashmap inline )
The above query without the project/by piece returns this:
[{v[199272505353083909]: 't1'}, {v[199272515180338177]: 't2'}]
Or is there a better way to do this?
Thanks!
One simple approach might be to just fold the results. If you get back an empty list you will know you did not find any on that "branch":
g.V('44').
union(out('route').fold().as('a').project('res','branch').by().by(constant('b1')),
out('none').fold().as('b').project('res','branch').by().by(constant('b2')))
which yields
{'res': [v[8], v[13], v[20], v[31]], 'branch': 'b1'}
{'res': [], 'branch': 'b2'}
UPDATED after discussion in comments to include an alternative approach that uses nested union steps to avoid the project step inside the union. I still think I prefer the project approach unless the performance when measured is not good.
g.V('44').
union(local(union(out('route').fold(),constant('b1')).fold()),
local(union(out('none').fold(),constant('b2')).fold()))
which yields
[[v[8], v[13], v[20], v[31]], 'b1']
[[], 'b2']

Gremlin : How do you find vertex and edges when some edges does not exists

I am new to gremlin.
I am facing issue in fetching the vertex and edges when sometimes edge from a vertex does not exists.
for example bellow query works fine if it gets all the vertex and edges.
but for one use case edge
`.outE("PRODUCES").`as`("produces"))`
does not exists in db.
in that case bellow query doesnt return any result.
I need your help to resolve this issue.
when edges does not exit then i want input_entity and processed_by in result.
janusGraph.traversal().V()
.has("isActive", "true")
.hasLabel("ENTITY").`as`("input_entity")
.outE("PROCESSED_BY").`as`("processed_by")
.inV().`as`("job")
.outE("PRODUCES").`as`("produces")
.select<String>("job").outE("HAS_STATE")
.`as`("job_state_edge").inV().hasLabel("JOB_STATE").`as`("job_state")
.select<String>("input_entity").outE("HAS_STATE")
.`as`("input_entity_state_edge").inV().hasLabel("ENTITY_STATE").`as`("input_entity_state")
.select<String>("input_entity").outE("CONNECTS_TO").`as`("connects_to")
.inV().hasLabel("ENTITY").has("entityName", TextP.startingWith(rootNamespace))
.`as`("output_entity").outE("HAS_STATE")
.`as`("output_entity_state_edge").inV().hasLabel("ENTITY_STATE").`as`("output_entity_state")
.select<String>("input_entity","output_entity","processed_by","produces","job","job_state","input_entity_state","output_entity_state","input_entity_state_edge","output_entity_state_edge","job_state_edge","connects_to")
.by(elementMap<Element, Any>()).toList()
with optional
janusGraph.traversal().V()
.has("isActive", "true")
.hasLabel("ENTITY").`as`("input_entity")
.outE("HAS_STATE").`as`("input_entity_state_edge").inV().hasLabel("ENTITY_STATE").`as`("input_entity_state")
.select<String>("input_entity").outE("PROCESSED_BY").`as`("processed_by")
.inV().`as`("job").outE("HAS_STATE").`as`("job_state_edge").inV().hasLabel("JOB_STATE").`as`("job_state")
.select<String>("job")
.optional(
outE("PRODUCES").`as`("produces")
.select<String>("input_entity").outE("CONNECTS_TO").`as`("connects_to")
.inV().hasLabel("ENTITY").has("entityName", TextP.startingWith(rootNamespace))
.`as`("output_entity").outE("HAS_STATE").`as`("output_entity_state_edge").inV().hasLabel("ENTITY_STATE").`as`("output_entity_state"))
.select<String>("input_entity","output_entity","processed_by","produces","job","job_state","input_entity_state","output_entity_state","input_entity_state_edge","output_entity_state_edge","job_state_edge","connects_to")
.by(elementMap<Element, Any>()).toList()
There are two Gremlin steps that can help in cases like this. When you have a part of a query that may or may not exist, but you either want the results up to that point if it does not exist or the results afterwards if it does exist, you can wrap that part of the query in an optional step.
For example :
g.V('3').optional(out())
Will either return V['3'] or the adjacent vertices if out yields results.
In cases where you want to select a value that may not exist, you can do something like this:
coalesce(select('a'),constant('No results'))
EDITED to add:
If you need to return multiple results, rather than just using select try a project('a','b,',c') type of approach where each by modulator for the project can contain its own coalesce step.

How can we reach to Key/value pairs after group by function on Gremlin?

I have a simple gremlin query, first of all, I aggregate data, then I want to reach aggregated data and I want to use it on other operations. But after group-by operation, I couldn’t reach them as key/value pair.
GraphTraversal t = graph.V().hasLabel("App").as("a")
.inE("RANKS").as("r")
.outV().as("k")
.choose(__.select("k").by("countryCode").is(__.in(...)),
__.math("1.0 / r").by("rank1"),
__.math("1.0 / r").by("rank2"))
.as("score")
.select("a").aggregate("ap")
.select("k").by("countryCode").aggregate("country")
.select("a", "k").by("appId").by("countryCode")
.group().by("grp_res").by(__.select("score").sum().as("sum_score"))
.cap("ap", "country", "grp_res")
.V().hasLabel("App").where(P.within("ap")).as("app")
.select("app", "country", "ak").by("appId").by().by();
The last line .select("app", "ak").by("appId").by() couldn’t be reachable askey/value pair after group-by. How can I reach them? Do you have any suggestion?
Now it looks:
{app=1, country=US, ak={{a=1, k=US}=363.0, {a=2, k=US}=544.5}}
{app=2, country=US, ak={{a=1, k=US}=363.0, {a=2, k=US}=544.5}}
But expected output is:
{app=1, country=US, ak=363.0}
{app=2, country=US, ak=544.5}
Solved like:
group().by(...)
.select(Column.values).as("grp_result")
.select("grp_result").select("score").as("score")
...
I don't see where you define "ak" in your query but the basic issue is that you have nested maps where the keys are themselves a map. So you will need to select from the map using that exact key or order the groups differently to have simpler keys. Then would do something like select('ak').select('key').

Adobe AEM Querybuilder Debugger - Multiple Paths and Multiple Nodenames

I am using querybuilder debugger and want to do a search where "nodename=.pdf OR nodename=.doc*" and "path=/content/dam/1 OR path=/content/dam/2".
I have been trying to find an example but no luck on the web. What I have below is not quite right - just wondering what I am missing.
The query does work but there is a huge difference in the amount of time that it runs when compared with when I just query using one nodename instead of 2.
Thanks in advance,
Jerry
type=dam:asset
mainasset=true
1_group.p.or=true
1_group.1.nodename=*.pdf
1_group.2.nodename=*.doc*
2_group.p.or=true
2_group.1_path=/content/dam/1
2_group.2_path=/content/dam/2
p.limit=-1
orderby=path
I thought maybe something as simple as this might work but no luck....
type=dam:asset
mainasset=true
group.p.or=true
group.1_nodename=*.doc*
group.1_path=/content/dam/1
group.2_nodename=*.doc*
group.2_path=/content/dam/2
group.3_nodename=*.pdf
group.3_path=/content/dam/1
group.4_nodename=*.pdf
group.4_path=/content/dam/2
p.limit=-1
orderby=path
Try splitting your query if this won't affect the behaviour you're trying to achieve.
path=/content/dam/1
type=dam:asset
mainasset=true
group.1.nodename=*.pdf
group.2.nodename=*.doc*
p.limit=-1
orderby=path
path=/content/dam/2
type=dam:asset
mainasset=true
group.1.nodename=*.pdf
group.2.nodename=*.doc*
p.limit=-1
orderby=path

Assign query using 'match()' to subgraph

I have a JanusGraph database with a graph structure as follows:
(Paper)<-[AuthorOf]-(Author)
I'm want to use Gremlin's match clause to query the data and assign the results to a subgraph. This is what I have so far:
g.V().match(
__.as('a').has('Paper','paperTitle', 'The name of my paper'),
__.as('a').in('AuthorOf').outV().as('b')).
select('b').values()
This query returns what I want, the Authors of the paper I'm for which I'm searching. However, I want to assign the results to a subgraph so I can export it using:
sg.io(IoCore.graphml()).writeGraph("/home/ubuntu/myresults.graphml")
Previously, I've achieved this with a different query structure like this:
sg = g.V().has('paperTitle', 'The name of my paper').
inE('AuthorOf').subgraph('sg1').
outV().
cap('sg1').
next()
Is there away to achieve the same results using the 'match()' statement?
After a little trial and error I was able to create a working solution:
sg = g.V().match(
__.as('a').has('Paper','paperTitle', 'ladle pouring guide'),
__.as('a').inE('AuthorOf').subgraph('sg').outV().as('b')).
cap('sg').next()
At first, I was trying to use the 'select' statement to isolate the subgraph. After reviewing the documentation on 'subgraph' and learning more about side-effects in gremlin I realized it wasn't necessary.

Resources