Is it possible to get which or condition resulted true in traversals? - gremlin

We have one photo sharing service in which one can allow or deny other set of users to view or not. We exposed this service as an API /view?caller=userId&photoId=photoId. We're using AWS Neptune Graph database service to maintain this authorization and using tinkerpop java library.
For the code maintainability, we fetch possible paths from other class methods and call canUserView method from the outside.
public boolean canUserView(User user, String photoId) {
return graph.V(user.getId()).hasLabel("user").or(getPossibleTraversals(user)).hasNext();
}
private GraphTraversal<Object, Vertex>[] getPossibleTraversals(User user) {
List<GraphTraversal<Vertex, Vertex>> traversals = collectTraversalsFromExternal();
return traversals.toArray(GraphTraversal[]::new);
}
collectTraversalsFromExternal() queries our other datastore and based on result, we form the query.
In every or traversal at the end we inject unique constant value to identify the traversal.
We were using .union() earlier to get the result and the constant value. But due to performance issues using .or() condition now.
This might be a dumb question. Is it possible to get reference which traversal was true ?

Using the air routes data set, here is one way you can achieve what you are looking for.
The or will filter out all airports not in either Texas or Georgia. After that the choose step returns a constant indicating which or path was taken. You can of course do something more interesting that return a constant value in your query.
gremlin> g.V(1,3,12).values('region')
==>US-GA
==>US-TX
==>US-NY
gremlin> g.V(1,3,12).
......1> or(has('region','US-GA'),
......2> has('region', 'US-TX')).
......3> choose(has('region','US-GA'),constant(1),constant(2))
==>1
==>2

Related

Reflect on Gremlin traversal type (Edge, Vertex, Property) in CHOOSE step, possible?

I am extending sparql-to-gremlin code to support fully and partially unbound predicate queries that can be used by automated processes to explore the graph structure. The idea being that you could just connect to some graph DB and ask a fully unbound query with some limit and get vertex properties, edge types, edge properties, etc. That can then be explored more.
I can now solve a fully unbound query and can solve one that has the subject bound to a vertex. Now I am trying to put it together into a multi-literal query and finding that the Gremlin MATCH Step would need to reflect on the type of Traversal before it can decide which steps would actually apply. For example if, the Traversal results in a Vertex, asking for out/in edges and properties makes sense; if it’s an Edge though, asking for out/in edges does not make sense and actually results in errors about unexpected type being thrown.
Thus the question, is it possible to write a kind of “switch” statement that would reflect on the type and then only ask for things that makes sense in that context?
Here’s one type of SPARQL query that I am trying to support (based on the Graph of the Gods described here https://old-docs.janusgraph.org/0.1.0/getting-started.html):
https://old-docs.janusgraph.org/0.1.0/images/graph-of-the-gods-2.png
SELECT ?BATTLE ?PRED ?VALUE
WHERE {
vid:6 ep:battled ?BATTLE .
?BATTLE ?PRED ?VALUE .
}
Here we are starting from a vertex with id 6, grabbing the outgoing edge reference with “battled” label, then grabbing all possible properties of the edge along with their values.
Here vertex with id 6 is Hercules, which has 3 outgoing edges with label “battled” going to vertex with id 9 (Nemean), 10 (Hydra) and 11 (Cerberus). I would want to the have ?PRED be bound to v:id (edge id), v:label (edge label), v:time (edge time property value), v:place (edge place property value), eps:battled (an extension to sparql-to-gremlin relating edge to an IN vertex).
I think that I follow your problem and I don't think I have a good answer for you. At the moment, Gremlin isn't terribly good with type detection and the issue remains open on TINKERPOP-2234. The typical workaround for most people when they have a mixed set of elements in a stream is to use a step like coalesce() or choose() to act as a form of switch statement and then figure out some filter than can identify the object type. So here's some mixed results that I've contrived:
gremlin> g.V().union(outE(),__.in())
==>e[9][1-created->3]
==>e[7][1-knows->2]
==>e[8][1-knows->4]
==>v[1]
==>v[1]
==>v[4]
==>v[6]
==>e[10][4-created->5]
==>e[11][4-created->3]
==>v[1]
==>v[4]
==>e[12][6-created->3]
and then I test with hasLabel() for labels I know to belong to vertices only, then everything else must be an edge:
gremlin> g.V().union(outE(),__.in()).choose(hasLabel('person','software'), values('name'), values('weight'))
==>0.4
==>0.5
==>1.0
==>marko
==>marko
==>josh
==>peter
==>1.0
==>0.4
==>marko
==>josh
==>0.2
Not ideal obviously but it typically resolves most people's problems. Hopefully we will see TINKERPOP-2234 solved for 3.5.0.
Another possible workaround is to use a lambda which works well for some use cases though we try to avoid them when possible:
gremlin> g.V().union(outE(),__.in()).choose(filter{it.get() instanceof Vertex}, values('name'), values('weight'))
==>0.4
==>0.5
==>1.0
==>marko
==>marko
==>josh
==>peter
==>1.0
==>0.4
==>marko
==>josh
==>0.2

Some way to count elements in a property in gremlin?

Hi I have a node that represents an email, and one of the properties is the recipients (e.g. { 'john#doe.com', 'max#example.com' }.
Is there a way that I can count how many recipients each email has?
So assuming:
gremlin> g = TinkerGraph.open().traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.addV().property('emails','["x#x.com","y#y.com"]')
==>v[0]
I'd start by saying that you should probably parse that JSON to multi-properties if your graph supports it because then you get a more natural approach to dealing with that data. It would be something like:
g.V(0L).values('emails').count()
Gremlin simply doesn't have native methods for parsing JSON so that leaves you with two options I guess:
Use a lambda and a Groovy JsonSlurper
Just return the JSON string and parse it on the client to get your count in your native programming language.
If you were using a lambda it would look like this:
gremlin> json = new groovy.json.JsonSlurper()
==>groovy.json.JsonSlurper#421a4ee1
gremlin> g.V(0L).values('emails').map{json.parseText(it.get())}
==>[x#x.com,y#y.com]
gremlin> g.V(0L).values('emails').map{json.parseText(it.get())}.count(local)
==>2
Note that this assumes your graph supports lambdas and that you can make use of JsonSlurper in that environment. We typically try to get folks to avoid lambdas so your best choice would be to model your data better (i.e. multiproperties or a List) or to process the JSON locally.

How to add a default value for empty traversals in gremlin?

I'm working on a gremlin query that navigates along several edges and eventually produces a String. Depending on the graph content, this traversal may be empty. In case that the traversal ends up being empty, I want to return a default value instead.
Here's what I am currently doing:
GraphTraversal<?, ?> traversal = g.traversal().V().
// ... fairly complex navigation here...
// eventually, we arrive at the target vertex and use its name
.values("name")
// as we don't know if the target vertex is present, lets add a default
.union(
identity(), // if we found something we want to keep it
constant("") // empty string is our default
)
// to make sure that we do not use the default if we have a value...
.order().by(s -> ((String)s).length(), Order.decr)
.limit(1)
This query works, but it is fairly convoluted - all I want is a default if the traversal ends up not finding anything.
Does anybody have a better proposal? My only restriction is that it has to be done within gremlin itself, i.e. the result must be of type GraphTraversal.
You can probably use coalesce() in some way:
gremlin> g.V().has('person','name','marko').coalesce(has('person','age',29),constant('nope'))
==>v[1]
gremlin> g.V().has('person','name','marko').coalesce(has('person','age',231),constant('nope'))
==>nope
If you have more complex logic in mind for determining if something is found or not then consider choose() step.

In TinkerPop can vertex properties contain complex objects?

I am playing with TinkerGraph and gremlin-scala and I see that it is capable of persisting complex objects:
case class InnerObj(a: Int, b: String)
case class ComplexObj(a: Int, b: InnerObj)
case class SuperComplexObj(a : String, b: ComplexObj)
class GremlinQueriesSpec extends FlatSpec
with ScalaFutures with MustMatchers {
behavior of "Gremlin queries"
it must "be able to persist complex objects containing collections" taggedAs Integration in {
val g = TinkerGraph.open()
implicit val graph = g.asScala
val user = RandomData.randomUserDataAggregate
graph + user
graph.V().toCC[UserDataAggregate].toList() must be eq List(user)
}
}
However, docs are not completely clear to me. On еру one hand there's not much structure available for property values besides lists, sets, and metaproperties. On the other hand docs say:
A Property denotes a key/value pair associated with an Edge. A
property is much like a Java8 Optional in that a property can be not
present (i.e. empty). The key of a property is always a String and the
value of a property is an arbitrary Java object. Each underlying graph
engine will typically have constraints on what Java objects are
allowed to be used as values.
Ok, it looks like it depends on the implementation. But is possible to work with nested objects in Gremlin queries?
It is indeed dependent on the implementation. You are using TinkerGraph which can in turn store any Java object so you're free to put whatever you like in there:
gremlin> g.addV().property('function',{it.length()})
==>v[2]
gremlin> g.V().sideEffect{println(it.get().value('function')("four"))}
4
==>v[2]
crazy right? Of course, you will need to consider issues like serialization and such if you start sticking random odds/ends in there and need to persist those objects or push them over the wire (like through Gremlin Server).
There is no problem with a nested object as a value for a TinkerGraph property. Just be careful. Really stop to think about your schema before going to deep down the path of storing complex Java objects as properties. Perhaps it would be better to just model those objects as elements of the graph as first class citizens to enable Gremlin traversals to work over them directly.

Transactions in Tinkerpop3 DSE

This is a question to understand the transaction semantics in DSE graph via TinkerPop3 Java API
As I understand, when you have DseSession.executeGraph(String query) is executed in a single transaction.
But what is the mechanism to achieve the same using TinkerPop3 Java API ?
Say I want to add a vertex and then drop a few edges and then add a new vertex etc.,
I tried
g.V("{~label=company, id=12345}").drop().addV("company").property("id", "1").next()
This didn't work as I intended. (Dropping the company and adding one in a single transaction). The first vertex did indeed get deleted, however the new company was never created.
I got below exception
Exception in thread "main" java.util.NoSuchElementException
at com.datastax.dse.graph.internal.DseRemoteTraversal.nextTraverser(DseRemoteTraversal.java:60)
The drop() step "kills" the traverser you are deleting, meaning nothing is returned from drop(). You can see that if you try this:
gremlin> g.V().has('name','marko').drop().next()
java.util.NoSuchElementException
Type ':help' or ':h' for help.
Display stack trace? [yN]
Without a traverser there is nothing to feed forward to your addV() step that follows. There might be a nicer way to do this, but you can work around that problem by using inject() to add a throwaway value as a traverser that will feed into your addV(). In this case, you deleted one thing, so you need to put one traverser back to trigger the single call to addV() thus:
gremlin> g.V().has('name','marko').drop().
......1> inject(1).
......2> addV('person').property('name','stephen').next()
==>v[13]
I also wonder if this is a nice way to do it:
gremlin> g.V().has('name','marko').
......1> union(drop(),
......2> addV('person').property('name','stephen'))
==>v[13]

Resources