Transactions in Tinkerpop3 DSE - gremlin

This is a question to understand the transaction semantics in DSE graph via TinkerPop3 Java API
As I understand, when you have DseSession.executeGraph(String query) is executed in a single transaction.
But what is the mechanism to achieve the same using TinkerPop3 Java API ?
Say I want to add a vertex and then drop a few edges and then add a new vertex etc.,
I tried
g.V("{~label=company, id=12345}").drop().addV("company").property("id", "1").next()
This didn't work as I intended. (Dropping the company and adding one in a single transaction). The first vertex did indeed get deleted, however the new company was never created.
I got below exception
Exception in thread "main" java.util.NoSuchElementException
at com.datastax.dse.graph.internal.DseRemoteTraversal.nextTraverser(DseRemoteTraversal.java:60)

The drop() step "kills" the traverser you are deleting, meaning nothing is returned from drop(). You can see that if you try this:
gremlin> g.V().has('name','marko').drop().next()
java.util.NoSuchElementException
Type ':help' or ':h' for help.
Display stack trace? [yN]
Without a traverser there is nothing to feed forward to your addV() step that follows. There might be a nicer way to do this, but you can work around that problem by using inject() to add a throwaway value as a traverser that will feed into your addV(). In this case, you deleted one thing, so you need to put one traverser back to trigger the single call to addV() thus:
gremlin> g.V().has('name','marko').drop().
......1> inject(1).
......2> addV('person').property('name','stephen').next()
==>v[13]
I also wonder if this is a nice way to do it:
gremlin> g.V().has('name','marko').
......1> union(drop(),
......2> addV('person').property('name','stephen'))
==>v[13]

Related

Is it possible to get which or condition resulted true in traversals?

We have one photo sharing service in which one can allow or deny other set of users to view or not. We exposed this service as an API /view?caller=userId&photoId=photoId. We're using AWS Neptune Graph database service to maintain this authorization and using tinkerpop java library.
For the code maintainability, we fetch possible paths from other class methods and call canUserView method from the outside.
public boolean canUserView(User user, String photoId) {
return graph.V(user.getId()).hasLabel("user").or(getPossibleTraversals(user)).hasNext();
}
private GraphTraversal<Object, Vertex>[] getPossibleTraversals(User user) {
List<GraphTraversal<Vertex, Vertex>> traversals = collectTraversalsFromExternal();
return traversals.toArray(GraphTraversal[]::new);
}
collectTraversalsFromExternal() queries our other datastore and based on result, we form the query.
In every or traversal at the end we inject unique constant value to identify the traversal.
We were using .union() earlier to get the result and the constant value. But due to performance issues using .or() condition now.
This might be a dumb question. Is it possible to get reference which traversal was true ?
Using the air routes data set, here is one way you can achieve what you are looking for.
The or will filter out all airports not in either Texas or Georgia. After that the choose step returns a constant indicating which or path was taken. You can of course do something more interesting that return a constant value in your query.
gremlin> g.V(1,3,12).values('region')
==>US-GA
==>US-TX
==>US-NY
gremlin> g.V(1,3,12).
......1> or(has('region','US-GA'),
......2> has('region', 'US-TX')).
......3> choose(has('region','US-GA'),constant(1),constant(2))
==>1
==>2

gremlin fold() unfold() throwing errors: "Vertex with id 70 was removed" and "The incoming object is not removable"

This query works:
choose(V().hasLabel("user"), V().hasLabel("user").drop())
But is searching 2 times for the vertex "user", I want to optimize it to 1 time, so I changed it for this one:
choose(V().hasLabel("user").fold(), unfold().drop())
If I do that trick multiple times in my query it throws:
Error: ResponseError: Server error: Vertex with id 70 was removed. (500)
I don't understand what that means, maybe fold() does not overwrite previous fold() calls, I don't know.
Is there an alternative to fold() unfold() for this use case? I've tried:
choose(V().hasLabel("user").as("u"), select("u").drop())
but that does not work, it seems .as() don't save anything when is called inside choose()
I also tried:
choose(V().hasLabel("user").store("u"), select("u").drop())
But throws another error I don't understand: The incoming object is not removable
Your first attempt with:
choose(V().hasLabel("user"), V().hasLabel("user").drop())
is not as expensive as you think. The if portion of choose() (i.e. the first child traversal) does not iterate all users. It immediately returns true if one "user" is found". Your then child traversal will iterate all "user" vertices. Also, you don't provide a false value to choose() so in that case, I believe that you will end up calling the drop() traversal for either situation:
gremlin> g.inject(1).choose(V().hasLabel('no-exist'),constant(1))
==>1
gremlin> g.inject(1).choose(V().hasLabel('no-exist'),constant(1),constant(2))
==>2
Of course, I'd wonder why you need to do an if/then here at all because calling V().hasLabel("user").drop() without choose() will remove all "user" vertices found, or if none are found, just do nothing.
For this traversal:
choose(V().hasLabel("user").fold(), unfold().drop())
note that V().hasLabel("user").fold() will always return "true" because you fold() which is a reducing step which will return a List of items in the stream. If the stream is empty you get an empty List and thus choose() will use the true path there. In any event, your unfold() is not unrolling what is returned from the first choose() parameter - it unfolds the same Traverser object handed to the choose() step. You don't have what comes before the choose() so I can't say what that is.
I'm not completely sure but based on your remaining traversal examples, I think you might be misusing choose() in general. You don't seem to have need for an if/then or switch style operation. With Gremlin you don't need to check if something is present in order to remove it and it is in fact cheaper not to do so as mentioned earlier.

Reflect on Gremlin traversal type (Edge, Vertex, Property) in CHOOSE step, possible?

I am extending sparql-to-gremlin code to support fully and partially unbound predicate queries that can be used by automated processes to explore the graph structure. The idea being that you could just connect to some graph DB and ask a fully unbound query with some limit and get vertex properties, edge types, edge properties, etc. That can then be explored more.
I can now solve a fully unbound query and can solve one that has the subject bound to a vertex. Now I am trying to put it together into a multi-literal query and finding that the Gremlin MATCH Step would need to reflect on the type of Traversal before it can decide which steps would actually apply. For example if, the Traversal results in a Vertex, asking for out/in edges and properties makes sense; if it’s an Edge though, asking for out/in edges does not make sense and actually results in errors about unexpected type being thrown.
Thus the question, is it possible to write a kind of “switch” statement that would reflect on the type and then only ask for things that makes sense in that context?
Here’s one type of SPARQL query that I am trying to support (based on the Graph of the Gods described here https://old-docs.janusgraph.org/0.1.0/getting-started.html):
https://old-docs.janusgraph.org/0.1.0/images/graph-of-the-gods-2.png
SELECT ?BATTLE ?PRED ?VALUE
WHERE {
vid:6 ep:battled ?BATTLE .
?BATTLE ?PRED ?VALUE .
}
Here we are starting from a vertex with id 6, grabbing the outgoing edge reference with “battled” label, then grabbing all possible properties of the edge along with their values.
Here vertex with id 6 is Hercules, which has 3 outgoing edges with label “battled” going to vertex with id 9 (Nemean), 10 (Hydra) and 11 (Cerberus). I would want to the have ?PRED be bound to v:id (edge id), v:label (edge label), v:time (edge time property value), v:place (edge place property value), eps:battled (an extension to sparql-to-gremlin relating edge to an IN vertex).
I think that I follow your problem and I don't think I have a good answer for you. At the moment, Gremlin isn't terribly good with type detection and the issue remains open on TINKERPOP-2234. The typical workaround for most people when they have a mixed set of elements in a stream is to use a step like coalesce() or choose() to act as a form of switch statement and then figure out some filter than can identify the object type. So here's some mixed results that I've contrived:
gremlin> g.V().union(outE(),__.in())
==>e[9][1-created->3]
==>e[7][1-knows->2]
==>e[8][1-knows->4]
==>v[1]
==>v[1]
==>v[4]
==>v[6]
==>e[10][4-created->5]
==>e[11][4-created->3]
==>v[1]
==>v[4]
==>e[12][6-created->3]
and then I test with hasLabel() for labels I know to belong to vertices only, then everything else must be an edge:
gremlin> g.V().union(outE(),__.in()).choose(hasLabel('person','software'), values('name'), values('weight'))
==>0.4
==>0.5
==>1.0
==>marko
==>marko
==>josh
==>peter
==>1.0
==>0.4
==>marko
==>josh
==>0.2
Not ideal obviously but it typically resolves most people's problems. Hopefully we will see TINKERPOP-2234 solved for 3.5.0.
Another possible workaround is to use a lambda which works well for some use cases though we try to avoid them when possible:
gremlin> g.V().union(outE(),__.in()).choose(filter{it.get() instanceof Vertex}, values('name'), values('weight'))
==>0.4
==>0.5
==>1.0
==>marko
==>marko
==>josh
==>peter
==>1.0
==>0.4
==>marko
==>josh
==>0.2

Gremlin console addV seem not to add vertices

In Gremlin Console, in order to add a vertex, I do the following:
// One time initialization
graph = TinkerGraph.open()
g = graph.traversal()
// Add the vertex
g.addV('somelabel')
And in the console i get:
==>v[0]
But if I try to traverse the vertex:
g.V(0)
I get nothing in the console, as if the index was wrong.
A proof of that (the fact i get nothing) is:
g.V(0).count()
==>0
If instead i define the id myself:
g.addV('somelabel').property(id, 1)
Everything works fine:
g.V(1)
==>v[1]
But I would not like to define the ids myself...
Where am I doing (or thinking) wrong?
Software version is JanusGraph 0.2.2, Apache TinkerPop 3.2.9
You are not using JanusGraph here but TinkerGraph, an in-memory graph store that is often used for testing or simple examples.
TinkerGraph uses long ids by default which means that it cannot find your vertex when you use an int id. It should work when you use parameter of type long:
gremlin> g.addV('somelabel')
==>v[0]
gremlin> g.V(0)
gremlin> g.V(0L)
==>v[0]
The configuration section for TinkerGraph explains how this configuration can be changed to use different types for its integers.

Why g.V() in gremlin is a GraphTraversal<Vertex, Vertex>?

g.V() is usually used to start a traversal. But since this a startup action, why its begin node is a Vertex?
g is the TraversalSource, by specifying V() you're saying that you are going to start at a set of elements that are of type Vertex, just like by specifying E() would specify starting at a set of elements that are of type Edge http://tinkerpop.apache.org/docs/current/reference/#the-graph-process and http://tinkerpop.apache.org/docs/current/reference/#traversal are pretty clear on how traversals are built up.

Resources