Can't update HashMap in Titan created via RexPro - gremlin

I'm running Titan 0.3.2 with Cassandra as a data store. I'm also using rexpro-python for interacting with Titan over RexPro.
Issue & Question:
If I run this Python code:
>>> import rexpro
>>> conn = rexpro.RexProConnection('localhost', 8184, 'graph')
>>> conn.execute('g.addVertex(null, node_dict)', {'node_dict':{'my_dict':{}}})
{'_type': 'vertex', '_id': '2280164', '_properties': {'my_dict': {}}}
>>> conn.execute('g.commit()')
I can look up the resulting node in the Gremlin console:
gremlin> g.v(2280164).map
==>{my_dict={}}
And it looks like the my_dict map is created properly:
gremlin> g.v(2280164).my_dict.getClass()
==>class java.util.HashMap
However, I'm unable to update my_dict with a new key & value:
gremlin> g.v(2280164).my_dict['abc'] = 123
==>123
gremlin> g.commit()
==>null
gremlin> g.v(2280164).map
==>{my_dict={}}
If I first reset my_dict to a new object in the console, my attempt to add key abc works as expected:
gremlin> g.v(2280164).my_dict = [:]
gremlin> g.v(2280164).my_dict['abc'] = 123
==>123
gremlin> g.v(2280164).map
==>{my_dict={abc=123}}
My question: how can I update my_dict with a new KV pair on a vertex created via RexPro?
Attempted workarounds/solutions:
This same issue happens when the vertex is created with a non-parameterized version of the python script:
>>> conn.execute('g.addVertex([my_dict:[:]])')
Also, not sure if relevant for this issue, but it looks like the new map created in the console is a LinkedHashMap (whereas the python/rexpro code created a HashMap):
gremlin> g.v(2280164).my_dict.getClass()
==>class java.util.LinkedHashMap

Updating the dictionary/map that comes off a vertex property circumvents the database level since you are directly modifying the value on the heap.
In other words, Titan does not know that you updated the map and therefore does not persist the change.
Always think of property values as immutable even though they might be mutable java objects because those mutations are invisible to the database.
This will work:
newdict = v.my_dict.clone()
newdict['hello']='other'
v.setProperty('my_dict',newdict)
graph.commit()
v.map

I tried to recreate this entirely in the REPL and couldn't. I even explicitly created a java.util.HashMap and it seemed to work fine with Titan/Cassandra. I think LinkedHashMap is the default for groovy when you do [:].
gremlin> g = TitanFactory.open('bin/cassandra.local')
==>titangraph[cassandrathrift:127.0.0.1]
gremlin> g.addVertex()
==>v[4]
gremlin> g.v(4).my_dict = new java.util.HashMap()
gremlin> g.v(4).map
==>{my_dict={}}
gremlin> g.v(4).my_dict['abc'] = 123
==>123
gremlin> g.v(4).map
==>{my_dict={abc=123}}
gremlin> g.v(4).my_dict.getClass()
==>class java.util.HashMap
gremlin> g.commit()
==>null
gremlin> g.v(4).map
==>{my_dict={abc=123}}
What happens if you don't parameterize your RexPro request? In other words, do you get different results if you do:
>>> conn.execute('g.addVertex([my_dict:[:]])')
If that works that might be a workaround for you. I know the recommendation is to "parameterize requests" via RexPro but since this script is basically static, it will cache nicely in the scriptengine at not much additional cost per request.

Related

Concatenate Gremlin GraphTraversal result with string

In this very simple example I am trying to add a new vertex which should be labeled like an existing vertex but with some prefix attached:
g.V(1).addV('prefix_' + label()).valueMap(true)
What am I missing here? It's clearly not a String, but how would I serialize that?
gremlin> g.V(1).label()
==>Person
gremlin> g.V(1).constant(label())
==>[LabelStep]
Gremlin today does not provide a built in string concatenation function. It would be nice if it did. That means your best alternative today is to use an in line closure/lambda. Here is an example using TinkerGraph and the air-routes graph.
gremlin> g.V(3).map {"prefix_" + it.get().label}.as('a').addV(select('a'))
==>v[60867]
gremlin> g.V(60867).label()
==>prefix_airport
Note that not all graph databases allow closures so this cannot be assumed to work universally on any TinkerPop enabled Graph DB backend.

Variables - Gremlin on Neptune

I'm on the Gremlin console and connected to an AWS Neptune DB
g.V().hasLabel('Risk').toList()
provides the expected results, but when i try to assign the results to a variable
risks = g.V().hasLabel('Risk').toList()
I get the message below - the token recognition error is clearly throwing on the variable name because when i change the variable name the position and "at" part of the message changes.
{"requestId":"650d7c4b-44d1-43f8-b9a3-fbf085cc3ead","code":"MalformedQueryException","detailedMessage":"Query parsing failed at line 1, character position at 3, error message : token recognition error at: 'ks'"}
I tried the same thing on a Tinkerpop database and the variable assignment works. What am I missing?
As you noticed, you cannot assign a variable when connected to Neptune with the console in :remote console mode. One thing you can do is with the console in local mode but connected to the server, issue a query such as :> g.V().limit(1) and the result of the query will be accessible via a special variable maintained by the console called result. Here is a simple example:
gremlin> :> g.V().count()
==> 3653
gremlin> result
==> result{object=3653 class=java.lang.String}
gremlin println result['object']
[3653]
gremlin> r = result['object'][0]
gremlin> println r
3653
Here is an example with a list as the result.
gremlin> :> g.V('1','2').values('city').toList()
==>Atlanta
==>Anchorage
gremlin> result
==>result{object=Atlanta class=java.lang.String}
==>result{object=Anchorage class=java.lang.String}
gremlin> result.each {println "City was : " + it['object']}
City was : Atlanta
City was : Anchorage
I sometimes find this useful if you want to use Groovy to post process some results in the console when the ability to assign a variable is not available.

Query ACL graph using gremlin with parent hierarchy check

my ACL graph
In Summary every user or group who have access to Folder1 will have access to File1
usecase #1 - checking access for user1 on Folder1, which works fine with below gremlin
g.V().has('user','userId','user1').
emit().
until(__.not(outE('member_of'))).
repeat(out('member_of')).
filter(outE('has_permission').has('permission','VS_F').
inV().has('folder','folderId','Folder1')).hasNext()
usecase #2 - checking access for user1 on File1, how can modify above gremlin to achieve this ? im pretty much new to gremlin and doing a POC on janusgraph for my employer.
First things first: You should always start with a script that people can run to create your sample graph. Also, formatting the code in your question wouldn't hurt.
With that said, here's the script that creates your graph:
g = TinkerGraph.open().traversal()
g.addV('user').property('userId','user1').as('u1').
addV('user').property('userId','user2').as('u2').
addV('user').property('userId','user3').as('u3').
addV('group').property('groupId','group1').as('g1').
addV('group').property('groupId','group2').as('g2').
addV('group').property('groupId','group3').as('g3').
addV('folder').property('folderId','folder1').as('f1').
addV('file').property('fileId','file1').
addE('in_folder').to('f1').
addE('member_of').from('u1').to('g1').
addE('member_of').from('u2').to('g2').
addE('member_of').from('u3').to('g3').
addE('member_of').from('g3').to('g1').
addE('has_permission').from('g1').to('f1').
addE('has_permission').from('u2').to('f1').iterate()
...and the query you're looking for:
g.V().has('file','fileId','file1').
until(inE('has_permission')).
repeat(out('in_folder')).as('folder').
V().has('user','userId','user1').
emit().
until(__.not(outE('member_of'))).
repeat(out('member_of')).
filter(out('has_permission').where(eq('folder'))).hasNext()

How to Generate TinkerPOP ByteCode?

I wonder if anyone can give me some pointers (that is where to start), on how to implement a translation from let say, a query language A to TinkerPop Graph Traversal. Let assume that A semantic intuitively translate to a subset of TinkerPop Traversal. In other words, what i am asking is:
Where Can I find the set instruction set of TinkerPOP Virtual Machine
What is TinkerPOP virtual Machine ByteCode Like.
Is there any documentation API that helps toward that.
The documentation says it is easy to generate the ByteCode but does not get in great details about the bytecode and its shapes and so on.
I am hoping that someone could help with that: the section Graph Language Provider is empty in the current documentation
On the JVM you can get a Bytecode object from any traversal with asAdmin().getBytecode() as follows:
gremlin> g.V().hasLabel('person').out().in().tree().asAdmin().getBytecode()
==>[[], [V(), hasLabel(person), out(), in(), tree()]]
Converted to GraphSON format the Bytecode format looks like this (example from the IO documentation):
{
"#type" : "g:Bytecode",
"#value" : {
"step" : [ [ "V" ], [ "hasLabel", "person" ], [ "out" ], [ "in" ], [ "tree" ] ]
}
}
The full instruction set is basically bound to the JVM at this time and is simply the list of Gremlin steps plus related expressions/tokens (e.g. P, T, etc). We are currently working on getting Gremlin defined more as a specification first rather than having it bound to the JVM as it is today, but that will take some time to complete.
Note that you are talking about development of a Gremlin Compiler. There already is an example which is just about ready for release as I write this in sparql-gremlin - pre-release documentation can be found here. This module takes the SPARQL query language and converts it to Gremlin Bytecode.
gremlin> graph = TinkerFactory.createModern()
==>tinkergraph[vertices:6 edges:6]
gremlin> g = graph.traversal(SparqlTraversalSource) //1\
==>sparqltraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.sparql("""SELECT ?name ?age
WHERE { ?person v:name ?name . ?person v:age ?age }
ORDER BY ASC(?age)""") //2\
==>[name:vadas,age:27]
==>[name:marko,age:29]
==>[name:josh,age:32]
==>[name:peter,age:35]
The code is not terribly complex - perhaps you could look to it for inspiration. If you have further questions please consider asking them on gremlin-users mailing list. It would be great to see more Gremlin Compilers available. There are currently others (SQL and Cypher), but I believe that the SPARQL compiler is the only one that is Bytecode based at this time.

Have Gremlin-console show all the methods available for a specific object?

In gremlin-console, is there a way to show all the methods available for a specific object?
For example, In gremlin-console if I type g.V().hasLabel("person") and I want to see what methods I can chain/call for the object returned by g.V().hasLabel("person"). How do you do that?
The answer is to use the <Tab> key.
gremlin> "test".c
capitalize() center( charAt( chars() codePointAt( codePointBefore( codePointCount( codePoints() collectReplacements( compareTo(
compareToIgnoreCase( concat( contains( contentEquals( count(
However, I'm finding that it is not working for something like g.V().o which I'd hoped would have shown out(). Apparently, the groovy shell (which is what the Gremlin Console is based on) doesn't seem to want to do the auto-complete on a fluent API. It seems to only work on the first object on which you are calling the method:
gremlin> g.
E( V( addV( addV() close() inject( tx() withBindings( withBulk(
withComputer( withComputer() withPath() withRemote( withSack( withSideEffect( withStrategies( withoutStrategies( anonymousTraversalClass
bytecode graph strategies
gremlin> x = g.V();[]
gremlin> x.o
option( optional( or( order( order() otherV() out( outE( outV()
gremlin> x.o
That stinks...that's not really a TinkerPop issue - we rely on the groovysh for that functionality. Not much we can do there I don't think....
Of course, you are using DSE Graph which means you have access to DataStax Studio which not only has the auto-complete that you're looking for but also schema support (and more!). I'd suggest that you switch to that.

Resources