Query ACL graph using gremlin with parent hierarchy check - graph

my ACL graph
In Summary every user or group who have access to Folder1 will have access to File1
usecase #1 - checking access for user1 on Folder1, which works fine with below gremlin
g.V().has('user','userId','user1').
emit().
until(__.not(outE('member_of'))).
repeat(out('member_of')).
filter(outE('has_permission').has('permission','VS_F').
inV().has('folder','folderId','Folder1')).hasNext()
usecase #2 - checking access for user1 on File1, how can modify above gremlin to achieve this ? im pretty much new to gremlin and doing a POC on janusgraph for my employer.

First things first: You should always start with a script that people can run to create your sample graph. Also, formatting the code in your question wouldn't hurt.
With that said, here's the script that creates your graph:
g = TinkerGraph.open().traversal()
g.addV('user').property('userId','user1').as('u1').
addV('user').property('userId','user2').as('u2').
addV('user').property('userId','user3').as('u3').
addV('group').property('groupId','group1').as('g1').
addV('group').property('groupId','group2').as('g2').
addV('group').property('groupId','group3').as('g3').
addV('folder').property('folderId','folder1').as('f1').
addV('file').property('fileId','file1').
addE('in_folder').to('f1').
addE('member_of').from('u1').to('g1').
addE('member_of').from('u2').to('g2').
addE('member_of').from('u3').to('g3').
addE('member_of').from('g3').to('g1').
addE('has_permission').from('g1').to('f1').
addE('has_permission').from('u2').to('f1').iterate()
...and the query you're looking for:
g.V().has('file','fileId','file1').
until(inE('has_permission')).
repeat(out('in_folder')).as('folder').
V().has('user','userId','user1').
emit().
until(__.not(outE('member_of'))).
repeat(out('member_of')).
filter(out('has_permission').where(eq('folder'))).hasNext()

Related

Need a query to retrieve complete graph

I am trying to retrieve all the node and properties details in parent-child hierarchy.
Nested within each other.
Since I am new with gremlin, graphDB I am having really tough time to get it done.
Please suggest a solution and if you could walk me through it, it will be great.
Following is my structure
And I am trying to keep the response as clean as possible.
I am using cosmosDB and Gremlin.
NET api for this.
I tried the following but it gave me response in key value,
g.V("some_id").repeat(out()).emit().tree().path()
g.V("some_id").emit().repeat(both().simplePath()).dedup()
please any kind of suggestion would be great.
I"m not sure what format you want your result, but use of path(), tree() or subgraph() would typically give you the graph structure. Since you are using CosmosDB, you're only options are path() and tree() as subgraph() does not appear to be supported.
Using this sample graph as a simple tree:
g.addV().property(id, '1').as('1').
addV().property(id, '2a').as('2a').
addV().property(id, '2b').as('2b').
addV().property(id, '3a').as('3a').
addV().property(id, '4a').as('4a').
addE('child').from('1').to('2a').
addE('child').from('1').to('2b').
addE('child').from('2a').to('3a').
addE('child').from('3a').to('4a')
you can see the effect of path() which basically gathers the contents of each step Gremlin took:
gremlin> g.V('1').repeat(out()).emit().path()
==>[v[1],v[2a]]
==>[v[1],v[2b]]
==>[v[1],v[2a],v[3a]]
==>[v[1],v[2a],v[3a],v[4a]]
Since I used out() we don't see the edges, but that is easily remedied by adding making a small adjustment to directly consume edges into the path history:
gremlin> g.V('1').repeat(outE().inV()).emit().path()
==>[v[1],e[0][1-child->2a],v[2a]]
==>[v[1],e[1][1-child->2b],v[2b]]
==>[v[1],e[0][1-child->2a],v[2a],e[2][2a-child->3a],v[3a]]
==>[v[1],e[0][1-child->2a],v[2a],e[2][2a-child->3a],v[3a],e[3][3a-child->4a],v[4a]]
Taken together with duplication removed on your application side you have a complete graph with path().
Replacing path() with tree() will essentially do that deduplication by maintaining the tree structure of the path history:
gremlin> g.V('1').repeat(out()).emit().tree()
==>[v[1]:[v[2b]:[],v[2a]:[v[3a]:[v[4a]:[]]]]]
gremlin> g.V('1').repeat(outE().inV()).emit().tree()
==>[v[1]:[e[0][1-child->2a]:[v[2a]:[e[2][2a-child->3a]:[v[3a]:[e[3][3a-child->4a]:[v[4a]:[]]]]]],e[1][1-child->2b]:[v[2b]:[]]]]
The Tree is just represented as a Map where each key represents a like a root and value is another Tree (i.e. the branches from it). It is perhaps better visualized this way:
gremlin> g.V('1').repeat(out()).emit().tree().unfold()
==>v[1]={v[2b]={}, v[2a]={v[3a]={v[4a]={}}}}
gremlin> g.V('1').repeat(out()).emit().tree().unfold().next().value
==>v[2b]={}
==>v[2a]={v[3a]={v[4a]={}}}
If neither of these structures are suitable and subgraph() is not available you can technically just capture and return the edges you traverse as the low level elements of your subgraph as described in this blog post.
Given the comments on this answer I also present the following option which used group():
gremlin> g.V('1').emit().
......1> repeat(outE().group('a').by(outV()).by(inV().fold()).inV()).cap('a').unfold()
==>v[1]=[v[2a], v[2b]]
==>v[3a]=[v[4a]]
==>v[2a]=[v[3a]]
It's not exactly a "tree" but if you know the root (in this case v[1]) you can find its key in the Map. The values are the children. You can then look up each of those keys in the Map to find if they have children and so on. For example, we can lookup v[2b] and find that it has no children while looking up [v2a] reveals a single child of [v3a]. Gremlin can be pretty flexible in getting answers if you can be sorta flexible in how you deal with the results.

In Gremlin, how can I use the SubgraphStrategy when submitting a script

I am issuing Gremlin queries to a AWS Neptune database as follows:
client = Client(f"wss://{self.host}:{self.port}/gremlin", "g")
client.submit("g.V()...")
This works fine and I get the expected results.
I would like to include a SubgraphStrategy when issuing these queries. (I'm using a SubgraphStrategy to ignore nodes marked as deleted.) I can do this when I build the query dynamically, like this:
g = traversal().withRemote(remoteConn).withStrategies(
SubgraphStrategy(
vertices=__.hasNot("is_deleted"), edges=__.hasNot("is_deleted")
)
)
g.V()...
I cant figure out how to specify the subgraph strategy when issuing the query as a string. For example, I've tried this:
client = Client(f"wss://{self.host}:{self.port}/gremlin", "g")
client.submit('g.withStrategies(SubgraphStrategy.build().vertexProperties(hasNot("is_deleted")).V()...')
Does anybody know how to do this?
Neptune doesn't allow for that Java syntax that uses .build() when creating strategies, but I think it will support the Groovy syntax that was introduced at TinkerPop 3.4.9:
g.withStrategies(ReadOnlyStrategy,
new SubgraphStrategy(vertexProperties: __.hasNot('endTime')))

Concatenate Gremlin GraphTraversal result with string

In this very simple example I am trying to add a new vertex which should be labeled like an existing vertex but with some prefix attached:
g.V(1).addV('prefix_' + label()).valueMap(true)
What am I missing here? It's clearly not a String, but how would I serialize that?
gremlin> g.V(1).label()
==>Person
gremlin> g.V(1).constant(label())
==>[LabelStep]
Gremlin today does not provide a built in string concatenation function. It would be nice if it did. That means your best alternative today is to use an in line closure/lambda. Here is an example using TinkerGraph and the air-routes graph.
gremlin> g.V(3).map {"prefix_" + it.get().label}.as('a').addV(select('a'))
==>v[60867]
gremlin> g.V(60867).label()
==>prefix_airport
Note that not all graph databases allow closures so this cannot be assumed to work universally on any TinkerPop enabled Graph DB backend.

Querying ACL/permissions graph using gremlin?

My permissions graph looks like this:
In this situation,
user1 has permission on folder1 through Group1.
user2 has direct permissions without any group, though the user is part of group2 where group2 doesn't have access over folder1.
user3 has permission through group hierarchy, not the direct group to folder access.
I was able to write separate gremlin queries to determine whether a user has permission through one of the groups and user direct permission.
Checking permission through group
g.V().has('user','userId','user1').emit().repeat(out('member_of'))
.outE('has_permission').has('permission','p1').inV()
.has('folder','folderId','folder1').hasNext()
User-direct permission
g.V().has('user','userId','user2')
.outE('has_permission').has('permission','p1').inV()
.has('folder','folderId','folder1').hasNext()
But I couldn't figure out the logic in a single query which can check both direct and group to see whether the user has permission or not.
Can someone help me out here?
Your graph:
g = TinkerGraph.open().traversal()
g.addV('user').property('userId','user1').as('u1').
addV('user').property('userId','user2').as('u2').
addV('user').property('userId','user3').as('u3').
addV('group').property('groupId','group1').as('g1').
addV('group').property('groupId','group2').as('g2').
addV('group').property('groupId','group3').as('g3').
addV('folder').property('folderId','folder1').as('f1').
addE('member_of').from('u1').to('g1').
addE('member_of').from('u2').to('g2').
addE('member_of').from('u3').to('g3').
addE('member_of').from('g3').to('g1').
addE('has_permission').from('g1').to('f1').
addE('has_permission').from('u2').to('f1').iterate()
A general solution to your problem:
g.V().has('user','userId',<userId>).
emit().
until(__.not(outE('member_of'))).
repeat(out('member_of')).
filter(out('has_permission').has('folder','folderId',<folderId>)).hasNext()
Traversal executed on the sample graph:
gremlin> g.V().has('user','userId','user1').
emit().
until(__.not(outE('member_of'))).
repeat(out('member_of')).
filter(out('has_permission').has('folder','folderId','folder1')).hasNext()
==>true
gremlin> g.V().has('user','userId','user2').
emit().
until(__.not(outE('member_of'))).
repeat(out('member_of')).
filter(out('has_permission').has('folder','folderId','folder1')).hasNext()
==>true
gremlin> g.V().has('user','userId','user3').
emit().
until(__.not(outE('member_of'))).
repeat(out('member_of')).
filter(out('has_permission').has('folder','folderId','folder1')).hasNext()
==>true
Thanks Daniel.
just to complicate the above query to check whether user have given permission or not.
below is the answer for any one like me looking for
gremlin> g.V().has('user','userId','user3').
emit().
until(__.not(outE('member_of'))).
repeat(out('member_of')).
filter(outE('has_permission').has('permission','V').inV().has('folder','folderId','folder1')).hasNext()
==>true

Paging or Using Skip in Azure Cosmos

Has anyone been able to determine the equivalent of Gremlin Skip in Azure Cosmos? It's not listed on Microsoft's documentation and I was thinking it's just outdated. I did try doing a query such as g.V().hasLabel('the_label').has('the_property', eq('the_value')).skip(some_number) and it errors out as such Unable to find any method 'skip'.
From your link in the Apache TinkerPop documentation:
The skip()-step is analogous to range()-step save that the higher end range is set to -1.
with these examples:
gremlin> g.V().values('age').order().skip(2)
==>32
==>35
gremlin> g.V().values('age').order().range(2, -1)
==>32
==>35

Resources