The gremlin documentation says:
Many graph vendors do not allow the user to specify an element ID and
in such cases, an exception is thrown.
I assume this refers to only specifying an ID when creating a new vertex or edge, not to the overall use of IDs in queries. So which gremlin implementations do, and which do not allow specifying and ID along with vertex or edge creation?
It's easier to specify the graph databases that do allow id assignment rather than those that don't as most graph databases do not allow you to specify the id when you create a vertex/edge.
I'm only aware of two that allow you to specify the id: TinkerGraph and elastic-gremlin. The rest do not support that.
You can always check what a graph supports by calling the features() method on the Graph instance:
http://tinkerpop.incubator.apache.org/docs/3.0.1-incubating/#_features
Related
Does Gremlin support unique properties other than ID property?
Trying to figure out if there's an equivalent in Gremlin to defining a field unique in Postgres for example.
I could always query the graph to see if the property exists first, I'm trying to figure out if there's a more efficient elegant way.
This is something that Apache TinkerPop leaves up to the database implementor, and it does vary by implementation. I see you tagged the question with amazon-neptune. Currently Neptune only enforces the unique ID constraints for vertices and edges.
It's possible in a future release that additional schema constraint capabilities will be added, but at the present time you would have to monitor/control that in your application logic.
We are stamping user permission as a property (of SET cardinality) on each nodes and edges. Wondering what is best way to apply the has step on all the visited nodes/edges for a given traversal gremlin query.
like a very simple travarsal query:
// Flights from London Heathrow (LHR) to airports in the USA
g.V().has('code','LHR').out('route').has('country','US').values('code')
add has('permission', 'team1') to all the visited vertices and edges while traversal using the above query.
There are two approaches you may consider.
Write a custom TraversalStrategy
Develop a Gremlin DSL
For a TraversalStrategy you would develop one similar to SubgraphStrategy or PartitionStrategy which would take your user permissions on construction and then automatically inject the necessary has() steps after out() / in() sorts of steps. The drawback here is that your TraversalStrategy must be written in a JVM language and if using Gremlin Server must be installed on the server. If you intend to configure this TraversalStrategy from the client-side in any way you would need to build custom serializers to make that possible.
For a DSL you would create new navigational steps for out() / in() sorts of steps and they would insert the appropriate combination of navigation step and has() step. The DSL approach is nice because you could write it in any programming language and it would work, but it doesn't allow server-side configuration and you must always ensure clients use the DSL when querying the graph.
We are stamping user permission as a property (of SET cardinality) on each nodes and edges.
As a final note, by "SET cardinality" I assume that you mean multi-properties. Edges don't allow for those so you would only be able to stamp such a property on vertices.
I'd like to create different graphs based on different domain. So, a kind of namespace or schema are needed. Just like the "Schema" conception in RDBMS. Therefore, does Gremlin support Namespace or similar sth?
Thanks
There is no notion of a schema name in the Gremlin language that is exactly like what you typically have in SQL. Your Gremlin query is bound to the graph to which you connect. If you have two or more domains then you either:
Create one graph per domain in which you can't traverse across those domains (you'd have to combine results after traversals - without explicit edges, i.e. joins, to connect the domains Gremlin has no way to do those sorts of queries), or
Create one large graph to house both domains and then constrain your traversal to the domain (in TinkerPop this is sometimes accomplished with PartitionStrategy)
I have a collection of json documents in CosmosDB that can contain references to other documents in the collection (by id).
I'd like to automatically manage graph edges between these documents by using triggers that run whenever a doc is created/updated/deleted.
Can I access the Gremlin API from Javascript inside the trigger function?
Is there any documentation for triggers in the context of graphs? I couldn't find any.
A dirtier alternative would be to just "manually" create the edge document in the trigger but this would break if the CosmosDB team change the underlying format of the documents describing the edges.
The Cosmos DB Trigger will probably work and it will give you a set of Documents which you might need to process first.
Since the Trigger is listening to the Change Feed you will get Documents that represent any insertion / update on the Collection. In the case of a Graph, these can be Vertices or Edges, so you might need to first detect what type of Document it is to work with it.
As for persisting the new relationship, the DocumentDB Output binding might not work for you, because like you said, the internal representation might change. But what you can do is include in your Azure Function some C# / Node Gremlin library and use it to talk to the Cosmos DB Graph API directly.
Graph API is currently not supported within UDFs/database triggers/stored procedures, and we don't have a timeline for when this will be supported.
The next best approach is manually create the graph elements as you described.
I’m just playing with the Graph API in Cosmos DB
which uses the Gremlin syntax for query.
I have a number of users (Vertex) in the graph and each have ‘knows’ properties to other users. Some of these are out edges (outE) and others are in edges (inE) depending on how the relationship was created.
I’m now trying to create a query which will return all ‘knows’ relationships for a given user (Vertex).
I can easily get the ID of either inE or outE via:
g.V('7112138f-fae6-4272-92d8-4f42e331b5e1').inE('knows')
g.V('7112138f-fae6-4272-92d8-4f42e331b5e1').outE('knows')
where '7112138f-fae6-4272-92d8-4f42e331b5e1' is the Id of the user I’m querying, but I don’t know ahead of time whether this is an in or out edge, so want to get both (e.g. if the user has in and out edges with the ‘knows’ label).
I’ve tried using a projection and OR operator and various combinations of things e.g.:
g.V('7112138f-fae6-4272-92d8-4f42e331b5e1').where(outE('knows').or().inE('knows'))
but its not getting me back the data I want.
All I want out is a list of the Id’s of all inE and outE that have the label ‘knows’ for a given vertex.
Or is there a simpler/better way to model bi-directional associations such as ‘knows’ or ‘friendOf’?
Thanks
You can use the bothE step in this case. g.V('7112138f-fae6-4272-92d8-4f42e331b5e1').bothE('knows')