I am looking for an upsert functionality in Gremlin.
Client program has a stream of (personId, favoriteMovieNodeId) that need to query for the favoriteMovieNodeId's, then UPSERT a person Vertex and create the [favoriteMovie] edge.
this will create duplicate Person nodes:
g.V().has(label,'movies').has('uid',$favoriteMovieNodeId).as('fm')
.addV('Person').property('personId', $personId).addE('favMovie').to('fm')
Is there a way to check for existence of node based on properties before adding a node? I can't seem to find the documentation on this very basic graph function thats a part of every underlying graph db.
If the movie is guaranteed to exist, then it's:
g.V().has('movies','uid',$favoriteMovieNodeId).as('fm').
coalesce(V().has('Person','personId', $personId),
addV('Person').property('personId', $personId)).
addE('favMovie').to('fm')
Related
We are trying to implement Customer oriented details in Graphdb, were with a single query we can fetch the details of a customer such as his address,phone,email etc. We have build it using had address, has email edges..
g.addV('member').property('id','CU10611972').property('CustomerId', 'CU10611972').property('TIN', 'xxxx').property('EntityType', 'Person').property('pk', 'pk')
g.addV('email').property('id','CU10611972E').property('pk', 'pk')
g.addV('primary').property('id','CU10611972EP').property('EmailPreference','Primary').property('EmailType', 'Home').property('EmailAddress', 'SNEHA#GMAIL.COM').property('pk', 'pk')
g.V('CU10611972').addE('has Email').to(g.V('CU10611972E'))
g.V('CU10611972E').addE('has Primary Email').to(g.V('CU10611972EP')
This is how we have build email relation to the customer.. Similarly we have relations with Address and Phone. So right now we are using this command to fetch the json related to this customer for email,
g.V('CU10611972').out('has Email').out('has Primary Email')
And for complete Customer details we are using union for each Vertex, Phone,Emaiul and address..
Could you please suggest if there is an efficient way to query this detail?
This comes down really to two things.
General graph data modelling
Things the graph DB you are using does and does not support.
With Gremlin there are a few ways to model this data for a single vertex.
If the database supports it, have a list of names like ['home','mobile'] and use metaproperties to attach a phone number to each.
A lot of the Gremlin implementations I am aware of have chosen not to support meta properties. In these cases you have a couple of options.
(a) Have a property for 'Home' and another for 'Mobile'. If either is not known you could either not create that property or give it a value such as "unknown"
(b) Use prefixed strings such as ["Home:123456789","Mobile:123456789] and store them in a set or list (multi properties) and access them in Gremlin using the startingWith predicate. Such as g.V(id).properties('phone').hasValue(startingWith('Mobile')).value()
The properties in my graph are dynamic. That means, there can be any number of properties on the vertices. This also means that, when I do a search, I will not know what property value to look for. Is it possible in gremlin to query the graph to find all vertices that have any property with a given value.
e.g., with name and desc as properties. If the incoming search request is 'test', the query would be g.V().has('name', 'test').or().has('desc', 'test'). How can I achieve similar functionality when I do not know what properties exist? I need to be able to search on all the properties and check if any of those properties' value is 'test'
You can do this using the following syntax:
g.V().properties().hasValue('test')
However, with any size dataset I would expect this to be a very slow traversal to perform as it is the equivalent of asking an RDBMS "Find me any cell in any column in any table where the value equals 'test'". If this is a high frequency request I would suggest looking at refactoring your graph model or using a database optimized for searches such as Elasticsearch.
I want to add property constraint on a specific vertex label to disallow null values or insertion of a vertex without specific properties
I added the name property to the person vertex as below, so the person will not take other properties except name but I need to add constraint on the value so it can not be null
mgmt = graph.openManagement()
person = mgmt.makeVertexLabel('person').make()
name = mgmt.makePropertyKey('name').dataType(String.class).cardinality(Cardinality.SET).make()
mgmt.addProperties(person, name)
mgmt.commit()
The Problem is :
A vertex with label person requires existence of a name property always. Or this vertex should not be created .
Is this achievable in janusgraph?
It is currently not possible to enforce the presence of certain property keys for certain vertex or edge labels in JanusGraph. However, this would be a good addition for the schema constraints that were introduced in JanusGraph 0.3.0. So, feel free to create an issue with JanusGraph for this feature request.
Until something like this is implemented in JanusGraph, you probably have to implement a logic to enforce this in your application that inserts the data.
If you for some reason cannot or don't want to implement this in your application (e.g. because you don't control all applications that insert data in your graph), then you could also implement your own TinkerPop TraversalStrategy that checks every addV step to ensure that the property is also added. These strategies are evaluated for all traversals and can change (e.g. as an optimization) the steps of the traversal or even throw an exception if the traversal is not legal which would be the correct behaviour in your case. JanusGraph itself would probably also implement a strategy to add these additional schema constraints.
I have this requirement in ArangoDB AQL: I have a graph created with Document collection for node and Edge collection for directed edge relation.
I want to input a subset of list of nodes as input to AQL query and get all the node traversals /sub graph as the output.
How to achieve this from AQL?
I want to know the relation between given nodes in that way. Please comment if more details are needed.
I know below query now
FOR v IN 1..1 INBOUND[or OUTBOUND] 'Collection/_key' EdgeCollection
OPTIONS {bfs: true}
RETURN v
I'd recommend reviewing the queries on the ArangoDB sample page where it shows how it performs graph queries, and how to review the results.
In your sample query above you are only returning v (vertex information) as in FOR v IN.
That returns only the last vertex from every path that the query returns, it doesn't return edge or path information.
For that you need to test with FOR v, e, p IN and it will return extra information about the last edge (e), and the path (p) it took.
In particular look at the results of p as it contains a JSON object that holds path information, which is a collection of vertices and edges.
By iterating through that data you should be able to extract the information you require.
AQL gives you many tools to aggregate, group, filter, de-duplicate, and reduce data sets, so make sure you look at the wider language functions and practice building more complex queries.
In my case there are two vertex labels : User, Seller.
Register user create new vertex using custom vertex id :
g.addV(label,'User', 'id', '123456789', 'name', 'User1').next();
When user login and if he register a new business then he become Seller.
Now i want same vertex id to be part of Seller label which is not acheivable.
If i create new vertex with seller label then whole graph(hasmobile, hasaddress) stored against user vertex will not be accessible with seller vertex.
Is there way to acheive this in DSE graph ?
The semantics of TinkerPop and DSE Graph (as well as most graph implementations - with Neo4j the only exception I can think of) do not allow a vertex to have multiple labels. You might think of the reason as being why you wouldn't have a row exist in multiple tables in a SQL database.
There are multiple ways you could resolve this. Make a "Person" vertex (instead of "User" or "Seller") then:
Infer whether they are a "User" or "Seller" from some aspect of the data related to them. Perhaps that is done with an "isSeller" or "isUser" property. Then you would query for sellers with
g.V().hasLabel('Person').has('isSeller',true)
Create a sub-type system where you connect your "Person" vertex to a "User" vertex or "Seller" vertex. Then you can then do stuff like "find a user only if they are a seller" with
g.V().hasLabel('User').has('someid','12345').
filter(__.in('isAPerson').out('isASeller'))