1.How should I prevent nodes to have relationships with themselves in neo4j graph?
2.How should I force nodes to have only one relationship with each other,I mean if node A have relationship with node B this way: A--->B, there should not be any B--->A?
I know in graph making everything is on my own so I can handle both of them,but consider there is a java api which user says which nodes and relationships should be created,but I want to prevent user from 1 and 2?
Is there any possibility in neo4j for handling them?
You can implement a [TransactionEventHandler][1] that performs the enforcing your constraints and register it with your GraphDatabaseService instance. A TransactionEventHandler can inspect the contents of the current transaction and eventually veto on committing, see http://docs.neo4j.org/chunked/stable/transactions-events.html.
Related
We are stamping user permission as a property (of SET cardinality) on each nodes and edges. Wondering what is best way to apply the has step on all the visited nodes/edges for a given traversal gremlin query.
like a very simple travarsal query:
// Flights from London Heathrow (LHR) to airports in the USA
g.V().has('code','LHR').out('route').has('country','US').values('code')
add has('permission', 'team1') to all the visited vertices and edges while traversal using the above query.
There are two approaches you may consider.
Write a custom TraversalStrategy
Develop a Gremlin DSL
For a TraversalStrategy you would develop one similar to SubgraphStrategy or PartitionStrategy which would take your user permissions on construction and then automatically inject the necessary has() steps after out() / in() sorts of steps. The drawback here is that your TraversalStrategy must be written in a JVM language and if using Gremlin Server must be installed on the server. If you intend to configure this TraversalStrategy from the client-side in any way you would need to build custom serializers to make that possible.
For a DSL you would create new navigational steps for out() / in() sorts of steps and they would insert the appropriate combination of navigation step and has() step. The DSL approach is nice because you could write it in any programming language and it would work, but it doesn't allow server-side configuration and you must always ensure clients use the DSL when querying the graph.
We are stamping user permission as a property (of SET cardinality) on each nodes and edges.
As a final note, by "SET cardinality" I assume that you mean multi-properties. Edges don't allow for those so you would only be able to stamp such a property on vertices.
I am attempting to model account access in a graph DB.
The account can have multiple users and multiple features. A user can have access to many accounts. Each account can give access to only part of the features for each user.
One way I see it is to represent access for each user through relationship attributes, this allows having a shared feature node.
user_1 has access to account_1-feature_1 and account_2-feature-2. user_1 does not have access to account_1-feature_2 even though it is enabled for the account.
Another way to model the same access, but without relationship attribute is to create account specific feature nodes.
Question 1: which of these 2 ways is a more 'proper' modeling in the graph DB world?
Now to make things more interesting the account can also have parts which can be accessed by multiple accounts and a certain feature should be able to be scoped down to only be accessible for specific part by user.
In this example user_1 can access account_1 only for part_a feature_1.
To me it seems like defining an attribute on relationship is the way to go for being able to scope down user access by feature & by part of the account. However, reading neo4j powerpoints this would be one of the code smells of relationships having "Lots of attribute-like properties". Is there a better way to approach such problem in a graph?
I could be wrong here, but here are my thoughts. Option 1 definitely sounds the better way from a modeling perspective, however, I don't see how you can keep the data consistent without building heavy machinery to do it. For example, If someone deletes Account1.Feature1, and does not update the edge from User1 -> Account1, then you end up having stale RBAC rules in the system. You think you have access to something, but in reality that "thing" does not exist anymore. Option 2 may not seem very attractive from a data model perspective, but it does keep your data consistent. If you delete Account1.Feature1, the edge is automatically deleted in the same transaction.
The only con is that, you need to incur additional cost at insertion where you need to insert a lot more nodes than Option 1. For an RBAC system, I think its a fair compromise.
The same comment applies to the second half of your question as well.
I am planing to use ArangoDB and I am faced with a problem I don't know how to solve. I would like to do simple traversals but in my case but there are two requirements that I don't know how to solve:
I will not know in advance the type of vertices than an edge will connect to. I want to be able to connect edge of one type to any vertex on any side.
For one vertex, I want to retrieve all connected vertices (depth 1) no matter the edge type.
For the requirement 1, an example would be a Tag vertex (to tag some entity with some information) and I want to be able to tag any vertex using i.e. HasTag edge in a named graph. From what I currently see is that I need to define the "From" collections ("To" collection is the Tag collection) and this is limited to 10 collections. Since I could have 100 or more From collections I don't see how to solve this with named graphs.
Option would be to use anonymous graphs but then I have a problem in the second requirement. I also want to have an option, when given a vertex, to find all connected vertices (depth = 1) no matter the type of an edge. In an anonymous graph I would need to specify all of the edge collections in a query and again, there could be 100 or more of them. I don't know if there is a limit to this number but I would assume there is one - maybe I'm mistaken since I haven't yet tried it out.
Has anyone any idea how to solve this with ArrangoDB? I really like the database but I would like it to be more "typeless", that is, that I wouldn't have to define the type of vertex collection an edge can connect to.
Best regards
Tomaz
You can have more than 10 vertex collections in a named graph. The limitation of 10 only exists in the webUI. Creating the named graph over the ArangoShell or the server console will work.
I am designing a graph database for eligibility rules. Some eligibility rules require that a user select 2 particular products (Product A and Product B) to qualify for Product C.
Is it possible to create a graph edge with 2 starting nodes?
I would think this would break what I think is the fundamental building block of a graph db - its adjacency list. But if this was possible, it would be very powerful for my application.
Update 6/16
More specifically, I'm looking to create a directed edge with 2 starting nodes, and 1 ending node. So, in biz rules terms: IF Node=A AND Node=B THEN Node=C. The real world relationship is this: If customer buys Product A and Product B, then customer qualifies for Product C.
Usually, to model a hypergraph in Neo4j, you end up creating an intermediate "group node" that connects all of the nodes you want to connect, then bridging off of that node to the other node. It's not a true hypergraph, but rather a representation of a hypergraph using the tools provided.
Here's an example:
http://www.markhneedham.com/blog/2013/10/22/neo4j-modelling-hyper-edges-in-a-property-graph/
Yes you can have multiple starting nodes in Neo4j, not sure about other graph db.
START a=node(0), b=node(1)
RETURN a,b
You should refer to http://docs.neo4j.org/chunked/stable/query-start.html for more details. Starting from Neo4j 2.0 start node is optional, Cypher will try and infer starting points from your query based on label and where clause.
Edit
I have edited the answer based on the updated question. What you need is a hypergraph. As Wes Freeman mentioned, to model a hypergraph Neo4j you will need to create an intermediate node that connects your other two nodes and the the third node. In you scenario a user will have a PURCHASED relationship with the two products(A and B) kinda like (:User {Id: 1})-[:PURCHASED]->(:Product {Name:A}). Then you will have to create an intermediate node like ProductQualifier (I am very bad at naming things) having a relationship from user like (:User {Id:1})-[:QUALIFIES]->(:ProductQualifier {Id:1}). The Product qualifier will then have 3 relations, two to Product A and B respectively and a third to Product C,
(:Product {Name: 'B'})<-[:HAS]-(:ProductQualifier {Id:1})-[:HAS]->(:Product {Name: 'A'})
(ProductQualifier {Id:1}-[:ELIGIBLE]->(:Product {Name: 'C'})
This ought to do what you want.
A second approach that you can take is use a database that inherently supports hypergraphs, something like Hypergraphdb, thus discarding the burden of creating extra node. I haven't had any occasion to use it though I wanted to try it out for quite some time, so I don't know in much details about its API's or its limitations, however it is fairly well known graph database.
Note: As mentioned I am very bad at naming things. You should probably change the label names to more suitable to your business model.
I have an interesting situation. I am allowing users to provide their own data sources to be imported into neo4j. The data sources could be the same across different users, but I would like cypher queries to only query nodes and relations specified by a particular user's sources.
I can think of several ways to do this:
Separate neo4j instances for each user
Tag nodes and relationships by user
Currently node duplicates are prevented by indexes so I would have to alter that approach since nodes which already exist simply cause a new relationship to that node. Number of relationships to a node are used in my analysis so separating relationships by user are important.
I will have to update an existing graph database to account for these new attributes. I'm thinking that tagging relationships might be the way to go. Any thoughts pro/con against this approach? This way I can include the user tag as a relationship parameter.
Thoughts?
Henry
You can tag all your users with labels and use these even to tag the source:
http://docs.neo4j.org/chunked/preview/query-match.html#match-get-all-nodes-with-a-label