So i was going through this paper :- http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
There in the part of replication they have mentioned that Each key k is assigned to a coordinator node. Is this coordinator node the actual physicial nodes in context of consistent hashing? Of are there some specific nodes out of physicial nodes assigned this task of being coordinator nodes.
Also they have mentioned that there are more than N nodes mentioned in preference list. However if a key is replicated to N-1 physicial nodes does this prefrence list has virtial nodes too?
I am little confused here
That 2007 paper covers "Dynamo", which is a database that came before the DynamoDB service. You can't read it to understand how DynamoDB works.
Here's a paper about DynamoDB, if you like reading conference papers instead of documentation: https://www.usenix.org/system/files/atc22-elhemali.pdf
Related
I'm fairly new to Gremlin and I'm trying to query a graph starting at my vertex Customer, which is related to various nodes amongst which is the Account node. And so, I want to retrieve all nodes related to the Customer node + all nodes related to the Account node connected to it.
As you can see in the image, my Customer node is related to the account node via the has_account edge. I would like to get all the nodes adjacent to that account node.
Customer node
As I said, I'm fairly new to neptune so what I've tried aside from the most basic visualizations is:
g.V('id').outE().inV().outE().inV().path()
And that gives me the nodes adjacent to the Account node but ommits the other adjacent nodes to the Customer node. I've also tried some other groupings and mappings but I can't seem to make it work.
In the query you wrote above, you are starting out a vertex id and then traversing to all of its connections (the first outE().inV(), and then the connections of the connections (the second outE().inV()). If a connection does not have any connections, then it will be filtered out.
If you would like to return both the 1st and optionally 2nd connections, then I would look at using the optional() step for the 2nd+ level connections that may or may not exist like this:
g.V('id').outE().inV().optional(outE().inV()).path()
The vault contains data extracted from the ledger that is considered relevant to the node’s owner, stored in a relational model that can be easily queried and worked with.If both exists on same node, is it that the vault database is something that we have access to (vault data) and not the distributed ledger (but only by means of internal API for Corda).
You can think of the entire distributed ledger in Corda as being a combination of everyone's segmented view of the ledger. For example, if on a particular network, Node A and Node B were to transact, these two would each hold representations of the states involved in the transaction in each of their vaults.
If Node C and Node D were to also transact, likewise they would store the states relevant to their transaction. If the network were comprised entirely of just Nodes A, B, C and D then together all these states would form the entire distributed ledger with each node holding only the states relevant to them.
Each node can access their own segment of the distributed ledger by querying for data from their vault.
We have a usecase where we need to have a linearstate as below.
Initiated -> Updated -> Queried -> Resolved -> Accepted -> Settled
We have multiple nodes as signing nodes in Queried, Resolved, Settled states. We need the state to be updated from Queried to Resolved to Accepted if and only if all the involved nodes have had their queries resolved.
Not every node might have a query. So if one of the nodes accepts it, without any queries, it doesn’t mean that other have to accept it. They might have still have a query. How do we handle this state change where each node might have a different say in the same state?
You can proceed in two steps:
Write the contract logic so that all the involved nodes are required signers
Write the flow logic so that a node only signs if it doesn't have a query
I am designing a graph database for eligibility rules. Some eligibility rules require that a user select 2 particular products (Product A and Product B) to qualify for Product C.
Is it possible to create a graph edge with 2 starting nodes?
I would think this would break what I think is the fundamental building block of a graph db - its adjacency list. But if this was possible, it would be very powerful for my application.
Update 6/16
More specifically, I'm looking to create a directed edge with 2 starting nodes, and 1 ending node. So, in biz rules terms: IF Node=A AND Node=B THEN Node=C. The real world relationship is this: If customer buys Product A and Product B, then customer qualifies for Product C.
Usually, to model a hypergraph in Neo4j, you end up creating an intermediate "group node" that connects all of the nodes you want to connect, then bridging off of that node to the other node. It's not a true hypergraph, but rather a representation of a hypergraph using the tools provided.
Here's an example:
http://www.markhneedham.com/blog/2013/10/22/neo4j-modelling-hyper-edges-in-a-property-graph/
Yes you can have multiple starting nodes in Neo4j, not sure about other graph db.
START a=node(0), b=node(1)
RETURN a,b
You should refer to http://docs.neo4j.org/chunked/stable/query-start.html for more details. Starting from Neo4j 2.0 start node is optional, Cypher will try and infer starting points from your query based on label and where clause.
Edit
I have edited the answer based on the updated question. What you need is a hypergraph. As Wes Freeman mentioned, to model a hypergraph Neo4j you will need to create an intermediate node that connects your other two nodes and the the third node. In you scenario a user will have a PURCHASED relationship with the two products(A and B) kinda like (:User {Id: 1})-[:PURCHASED]->(:Product {Name:A}). Then you will have to create an intermediate node like ProductQualifier (I am very bad at naming things) having a relationship from user like (:User {Id:1})-[:QUALIFIES]->(:ProductQualifier {Id:1}). The Product qualifier will then have 3 relations, two to Product A and B respectively and a third to Product C,
(:Product {Name: 'B'})<-[:HAS]-(:ProductQualifier {Id:1})-[:HAS]->(:Product {Name: 'A'})
(ProductQualifier {Id:1}-[:ELIGIBLE]->(:Product {Name: 'C'})
This ought to do what you want.
A second approach that you can take is use a database that inherently supports hypergraphs, something like Hypergraphdb, thus discarding the burden of creating extra node. I haven't had any occasion to use it though I wanted to try it out for quite some time, so I don't know in much details about its API's or its limitations, however it is fairly well known graph database.
Note: As mentioned I am very bad at naming things. You should probably change the label names to more suitable to your business model.
Hi all I'm playing around orientdb to evaluate his inclusion into a new project.
Here is my problem.
Looking at the use cases I'm going to have a lot of super nodes (nodes which will have at least 5-10k outgoing relations) and I think that those nodes could be an irritating bottleneck on highly concurrent access.
The entire database must serve 20 departments, every department owns a partition of the data and those "blocks" are not accessible from the other departments.
Every department's partition share about the 60% of the data structure schema and the other 40% of the schema is department independent...
At system level I have a couple of agents which have complete read access to the graph for data analysis and profiling and every department can have is own profiling agent which will profile only his partition data.
Now. My question is
Is possible to create "indipendent" sub graphs into an orient graph database?
Thanks to everybody for the time and the help.
Marco
You can model this use case inside your domain as graph:
root -> * departments -> other nodes
In this way department cross the graph starting for own department node.
To use something already done look at this post by Marko Rodriguez (the main author of Blueprints and Gremlin language): http://thinkaurelius.com/2012/04/06/multitenant-graph-applications/
And this recent project to run a partition graph on top of OrientDB's Blueprints implementation: https://github.com/tinkerpop/blueprints/wiki/Partition-Implementation