Graph or Relational DB specifically recursion

Graph or Relational DB specifically recursion - recursion

I am about to develop a solution for a customer where the basic entity is a member and members can have different multiple social relationships with other members. For instance Lets say we have four types of members Doctors, Specialist, Nurses and Patients. So one or more Doctors can consult one or more Specialists, One or more Doctors can treat one or more Patients. One or more Doctor is in Charge of one or more Nurses. So if I were to use a Relational DB a high degree of recursion would be necessary (as All entities must be members). Whereas recursion is not necessary in a Graph data model.
Is it then safe to say it is better to use a Graph database for a social type application where members may have different or overlapping roles.

A graph database would be good at modelling these kinds of relationships. There's a few ways that you might model it. You could think of a vertex as being a Member with different edges from Member to other Members representing the types of relationships:
Member --consults--> Member (physician to specialist)
Member --reportsTo--> Member (nurse to physician)
Member --diagnoses--> Member (physician to patient)
Obviously a Member may have as many of any edge type (e.g. multiple "consults" with specialists). In a more complex model, you might also see a Member as being an "identity" for a person such that your model looks like:
Member --actsAsPhysician--> Physician
Member --actsAsSpecialist--> Specialist
Physician --consults--> Specialist
In this approach the "consults" edge can only exist on a "Physician" vertex, thus you provide some constraints as to what vertex types can be expected to have what kind of edges.
Graphs provide you a lot of flexibility in being able to model data such as it exists in the real-world as you are really just describing entities and the relationships among them. I'd encourage you to look at http://tinkerpop.com for a collection of tools that are helpful in building graph applications independent of the graph database you choose.

If every one is a member then member is central to the data model in a relational perspective. There is no need for recursive SQL select statements:
Member ->---<- Doctor ->---<- Patients
One or More Members is One or More Doctors One or More Doctors treats One or More Patients
Your model will have a lot of Many to Many relationships and alot of Relationship tables. For instance the Treats relation could contain attributes such as Treatment Period
Ailment
etc.
Your solution could be implemented in any topology. While the network/graph topology is faster than the Set topology the graph data model once implemented is almost impossible to change. History tells us it is unwise to build rigid business applications. So research the pros and cons of each and make a decision.

Related

What is the difference between a Domain model and a conceptual class diagram in UML

I have an assignment for school where I'm asked to represent the system of a company that I am to upgrade with a domain model and draw a conceptual class diagram with the four most important use cases of the system. I don't really understand the difference between the two, can someone help me ?

In short
Domain model and a conceptual mean different things to different people. There is no universal authoritative definition of these terms.
Nevertheless, objectively a domain is more than interrelated classes. If we consider that conceptual means independent of any solution implementation, we can claim that a conceptual class diagram is a subpart of the domain model.
Some more arguments
A domain model describes the elements of the real word for which a software shall provide a solution.
For example, for a real estate application:
you’d have “business objects” such as real estate assets (houses, flats, …), owners, tenants, sellers, buyers, agents, contracts, payments, geographical regions, etc.
But you would also have domain logic, such as the lifecycle of things: at first a party can be a prospect for an asset, then an interested prospect after a visit, then a tenderer if a bid is submitted, the. a buyer if the bid is accepted. The domain model can also describe business rules, e.g. if a tenderer proposes a price below the price demanded by the seller, the agent has to insure agreement of the seller before continuing negotiations.
DDD practitioners would also remind us that domain entities and aggregates (the things) are related to domain events that express what happens to entities and aggregates.
Hence, a domain is more than interrelated classes. If you’re bot convinced, imagine a Model-View-Controller application where the Model would ignore the business logic: would it be useful?
The term “conceptual” means something abstract that is independent of any concrete implementation. In this regard, a conceptual class diagram refers in principle to a diagram of classes that describes the domain, independently of any concrete/implementable solution.
As a consequence, a conceptual class diagram in UML only describes a static subpart of the domain model. Because by construction, the class diagram is designed for representing the static structure of classes. UML foresees other diagrams to describe the dynamic aspects of a system or its domain, such as activity diagrams, sequence diagrams or state diagrams, that allow to focus on some dynamic parts of the domain.
So a conceptual class diagram can only be a part of the domain model.
You’ll nevertheless find articles and peers who use the term “domain model” to refer to the “model of the entities of a domain”. This is a misleading shortcut in the language.

Constraintless conceptual model of organization with constraints?

We've been given an assignment in which we are to create a conceptual model, described by a text document. There are a number of constraints given in the document, but we have also been instructed not to use constraints in the model.
We have been able to work around a few constraints, but there is one that we've been unable to tackle. I've made up a scenario that is somewhat similar to the part of the assignment that we're having issues with.
You've been tasked to create a model of the structure of a game studio. The company consists of a number of departments, and each department has at least one employee. Each employee works at a single department. There are three different types of employees: developers, designers and engineers.
In addition to this, there are a number of leadership roles that employees can have: Head of Department, Deputy Head of Department, CTO or CEO (Yes, CTO and CEO are roles that regular employees have). Each department must have 1 Head of Department and at least one Deputy Head of Department.
In addition to this, there can only be one CTO and one CEO, and these roles can only be held by engineers. Each employee can only have a single leadership role.
To solve this, we've made up an additional, abstract entity: BasicRole. This entity is a specialisation of LeadershipRole, and is a generalisation of the three roles that any employee can hold. That solves one of the problems, and now we can simply create appropriate associations between Designer/Developer and BasicRole
However, we also want Engineer to have an association with BasicRole in addition to associations to CEO and CTO. Adding those associations results in a conceptual model that looks as such:
However, this is problematic because now we're saying that an engineer can have anywhere between 0 and 3 roles.
We've considered including Company as an entity and adding associations between Company and CTO/CEO, to specify that way that the company can only have one of each, but we've been told over and over during this course not to include the thing that we're modeling as an entity in the model.
Now, it seems as if all our problems could be solved with constraints (if we were to go ahead and read up on those), with some sort of xor for the three associations. However, seeing as we've been instructed not to use constraints in the conceptual model, we're at a loss.

If you associate your Engineer to LeadershipRole (with multiplicity 0..1) removing your two relationships from Engineer to CTO, CEO and LowerRole you will get the expected result:
Each employee can only have a single leadership role.
Since LeadershipRole is abstract it has to be either CEO, CTO, HeadOfDepartment or DeputyHOD) but due to the multiplicity can't be more than one at the same time.
The "we've been told over and over during this course not to include the thing that we're modelling as an entity in the model" statement is correct if you're designing the code-level documentation but it is normal to put the entity representing the whole organisation you're modelling. In other words - don't put System (or however you call your system) in your system's model. But Company is something you model within your system.

2 options for "Each department must have 1 Head of Department and at least one Deputy Head of Department."
Redefinition
Nested notation
2 options for "these roles can only be held by engineers"
Redefinition
Generalization
Total 4=2*2 options

association relationship in UML

As i read through software engineering appendix 1 from Roger Pressman book that
an association between two classes means that there is a structural
relationship between them
what structural relationship means?

UML differentiates 'structural' and 'behavioural' models. Class Diagrams, Package Diagrams and a few other capture the structural aspects. State/Sequence/Activity Diagrams capture behavioural aspects.
'Structural' means it holds over time. For example, the association between Order and OrderLines ("Order consists of 1 or more OrderLines / OrderLine is part of exactly one Order"). Or Dog and Person ("Dog is owned by exactly one Person / Person owns many Dogs"). Used well, Associations capture invariant rules from the problem domain. To use the Dog example: the association says a Dog can't ever be owned by more than one Person at any given time. Doesn't matter if the Dog is running, sitting, or eating: it must have exactly one Owner. Note also the owner could change over time: but there can never be more than one at any point.
An alternative is to think of Associations as the kind of thing that might be captured using foreign keys in a relational database.
hth.

How is representing all information in Nodes vs Attributes affect storage, computations?

While using Graph Databases(my case Neo4j), we can represent the same information many ways. Making each entity a Node and connecting all entities through relationships or just adding the entities to attribute list of a Node.diff
Following are two different representations of the same data.
Overall, which mechanism is suitable in which conditions?
My use case involves traversing the Database from different nodes until 4 depths and examining the information through connected nodes or attributes (based on which approach it is).
One query of interest may be, "Who are the friends of John who went to Stanford?"
What is the difference in terms of Storage, computations

Normally,
properties are loaded lazily, and are more expensive to hold in cache, especially strings. Nodes and Relationships are most effective for traversal, especially since the relationships types are stored together with the relatoinship records and thus don't trigger property loads when used in traversals.
Also, a balanced graph (that is, not many dense nodes with over say 10K relationships) is most effective to traverse.
I would try to model most of the reoccurring proeprties as nodes connecting to the entities, thus using the graph itself to index on these values, instead of having to revert to filter on property values or index the property with an expensive index lookup.

The first one is much better since you're querying on entities such as Stanford- and that entity is related to many person nodes. My opinion that modeling as nodes is more intuitive and easier to query on. "Find all persons who went to Stanford" would not be very easy to do in your second model as you don't have a place to start traversing from.
I'd use attributes mainly to describe the node/entity use them to filter results from the query e.g. Who are friends of John who went to Stanford in the year 2010. In this case, the year attribute would just be used to trim the results. Depends on your use case- if year is really important and drives a lot of queries or is used to represent a timeline, you could even model the year as a node attached to Stanford.

How do you design an OLAP Database?

I need a mental process to design an OLAP database...
Essentially for standard relational it'd be (loosely):
Identify Entities
Identify Relationships
Identify Properties of Entities
For each property:
Ensure property can be related to only one entity
Ensure property is directly related to entity
For OLAP databases, I understand the terminology, the motivation and the structure; however, I have no clue as to how to decompose my relational model into an OLAP model.

Identify Dimensions (or By's)
These are anything that you may want to analyse/group your report by. Every table in the source database is a potential Dimension. Dimensions should be hierarchical if possible, e.g. your Date dimension should have a year,month,day hierarchy, Similarly Location should have for example Country, Region, City hierarchy. This will allow your OLAP tool to more efficiently calculate aggregations.
Identify Measures
These are the KPI's or the actual numerical information your client wants to see, these are usually capable of being aggregated, therefore any non flag, non key numeric field in the source database is a potential measure.
Arrange in star schema, with Measures in the center 'Fact' table, and FK relations to applicable Dimension tables. Measures should be stored at the lowest dimension hierarchy level.
Identify the 'Grain' of the fact table, this is essentially the 'level of detail' held. It is usually determined by the reporting requirements, the data granularity available in the source and performance requirements of the reporting solution.You may identify the grain as you go, or you may approach it as a final step once all the important data has been identified. I tend to have a final step to ensure the grain is consistent between my fact tables.
The final step is identifying slowly changing dimensions, and the requirements for these. For example if the customer dimension includes an element of their address and they move, how is that to be handled.

One important point in identify the Dimensions and Measures is the final cardinality that you are electing for the model.
Let´s say that your relational database data entry is during all day.
Maybe you don´t need to visualize or aggregate the measures by hour, even by day. You can choose a week granularity or monthly etc.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex