reverse engineer a physical data model from a SAP HANA Calculation View in PowerDesigner - multidimensional-array

I created a multidimensional data model in SAP HANA as a calculation view type Cube with star join. In this calculation view I only used calculation views type Dimension, which include the dimension tables and the necessary changes I made to them (e.g. building hierarchies).
I now need to present a conceptual data model with all the dependencies. In PowerDesigner it is possible to reverse engineer physical data models, but when I try to do as it is described by SAP I get the physical tables as a result without the connections. I imported all calculation views and the necessary tables.
Does this happen because I did not connect the tables itself and only the views and is there a way to solve this?
Thank you very much for reading this. :)

SAP PowerDesigner can read the SAP HANA information models online help: Calculation Views (HANA).
This allows for impact analysis, i.e. the dependencies to source tables and views are captured.
However, the SAP HANA information views are usually not considered part of a logical data model as they are rather parts of analytical applications.
As for the lack of join conditions in the reverse engineered data model: if the model is reversed from the database runtime objects, that is the tables and views currently in the database, then you won't commonly find that foreign key constraints are implemented as database constraints.
Instead, SAP products implement the join definition either in the application layer (SAP Netweaver dictionary) or in the repository via view definitions and CDS associations.
See PowerDesigner and HANA for details on this.

Related

Constructing a OLAP Cube in Snowflake

I am trying to reconstruct a Cognos Transformer cube in Snowflake.
1. Do we have an option to build an OLAP cube in Snowflake (like SSAS, Cognos Transformer)?
2. Any recommendations of what the approach should be or steps to be followed?
Currently there is no option similar to an SSAS cube in Snowflake. Once data is loaded into the databases Snowflake allows to query the data in a way similar to traditional OLTP databases.
For data aggregations the Snowflake library offers rich sets of in-built functions. Together with UDFs, SPs and Materialized Views we can build custom solutions for precomputed data aggregations.
For data analysis we still have to rely upon third party tools. Snowflake provides a variety of different connectors to access its database objects from other analytical tools.
There are plans in near future to introduce an integrated tool for data aggregations, analysis and reporting.
Use TM1 to build your OLAP cube, then run Cognos over the top of the TM1 cube.
TM1 will have no problem shaping your Snowflake data into OLAP structure.
Snowflake is no multidimensional database and offers analytical statements like "Group by cube" as Oracle also does. But this is more like a matrix with aggregations. There's no drill down or drill up available like SSAS Cubes, PowerCubes and other multidimensional databases (MDDB) are offering.
An option could be to simulate OLAP by creating ad hoc aggregations and use JavaScript to drill down / drill up. But in my experience operations equal to drilling will take often more than 10 seconds (if not extremly high ressources are available). Snowflake is probably not the best solution for such use cases.

Time dependent Master data via History tables in SAP HANA

I was looking for the best way to capture historical data in HANA for master data tables without the VALID_TO and VALID_FROM fields.
From my understanding, we have 2 options here.
Create a custom history table and run a stored procedure that populates this history table from the original table. Here we compromise with the real-time reporting capability on top of this table.
Enable the History table flag in SLT for this table so that SLT creates this as a history table which solves this problem.
Option 2 looks like a clear winner to me but I would like your thoughts on this as well.
Let me know.
Thanks,
Shyam
You asked for thoughts...
I would not use history tables for modeling time dependent master data. That's not the way history tables work. Think of them as system versioned temporal tables using commit IDs for the validity range. There are several posts on this topic in the SAP community.
Most applications I know need application time validity ranges instead (or sometimes both). Therefore I would rather model the time dependency explicitly using valid from / valid to. This gives you the opportunity e.g. to model temporal joins in CalcViews or query the data using "standard" SQL. The different ETL tools like EIM SDI or BODS have also options for populating such time dependent tables using special transformations like "table comparison" or "history preserving". Just search the web for "slowly changing dimensions" for the concepts.
In the future maybe temporal tables as defined in SQL 2011 could be an option as well, but I do not know when those will be available in HANA.

Table Relations in GUI using SQL developer

I have created a connection to the database in a SQL developer. Now in there I can see lots of Table having different dependencies and constraints applied. now its very confusing and time consuming to see the details of each table manually. I want them in a way(GUI) so that I can easily Identify that particular table is master one and all the dependencies of all other tables. does it provide any kind of tool ? or is there any other Method ?
You can generate an ER diagram for the objects and their relations.
File -> Data Modeler -> Import -> Data Dictionary
Choose the database and schema which contains the objects.
Choose the objects

Relational behavior against a NoSQL document store for ODBC support

The first assertion is that document style nosql databases such as MarkLogic and Mongo should store each piece of information in a nested/complex object.
Consider the following model
<patient>
<patientid>1000</patientid>
<firstname>Johnny</firstname>
<claim>
<claimid>1</claimid>
<claimdate>2015-01-02</claimdate>
<charge><amount>100</amount><code>374.3</code></charge>
<charge><amount>200</amount><code>784.3</code></charge>
</claim>
<claim>
<claimid>2</claimid>
<claimdate>2015-02-02</claimdate>
<charge><amount>300</amount><code>372.2</code></charge>
<charge><amount>400</amount><code>783.1</code></charge>
</claim>
</patient>
In the relational world this would be modeled as a patient table, claim table, and claim charge table.
Our primary desire is to simultaneously feed downstream applications with this data, but also perform analytics on it. Since we don't want to write a complex program for every measure, we should be able to put a tool on top of this. For example Tableau claims to have a native connection with MarkLogic, which is through ODBC.
When we create views using range indexes on our document model, the SQL against it in MarkLogic returns excessive repeating results. The charge numbers are also double counted with sum functions. It does not work.
The thought is that through these index, view, and possibly fragment techniques of MarkLogic, we can define a semantic layer that resembles a relational structure.
The documentation hints that you should create 1 object per table, but this seems to be against the preferred document db structure.
What is the data modeling and application pattern to store large amounts of document data and then provide a turnkey analytics tool on top of it?
If the ODBC connection is going to always return bad data and not be aware of relationships, then all of the tools claiming to have ODBC support against NoSQL is not true.
References
https://docs.marklogic.com/guide/sql/setup
https://docs.marklogic.com/guide/sql/tableau
http://www.marklogic.com/press-releases/marklogic-and-tableau-build-connection/
https://developer.marklogic.com/learn/arch/data-model
For your question: "What is the data modeling and application pattern to store large amounts of document data and then provide a turnkey analytics tool on top of it?"
The rule of thumb I use is that when I want to count "objects", I model them as separate documents. So if you want to run queries that count patients, claims, and charges, you would put them in separate documents.
That doesn't mean we're constraining MarkLogic to only relational patterns. In UML terms, a one-to-many relationship can be a composition or an aggregation. In a relational model, I have no choice but to model those as separate tables. But in a document model, I can do separate documents per object or roll them all together - the choice is usually based on how I want to query the data.
So your first assertion is partially true - in a document store, you have the option of nesting all your related data, but you don't have to. Also note that because MarkLogic is schema-agnostic, it's straightforward to transform your data as your requirements evolve (corb is a good option for this). Certain requirements may require denormalization to help searches run efficiently.
Brief example - a person can have many names (aliases, maiden name) and many addresses (different homes, work address). In a relational model, I'd need a persons table, a names table, and an addresses table. But I'd consider the names to be a composite relationship - the lifecycle of a name equals that of the person - and so I'd rather nest those names into a person document. An address OTOH has a lifecycle independent of the person, so I'd make that an address document and toss an element onto the person document for each related address. From an analytics perspective, I can now ask lots of interesting questions about persons and their names, and persons and addresses - I just can't get counts of names efficiently, because names aren't in separate documents.
I guess MarkLogic is a little atypical compared to other document stores. It works best when you don't store an entire table as one document, but one record per document. MarkLogic indexing is optimized for this approach, and handles searching across millions of documents easily that way. You will see that as soon as you store records as documents, results in Tableau will improve greatly.
Splitting documents to such small fragments also allows higher performance, and lower footprints. MarkLogic doesn't hold the data as persisted DOM trees that allow random access. Instead, it streams the data in a very efficient way, and relies on index resolution to pull relevant fragments quickly..
HTH!

ORACLE-How to manage rdf schema and instance data in multiple tables?

I need to store the rdf schema in one table and rdf instance data in another table in ORACLE.
How I can do this?
How to configure the joseki-config.ttl to work for multiple models? Some example will help me to understand the solution.
Is there any possibility to create a single model for multiple tables?
Please let me know.
You need to use SDB with Joseki and Oracle. Then you can have a persistent datasets (a collection of models). There is an example of an SDB configuration in the Joseki download in joseki-config-sdb.ttl.
SDB controls the database table layout. A model is stored in the default graph name or in the named graphs table. There is no control for other layouts without changing the code of SDB.
Note that TDB, a custom database layer for Jena, scales better and is faster than using a relational database over JDBC. Fuseki is the new version of Joseki.
The Jena user mailing list is jena-users#incubator.apache.org.

Resources