Microstrategy intelligent cube data aggragation - report

I created a cube that contain five attributes and a metric and I want to create a document from this cube with different visualisations for each attribute. the problem is that data in the cube are aggregated based on all attributes in the cube, so when you add a grid with one attribute and the metric the numbers will not be correct.
Is there any way to make the metric dynamically aggragate depending on the attribute in use?

This depends what kind of metric you have in the cube. The best way to achieve aggregation across all attributes is obviously to have the most granular, least aggregated data in the cube, but understandably this is not always possible.
If your metric is a simple SUM metric then you can set your dynamic aggregation settings on the metric to just do SUM and it should perform SUM's appropriately regardless of the attributes you are placing on your document/report? Unless your attribute relationships are not set up correctly or there are some many-to-many relationships taking place between some of those attributes.
If you metric is a distinct count metric, then the approach is slightly different and has been covered previously in a few places. Here is one, on an older version of Microstrategy but the logic can still be applied to newer versions.:
http://community.microstrategy.com/t5/tkb/articleprintpage/tkb-id/architect/article-id/1695

Related

Local Cube - Is there a reason to use OLTP's grain?

I am building a local OLAP cube based on data gathered from several OLTP sources. Please note that I am doing this programmatically and do not have access to tools like SSAS or MDX-based tools.
My requirements are somewhat different than the operational requirements of the OLTP system users. I know that "in theory" it would be preferable to retain the most atomic grain available to me, but I don't see a reason to include the lowest level of data in the cube.
For example (I am simplifying), I have a measure field like "Price". Additionally, each sales fact has a Version attribute with values such as:
List (Original/Initial)
Initial Quote
Adjusted Quote
Sold
These describe the internal development of our pricing and are critical to the reports that I create.
However, for my reporting purposes, I will always want to know the value of all Versions whenever I am referencing a given transaction. Therefore, I am considering pivoting measures like Price by Version in the cube (Version will still be its own entity in the data model), resulting in measures like:
PriceList
PriceQuotedInitial
PriceQuotedAdjusted
PriceSold
Since only one Version is ever effective at a given point in time, we do not need to aggregate across multiple Versions.
Known Advantages
Since this will be a local cube file, it appears this approach would
simplify the creation of several required calculated measures that compare Price
across different Versions (would not be an issue to create calculated measures at various levels of aggregation if I was doing this with MDX)
It would also reduce the number of records by a factor of between 3
and 6, which would significantly boost performance for a local cube.
Known Disadvantages
While the data model will match the business process, the cube would not store the data at the most atomic level. An analyst would need to distinguish between Versions by Measure selection, and could not filter by Version - they would always get all available Versions.
This approach will greatly increase the number of Measures. For
example, there is not just one Price we are tracking, but several
price components and other Measures we track for each transaction.
So if we track a dozen true Measures for each transaction, that
might end up being 50-60 Measures if I take this approach.
I understand that for very large Fact tables, it would be preferable to factor all possible fields out of the Fact table into Dimensions for performance purposes, but I am not sure whether this is the case when using a local cube, as in all likelihood, I will put fewer than 50,000 records into any given cube file, given the limitations of local cubes.
Are there other drawbacks to this approach that I'm missing?

Why is this so in Crossfilter?

In the Crossfilter documentation, it states the following.
a grouping intersects the crossfilter's current filters, except for the associated dimension's filter. Thus, group methods consider only records that satisfy every filter except this dimension's filter. So, if the crossfilter of payments is filtered by type and total, then group by total only observes the filter by type.
What is the reasoning behind that and what is the way around it?
The reason is that Crossfilter is designed for filtering on coordinated views. In this scenario, you are usually filtering on a dimension that is visualized and you want to see other dimensions change based on your filter. But the dimension where the filter is defined should stay constant, partially because it would be redundant (the filter mechanism is usually displayed visually already) and partly because you don't want your dimension values to jump around while you are trying to filter on them.
In any case, to get around it you can define two identical dimensions on the same attribute. Use one dimension for filtering and the other for grouping. This way, as far as Crossfilter is concerned, your filtering dimension and grouping dimensions are separate.

Ranges in Drupal Views

I have a content type that stores two numerical values, effectively the minimum and maximum of a range.
I would like to configure the views module filter so that it will display nodes where the node range is contained within or overlaps a range specified in the view.
Views does not allow mixed OR and AND filters. You can configure an existing filter to show all nodes where N > 30 AND N < 50 (between, excluding, 30 and 50).
If you want more complex filters, e.g. filters that have business logic, or filters that create either/or conditions, you can define them yourself, trough hook_views. This is badly documented and requires a lot of googling and reading existing filter code.

How do I add KPI targets to my cube that are at a higher grain to my fact table?

I have a simple star schema with 2 dimensions; course and student. My fact table is an enrolment on a course. I have KPI Values set up which use data in the fact table (e.g. percentage of students that completed course). All is working great.
I now need to add KPI Goals though that are a different grain to the fact table. The goals are at the course level, but should also work at department level, and for whatever combination of dimension attributes are selected. I have the numerator and denominators for the KPI Goals so want to aggregate these when there are multiple courses involved - before dividing to get the actual percentage goal.
How can I implement this? From my understanding I should only have one fact table in my star schema. So in that case would I perhaps store the higher grain values in the fact table? Or would I create a dimension with these values in? Or some alternative solution?
In most cases I would expect KPI measures to be calculated from the existing measures in your cube, so can you get away from the idea of fact table changes, and just set up KPIs as calculated members in the cube or MDX?
Your issue is complicated by the KPI granularity being different, yes...but I would just hide KPI measures when such a level of granularity was being displayed. You can implement this within the calculated measure definition too.
For example, I have used ISLEAF() to detect if a measure is about to be shown at the bottom level, and return blank/NULL. Or you can check the level number of any relevant dimensions.

How do you design an OLAP Database?

I need a mental process to design an OLAP database...
Essentially for standard relational it'd be (loosely):
Identify Entities
Identify Relationships
Identify Properties of Entities
For each property:
Ensure property can be related to only one entity
Ensure property is directly related to entity
For OLAP databases, I understand the terminology, the motivation and the structure; however, I have no clue as to how to decompose my relational model into an OLAP model.
Identify Dimensions (or By's)
These are anything that you may want to analyse/group your report by. Every table in the source database is a potential Dimension. Dimensions should be hierarchical if possible, e.g. your Date dimension should have a year,month,day hierarchy, Similarly Location should have for example Country, Region, City hierarchy. This will allow your OLAP tool to more efficiently calculate aggregations.
Identify Measures
These are the KPI's or the actual numerical information your client wants to see, these are usually capable of being aggregated, therefore any non flag, non key numeric field in the source database is a potential measure.
Arrange in star schema, with Measures in the center 'Fact' table, and FK relations to applicable Dimension tables. Measures should be stored at the lowest dimension hierarchy level.
Identify the 'Grain' of the fact table, this is essentially the 'level of detail' held. It is usually determined by the reporting requirements, the data granularity available in the source and performance requirements of the reporting solution.You may identify the grain as you go, or you may approach it as a final step once all the important data has been identified. I tend to have a final step to ensure the grain is consistent between my fact tables.
The final step is identifying slowly changing dimensions, and the requirements for these. For example if the customer dimension includes an element of their address and they move, how is that to be handled.
One important point in identify the Dimensions and Measures is the final cardinality that you are electing for the model.
Let´s say that your relational database data entry is during all day.
Maybe you don´t need to visualize or aggregate the measures by hour, even by day. You can choose a week granularity or monthly etc.

Resources