I have an OLAP cube containing a numeric measure that represent a foreign key against a parent table. I would like display the 'name' field from the parent table instead of the measure in my reports. it is possible?
Do you have a dimension based on this table? If you have one than it should work by design (though not as a measure but as a dimension attribute member name)
If you do not have a dimension that that information (name attribute) is not included in the cube.
Also consider that a typical MDX query is not on the granularity level of the fact table, so for a specific coordinate, say Year 2007 in a fact table that contains monthly data there would probably be n-values in that measure and the default behavior is to aggregate that measure using some of the available aggregation formulas.
If on the other hand it was a dimension key then you could display the count of fact records in Year 2007, or different dimension members that crossjoin with Year 2007 in the fact table.
Could you be more specific in your question?
Regards,
Hrvoje
Related
I am new to OLAP,if I have two fact tables can they share the same dimension table?
A good example would be if I have tables fact1 and fact2, can they both have a foreign key into a single Date dimension (dimDate) table? Or, do I need/should create separate dimDate dimension tables for each separate fact?
To me, and based on my research, I don't see any downfall of sharing a dim table, but wanted to check.
Thanks!
They can, and should.
That's the whole point of conformed dimensions, keeping the attributes in a single place, so as to avoid multiple versions of truth coming from different fact tables.
So a single date dimension, with all the necessary attributes for each fact table, which is then linked from each fact table that needs it.
Same for a customer dimension. If you have a sales fact table that needs customer info such as billing address and a marketing fact table that holds info about campaigns each customer can benefit from, you would combine all those attributes in a single table. Some customers may not be referenced in the marketing fact table, others may not exist in the fact table, but all would exist in the single customer dimension, which is your single source of truth about who your customers are.
I'm trying to figure out the best way to link a dimension to a fact table and having some trouble finding an example in the documentation. All of my data sources are csv files and I have the following 2 data tables:
Areas table with columns: Area,DateTime,Cost (Area,DateTime are unique)
Company table with columns: CompanyID,Area (CompanyID is unique and represents a dimension)
I would like to link the Company Dimension to the Cost measure from the Areas table. However, it seems that I can only link the Company dimension to the Areas measure group through the dimension key, which is CompanyID. Is there a way around this or do I need to add a CompanyID column to my Areas csv file prior to loading.
thanks
I'm not really sure what you want to achieve but it looks as you could use a many-to-many cube to create a link from Area to Company (even though it might be a one-to-on relation).
1) Create a many-to-many cube using your Company table (Advanced/Many-to-Many)
2) Bind the Company dimension in the facts, Area table, using the defined many-to-many relation.
Some documentation here (the first image is wrong).
hope it helps.
I am dealing with a roster with 15,000 unique employees. Depending on their 'Designation' they either impact performance or do not. The issue is, these employees could change their designation any day. The roster is as simple as this:
AgentID
AgentDesignation
Date
I feel like I would be violating some Normalization rules if I just have duplicate values (the agent has the same designation from the previous day, for example). Would I really want to create a new row for each date even if the Designation is the same? I want to always be able to get the agent's correct designation on a particular date.
All calculations are done with Excel, probably with Vlookup. Anyone have some tips?
The table structure you propose would not be a violation of normalization -- it contains a PRIMARY KEY (AgentID, Date) and a single attribute that is dependent on all elements of the key (AgentDesignation). Furthermore, it's easy (using the PRIMARY KEY constraint) to ensure that there is one-and-only-one designation per agent per day. The fact that many PRIMARY KEY values will yield the same dependent value does not mean the database is not correctly normalized.
An alternative approach using date ranges would likely result in fewer rows but guaranteeing integrity would be harder and searches for a particular value would be costlier.
I have a simple star schema with 2 dimensions; course and student. My fact table is an enrolment on a course. I have KPI Values set up which use data in the fact table (e.g. percentage of students that completed course). All is working great.
I now need to add KPI Goals though that are a different grain to the fact table. The goals are at the course level, but should also work at department level, and for whatever combination of dimension attributes are selected. I have the numerator and denominators for the KPI Goals so want to aggregate these when there are multiple courses involved - before dividing to get the actual percentage goal.
How can I implement this? From my understanding I should only have one fact table in my star schema. So in that case would I perhaps store the higher grain values in the fact table? Or would I create a dimension with these values in? Or some alternative solution?
In most cases I would expect KPI measures to be calculated from the existing measures in your cube, so can you get away from the idea of fact table changes, and just set up KPIs as calculated members in the cube or MDX?
Your issue is complicated by the KPI granularity being different, yes...but I would just hide KPI measures when such a level of granularity was being displayed. You can implement this within the calculated measure definition too.
For example, I have used ISLEAF() to detect if a measure is about to be shown at the bottom level, and return blank/NULL. Or you can check the level number of any relevant dimensions.
I need a mental process to design an OLAP database...
Essentially for standard relational it'd be (loosely):
Identify Entities
Identify Relationships
Identify Properties of Entities
For each property:
Ensure property can be related to only one entity
Ensure property is directly related to entity
For OLAP databases, I understand the terminology, the motivation and the structure; however, I have no clue as to how to decompose my relational model into an OLAP model.
Identify Dimensions (or By's)
These are anything that you may want to analyse/group your report by. Every table in the source database is a potential Dimension. Dimensions should be hierarchical if possible, e.g. your Date dimension should have a year,month,day hierarchy, Similarly Location should have for example Country, Region, City hierarchy. This will allow your OLAP tool to more efficiently calculate aggregations.
Identify Measures
These are the KPI's or the actual numerical information your client wants to see, these are usually capable of being aggregated, therefore any non flag, non key numeric field in the source database is a potential measure.
Arrange in star schema, with Measures in the center 'Fact' table, and FK relations to applicable Dimension tables. Measures should be stored at the lowest dimension hierarchy level.
Identify the 'Grain' of the fact table, this is essentially the 'level of detail' held. It is usually determined by the reporting requirements, the data granularity available in the source and performance requirements of the reporting solution.You may identify the grain as you go, or you may approach it as a final step once all the important data has been identified. I tend to have a final step to ensure the grain is consistent between my fact tables.
The final step is identifying slowly changing dimensions, and the requirements for these. For example if the customer dimension includes an element of their address and they move, how is that to be handled.
One important point in identify the Dimensions and Measures is the final cardinality that you are electing for the model.
Let´s say that your relational database data entry is during all day.
Maybe you don´t need to visualize or aggregate the measures by hour, even by day. You can choose a week granularity or monthly etc.