this question is very specific to the Concur implementation of the IBM Cognos Report Studio tool since it primarily focuses on the data model used therein.
It contains business travel expense information including the travel itineraries, which are the main source of this report:
Itinerary Example
My goal now is to create a report showing which destination countries the employees traveled to (all countries if more than one country was visited in a single itinerary), how many single business trips were taken to that destination country/countries, the average duration of the business trips to that destination country/countries and the number of all trips. If possible, duration of stay by country would be great, but i have no idea how to go about that. Mockup
Using a repeater based on the field [Arrival Country] from the itinerary, i managed to get something that looks what i am trying to achieve, but it somehow does not include the home country once i cut out the other identifying columns (Itinerary Key, Departure Country, Arrival Country).
I then did a count(distinct([Itinerary Key] for [Arrival Country Repeater] which gave me numbers, but i am not really sure they are correct in this case.
Repeater
Also, as soon as i add query calculations to include the average duration, the repeater fields go blank.
Is there another way to get the report i want to build?
Are there major flaws with my attempt?
Thanks a ton for any and all suggestions!
Related
Im learning dimensional modeling and Im trying to create a model. I was thinking about a social media platform which rates hotels. The platform has following data:
hotel information: name and address
a user can rate hotels (1-5 points)
a user can write comments
platform stores the date of the comments
hotel can answer via comment and it stores the date of it
the platform stores the total number of each rating level (i.e.: all rates with 1 point, all rates with 2 point etc.)
platform stores information of the user: sex, name, total number of votes he/she made and address
First, I tried to define which information belongs to a dimension or fact table
(here I also checked which one is additive/semi additive/non-additive)
I realized my example is kind of difficult, because it’s hard to decide if it belongs to a fact table or dimension.
I would like to hear some advice. Would someone agree with my model?
This is how I would model it:
Hotel information -> hotel dimension
User rating -> additive fact – because I can aggregate them with all dimensions
User comment -> semi additive? – because I can aggregate them with the date dimension (I don’t know if my argument is correct, but I know I would have new comments every day, which is for me a reason to store it in a fact table
Answer as comment -> same handling like with the user comments
Date of comment-> dimension
Total Number of all votes (1/2/3/4/5) -> semi-additive facts – makes no sense to aggregate them, since its already total but I would get the average
User information sex and name, address -> user-dimension
User Information: total number of votes -> could be dimension or fact. It depends how often it changes. If it changes often, I store it in a fact. If its not that often, then dimension
I still have question, hope someone can help me:
My Question: should I create two date dimensions, or can I store both information in one date dimension?
2nd Question: each user and hotel just have one address. Are there arguments, to separate the address dimension in a own hierarchy? Can I create a 1:1 relationship to a user dimension and address dimension?
For your model, it looks well considered, but here are some thoughts:
User comment (and answers to comments): they are an event to be captured (with new ones each day, as you mention) so are factual, with dimensionality of the commenter, type of comment, date, and the measure is at least a 'count' which is additive. But you don't want to store big text in a fact so you would put that in a dimension by itself which is 1:1 with the fact, for situations where you need to query on the comment itself.
Total Number of all votes (1/2/3/4/5) are, as you say, already aggregates, mostly for performance. Totals should be easy from the raw data itself so maybe not worthwhile to store them at all. You might also consider updating the hotel dimension with columns (hotel A has 5 '1' votes and 4 '2' votes) that you'd update as you go on, for easy filtering and categorisation.
User Information: total number of votes: it is factual information about a user (dimension) and it depends on whether you always just want to 'find it out' about a person or whether you are likely to use it to filter other information (i.e. show me all reviews for users who have made 10-20 votes). In that case you might store the total in the user dimension (and/or a banding, like 'number of reviews range' with 10-20, 20-30). You can update dimensions often if you need to, but you're right, it could still just live as a fact only.
As for date dimensions, if the 'grain' is 'day' then you only need one dimension, that you refer to from multiple facts.
As for addresses, you're right that there are arguments on both sides! Many people separate addresses into their own dimension, referred to from the other dimensions that use them. Kimball suggests you can do that behind the scenes if necessary, but prefers for each dimension to have its own set of address columns(but modelled as consistently as possible).
I have a stock dimension set up on several items. I don't want to add dimensions I just want a current stock dimension that is enabled to be recognised during financial update of stock. I thought I could create a new dimension group and add the same dimensions as the previous one except I have checked serial numbers to be part of financial stock update. I just want to know if this is ok?
You can set up a new tracking dimension group where the flag "Financial inventory" is active for dimension "Serial number", yes. Since the gui allows you to do that, this is ok from a technical point of view (if you are asking if this makes sense as a business requirement, this would be the wrong place to ask that).
You may run into problems if you want to change the tracking dimension group on an existing product. I think you can only do this as long as the product has not been used in any transactions yet.
I have two database tables
this is a sample database of a Ticketing system.
Figure 1: Sample table of air ticket.
Figure 2: Sample table of tax.
Requirement:
When ticket is made from the interface, it has multiple taxes of different names every time.
How can I store this information i.e. 'n' number of taxes for each ticket with different names every time.
I have tried to make many to many relationship but the problem is:
For each ticket if the tax is not setup, then need to add the tax first.
Any optimal solution for this?
"the problem is: For each ticket if the tax is not setup, then need to
add the tax first."
This is not a real-life problem. In real life governments declare taxes well in advance of collecting them, This gives organizations sufficient time to amend their systems which need to handle taxes. Tax is never a surprise.
"But this is very tiring solution for the end user.... to make bunch
of tax setup for each ticket"
This sort of thing is reference data, and is the duty of the system developer (hint: that's you) to populate the reference data tables. Or at least provide a screen where the user can create or amend various taxes. This is a different function from defining a ticket type.
The Ticket Creation screen should have a drop-down list (or similar widget) displaying all the existing taxes, which allows the user to pick the relevant one(s). If you reall think it's necessary you can include a link to the Create Tax screen, but that really is a very confusing workflow.
If the commentators are correct, and this is a ticket purchasing function, then your design is seriously wrong. Sales taxes must be included automatically to the cost of the purcahse as part of the transaction. Otherwise nobody would pay any tax.
I have a database that tracks employee’s data for the current year and previous years.
In examples below I will use calendar years (years start in Jan and end in Dec)- this is not always the case, some users have their year running from July to June- or April to March, etc.
There are many tables that, with a few heavy calculations, make a view of the employee’s data at this point in time.
This current year’s data is what users look at mostly. But data from previous years impact on this year (so, a change made to the 2008 data- will have a knock on effect on 2009, and then into 2010 and so onto this year).
Obviously, this has a negative impact on performance when viewing reports as viewing this year’s data will mean trudging through all the previous years- calculating and creating views until the end result is found. As the application ages, this problem will get worse and worse- say in 2015, anybody using the system from its inception (2008) will be waiting for a long time to get their data.
We plan to freeze previous years data so instead of having data from 2008, 2009, 2010 and this year’s data- we would have one block with the previous year’s data (with all calculations done for those previous years) and this year’s data.
In this way, we would have the end results data for all the previous year’s already calculated and we would only need to add to this year to get the final result.
Obviously, we would have to prevent users from entering/updating data in previous years.
My question is what is the best way to achieve this? I presume you would need some process that waits until the new year and does some calculations.
Thanks in advance,
ViperMAN.
the approach you describe is normally referred as data archiving, you can have some queries a DBA runs manually every year the first working day after the new year party so the calculated data is prepared and stored.
Also, your application needs to deny users to modify previous years data, if I have got it right.
One approach I was thinking works as follows:
Create a table that holds the result of the previous years calculations.
Prevent all addtions/deletions/updates to previous years from the app tier.
Change reporting so that queries would consult this table instead of trudging along, calculating everything out each time.
Have a daily process that would:
Check if today was the first day of an employees year-
If yes, get all of the elapsed year's data and add them to all the previous years.
Obviously, this is a simlified version- but one that I think could work.
Thoughts?
If you have data that should not be edited, and you can define what that data is, then I would use a combination of stored procedures and security settings to ensure the old data stays accurate.
If you used stored procedures as a filter, you can have logic in your stored procedure that checks the record against the current DateTime and only allows the update if everything fits your requirements.
I want to develop some crosstab also know as pivot reports in Asp.net with x-axis and y-axis being dynamics, allowing grouping by row and column, for example: have products in y-axis and date in x-axis having in body number of sells of a given product in a given date, if date in x-axis are years, i want subtotals for each month for a product (row) and subtotals of sells of all products in date (column)
I know there are products available to build reports, but i am using Mysql, so Reporting Service is not an option. It's not necessary for the client build additional reports, i think the simplest solution is having a control to display such information and not using crystal report (which is not free) or something more complex, i want to know if is there an available free control to reach my goal.
Well, does anybody know a control or have a different idea,
thanks in advance.
You don't need SQL Server to run Reporting Services reports.
gotreportviewer.com have all the info (look for Generate RDLC dynamically - Matrix in the right column)