I recently came across the star schema and its advantages. I am new to Power BI and Database modeling.
I am making a Power BI Report.
I planned hours for all my members in my team for each Project and thus calculated the hours spent with their hourly rate and got the budget for every Project for the year. This is my first table. I have EmployeeName, Hours, Date, ProjectName, TotalCost
I get the monthly timesheet from SAP after they include their hours. The fields are the same. EMployeeNAme, PostingDate, Hours, Cost, ProjectName.
Now, I need to make a report where I see the difference between what was planned and what I get from my Timesheet. And how much budget is remaining for the next two months.
I should be able to filter this for each employee, each project and between any month.
I do not know what do I make Fact Table and what will be my Dimensions Table.
How do I use StarSchema in this?
Related
I am doubting myself on how I should approach this problem.
My users are able to record many parts of their day, including activities, mood, health measurement (heart bpm, glucose), exercise, meals.
I originally thought that I should create one document per entry (i.e. one entry per day). However, when displaying data to the user it rarely occurs on a day by day basis but more on a month by month (charts).
Should I model my Firestore DB in relation to my views or would it be better to just save each entry for each day and then just query?
I am just thinking that it will be more efficient in many parts of the app to have the entries grouped by month than by day.
Am I thinking this right or is there really no benefit? (i.e. maybe the amount of data transferred offsets the costs of unnecessary queries).
If you plane to save each entry for each day and then just query and query to find the result. The more document you'll have, the more document you will query and it will increase you're read/day and so may be more expensive than in a month by month.
To answer your last question :
Let's take an example : if you have a collection with 100 documents.
And you want to query 20 of it.
It will only count as 20 read and not 100 as we might expect.
Just to remind that with firebase you can read up to 50k/day for free, after this limit is it 0.06$/100k read.
I hope it will help you.
Have a nice day !
I have recently taken over a web development project for a local car rental company and need help finding out how to calculate the Daily, Weekly, and Monthly cost of a vehicle.
The previous developer used a plugin that allowed you to create "Pricing Schemes" where you define a day range and its price:
19.99/day, 99.99/week, 299.99/month:
Day 1-5 = $19.99
Day 5-6 = $16.665
Day 6-7 = 14.284
Day 7-8 = $14.9975
and so on...
Sadly the developer left no notes on how he got these numbers and each pricing scheme he made only extends to the 31st day. Which causes an issue when a user wants to rent a car for longer than a month (Which is common).
What I need help finding out is the equation he used to get these numbers so I can add on to the pricing schemes and create others if the need arises. I will add a screenshot of a full pricing scheme for reference below.
Any help with this would be greatly appreciated and I will be available to answer any questions if my question is not clear enough. Thank you!
I am investigating chances to use Firebase for my next project.I spent several days reading and building a "prove of concepts" project. In the demo project i build a shopping cart.In the admin section i can create products, and the client can buy it.When the client checks out i push to the closed-orders node a complex object which stores all data for the deal like this simplified version:
closed-orders
-order_id
-date
-client_id
-products
-product_id
-sale_price
-delivery_price
-qùantity
-product_id
-sale_price
-delivery_price
-quantity
....more products sold in this order
next order....
It is easy to do it that way and i can acces every different order and show it in the admin, but i want to make queries about the total sales, sales by product and a query about the profit.
Example question asked
1.What is the total quantites sold for every product from date1 to date2
2.What is the total turnover from date1 to date2
3.What is the profit for date1 to date2.
I want to answer this questions without downloading the whole dataset in the brwoser of course, because i do not think i can afford to pay for such bandwidth.Orders for one year could be tens of thousands:)
I wrote about Elastic search, keen.io but i am not sure exactly what functionality they offer and if it will answer my questions in a bandwidth friendly way.
I'm using "Reporting google Analitics API" and I can’t find information about what the last “end date” with data in Analytics is.
For example, let's suppose you want to retrive the last month’s data.
When do you have to perform the query?
The first day of the current month?
...or the second one?
...or maybe the third one?
And only another question: are the returned data for days in pacific time?
Google Analytics API is supposed to have access to the same data you have in the interface.
Google says that data can take up to 24h to process. The time it takes to really update the data depends on the type and size of the account. Small accounts are updated multiple times a day and can have data available in just a few hours. Once you reach 1M hits a month you are moved to a different mode where the data on your account is updated only once a day. Google Analytics Premium customers have updates more often even for large ammounts of traffic.
There's no way to tell through the API what is exactly the time of the last hit processed. You can query the data for today by the hour and see for yourself though.
Usually you don't care and just want to make sure that the data you're querying has been fully processed for that day.
So if you query data for yesterday there's a chance it has not being completely updated, for example if it's midnight the data for yesterday is just a couple minutes ago and probably haven't been completely processed yet. The safest bet in this case is to query data for 2 days ago.
So if today is 2012-06-15 and you want to get 1 month of data a safe approach is to query data with start-date=2012-05-13 and end-date=2012-06-13. This will most of the time give you data for days that have been fully processed, but it's not 100% safe as well. Google Analytics have had outages in the past where data took longer than that to process, these are not usual though. When you get the data out it's really hard to tell just for the API if the data for those days have been fully processed or not, using the 2 days ago isea you just make it more likely that it is.
The days are aggregate following your timezone settings configured on the Google Analytics profile.
I have a database that tracks employee’s data for the current year and previous years.
In examples below I will use calendar years (years start in Jan and end in Dec)- this is not always the case, some users have their year running from July to June- or April to March, etc.
There are many tables that, with a few heavy calculations, make a view of the employee’s data at this point in time.
This current year’s data is what users look at mostly. But data from previous years impact on this year (so, a change made to the 2008 data- will have a knock on effect on 2009, and then into 2010 and so onto this year).
Obviously, this has a negative impact on performance when viewing reports as viewing this year’s data will mean trudging through all the previous years- calculating and creating views until the end result is found. As the application ages, this problem will get worse and worse- say in 2015, anybody using the system from its inception (2008) will be waiting for a long time to get their data.
We plan to freeze previous years data so instead of having data from 2008, 2009, 2010 and this year’s data- we would have one block with the previous year’s data (with all calculations done for those previous years) and this year’s data.
In this way, we would have the end results data for all the previous year’s already calculated and we would only need to add to this year to get the final result.
Obviously, we would have to prevent users from entering/updating data in previous years.
My question is what is the best way to achieve this? I presume you would need some process that waits until the new year and does some calculations.
Thanks in advance,
ViperMAN.
the approach you describe is normally referred as data archiving, you can have some queries a DBA runs manually every year the first working day after the new year party so the calculated data is prepared and stored.
Also, your application needs to deny users to modify previous years data, if I have got it right.
One approach I was thinking works as follows:
Create a table that holds the result of the previous years calculations.
Prevent all addtions/deletions/updates to previous years from the app tier.
Change reporting so that queries would consult this table instead of trudging along, calculating everything out each time.
Have a daily process that would:
Check if today was the first day of an employees year-
If yes, get all of the elapsed year's data and add them to all the previous years.
Obviously, this is a simlified version- but one that I think could work.
Thoughts?
If you have data that should not be edited, and you can define what that data is, then I would use a combination of stored procedures and security settings to ensure the old data stays accurate.
If you used stored procedures as a filter, you can have logic in your stored procedure that checks the record against the current DateTime and only allows the update if everything fits your requirements.