I have a database that tracks employee’s data for the current year and previous years.
In examples below I will use calendar years (years start in Jan and end in Dec)- this is not always the case, some users have their year running from July to June- or April to March, etc.
There are many tables that, with a few heavy calculations, make a view of the employee’s data at this point in time.
This current year’s data is what users look at mostly. But data from previous years impact on this year (so, a change made to the 2008 data- will have a knock on effect on 2009, and then into 2010 and so onto this year).
Obviously, this has a negative impact on performance when viewing reports as viewing this year’s data will mean trudging through all the previous years- calculating and creating views until the end result is found. As the application ages, this problem will get worse and worse- say in 2015, anybody using the system from its inception (2008) will be waiting for a long time to get their data.
We plan to freeze previous years data so instead of having data from 2008, 2009, 2010 and this year’s data- we would have one block with the previous year’s data (with all calculations done for those previous years) and this year’s data.
In this way, we would have the end results data for all the previous year’s already calculated and we would only need to add to this year to get the final result.
Obviously, we would have to prevent users from entering/updating data in previous years.
My question is what is the best way to achieve this? I presume you would need some process that waits until the new year and does some calculations.
Thanks in advance,
ViperMAN.
the approach you describe is normally referred as data archiving, you can have some queries a DBA runs manually every year the first working day after the new year party so the calculated data is prepared and stored.
Also, your application needs to deny users to modify previous years data, if I have got it right.
One approach I was thinking works as follows:
Create a table that holds the result of the previous years calculations.
Prevent all addtions/deletions/updates to previous years from the app tier.
Change reporting so that queries would consult this table instead of trudging along, calculating everything out each time.
Have a daily process that would:
Check if today was the first day of an employees year-
If yes, get all of the elapsed year's data and add them to all the previous years.
Obviously, this is a simlified version- but one that I think could work.
Thoughts?
If you have data that should not be edited, and you can define what that data is, then I would use a combination of stored procedures and security settings to ensure the old data stays accurate.
If you used stored procedures as a filter, you can have logic in your stored procedure that checks the record against the current DateTime and only allows the update if everything fits your requirements.
Related
I am doubting myself on how I should approach this problem.
My users are able to record many parts of their day, including activities, mood, health measurement (heart bpm, glucose), exercise, meals.
I originally thought that I should create one document per entry (i.e. one entry per day). However, when displaying data to the user it rarely occurs on a day by day basis but more on a month by month (charts).
Should I model my Firestore DB in relation to my views or would it be better to just save each entry for each day and then just query?
I am just thinking that it will be more efficient in many parts of the app to have the entries grouped by month than by day.
Am I thinking this right or is there really no benefit? (i.e. maybe the amount of data transferred offsets the costs of unnecessary queries).
If you plane to save each entry for each day and then just query and query to find the result. The more document you'll have, the more document you will query and it will increase you're read/day and so may be more expensive than in a month by month.
To answer your last question :
Let's take an example : if you have a collection with 100 documents.
And you want to query 20 of it.
It will only count as 20 read and not 100 as we might expect.
Just to remind that with firebase you can read up to 50k/day for free, after this limit is it 0.06$/100k read.
I hope it will help you.
Have a nice day !
this question is very specific to the Concur implementation of the IBM Cognos Report Studio tool since it primarily focuses on the data model used therein.
It contains business travel expense information including the travel itineraries, which are the main source of this report:
Itinerary Example
My goal now is to create a report showing which destination countries the employees traveled to (all countries if more than one country was visited in a single itinerary), how many single business trips were taken to that destination country/countries, the average duration of the business trips to that destination country/countries and the number of all trips. If possible, duration of stay by country would be great, but i have no idea how to go about that. Mockup
Using a repeater based on the field [Arrival Country] from the itinerary, i managed to get something that looks what i am trying to achieve, but it somehow does not include the home country once i cut out the other identifying columns (Itinerary Key, Departure Country, Arrival Country).
I then did a count(distinct([Itinerary Key] for [Arrival Country Repeater] which gave me numbers, but i am not really sure they are correct in this case.
Repeater
Also, as soon as i add query calculations to include the average duration, the repeater fields go blank.
Is there another way to get the report i want to build?
Are there major flaws with my attempt?
Thanks a ton for any and all suggestions!
Is there a way to simply show the change of a value over the selected time period? All I'm interested in is the offset of the last value compared to the initial one. The values can vary above and below these over the time period, it's not really relevant (and would be exceptions in my case).
For an initial value of 100 and an final value of 105, I'd expect a single stat box displaying 5%.
I have the feeling I'm missing something obvious obvious, but can't find a method to display this deceptively simple task.
Edit:
I'm trying to create a scripted Grafana dashboard that will automatically populate disk consumption growth for all our various volumes. The data is already in Graphite, but for purposes of capacity management and finance planning (which projects/departments gets billed) it would be helpful for managers to have a simple and coarse overview of which volumes grow outside expected parameters.
The idea was to create a list of single-stat values with color coding that could easily be scrolled through to find abnormalities. Disk usage would obviously never be negative, but volatility in usage between the start and end of the time period would be lost in this view. That's not a big concern for us as this is all shared storage and such usage is expected to a certain degree.
The perfect solution would be to have the calculations change dynamically based on the selected time period.
I'm thinking that this is not really possible (at least not easily) to do with just Graphite and Grafana and have started looking for alternative methods. We might have to implement a different reporting system for this purpose.
Edit 2
I've tried implementing the suggested solution from Leonid, and it works after a fashion. The calculations seems somewhat off from what I expected though.
My test dashboard looks like follows:
If I were to calculate the change manually, I'd end up with roughly 24% change between the start (7,23) and end (8.96) value. Graphite calculates this to 19%. It's probably a reason for the discrepancy, probably something to do with it being a time-series and not discreet values?
As a sidenote: The example is only 30 days, even though the most interesting number would be a year. We don't have quite a year of data in Graphite yet and having a 30 day view is also interesting. It seems I have to implement several dashboards with static times.
You certainly can do that for some fixed period. For example following query take absolute difference betweent current metric value and value that metric has one minute ago (i.e. initial value) and then calculate it's percentage of inital value.
asPercent(absolute(diffSeries(my_metric, timeShift(my_metric, '1m'))), timeShift(my_metric, '1m'))
I believe you can't do that for time period selected in Grafana picker.
But is that really what you need? It's seems strange because as you said value can change in both directions. Maybe standard deviation would be more suitable for you? It's available in Graphite as stdev function.
I'm trying to implement the graph from this blog post: http://blog.neo4j.org/2012/02/modeling-multilevel-index-in-neoj4.html
Inside the blogpost is a schematic of the graph and a query on how to find a range of events. However in my use case i don't have a set consecutive days. So for example the current state of the graph could be that i have day nodes from 12-7-2013 (12 july 2013) to 12-8-2013 (12 august 2013). Then when adding an event on 12-7-2014 i'm missing all the inbetween days for a whole year!
First problem is if i start to write queries that generate those days it might become very slow (application needs to be responsive). Second problem is i end up with days on which no event could be taken place, and so have unneeded data in my database.
So my question is: How can i get a range of events without using the NEXT relation between days?
I need a global interval template for recurring events.
I am constructing a schedule management web app. I have a set of event happening periodically up certain moment in time. For example I a have a train schedules. They repeat them selves every week for certain winter or summer periods. Using Date module I would have to enter ending and beginning dates of let say summer period for each train route.
What I want to do is to simply add a taxonomy term which would hold repetition interval with some holiday exclusions...
This more Drupal philosophy question yet I think that others had ran into similar issues before.
Maybe I am looking to this problem from a wrong angel could somebody could lend me fresh set of eyes.