Reversal of PeopleSoft Time and Labor Stat Holiday Payout causes summing errors - peoplesoft

(Note that my knowledge of Time and Labor processing is limited. This is just something that was done previously by someone else that needs fixing, not a general T&L implementation)
Background
Business has a Stat Holiday Bank policy, where employees who are not scheduled to work on holidays get a 7.5 hour bank credit. Expiry is within 90 days if not taken and results in a payout.
The initial implementation of this T&L Rule did not correctly determine cases where a Banked Stat Holiday due for expiration had already been taken and, as a result, often paid out at the wrong dates.
Approach for Rule reimplementation.
The Banked and Taken event need to be matched up, rather than just applying 90 day aging rules. #1Banked, #2Banked, #3Taken really means that the #1Banked is cancelled out by the #3Taken, so that only #2Banked is up for expiration. In this context, a Payout really is a form of Taken.
So the rule was rewritten as follows:
Load all Record.TL_COMPLEAV_TBL rows since the last 0-balance date, and up the T&L Batch's POI_START_DT, into a work table.
Add current Batch incoming TL_IPT1 rows for Taken into work table.
Rank events by age i.e. Banked #1, Banked #2, Banked #3. Taken #1, Taken #2.
Cross-out Banked events that have a matching Taken.
What works
So far, so good. I can generate Payouts at appropriate times, when I examine the actual dates that the employees earned and took banked time off. I checked the TL_IPT1 rows that the original implementation was emitting and my own contents match on all fields, except for the date of payout.
What doesn't work.
When we re-process Time and Labor via Process.TL_TIMEADMIN, the new rule does not generate payouts that had been done incorrectly. It in fact, does not send anything to TL_IPT1.
Time and Labor however picks up that the payout isn't happening and generates a reversal. -7.5hrs to Payout, same date as original payout.
So far, so good.
However, the next time that Process.TL_TIMEADMIN is run, if I have a banked time earned event, then things go wrong.
Section.TL_COMPTIME.DA000 kicks in, does some calculations, calls DD000 and instead of say adding 7.5 earned hours to a banked 15, giving 22.5 it will set the bank to 0 hours.
Oddly enough, I had expressed surprise that the vanilla T&L implementation supports the notion of expiry and aging in TL_COMPLEAV_TBL but then, according to the business owner, does nothing with it. Section.TL_COMPTIME.DA000, which I haven't reviewed yet in detail, seems to be intended for keeping track of Taken and Earned so it almost looks to me as it then decides to do aging that the custom rule has already done.
By doing nothing when there is nothing to do, am I taking the wrong approach to support reversals? Should I instead insert a row into TL_IPT1 with TL_QUANTITY=0 for that day, when I see that there was a previous incorrect payout?
I am fairly confident that my payout calculations are correct - they are the same as the previous ones, only the dates differ. But I don't know how to handle reversals.

Related

Single-stat percentage change from initial value in graphite/grafana?

Is there a way to simply show the change of a value over the selected time period? All I'm interested in is the offset of the last value compared to the initial one. The values can vary above and below these over the time period, it's not really relevant (and would be exceptions in my case).
For an initial value of 100 and an final value of 105, I'd expect a single stat box displaying 5%.
I have the feeling I'm missing something obvious obvious, but can't find a method to display this deceptively simple task.
Edit:
I'm trying to create a scripted Grafana dashboard that will automatically populate disk consumption growth for all our various volumes. The data is already in Graphite, but for purposes of capacity management and finance planning (which projects/departments gets billed) it would be helpful for managers to have a simple and coarse overview of which volumes grow outside expected parameters.
The idea was to create a list of single-stat values with color coding that could easily be scrolled through to find abnormalities. Disk usage would obviously never be negative, but volatility in usage between the start and end of the time period would be lost in this view. That's not a big concern for us as this is all shared storage and such usage is expected to a certain degree.
The perfect solution would be to have the calculations change dynamically based on the selected time period.
I'm thinking that this is not really possible (at least not easily) to do with just Graphite and Grafana and have started looking for alternative methods. We might have to implement a different reporting system for this purpose.
Edit 2
I've tried implementing the suggested solution from Leonid, and it works after a fashion. The calculations seems somewhat off from what I expected though.
My test dashboard looks like follows:
If I were to calculate the change manually, I'd end up with roughly 24% change between the start (7,23) and end (8.96) value. Graphite calculates this to 19%. It's probably a reason for the discrepancy, probably something to do with it being a time-series and not discreet values?
As a sidenote: The example is only 30 days, even though the most interesting number would be a year. We don't have quite a year of data in Graphite yet and having a 30 day view is also interesting. It seems I have to implement several dashboards with static times.
You certainly can do that for some fixed period. For example following query take absolute difference betweent current metric value and value that metric has one minute ago (i.e. initial value) and then calculate it's percentage of inital value.
asPercent(absolute(diffSeries(my_metric, timeShift(my_metric, '1m'))), timeShift(my_metric, '1m'))
I believe you can't do that for time period selected in Grafana picker.
But is that really what you need? It's seems strange because as you said value can change in both directions. Maybe standard deviation would be more suitable for you? It's available in Graphite as stdev function.

How can I use the occurrences to calculate the end date of a recurring event in eas

Can anyone tell me the best way of calculating the end date of a recurring event from the number of occurrences and the pattern in which the event occurs.
For example:
I have an event which has start date as 10/07/2014 (Tuesday) and occurs every week on Tuesday. This event will end after 10 occurrences (say). So, the my method should return me the end date as : 12/09/2014
The method should also consider more complex situations like suppose if the event occurs yearly on first Monday of October and has total 10 occurrences.
(This isn't an answer which gives you a complete solution by any means, but hopefully it's a step in the right direction.)
Good luck. I've worked on an ActiveSync implementation, and recurrent events are fundamentally painful. You'll need to think about all kinds of corner cases - if something occurs every month on the 30th, what happens in February? What happens if it happens at 1.30am, and the clocks go forward or backward in the event's time zone so that 1.30am happens 0 or 2 times for a particular day?
Noda Time can help with this, but it doesn't provide a complete solution, partly because all the requirements will vary so much.
The important types you'll need to know about are LocalDate and LocalDateTime to provide time-zone-neutral dates/times, and Period which represents a not-necessarily-fixed period of time, such as "1 month". That will help with things like "add a week" - and there are methods on LocalDate for things like "next Monday after this date". It gets harder for events which are "weekly, on Monday and Wednesday" - you'll want to step through the weeks, working out which days occur within a particular week, until you've gone through all the events you need.
Noda Time 2.0 has the concept of "adjusters" which will make life somewhat simpler for things like "the first Monday of October" but everything you need to do can be done with Noda Time 1.3. (Don't wait for Noda Time 2.0, which I wouldn't expect to be released for another 6 months at least.)
I think my strongest pieces of advice would be:
Keep it simple. Focus on getting the right results first, then work out any optimizations you need. (For example, don't try to "guess" when the 100th instance of an event will occur - stepping through 100 instances with simple steps will be slower, but get you to the right answer. Do measure the performance, but make sure you have good tests before you optimize.)
Introduce your own types to represent exactly what you know about the event. Use the Noda Time types where they match of course, but don't be tempted to use an existing type just because it's quite like what you're trying to represent. The small differences will bit you eventually.
Make sure you know what you actually want the results to be. Write lots of tests. Date and time work is a naturally data-oriented domain, so invest in making it as easy as possible to write tests for all the corner cases you should be thinking of. (And you really should be thinking about them. Pay particular attention to leap years and time zones.)
Be aware that time arithmetic doesn't follow the normal rules of arithmetic - x + 1 month + 1 month isn't the same as x + 2 months
If/when behaviour surprises you, do come back to ask specific questions here. There aren't very many of us working on Noda Time, but questions tend to be answered quickly :)

How does collection sampling affect the "live" stats for Google Analytics?

We've noticed lately that as our site is growing, our data in Google Analytics is getting less reliable.
One of the places we've noticed this most strongly is on the "Realtime Dashboard".
When we were getting 30k users per day, it would show about 500-600 people on line at a time. Now that we are hitting 50k users per day, it's showing 200-300 people on line at a time.
(Other custom metrics from within our product show that the user behavior hasn't changed much; if anything, users are currently spending longer on the site than ever!)
The daily totals in analytics are still rising, so it's not like it's just missing the hits or something... Does anyone have any thoughts?
The only thing I can think of is that there is probably a difference in interpretation of what constitutes a user being on line.
How do you determine if the user is on line?
Unless there is an explicit login/logout tracking, is it possible that it assumes that a user has gone if there is no user generated event or a request from the browser within an interval of X seconds?
If that is the case then it may be worth while adding a hidden iframe with some Javascript code that keeps sending a request every t seconds.
You can't compare instant measures of unique, concurrent users to different time-slices of unique users.
For example, you could have a small number of concurrent unique users (say 10) and a much higher daily unique users number like 1000, because 1000 different people were there over the course of the day, but only 10 at any given time. The number of concurrent users isn't correlated to the total daily uniques, the distribution over the course of the day may be uneven and it's almost apples and oranges.
This is the same way that monthly unique and daily uniques can't be combined, but average daily uniques are a lower bound for monthly uniques.

Confusion over Google Analytics (GA) Absolute Unique Visitors data

GA Unique Visitors data isn't making sense to me. From the GA FAQ we get the following definition for 'Visits vs. Visitors'
"The initial session by a user during any given date range is considered to be an additional visit and an additional visitor. Any future sessions from the same user during the selected time period are counted as additional visits, but not as additional visitors. "
The part that I can't resolve with the GA graph is "Any future sessions from the same user during the selected time period are counted as additional visits, but not as additional visitors". For the graph below covering a 30-day period, I would understand the GA definition to mean that the data represents uniqueness across all 30 days, right? But if you look at the screen shot below, you see a regular pattern for each week over the 30-day period the report covers. From that, it seems the numbers we are seeing associated with each of the days of the graph (e.g. 3.92% (4142) for Tuesday, September 8) is a count of unique visitors just in the context of that one day - i.e. without correlating their uniqueness to the rest of the days in the 30-day period. If the graph actually showed uniqueness across the 30-day period, I would expect the daily numbers to start high in the early days of the period and decrease over the 30-day period as the number of already-seen visitors (i.e. returning visitors) increases, no?
What am I missing here?
UPDATE
Helpful clue from Jonathan S. below got me on the right track.
I think I understand now what the daily bar graph values mean, but it's a little counter-intuitive and I'd bet not what some others might be assuming as well. The reports states "39,822 Absolute Unique Visitors" at the top, which means just that: over the 30-day period we saw this many uniques. Fair enough. The confusing part is that the daily (or weekly) bar values in the graph below are not mutually exclusive uniques as I had assumed, but are values relative only to the 39,822 total - i.e. there is overlap between the unique visitor counts across any group of days. This means the sum of the daily % values > 100% and the sum of the daily count values > 39,822. The algorithm is: when you visit for the first time in the 30-day period, call that "today", you add 1 to the total (39,822) and 1 to the "today" bar value. When you show up again "tomorrow", you are NOT counted again in the total, but ARE counted as 1 in the "tomorrow" bar value.
alt text http://img.skitch.com/20090922-djti81ejj5gqn575ibf8cj1e8x.jpg
I believe it's just an issue of grouping. The top right of the graph has 3 icons to group by day, week, or month. It's currently grouping by day. So if I visit your site today and come back tomorrow, I'll be counted once for each day.
I tried looking at the month view for one of my sites but it didn't give me much meaningful data. I believe the above should answer your original confusion though.
Is it possible that you're searching for something what isn't existing anymore? Unique Visitors/Visits is old terminology. Check: https://www.seroundtable.com/google-analytics-sessions-users-18424.html
Then check how sessions and users are defined:
Sessions ("ex-visits", it's very detailed): https://support.google.com/analytics/answer/2731565?hl=en&ref_topic=1012046
Users in Google Analytics reporting are defined as "Users who have initiated at least one session during the date range". So IMHO it's not about 30 days, it's about the SELECTED date range.
I hope this helps.

ASP.Net + SQL Server Design/Implementation Suggestions

I'm constructing a prototype site for my company, which has several thousand employees and I'm running into a wall regarding the implementation of a specific requirement.
To simplify it down with an example, lets say each user has a bank account with interest. Every 5 minutes or so (can vary) the interest pays out. When the user hits the site, they see a timer counting down to when the interest is supposed to pay out.
The wall I'm running into is that it just feels dirty to have a windows service (or whatever) constantly hitting the database looking for accounts that need to 'pay out' and take action accordingly.
This seems to be the only solution in my head right now, but I'm fairly certain a service running a query to retrieve a sorted result set of accounts that need to be 'paid out' just won't cut it.
Thank you in advance for and ideas and suggestions!
Rather than updating records, just calculate the accrued interest on the fly.
This sort of math is pretty straightforward, the calculations are very likely to be orders of magnitude faster than the continuous updating.
Something like the following:
WITH depositswithperiods AS (SELECT accountid, depositamount,
FLOOR(DATEDIFF(n, deposit_timestamp, GETDATE()) / 5) as accrualperiods, interestrate
FROM deposits)
SELECT accountid, sum(depositamount) as TotalDeposits,
sum( POWER(depositamount * (1 + interestrate), accrualperiods) ) as Balance
FROM
depositswithperiods
GROUP BY accountid
I assumed compounded interest above, and no withdrawals.
The addition of withdrawals would require creating a group of deposits for each time period, taking the sum of those to get a net deposit for each time period, and then calculating the interest on those groups.
I don't know if the interest analogy will hold for your actual use case. If the database doesn't need to be kept up to date for all users at all times, you could apply the AddInterest operation multiple times at once when you need an up-to-date value. That is, whenever the value is displayed, or when the account balance is about to change.
You could do a single nightly update for all accounts.
A good thing to think of when doing this kind of thing is DateTime.
If you are charged 10 pence a minute for a phone call, there isn't a computer sitting there counting every second and working out minutes... It just records the date/time at the start, and the datetime at the end.
As others suggest, just calculate it when the user tries to view it.

Resources