SSAS incremental cube processing shows wrong distinct count

SSAS incremental cube processing shows wrong distinct count - count

I have created XMLA to process cube in incremental way. It is using type "ProcessUpdate" for dimensions and "ProcessAdd" for measurement partitions. I am facing one issue on distinct count. Let me take one example:
Order Id CustId Amount
1 C1 100.00
2 C2 200.00
3 C3 300.00
4 C4 400.00
5 C5 500.00
If we browse cube, SSAS shows sum of orders as 1500.00, and distinct customer count for all orders as 5. Now adding new fact record for cancelling one order,
e.g.:
Order Id CustId Amount
3 C3 -300.00
After incremental processing, it shows sum of orders as 1200.00 which is correct one. But the distinct customer count for all orders keeps same and shows 5 which is incorrect.
I can understand that the rows are getting append on incremental process which works for sum operation, but fails computing distinct count. I want to know if there is any way that can remove order #3 from all aggregate operation while processing in incremental way.

Distinct customer count remaining at five is correct, as it doesn't know that a minus 300 means the customer shouldn't show. If you processed the cube fully it would show as 5 distinct customers.
This isn't to do with incremental processing, this is to do with how SSAS handles distinct count - It's just "Count distinct customer Ids in fact table", and C3 is there twice with a sale of 300 and a sale of -300.
You need to reconsider how to handle this, ideally at the stage where you load your data warehouse. You could handle it in MDX by not including anyone who has sales of zero or less, but then the whole distinct count calculation will be done in MDX and will be much, much slower.

Related

Oracle 11g PLSQL - Splitting A Record Out Into Constituent Records Over Time - Row Generating

I have a dataset (a view) that has a numeric field "WR_EST_MHs". If that field exceeds a certain number of man hours (120 or 60, depending on 2 other fields' values), I need to split it out into constiuent records and spread those hours over future weeks.
The OH_UG_Key and 1kMCM_Flag fields determine the threshold for splitting. For example, if the OH_UG = 1 AND 1kMCM_Flag = 'N' and the WR_EST_MHs > 120, then spread the WR_EST_MHs value over as many records as is necessary, in 120 MH increments, changing only the WRSchedDate and WRSchedDate_Key fields (advancing each by one week).
Each OH_UG / 1kMCM_Flag / WR_EST_MHs scenario is as follows:
This is an example of what I need to do:
I thought that something like this might work, but I haven't worked with levels before:
with cte as
2 (Select * from "STJOF"."vfactScheduledWAWork"
5 )
6 select WR_Key, WP_Key, WRShedDate, DistSA_Key_Hash, CrewHQ_Key_Hash, Priority_Key_Hash, JobType_Key_Hash, WRStatus_Key_Hash, PerfBy_Key, OHUG_Key, 1kMCM_Flag, WR_EST_MHs
7 from cte cross join table(cast(multiset(select level from dual
8 connect by level >= WR_EST_MHs / 120
9 ) as sys.odcinumberlist))
10 order by WR_Key;
I also thought this could be done with a "tally table" which I have a little experience with. I really don't know where to begin on this one.

So I would say that a "Tally Table" will work if it is applied correctly. (Or, in this case, a tally view.)
First, break the logic for the hour breakout into a function so we don't have case when everywhere like so:
CREATE OR REPLACE FUNCTION get_hour_breakout(in_ohug_key IN NUMBER, in_1kmcm_flag in varchar2, in_tot_hours in number)
RETURN number
IS hours number;
BEGIN
hours:=
case when in_ohug_key=2 and in_1kmcm_flag='N' and in_tot_hours>60 then 60 else
case when in_ohug_key=2 and in_1kmcm_flag='Y' and in_tot_hours>60 and in_tot_hours<=120 then 60 else
case when in_ohug_key=2 and in_1kmcm_flag='Y' and in_tot_hours>120 then 120 else
120
end
end
end;
RETURN(hours);
END get_hour_breakout;
This way, if the hour breakout logic changes, it can be tweaked in one place.
Second, join to a dynamic "tally" view like so:
select wr_key,
WP_Key,
wrscheddate+idxkey.nnn*7 wrscheddate,
to_char(wrscheddate+idxkey.nnn*7,'yyyymmdd') WRSchedDate_Key,
OHUG_Key,
kMCM_Flag,
case when (wr_est_mhs-idxkey.nnn*get_hour_breakout(ohug_key, kmcm_flag, wr_est_mhs))>=get_hour_breakout(ohug_key, kmcm_flag, wr_est_mhs) then get_hour_breakout(ohug_key, kmcm_flag, wr_est_mhs) else wr_est_mhs-idxkey.nnn*get_hour_breakout(ohug_key, kmcm_flag, wr_est_mhs) end wr_est_mhs
from yourView inner join (SELECT ROWNUM-1 nnn
FROM ( SELECT 1 just_a_column
FROM dual
CONNECT BY LEVEL <= 52
)
) idxkey on vwrk.wr_est_mhs/get_hour_breakout(ohug_key, kmcm_flag, wr_est_mhs) > idxkey.nnn
By using the connect by level we, in effect, generate a bunch of zero indexed rows, then by joining to it with the hours divided by the breakout greater than the feed number we get a few rows for each group.
For example, if the function returns 120 and the hours are 100 you get a single row, so it stays 1 to 1. If the function returns 120 and the hours are 500, however, you get 5 rows because 500/120=4.1666666…, which in the join gives rows 4,3,2,1,0. Then the rest is simple math to determine the number of hours per breakout.
This could also be improved by moving the function call into the lower view so it is only used once per row. And the inline tally view could be made into it's own view, depends on the maintainability you need to build into it.

Count object with maxnr in category with DAX

I am building a report in Power BI for all the calls we do.
Down below is a simplified version of the data I work with (we produce 1250 calls per hour, so my data is a lot bigger than this).
Every row is a call attempt, the first column defines who the attempt was to, the second in which week, the third the attemptnr and the last column states the status of the phone call.
For example: We called ID 1 two times, the first time in week 1 which ended in status 310 (means callback) and the second time in week 2 which ended in status 710 (positive conversion).
The problem: I want to make a count of all the people (Call ID's) who are still waiting on a phone call (last call status = 310).
If I use: CALCULATE(DISTINCTCOUNT(data[ID]), data[Status] = 310) the result = 3. Which makes sense: PBI counts 3 times a 310 status.
But it should count only 1, because ID 1 & 2 are already called back and have a positive result (710 & 711). So it needs to look to the highest attemptnr.
So I tried: CALCULATE(DISTINCTCOUNT(Blad1[ID]), FILTER(Blad1,MAX(Blad1[Attempt])), Blad1[Status] = 310) But this also results in a count of 3.
I've found solutions in which you make a calculated column, but I also want to combine this with a slicer on the weeknumber, so I can check what the callbacks in a specified week are.
So basically I need PBI to count the ID's with a certain status (310) with max attempt. Does anybody know how I can do this?

How to find maximum number of records for a particular key in a table

I was trying to find which customer has more number of records in a table, i got suggested by RANK function but its not the useful in finding the exact record , so i used this following snippet:
select count(customerkey),customerkey
FROM FILEMAPPERTEMPLATE
group by customerkey;
Result :
1 298,254
1 299,732
2 246,027
43 197,053
1 299,745
1 299,751
60 271,623
Though i am able to find how many reocrds attributed to a customerkey in the table, I couldn't find the single exact record(after executing the query ) that has maximum record fro a customer. Please help
I want only
60 271,623 as reult

select * from (select count(customerkey) cnt,customerkey
FROM FILEMAPPERTEMPLATE
group by customerkey order by cnt desc) where rownum<2;

sql avg not returning expected result

I'm running the following on my sqlite3 DB, but the result is not limited to the last 3 records. It is returning the average for all records.
SELECT AVG(time) FROM tbl_aa ORDER BY ID LIMIT 3
Any thoughts?

Use a subquery to get the first 3 records and then calculate the average on them
select avg(time) from
(
SELECT time
FROM tbl_a
ORDER BY ID
LIMIT 3
) x

Limit will restrict the number of results in your result set, however AVG is calculated on the entire set so will only return one row. Therefore the limit is redundant.

Cognos: Count the number of occurences of a distinct id

I'm making a report in Cognos Report Studio and I'm having abit of trouble getting a count taht I need. What I need to do is count the number of IDs for a department. But I need to split the count between initiated and completed. If an ID occures more than once, it is to be counted as completed. The others, of course, will be initiated. So I'm trying to count the number of ID occurences for a distinct ID. Here is the query I've made in SQl Developer:
SELECT
COUNT((CASE WHEN COUNT(S.RFP_ID) > 8 THEN MAX(CT.GCT_STATUS_HISTORY_CLOSE_DT) END)) AS "Sales Admin Completed"
,COUNT((CASE WHEN COUNT(S.RFP_ID) = 8 THEN MIN(CT.GCT_STATUS_HISTORY_OPEN_DT) END)) as "Sales Admin Initiated"
FROM
ADM.B_RFP_WC_COVERAGE_DIM S
JOIN ADM.B_GROUP_CHANGE_REQUEST_DIM CR
ON S. RFP_ID = CR.GCR_RFP_ID
JOIN ADM.GROUP_CHANGE_TASK_FACT CT
ON CR.GROUP_CHANGE_REQUEST_KEY = CT.GROUP_CHANGE_REQUEST_KEY
JOIN ADM.B_DEPARTMENT_DIM D
ON D.DEPARTMENT_KEY = CT.DEPARTMENT_RESP_KEY
WHERE CR.GCR_CHANGE_TYPE_ID = '20'
AND S.RFP_LOB_IND = 'WC'
AND S.RFP_AUDIT_IND = 'N'
AND CR.GCR_RECEIVED_DT BETWEEN '01-JAN-13' AND '31-DEC-13'
AND D.DEPARTMENT_DESC = 'Sales'
AND CT.GCT_STATUS_IND = 'C'
GROUP BY S.RFP_ID ;
Now this works. But I'm not sure how to translate taht into Cognos. I tried doing a CASE taht looked liek this(this code is using basic names such as dept instead of D.DEPARTMENT_DESC):
CASE WHEN dept = 'Sales' AND count(ID for {DISTINCT ID}) > 1 THEN count(distinct ID)END)
I'm using count(distinct ID) instead of count(maximum(close_date)). But the results would be the same anyway. The "AND" is where I think its being lost. It obviously isn't the proper way to count occurences. But I'm hoping I'm close. Is there a way to do this with a CASE? Or at all?
--EDIT--
To make my question more clear, here is an example:
Say I have this data in my table
ID
---
1
2
3
4
2
5
5
6
2
My desired count output would be:
Initiated Completed
--------- ---------
4 2
This is because two of the distinct IDs (2 and 5) occure more than once. So they are counted as Completed. The ones that occure only once are counted as Initiated. I am able to do this in SQl Dev, but I can't figure out how to do this in Cognos Report Studio. I hope this helps to better explaine my issue.

Oh, I didn't quite got it originally, amending the answer.
But it's still easiest to do with 2 queries in Report Studio. Key moment is that you can use a query as a source for another query, guaranteeing proper group by's and calculations.
So if you have ID list in the table in Report Studio you create:
Query 1 with dataitems:
ID,
count(*) or count (1) as count_occurences
status (initiated or completed) with a formula: if (count_occurences > 1) then ('completed') else ('initiated').
After that you create a query 2 using query one as source with just 2 data items:
[Query1].[Status]
Count with formula: count([Query1].[ID])
That will give you the result you're after.
Here's a link to doco on how to nest queries:
http://pic.dhe.ibm.com/infocenter/cx/v10r1m0/topic/com.ibm.swg.ba.cognos.ug_cr_rptstd.10.1.0.doc/c_cr_rptstd_wrkdat_working_with_queries_rel.html?path=3_3_10_6#cr_rptstd_wrkdat_working_with_queries_rel

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

SSAS incremental cube processing shows wrong distinct count - count

Related

Oracle 11g PLSQL - Splitting A Record Out Into Constituent Records Over Time - Row Generating

Count object with maxnr in category with DAX

How to find maximum number of records for a particular key in a table

sql avg not returning expected result

Cognos: Count the number of occurences of a distinct id

Categories

Resources