Get share of dimension member in calculated measure - olap

Not sure if what I'm trying to do is possible or if I need to change my data model.
I have a dimension containing the different amouts a customer can loan so what I wan't to do is see the share of a certain amount compared to total sales.
Pseudo code:
member [Measures].[Share 5000] as 'count([Amount].[5000])/([Measures].[Total Sales], [Time].CurrentMember)'

I assume you want the 5000 included included in your calculation? So if 10 consumers loaned 5000 and the total sales is 100000 the share is (5000*10) / 100000 = 0,5
First you will need a 'number of customers' measure, I don't know if it exists already in your data model otherwise you will have to add it.
Then you can write your calculation something like this:
member [Measures].[Share 5000] as
'
(([Amount].[Total Amount].[5000],[Measures].[Number of Customers])*5000) /
([Amount].[Total Amount],[Measures].[Total Sales])
'
You don't need to include the Time.CurrentMember in your calculation as it does not make any difference in this case. If you put Time in the rows or columns it will be automatically included.

Related

How to access unaggregated results when aggregation is needed due to dataset size in R

My task is to get total inbound leads for a group of customers, leads by month for the same group of customers and conversion rate of those leads.
The dataset I'm pulling from is 20 million records so I can't query the whole thing. I have successfully done the first step (getting total lead count for each org with this:
inbound_leads <- domo_get_query('6d969e8b-fe3e-46ca-9ba2-21106452eee2',
auto_limit = TRUE,
query = "select org_id,
COUNT(*)
from table
GROUP BY org_id
ORDER BY org_id"
DOMO is the bi tool I'm pulling from and domo_get_query is an internal function from a custom library my company built. It takes a query argument which is a mysql query)and various others which aren't important right now.
sample data looks like this:
org_id, inserted_at, lead_converted_at
1 10/17/2021 2021-01-27T03:39:03
2 10/18/2021 2021-01-28T03:39:03
1 10/17/2021 2021-01-28T03:39:03
3 10/19/2021 2021-01-29T03:39:03
2 10/18/2021 2021-01-29T03:39:03
I have looked through many aggregation online tutorials but none of them seem to go over how to get data needed pre-aggregation (such as number of leads per month per org, which isn't possible once the aggregation has occurred because in the above sample the aggregation would remove the ability to see more than one instance of org_id 1 for example) from a dataset that needs to be aggregated in order to be accessed in the first place. Maybe I just don't understand this enough to know the right questions to ask. Any direction appreciated.
If you're unable to fit your data in memory, you have a few options. You could process the data in batches (i.e. one year at a time) so that it fits in memory. You could use a package like chunked to help.
But in this case I would bet the easiest way to handle your problem is to solve it entirely in your SQL query. To get leads by month, you'll need to truncate your date column and group by org_id, month.
To get conversion rate for leads in those months, you could add a column (in addition to your count column) that is something like:
sum(case when conversion_date is not null then 1 else 0) as convert_count

Tableau Weighted Average Per Capita Calc not aggregating right

I am trying to create a simple revenue per person calc that works with different filters within the data. I have it working for a single record, however, it breaks and aggregates incorrectly with multiple records.
The formula I have now is simply Sum([Revenue]) / Sum([Attendance]). This works when I only have a single event selected. However, as soon as I select multiple shows it aggregates and doesn't do the weighted avg.
I'm making some assumptions here, but hopefully this will help you out. I've created an .xlsx file with the following data:
Event Revenue Attendance
Event 1 63761 6685
Event 2 24065 3613
Event 3 69325 4635
Event 4 41996 5414
Inside Tableu I've created the calculated column for Rev Per Person.
Finally, in the Analysis dropdown I've enabled Show Column Grand Totals. This gives me the following:
Simple Fix
The problem is that all of the column totals are being calculated using the SUM aggregation. This is the desired behavior for Revenue and Attendance, but for Rev Per Person, you want to display the average.
In Analysis/ Totals / Total All Using you can configure the default aggregation. Here we don't want to set all of them though; but it's useful to know. Leave that where it is, and instead click on the Rev Per Person Grand Total value and change it from 'Automatic' to 'Average'.
Now you'll see a number much closer to the expected.
But it's not exactly what you expect. The average of all the Rev Per Person values gives us $9.73; but if you take the total Revenue / total Attendance you'd expect a value of $9.79.
Slight More Involved Fix
First - undo the simple fix. We'll keep all of the totals at 'Default'. Instead, we'll modify the Rev Per Person calculation.
IF Size() > 1 THEN
// Grand Total
SUM([Revenue]/[Attendance])
ELSE
// Regular View
SUM([Revenue])/SUM([Attendance])
END
Size() is being used to determine if the calculation is being done for an individual cell or not.
More information on Size() and similar functions can be found on Tableau's website here - https://onlinehelp.tableau.com/current/pro/desktop/en-us/functions_functions_tablecalculation.html
Now I see the expected value of $9.79.

Remove total value for one column in PowerBI

I have a table visualisation in PowerBI that sums the top 10 products sold by sales quantity. I have a calculated column which shows the rate of sale, using other fields from the data source:
(quantity / # stores with product) / weeks on sale
The ROS calculates correctly, but it still sums and appears in the total row.. The number of stores and number of weeks are set to 'Don't Summarize', but they still add together and give some meaningless number in the total row. If i set ROS to 'Don't Summarize', to remove the total row, the summing of the rest of the table and therefore the filter I have on top N by quantity drops out.
It is very frustrating! Is there an option somewhere to simply not display total for a field?? I don't want to remove the total row completely as the other fields (e.g. Qty, Value, Margin) are useful to see a sum of.. It seems very strange that it is so difficult to do something so minor..
Additional info:
Qty is a SUM field.
Stores is not summarized and simply refers to the average number of stores that stock that product over the weeks of the trading season
Weeks is not summarized.
Weeks is not summarized and refers to the weeks that have passed in the trading season.
Example data:
Item.......Qty......Stores.....Weeks....ROS
Itm1........600........390.........2............0.77
Itm2........444........461.........2............0.48
Itm3........348........440.........2............0.40
Total.....1,392.....1,291*......6*...........1.65*
Fields marked with a * are those where the sum is a meaningless figure unrelated to the data. I do not actually need Stores and Weeks to show in the table, so the fact that they sum does not matter. However, ROS is essential, but the sum part is totally irrelevant and I do not want it to show. Any ideas? I am open to the idea of using R to overcome the lack of flexibility in the standard tables although my knowledge in this area is fairly limited.
I suspect you've made a common mistake - using a Calculated Column for ROS where you should've used a Measure.
If you rebuild that calculation as a Measure, then you can wrap the HASONEVALUE function around it, with the objective of showing a blank when there are multiple Item values in context (the Total row).
Roughly the Measure formula would be:
ROS = IF ( HASONEVALUE ( Mytable[Item] ) , << calculation >> , BLANK() )
I would also replace your use of / with the DIVIDE function, to avoid divide by zero errors.
You can remove individual totals for columns in tables and matrix objects in a round-about way by using field formatting.
Click the object, go to formatting, click the field formatting accordion, select the column or columns you want to affect from the drop-down list, set the font color to white, set 'apply to values' to off, and set 'apply to totals' to on.
A bit tedious if you have many columns, but you will have, in affect, whited-out the column totals.
Heads up, you might still have a problem with exporting data, though.
Cheers
Click on the table -> Fields -> expand the value field you don't want to include -> Select "Don't Summarize." This will exclude it from the "Total" row.
select do not summarise option for those metrics which you dont want total
Select the table you want to change
In the Visualizations pane:
Go to Format,
Find the Field Formatting option,
Choose the field you don't want to summarize.
Turn off 'apply to header',
Turn off 'apply to values',
Turn ON 'apply to total',
Change the font color to white.

Calculate Multiple Correlation Between Several Products

I Want to get the correlation between several Items which regarding to the historical Orders and sales and i want to do that to create a recommendation model for every new order (recommend the products depending on the correlation between the selected product and others) , So that i have an Idea to get this correlation by Create a query which pivoting my data to get every Order with the total of quantity of its items and then calculate the correlation between item.
I Already Attached an Excel Sheet Has a sample data for my case.
enter link description here.
"the numbers in the columns of products is the total of the quantity in every order for every product >> As Example the order of 131245 was has 1.96 of the product 11 and 3.91 of 27 and so on" i want to get the correlation between all products > the correlation depends on the orders and the items in it.
Is this Idea is Useful to get the correlation or i should use different value to calculate it ?
Any One have an idea about that ?
It is up to you to think what the correlation must be. If product A and B are always brought together, then the correlation would be 1. But what do you want to do if sometimes only product A is bought and sometimes both A and B? There is not one good code for this problem

Subtract payment until amount due is zero

Using powershell I read in a text file that contains the check amount. I then create a query and get the amount due. The problem comes in because buyer could have multiple balances for different products. So they could write a check that covered A but not B and C.
$remainAmount = $currentAmount[0] - $checkAmount
How can I do this and not produce a negative number, force it to stop subtracting when zero is reached?
One solution would be to use the [Math]::Max() function like this:
$remainamount = [Math]::Max($currentamount[0] - $checkamount,0)
That will give you the higher of the two numbers, so if they still owe something it gives that, or it gives 0.

Resources