Powerbi equivalent to SQL having function - report

Im a trainee working with databases.
Im working on PowerBI report based on SQL query where all of the needed joins are included for my data to be obtained. So Im working within one dataset.
I have made a table where I can show number of transaction(like invoice number) and name of person that made that transaction. My problem lies in creating a measure that will influence that table. It should work like a having clausule from SQL (well at least my boss said that).
I would like for this measure to force this table to show only data for people that have made more than 2 transactions (they have more than 2 invoice numbers [so there are more than two rows for this person]) .
I tried to do it by writing a measure like that:
Measure = COUNTAX(
Query1;counta([Salesman])>2)
Or like that:
Measure 2 =
FILTER( Query1; counta(Query1[Salesman])>2 )
But i only got a bar graph that is showing me how many transactions were made by each person. When Im adding this measure to this table i see that for each row i got value 1.
Im new to the PowerBi and DAX so it's quite a big hurdle for me. Can someone share his/hers knowledge to help solve this problem? I would be much obliged.

I found a solution for my problem.
I created a second query that counted transactions for each person with their names. I created relationship between my two queries. Next I added counting attribute to my table with data from query one and I used filter on my counting attribute. After that this attribute can be just hidden and It works perfectly.
On top of that I created a measure and made a chart using this measure. It looks nice and clear.
The measure looks like that:
Measure =
COUNTAX(
Query1;counta([Salesman])
)
I filtered this measure too to get wanted result.

Related

I want to combine two tables and display them as one table

I want to combine 1 and 2, but I don't know how.
I created a model like this.
①CalendarMaster【field→Index,Date,CalenderHoliday,CompanyHoliday】
②AttendanceMaster【field→Index,EmployeeId,Date,GoingTime,LeavingTime】
I want to combine the date of the CalendarMaster and the date of the AttendanceMaster as a key.
I want to know the type of model and where to write the SQL query script.
If you join the tables into one,
I want to display [Date,CalenderHoliday,CompanyHoliday,GoingTime,LeavingTime] in one table.
I looked at various sites and tried relations, but it didn't work, so please help someone.
I have been worried for another week.
Waiting for advice.

Tableau---Getting count from 2 different data sources and combining into one total

I am a tableau newbie and am trying to see if this is possible or not. I have 2 separate data sources where the same employees are listed, one is for closed cases and the other is for open cases. These data sources have some of the same columns, but for the most part they are different.
Is it possible to aggregate the case count for each employee on the closed and open data sources into a single column? For instance, if an employee has 50 closed cases and 23 open cases, I want it to show 73 for them.
I attempted to play around with the joins/unions but these didn't work properly and duplicated the data most times.
I think this is a great chance to leverage blends.
I have created a workbook with the Sample Superstore Excel dataset. This dataset has three sheets. I'll use the Orders and Returns sheets to demonstrate how we can calculate the net orders using blends.
The dataset I'm using can be found here.
Start by connecting to both the Orders and Returns separately. Once done with this step you should see the two data sources at the top of your data pane.
In this example, I'll calculate the Net Returns by Category. In your case, you're after the Total Cases by Employee, so just imagine Employee in place of Category.
Next, drag Category from the Orders data source onto the view, then select the Orders data source and click the chain icon to blend on Order ID.
You will need a common column between the two tables in order to blend.
Once blended I'll go back to the primary data source (indicated by the blue check mark) and create the Net Orders calculation.
This calculation uses the dot notation - similar to what you might see in SQL - to reference our other table.
To double check that our calculation is working properly, we can drag the components of this calculation onto the view and do the math.
Of course, once you are satisfied you can remove all but your blended calculation.
Blending isn't ideal in most cases but you could try it. Bring in each data source separately and "join" them within your workbook pane on Employee or hopefully an Employee_id. Click the little chain once you have them both loaded and you are on a worksheet tab. Then you could sum the counts by employee. Blending sometimes presents some issues with calculated fields across the two data sources but this is what I would try first.

Adding new fields to historical tables in BigQuery

I'm getting daily exports of Google Analytics data into BigQuery and these form the basis for our main reporting dataset.
Over time i need to add new columns for additional things we use to enrich the data - like say a mapping from url to 'reporting category' for example.
This is easy to just add as a new column onto the processed tables (there is about 10 processing steps at the moment for all the enrichment we do).
This issue is if stakeholders then ask - can we add that new column to the historical data?
Currently i then need to rerun all the daily jobs which is very slow and costly.
This is coming up frequently enough that i'm seriously thinking about redesigning my data pipelines to tailor for the fact that i often need to essentially drop and recreate ALL the data from time to time when i need to add a new field or correct old dirty data or something.
I'm just wondering if there is better ways to
Add a new column to an old table in BQ (would be happy to do this by hand for these instances where i can just join the new column based on the ga [hit_key] i have defined which is basically a row key)
(Less common) Update existing tables based on some where condition.
Just wondering what best practices are and if anyone has had similar issues where you basically need to update an historic shema and if there are ways to do it without just dropping and recreating which is essentially what i'm currently doing.
To be clearer on my current approach: I'm taking the [ga_sessions_yyyymmdd] table and making a series of [ga_data_prepN_yyyymmdd] tables where is either add new columns at each step or reduce the data in some way. There is now 11 of these steps and each time i'm taking all the 100 or more columns along for the ride. This is what i'm going to try design away from as currently 90% of the columns at each stage dont even need to be touched as they can just be joined back on at the end maybe based on hit_key or something.
It's a little bit messy though to try and pick apart.
Adding new columns to the schema of the existing historical tables is possible, but the values for newly added columns will be NULLs. If you do need to populate values into these columns, probably the best approach is to use UPDATE DML statement. More details how to try it out is here: Does BigQuery support UPDATE, DELETE, and INSERT (SQL DML) statements?

R Models with Factors in Tableau

I'm attempting to build a model for sales in R that is then integrated into Tableau so I can look at the predictions as they relate to the actual values. The model I'm building for sales is in R, and I'm trying to integrate it into Tableau by creating a calculated field that uses the model to give the predicted value for each record using the SCRIPT_REAL function in Tableau. The records are all coming from a MySQL database connection. The issue that I'm having comes from using factors in my model (for example, month).
If I want to group all of the predictions by day of week, Tableau can't perform the calculation because it tries to aggregate each field I'm using before passing it into the model. When it tries to aggregate month, not all of the values are the same, so it instead returns a "". Obviously a prediction value then can't be reached because there is no value associated with a "". Essentially what I'm trying to do is get a prediction value for each record that I have, and then aggregate those prediction values in various ways.
Okay, now I can understand a little bit better what you're talking about. A twbx with dummy data (and even dummy model, but that generates the same problem you're facing) would help even more, but let me try to say a couple of things
One thing that is important to understand is that SCRIPT functions are like table calculations, i.e., they are performed only with aggregated fields, they are computed last (after all aggregations, measures and regular calculations) and you can define the level of aggregation you want.
So, if you want to display values on a daily basis, put your date field on page, go to the day level, and for the calculation partition by DAY(date_field). If you want by week, same thing.
I find table calculations (including R scripts) very useful when they are an end, i.e. the calculation is the answer. It's not so useful (or better, not so easily manipulable) when it's an end, like an intermediate step before a final calculation to get to the answer. That is mainly because the level of aggregation is based on the fields that are on page. So, for instance, if I have multiple orders from different clients, and want to assess what's the average order value by customer, table calculation is great, WINDOW_AVG(SUM(order_value)) partitioned by customer. If, for some reason, I want to sum all this values, then it's tricky. I can't do it directly, as the avg order value is not stored anywhere, and cannot be retrieved without all the clients being on page. So what I usually do is to create the table with all customers, export it to mdb, and reconnect in Tableau.
I said all this because it might be your problem, when you say "Tableau can't perform the calculation because it tries to aggregate each field I'm using before passing it into the model". Yes, Tableau does that and there's nothing you can do about it, but figure out a way around it. Creating an intermediate table in Tableau, exporting it, and connecting to it again in Tableau might be an answer. Performing the calculations in R, exporting it and then connecting to Tableau might be another way.
But again, without actually seeing what you're trying to do, it's hard to say what you need to do

Array calculation in Tableau, maxif routine

I'm fairly new to Tableau, and I'm struggling in building some routines that could be easily implemented in Excel (though it would take forever for big sets of data).
So here is the deal, consider a dataset with the following fields:
int [id_order] -> id of the sales order (deepest level, there are only unique entries of id_order)
int [id_client] -> as I want to know who bought it
date [purchase_date] -> when the customer bought the product
What I want to know is, for each order, when was the last time (if ever) the client has bought something. In order words, what is the highest purchase_date for that user that is smaller than current purchase_date.
In excel, approach is simple (but again, not efficient)
{=max(if(id_client=B1,if(purchase_order
Is there a way to do this kind of calculation in Tableau?
You can do this in Tableau using table calculations. They take a little time to understand how to use well, but are very powerful and flexible. I posted a sample Tableau workbook for a similar question in an answer for SO question Find first time a condition is met
Your situation is similar, but with the extra complication that you want to repeat the analysis for each client id, so you might want to try a recursive approach using the Previous_Value() function instead of the approach used in that example - though I'm not certain that previous_value() will fit your situation.
Still, it might be helpful to download the example workbook I mentioned to get an idea how table calculations can address similar problems.
Just to register the solution, in case someone has the same doubt.
So, basically the solution I found use table calculation, which is not calculated until it's called on a sheet (and is only calculated on the context of the sheet). That's a little bit limiting, so what I do is create a sheet with all the fields I need (+ what is necessary for the table calculation) then export the data (to mdb) and connect to this new file.
So, for my example, the right table calculation is (let's name it last_order_date):
LOOKUP(MAX([purchase_date]),-1)
Explanations. The MAX() is necessary because Lookup (and all table calculations) does not work with data directly, only with aggregations. You can use sum, avg, max, attr, whatever suits you. As in my case there will be only 1 correspondence, any function will do just fine and return the same value.
The -1 indicates that I'm looking for the element immediately before the current entry (of the table, as you define it). If it were FIRST(), it would go for the first entry of the table, and LAST() would go for the last.
Now, I have to put it on a sheet. So I'll bring the fields id_client, id_order, purchase_date and last_order_date.
Then I have to define the parameters of my table calculation last_order_date (Edit Table Calculation). I'll go to Compute using and choose advanced. Now I'll do Partitioning: id_client, and addressing all the rest. What will that do? This mean Tableau will create temporary tables for each id_client, and table calculations will use those tables as parameter.
Additionally, I will Sort by field purchase_date, Max (again the aggregation issue) and ascending, to guarantee my entries are in chronological order.
Now, what will it do? For each entry it will access the table of the id_client, and check what was the purchase_date that is immediately before the current entry (that is being assessed), exactly what I need.
To avoid spending Tableau processing in Visualization, I often put all the fields in details (and leave nothing on screen), use Bar chart (it's good because it allows me to see the data). Then I export it to mdb, then connect to it again. Unfortunately Tableau doesn't directly export to tde.

Resources