Create Expression in Report Builder 3.0 Report to sum a column - aggregate-functions

I created a report with 3 columns, Department, Ticket Count, Ticket Number. It groups Department names and the second column shows 1 instance of the Ticket Count. The last column shows all of the ticket numbers.
I added a row that shows the grand total of all of the departments displayed.
This is the data set results :
Department TicketCount TicketNumber
D1 3 12345
D1 3 22345
D1 3 32345
I group the Department and the TicketCount so that the display is like this:
Department TicketCount TicketNumber
D1 3 12345
22345
32345
I want to add a ticket total at the end but the result is always adding all of the ticket counts and not just one.
So the Total displayed is 9 not 3.
I need to create an expression that picks the distinct TicketCounts of the departments and sums them.
The function DistinctCount returns the correct number of counts when I have multiple departments but not the values.
I tried the RunningValue function but it adds all of the values in the column.
=RunningValue(Fields!ReopenedTicketCount.Value, sum, Nothing)
I need to create a function that sums the distinct values of the ticket counts of each department.
Can anyone point me in the direction as to the functions that I need to use?

I figured this out. I just used the COUNT function on the third column to get the grand total.

Related

Impala Using CASE to determine if an ID from one Tableis in another Table

Background:
Hey everyone! I'm hoping you can help me with something that I've been trying to figure out. I have a dataset/table called customer_universe that shows all of our in scope customers. Every row/cust_id in that table is unique.
Let's say this table has 60,000 total rows. Every cust_id entry in this table is unique so total rows = unique row count.
There is also a dataset that I created (customer_sport_product_purch) that lists out all of customers (from the customer_universe table) and any of the 3 in-scope sports products they purchased along with a purchase date. This tables only contains customers who have purchased one of the three sport products but since there are three sport products and a customer may have purchased multiple, cust_id field does not contain only unique customers.
Let's say this table has 46,000 total rows but only 25,000 unique customer.
Goal Query Output:
I need to write a query that lists out every customer in the customer_universe table and one more column with a binary (1/0) value that will indicate if they have purchased a sport product or not.
So this query output should have a total of 60000 records and only two columns.
Environment and Attempted Solutions Details
I'm currently building these queries using Impala in Hue. I'm trying to use a case statement to get me my desired result but I'm getting the error message provided below.
Customer_universe Table:
Cust_ID
Customer_Since
1
02-20-2019
2
01-13-2020
3
06-17-2012
4
06-19-2021
5
06-06-2017
Customer_sport_product_purch Table:
Cust ID
Product
Purch_Dt
1
Basketball
01-01-2022
1
BoxGlove
02-01-2020
5
BoxGlove
12-15-2019
Desired Query Output:
Cust_ID
Sport_Purch
1
1
2
0
3
0
4
0
5
1
Queries I've attempted and the Error Messages I've Received:
Query 1:
SELECT a.cust_id,
case when (a.cust_id in (select distinct b.cust_id from DB.customer_sport_purch b)
then 1 else 0 end as Sport_Purch
FROM DB.customer_universe
GROUP BY cust_id;
Error Message 1:
Error while compiling statement: FAILED: SemanticException [Error 10249]: line 2:72 Unsupported SubQuery Expression 'cust_id': Currently SubQuery expressions are only allowed as Where Clause predicates
Query 2:
SELET a.cust_id,
case when (a.cust_id in sportPurch) then 1 else 0 end as Sport_Purch
FROM DB.customer_universe a,
(select distinct cust_id from DB.customer_sport_purch) sportPurch
GROUP BY a.cust_id;
Error Message 2:
Error while compiling statement: FAILED: ParseException line 2:36 cannot recognize input near 'sportPurch' ')' 'then' in expression specification
Other Considerations:
I cannot bring bring the customer_sport_table.cust_id values into a text file and have the query read from file since those values will change frequently and need to be able to just re-execute queries.
Thanks in advance!

update one table with 2 where conditions in the same and one condition in another table

I have 2 tables fees and students. i want to update one field of fees with 3 WHERE conditions, i.e, 2 conditions in table 'fees' and 1 condition in table 'students'.
I tried many queries like
UPDATE fees, students SET fees.dues= 300 WHERE fees.month= November
AND fees.session= 2017-18 AND students.class= Nursery
It gives me error like java.sql.SQLException: near",": syntax error
I am using sqlite as database. Please suggest me a query or let me correct this query.
Thanks
You cannot join tables in a UPDATE command in SQLite. Therefore, use a sub-query in the where condition
UPDATE fees
SET dues = 300
WHERE
month = November AND
session = 2017-18 AND
student_id IN (SELECT id FROM students WHERE class=Nursery)
Also, I am not sure about the types of your columns. String literals must be enclosed in single quotes ('). The expression 2017-18 would yield the number 2017 minus 18 = 1999. Should it be a string literal as well?
UPDATE fees
SET dues = 300
WHERE
month = 'November' AND
session = '2017-18' AND
student_id IN (SELECT id FROM students WHERE class='Nursery')

Data table - apply the same function on several columns to create new data table columns

I am working with data.table package. I have a data table which represents users actions on a website. Let's say that every user can visit a website, and perform multiple actions on it. My original data table is of actions (every row is an action) and I want to aggregate this information into a new data table, grouped by user visits (every visit has a unique ID). There are some fields which are shared by the actions of the same visit - for example - the user name, the user status, the visit number etc. At least one of the actions of each visit contains this info (not necessarily all of the actions). I want to retrieve, for each visit (= group of actions with the same visit ID), the value of this field, and set it to the visit in the visits new data table. For example, if I have the following original data table:
VisitID ActionNum UserName UserStatus VisitNum ActionType
aaaaaaa 1 John Active 5 x
aaaaaaa 2 Active y
aaaaaaa 3 John 5 z
bbbbbbb 1 NonActive w
bbbbbbb 2 Dan 7 t
I want to have a visits data table, as following:
VisitID UserName UserStatus VisitNum
aaaaaaa John Active 5
bbbbbbb Dan NonActive 7
I created a function that works on subset of data table (only the rows of the visit) and a field, and this function should be applied on several fields (UserName, UserStatus, VisitNum).
getGeneralField<- function(visitDT,field){
vec = visitDT[,get(field)]
return (unique(vec[vec != ""])[1])
}
The problem is that every trial to apply this function on .SD when by=VisitID results in something different than I planned... What is the best way to do it? I used !="" in order to avoid blank cells.
We specify the columns of interest in .SDcols, grouped by 'VisitID', loop through the columns in .SDcols (lapply(.SD, ...) and get the first non-blank element
dt[, lapply(.SD, function(x) x[nzchar(x)][1]), by = VisitID, .SDcols = 3:5]

Conditionally matching elements in multiple columns of two large datasets with each other

I have two very large datasets for demand and returns of products (about 4 million entries per dataset, but unequal length). The first dataset gives [1] the date of demand, [2] the id of the customer and [3] the id of the product. The second dataset gives the [1] date of return, [2] the id of the customer and [3] the id of the product.
Now I would like to match all demands for given customers and products with the returns of the same customer and product. Pairs of product types and customers are not unique, because customer can demand a product multiple times. Therefore, I want to match a demand for a product with the earliest return in the dataset. It can also happen that some products are not returned, or that some products are returned which have not been demanded (because customers return items that were demanded before the starting data in the dataset).
To that end I've written the following code:
transactionNumber = 1:nrow(demandSet) #transaction numbers for the demandSet
matchedNumber = rep(0, nrow(demandSet)) #vector of which values in the returnSet correspond to the transactions in the demandSet
for (transaction in transactionNumber){
indices <- which(returnSet[,2]==demandSet[transaction,2]&returnSet[,3]==demandSet[transaction,3])
if (length(indices)>0){
matchedNumber[transaction] <- indices[which.min(returnSet[indices,][,1])] #Select the index of the transaction with the minimum date
}
}
However, this takes around a day to compute. Anyone have a better suggestion? Note that the suggestions from match two columns with two other columns do not work here, since match() overflows memory.
As a working example consider
demandDates = c(1,1,1,5,6,6,8,8)
demandCustIds = c(1,1,1,2,3,3,1,1)
demandProdIds = c(1,2,3,4,1,5,2,6)
demandSet = data.frame(demandDates,demandCustIds,demandProdIds)
returnDates = c(1,1,4,4,4)
returnCustIds = c(4,4,1,1,1)
returnProdIds = c(5,7,1,2,3)
returnSet = data.frame(returnDates,returnCustIds,returnProdIds)
(This actually doesn't work completely correctly, since transaction 7 is incorrectly matched with return 4, however for the sake of the question lets assume this I what I want... I can fix this later)
require(data.table)
DD<-data.table(demandSet,key="demandCustIds,demandProdIds")
DR<-data.table(returnSet,key="returnCustIds,returnProdIds")
DD[DR,mult="first"]
demandCustIds demandProdIds demandDates returnDates
1: 1 1 1 4
2: 1 2 1 4
3: 1 3 1 4
4: 4 5 NA 1
5: 4 7 NA 1

Using two datasets in a single report using SQL server reporting service

I need to show a report of same set of data with different condition.
I need to show count of users registered by grouping region, country and userType, I have used drill down feature for showing this and is working fine. Also the reported data is the count of users registered between two dates. Along with that I have to show the total users in the system using the same drill down that is total users by region, country and usertype in a separate column along with each count (count of users between two date)
so that my result will be as follwsinitialy it will be like
Region - Country - New Reg - Total Reg - User Type 1 - UserType2
+ Region1 2 10 1 5 1 5
+ Region2 3 7 2 4 1 3
and upon expanding the region it will be like
Region - Country - New Reg - Total Reg - User Type 1 - UserType2
+ Region1 2 10 1 5 1 5
country1 1 2 1 2 - -
country2 1 8 1 8 - -
+ Region2 3 7 2 4 1 3
Is there a way I can show my report like this, I have tried with two data sets one with conditional datas and other with non conditional but it didn't work, its always bing total number of regiostered users for all the total reg columns
Unless I'm mistaken, you're trying to create an expandable table, with different grouping levels? Fortunately, this can be easily done in SSRS if you know where to look. The totals on your example don't seem to match up in the user columns, so I may have misunderstood the problem.
For starters, set up your query to produce a single dataset like this:
Region Country New Reg - Total Reg - User Type 1 - User Type 2
Region1 country1 1 2 1
Region1 country2 1 8 1
Region2 country3 2 4 1 1
Region2 country4 1 3 1
Now that you've got that, you want to set up a new table with the fields "NewReg", "TotalReg", "UserType1" and "UserType2". Then right-click the table row, and go to "Add Group > Row Group > Parent Group". Select "Country" in the Group by and click okay. Then, repeat this process and select "Region". This time however, tick the "Add group header" box. This will insert another row above the original.
Now, for each of your fields ("NewReg", "TotalReg" etc), click in the new row above and select the field again. this will automaticaly add a Sum(FieldName) value into the cell. This will add together all the individual row totals and present a new, grouped by region row when you run the report.
That should give you the table you require with the data aggregated correctly, so all you need to do is manage the show/hide the detail rows on demand.
To do this, select your detail row (the original row) and right-click "> Row visibility". Set this to "Hide". Now, select the cell that contains the "Region" and take note of its ID using Properties (for now, let's assume it's called "Region"). Click back onto your detail row and look at the properties window. At the bottom you'll see a "Visibility" setting. In there, set "InitialToggleState" to False and "ToggleItem" to the name of your region group's cell (i.e. "Region").
Now all that should be left is to do the formatting etc and tidy up.
I have solved this problem by taking all the records from DB and filtering the records to collect new reg count by using an expression as following
=Sum(IIF(Fields!RegisteredOn.Value >Parameters!FromDate.Value and Fields!RegisteredOn.Value < Parameters!EndDate.Value , 1,0))

Resources