SQLite 5 tables, Join, Sum, GroupBy - sqlite

I am desperately working on a issue and cannot resolve. I have a SQLite DB with five tables and corresponding columns:
Tab1 = {Job_ID, Company_ID, Source_ID}
Tab1_Category = {JOb_ID, CAtegory_ID}
Category = {ID, First_level, Second_level}
Tab2 = {Job_ID, Log_Date, Clicks, Applications}
Source = {ID, Name}
I craeated a sample db:
CREATE TABLE TAB1 (
`job_id` INTEGER,
`company_id` INTEGER,
`source_id` INTEGER
);
INSERT INTO TAB1
(`job_id`, `company_id`, `source_id`)
VALUES
('1', '222', '2'),
('2', '222', '1'),
('3', '222', '1'),
('4', '222', '1'),
('5', '255', '3');
CREATE TABLE TAB1_CATEGORY (
`job_id` INTEGER,
`category_id` INTEGER
);
INSERT INTO TAB1_CATEGORY
(`job_id`, `category_id`)
VALUES
('1', '31'),
('2', '36'),
('3', '33'),
('3', '35'),
('4', '32'),
('4', '31'),
('5', '34');
CREATE TABLE CATEGORY (
`id` INTEGER,
`first_level` VARCHAR(3),
`second_level` VARCHAR(3)
);
INSERT INTO CATEGORY
(`id`, `first_level`, `second_level`)
VALUES
('30', 'sss', 'aaa'),
('31', 'sss', 'aaa'),
('32', 'sss', 'bbb'),
('33', 'ggg', 'ccc'),
('34', 'ggg', 'ddd'),
('35', 'ggg', 'eee'),
('36', 'hhh', 'fff');
CREATE TABLE SOURCE (
`id` INTEGER,
`name` VARCHAR(3)
);
INSERT INTO SOURCE
(`id`, `name`)
VALUES
('1', 'mmm'),
('2', 'nnn'),
('3', 'ooo');
CREATE TABLE TAB2 (
`job_id` INTEGER,
`log_date` VARCHAR(10),
`clicks` INTEGER,
`applications` INTEGER
);
INSERT INTO TAB2
(`job_id`, `log_date`, `clicks`, `applications`)
VALUES
('1', '01-01-1999', '6', '2'),
('1', '02-01-1999', '7', '3'),
('1', '03-01-1999', '9', '1'),
('2', '02-01-1999', '4', '1'),
('2', '05-01-1999', '8', '2'),
('3', '03-01-1999', '9', '0'),
('4', '05-01-1999', '5', '3'),
('4', '06-01-1999', '4', '1'),
('5', '01-01-1999', '1', '0'),
('5', '03-01-1999', '3', '1');
I need the following results with one query>
list of all JOB_ID (Tab1) and Company_ID (Tab1) where First_level(from table category) is "ggg" or "sss" and Name (from table Source) is "mmm"
sum of clicks and sum of applications (Tab2) per Job_ID
sum of distinct Second_level (from table Category)
sum of applications for each company_ID (a company_ID can have many Job_ids)
This is what I did so far, but is not working the way i want it>
SELECT t1.job_id, t1.company_id,
SUM(t2.clicks), SUM(t2.applications), COUNT(DISTINCT c.second_level)
FROM TAB1 t1
JOIN SOURCE s ON s.id = t1.source_id
JOIN TAB1_CATEGORY tc ON t1.job_id = tc.job_id
JOIN CATEGORY c ON tc.category_id = c.id
JOIN TAB2 t2 ON t1.job_id = t2.job_id
WHERE c.first_level IN ('ggg', 'sss') AND s.NAME ='mmm'
GROUP BY t1.job_id
What I get is sum of all clicks/applications and not per job_id. :
job_id
company_id
SUM(t2.clicks)
SUM(t2.applications)
COUNT(DISTINCT c.second_level)
3
222
18
0
2
4
222
18
8
2
And this is what i want to get:
job_id
company_id
SUM(t2.clicks)
SUM(t2.applications)
COUNT(DISTINCT c.second_level)
Total Appl per company
3
222
9
0
2
4
4
222
9
4
2
4

First you must aggregate inside TAB2 and then join (with INNER joins).
Also you need SUM() window function for the column Total Appl per company:
SELECT t1.JOB_ID, t1.COMPANY_ID,
t2.total_clicks, t2.total_apps,
COUNT(DISTINCT c.SECOND_LEVEL) count_second_level,
SUM(t2.total_apps) OVER (PARTITION BY t1.COMPANY_ID) [Total Appl per company]
FROM TAB1 t1
INNER JOIN SOURCE s ON s.ID = t1.SOURCE_ID
INNER JOIN TAB1_CATEGORY tc ON t1.JOB_ID = tc.JOB_ID
INNER JOIN CATEGORY c ON tc.CATEGORY_ID = c.ID
INNER JOIN (
SELECT JOB_ID, SUM(CLICKS) total_clicks, SUM(APPLICATIONS) total_apps
FROM TAB2
GROUP BY JOB_ID
) t2 ON t1.JOB_ID = t2.JOB_ID
WHERE c.FIRST_LEVEL IN ('ggg', 'sss') AND s.NAME ='mmm'
GROUP BY t1.JOB_ID, t1.COMPANY_ID, t2.total_clicks, t2.total_apps
See the demo.
Results:
> job_id | company_id | total_clicks | total_apps | count_second_level | Total Appl per company
> -----: | ---------: | -----------: | ---------: | -----------------: | ---------------------:
> 3 | 222 | 9 | 0 | 2 | 4
> 4 | 222 | 9 | 4 | 2 | 4

Related

Oracle left join with Union gives duplicate records

Have two tables payment, payment_info , trying with left join and union to get the ordnums, it gives me duplicate records, below is my tables data structure.
Payment Table:
id
invoice
cardpay
gpay
phonepe
1
4567
0000123
null
null
2
4567
null
dummy#dummy
null
3
4567
null
null
P#dummy
4
4568
0000124
null
null
Payment_info Table:
ordnum
payment_method
payment_value
101
C
0000123
102
G
dummy#dummy
103
C
0000124
Query:
select pinfo.ordnum from
payment p
left join payment_info pinfo
on (select payment_info.ordnum
from payment_info
where payment_info.payment_method = 'C'
and payment_info.payment_value = p.cardpay
union
select payment_info.ordnum
from payment_info
where payment_info.payment_method = 'G'
and payment_info.payment_value = p.gpay
union
select payment_info.ordnum
from payment_info
where payment_info.payment_method = 'P'
and payment_info.payment_value = p.phonepe ) = pinfo.ordnum
where p.invoice = '4567'
Result: It gives 3 duplicate records.
ordnum
101
101
101
Expected result value is : 101,102, null
Can you please explain me on why it is generating duplicate records, also please let me know how can I solve this, it works good with "OR", but that would cause performance issue, any other solution that that would be really helpful.
Also in sql server it works good.
How about something simpler?
Sample data:
SQL> with
2 payment (id, invoice, cardpay, gpay, phonepe) as
3 (select 1, 4567, '0000123', null , null from dual union all
4 select 2, 4567, null , 'dummy#dummy', null from dual union all
5 select 3, 4567, null , null , 'P#dummy' from dual union all
6 select 4, 4568, '0000124', null , null from dual
7 ),
8 payment_info (ordnum, payment_method, payment_value) as
9 (select 101, 'C', '0000123' from dual union all
10 select 102, 'G', 'dummy#dummy' from dual union all
11 select 103, 'C', '0000124' from dual
12 )
Query begins here: outer join it is, but simplified as coalesce returns the first non-null value. This should be OK because there can be only one payment "method" (cardpay, gpay or phonepe) per ID.
13 select p.id, i.ordnum
14 from payment p left join payment_info i on
15 coalesce(p.cardpay, p.gpay, p.phonepe) = i.payment_value
16 where p.invoice = 4567;
ID ORDNUM
---------- ----------
1 101
2 102
3
SQL>
You can use a single LEFT OUTER JOIN:
SELECT i.ordnum
FROM payment p
LEFT OUTER JOIN payment_info i
ON ( (i.payment_method, i.payment_value) IN (
('C', p.cardpay),
('G', p.gpay),
('P', p.phonepe)
) )
WHERE p.invoice = 4567;
or:
SELECT i.ordnum
FROM payment p
LEFT OUTER JOIN payment_info i
ON ( (i.payment_method = 'C' AND i.payment_value = p.cardpay)
OR (i.payment_method = 'G' AND i.payment_value = p.gpay)
OR (i.payment_method = 'P' AND i.payment_value = p.phonepe)
)
WHERE p.invoice = 4567;
Which, for the sample data:
CREATE TABLE payment (id, invoice, cardpay, gpay, phonepe) AS
SELECT 1, 4567, '0000123', NULL , NULL FROM DUAL UNION ALL
SELECT 2, 4567, NULL , 'dummy#dummy', NULL FROM DUAL UNION ALL
SELECT 3, 4567, NULL , NULL , 'P#dummy' FROM DUAL UNION ALL
SELECT 4, 4568, '0000124', NULL , NULL FROM DUAL;
CREATE TABLE payment_info (ordnum, payment_method, payment_value) AS
SELECT 101, 'C', '0000123' FROM DUAL UNION ALL
SELECT 102, 'G', 'dummy#dummy' FROM DUAL UNION ALL
SELECT 103, 'C', '0000124' FROM DUAL;
Both output:
ORDNUM
101
102
null
fiddle

Kusto | KQL: Expand dynamic column to all combinations of two ( Couples | Tuples )

I have a scenario where I am trying to create a view that shows me all the unique couples of values per key. For example:
datatable(Key:string, Value:string)[
'1', 'A',
'2', 'B',
'2', 'C',
'3', 'A',
'3', 'B',
'3', 'C',
'3', 'C']
| sort by Key, Value asc
| summarize Tuples=make_set(Value) by Key
Result:
Key Tuples
1 ["A"]
2 ["B","C"]
3 ["A","B","C"]
Desired Result:
Key Tuples
1 ["A"]
2 ["B","C"]
3 ["A","B"]
3 ["A","C"]
3 ["B","C"]
How can I achieve this in KQL?
Here's a not too elegant nor efficient way, that uses an inner self join to get all combinations per Key
datatable(Key:string, Value:string)
[
'1', 'A',
'2', 'B',
'2', 'C',
'3', 'A',
'3', 'B',
'3', 'C',
'3', 'C'
]
| distinct Key, Value
| as hint.materialized=true T1
| join kind=inner T1 on Key
| where Value != Value1
| project Key, Tuple = tostring(array_sort_asc(pack_array(Value, Value1)))
| distinct Key, Tuple
| as hint.materialized=true T2
| union (
T1
| where Key !in ((T2 | project Key)) | project Key, Tuple = tostring(pack_array(Value))
)
| order by Key asc, Tuple asc
Key
Tuple
1
["A"]
2
["B","C"]
3
["A","B"]
3
["A","C"]
3
["B","C"]

Oracle Query on 3 tables with 2 outer joins

I'm having some trouble writing a query that seems like it should be simple, but the solution is evading me.
We have three tables (simplified for the purpose of this question):
persons - a table of user names:
per_id number(10) - primary key, populated by a sequence
user_name varchar2(50)
user_id varchar2(15) - unique, basically the employee ID
work_assignments - kind of like crew assignments, but more general:
wa_id number(10) - primary key, populated by a sequence
wa_name varchar2(25)
current_assignments - which users have which work_assignments; the average per user is about 25 work assignments, but some "lucky" individuals have upwards of 150:
wa_id number(10)
per_id number(10)
I'm trying to write a query that will compare the work_assignments for two users, in a total of three columns. The results should look like this:
WA_Name User_Name1 User_Name2
Crew A Bob Joe
Crew B Joe
Crew C Bob
Basically, every work_assignment that either of the two user has, with the name(s) of the user(s) who has it.
Here's the closest I could come up with (well, I did come up with an ugly query with 3 subqueries that does the job, but it seems like there should be a more elegant solution):
select distinct * from (
select wa.name work_assignment,
per.name user_name1,
per2.name user_name2
from work_assignments wa join current_assignments ca on wa.wa_id = ca.wa_id
join current_assignments ca2 on wa.wa_id = ca2.wa_id
left outer join persons per on per.per_id = ca.per_id and per.user_id = 'X12345'
left outer join persons per2 on per2.per_id = ca2.per_id and per2.user_id = 'Y67890'
)
where user_name1 is not null or user_name2 is not null
order by 1;
The problem with this one is that if both users have a work assignment, it shows 3 records: one for Bob, one for Joe, and one for both:
WA_Name User_Name1 User_Name2
Crew A Bob Joe
Crew A Joe
Crew A Bob
Please help!
Thanks,
Dan
I created a set of sample data/tables
drop table persons;
drop table work_assgn;
drop table curr_assgn;
create table persons(
per_id number(10) not null
, user_name varchar2(10) not null
, user_id varchar2(10) not null
)
;
insert into persons values( 1, 'Bob', 'X123' );
insert into persons values( 2, 'Joe', 'Y456' );
insert into persons values( 3, 'Mike', 'Z789' );
insert into persons values( 4, 'Jeff', 'J987' );
commit;
create table work_assgn(
wa_id number(10) not null
, wa_name varchar2(25)
)
;
insert into work_assgn values( 10, 'Crew A' );
insert into work_assgn values( 20, 'Crew B' );
insert into work_assgn values( 30, 'Crew C' );
insert into work_assgn values( 40, 'Crew D' );
commit;
create table curr_assgn(
wa_id number(10) not null
, per_id number(10) not null
)
;
insert into curr_assgn values( 10, 1 );
insert into curr_assgn values( 10, 2 );
insert into curr_assgn values( 20, 2 );
insert into curr_assgn values( 30, 1 );
insert into curr_assgn values( 40, 4 );
commit;
select * from persons;
select * from work_assgn;
select * from curr_assgn;
So the data looks like
PERSONS
PER_ID USER_NAME USER_ID
---------- ---------- ----------
1 Bob X123
2 Joe Y456
3 Mike Z789
4 Jeff J987
WORK_ASSGN
WA_ID WA_NAME
---------- -------------------------
10 Crew A
20 Crew B
30 Crew C
40 Crew D
CURRASSGN
WA_ID PER_ID
---------- ----------
10 1
10 2
20 2
30 1
40 4
One approach may be to use a PIVOT
with assignment as
(
select p.user_id, p.user_name, a.wa_name
from persons p
join curr_assgn c
on p.per_id =c.per_id
join work_assgn a
on a.wa_id = c.wa_id
where p.user_id in ( 'X123', 'Y456' )
)
select * from assignment
pivot
( max(user_name) for user_id in ( 'X123', 'Y456' )
)
;

select distinct sum of item that can exists in several tables

Let's say that i have 3 tables: Articles1,Articles2,Articles3.
It's possible that same articlegroup exists in two of theese tables.
I only want to sum amount by each articlegroup existing in Articles1 and does not exists in the other tables.
Tables:
Articles1
| Id | ArticleName | Amount |
-----------------------------------------
'1' 'Apple' '2'
'2' 'Orange' '2'
'3' 'Banana' '3'
Articles2
| Id | ArticleName | Amount |
-----------------------------------------
'1' 'Apple' '2'
'2' 'Orange' '2'
Articles3
| Id | ArticleName | Amount |
-----------------------------------------
'1' 'Apple' '2'
'2' 'Orange' '2'
My code:
SELECT SUM(a1.Amount)
FROM Articles1 a1
LEFT OUTER JOIN Articles2 a2
ON a1.Id = a2.Id
LEFT OUTER JOIN Articles3 a3
ON a1.Id = a3.Id
WHERE a1.Id <> a2.Id OR a1.Id <> a3.Id
GROUP BY a1.ArticleName
Fiddle
Modified your query to
select sum(a1.Amount)
FROM Articles1 a1
WHERE
a1.Id not in (select Id from Articles2) and
a1.Id not in (select Id from Articles3)
GROUP BY a1.ArticleName
This returns 3 as output i.e. only for Banana.
Fiddle

Get the most recent record for each user where value is 'K', action id is null or its state is 1

I have the following tables in SQL Server:
user_id, value, date, action_id
----------------------------------
1 A 1/3/2012 null
1 K 1/4/2012 null
1 B 1/5/2012 null
2 X 1/3/2012 null
2 K 1/4/2012 1
3 K 1/3/2012 null
3 L 1/4/2012 2
3 K 1/5/2012 3
4 K 1/3/2012 null
action_id, state
----------------------------------
1 0
2 1
3 1
4 0
5 1
I need to return the most recent record for each user where the value is 'K', the action id is either null or its state is set to 1. Here's the result set I want:
user_id, value, date, action_id
----------------------------------
3 K 1/5/2012 3
4 K 1/3/2012 null
For user_id 1, the most recent value is B and its action id is null, so I consider this the most recent record, but it's value is not K.
For user_id 2, the most recent value is K, but action id 1 has state 0, so I fallback to X, but X is not K.
user_id 3 and 4 are straightforward.
I'm interested in Linq to SQL query in ASP.NET, but for now T-SQL is fine too.
The SQL query would be :
Select Top 1 T1.* from Table1 T1
LEFT JOIN Table2 T2
ON T1.action_id = T2.action_id
Where T1.Value = 'K' AND (T1.action_id is null or T2.state = 1)
Order by T1.date desc
LINQ Query :
var result = context.Table1.Where(T1=> T1.Value == "K"
&& (T1.action_id == null ||
context.Table2
.Where(T2=>T2.State == 1)
.Select(T2 => T2.action_id).Contains(T1.action_id)))
.OrderByDescending(T => T.date)
.FirstOrDefault();
Good Luck !!
This query will return desired result set:
SELECT
*
FROM
(
SELECT
user_id
,value
,date
,action_id
,ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY date DESC) RowNum
FROM
testtable
WHERE
value = 'K'
) testtable
WHERE
RowNum = 1
You can also try following approach if user_id and date combination is unique
Make sure to get the order of predicates in the join to be able to use indexes:
SELECT
testtable.*
FROM
(
SELECT
user_id
,MAX(date) LastDate
FROM
testtable
WHERE
value = 'K'
GROUP BY
user_id
) tblLastValue
INNER JOIN
testtable
ON
testtable.user_id = tblLastValue.user_id
AND
testtable.date = tblLastValue.LastDate
This would select the top entries for all users as described in your specification, as opposed to TOP 1 which just selects the most recent entry in the database. I'm assuming here that your tables are named users and actions:
WITH usersactions as
(SELECT
u.user_id,
u.value,
u.date,
u.action_id,
ROW NUMBER() OVER (PARTITION BY u.user_id ORDER BY u.date DESC, u.action_id DESC) as row
FROM users u
LEFT OUTER JOIN actions a ON u.action_id = a.action_id
WHERE
u.value = 'K' AND
(u.action_id IS NULL OR a.state = 1)
)
SELECT * FROM usersactions WHERE row = 1
Or if you don't want to use a CTE:
SELECT * FROM
(SELECT
u.user_id,
u.value,
u.date,
u.action_id,
ROW NUMBER() OVER (PARTITION BY u.user_id ORDER BY u.date DESC, u.action_id DESC) as row
FROM users u
LEFT OUTER JOIN actions a ON u.action_id = a.action_id
WHERE
u.value = 'K' AND
(u.action_id IS NULL OR a.state = 1)
) useractions
WHERE row = 1

Resources