Kusto | KQL: Expand dynamic column to all combinations of two ( Couples | Tuples ) - azure-data-explorer

I have a scenario where I am trying to create a view that shows me all the unique couples of values per key. For example:
datatable(Key:string, Value:string)[
'1', 'A',
'2', 'B',
'2', 'C',
'3', 'A',
'3', 'B',
'3', 'C',
'3', 'C']
| sort by Key, Value asc
| summarize Tuples=make_set(Value) by Key
Result:
Key Tuples
1 ["A"]
2 ["B","C"]
3 ["A","B","C"]
Desired Result:
Key Tuples
1 ["A"]
2 ["B","C"]
3 ["A","B"]
3 ["A","C"]
3 ["B","C"]
How can I achieve this in KQL?

Here's a not too elegant nor efficient way, that uses an inner self join to get all combinations per Key
datatable(Key:string, Value:string)
[
'1', 'A',
'2', 'B',
'2', 'C',
'3', 'A',
'3', 'B',
'3', 'C',
'3', 'C'
]
| distinct Key, Value
| as hint.materialized=true T1
| join kind=inner T1 on Key
| where Value != Value1
| project Key, Tuple = tostring(array_sort_asc(pack_array(Value, Value1)))
| distinct Key, Tuple
| as hint.materialized=true T2
| union (
T1
| where Key !in ((T2 | project Key)) | project Key, Tuple = tostring(pack_array(Value))
)
| order by Key asc, Tuple asc
Key
Tuple
1
["A"]
2
["B","C"]
3
["A","B"]
3
["A","C"]
3
["B","C"]

Related

Join values of two tables that represent the missing values between those tables in SQLITE

I have the following table
CREATE TABLE holes (`tournament_id` INTEGER, `year` INTEGER, `course_id` INTEGER, `round` INTEGER, `hole` INTEGER, `front` INTEGER, `side` INTEGER, `region` INTEGER);
With the following data sample
INSERT INTO holes (`tournament_id`, `year`, `course_id`, `round`, `hole`, `front`, `side`, `region`) VALUES
('33', '2016', '895', '1', '1', '12', '5', 'L'),
('33', '2016', '895', '1', '2', '18', '10', 'R'),
('33', '2016', '895', '1', '3', '15', '7', 'R'),
('33', '2016', '895', '1', '4', '11', '7', 'R'),
('33', '2016', '895', '1', '5', '18', '7', 'L'),
('33', '2016', '895', '1', '6', '28', '5', 'L'),
('33', '2016', '895', '1', '7', '21', '12', 'R'));
In addition, I have another table tournaments
CREATE TABLE tournaments (`tournament_id` INTEGER, `year` INTEGER, `R1` INTEGER, `R2` INTEGER, `R3` INTEGER, `R4` INTEGER);
With data
INSERT INTO tournaments VALUES
(33, 2016, 715, 715, 895, 400);
The values for R1, R2, R3 and R4 present ids of the courses.
I want the columns tournament_id, year and course_id that are missing in table holes based on all the possible values of table tournaments.
With the help of this answer I tried the following:
WITH h AS (
SELECT DISTINCT tournament_id, year, course_id
FROM holes)
SELECT t.tournament_id, t.year
FROM tournaments t
WHERE NOT EXISTS (
SELECT *
FROM h
WHERE h.tournament_id = t.tournament_id
AND h.year = t.year
AND h.course_id IN (t.R1, t.R2, t.R3, t.R4)
);
demo
The above goes a long way but I also want the h.course_id that is/are missing. Desired result:
33 2016 715
33 2016 400
These combinations of tournament_id, year and course_id are not present in holes. However, they do exists because they are present in tournaments.
For this requirement you need a resultset consisting of all the values of the Rx columns which you can get with UNION in a CTE.
Then you can use NOT EXISTS to get all the combinations of id, year and course that do not exist in holes:
WITH cte AS (
SELECT id, year, R1 AS course FROM tournaments
UNION
SELECT id, year, R2 FROM tournaments
UNION
SELECT id, year, R3 FROM tournaments
UNION
SELECT id, year, R4 FROM tournaments
)
SELECT c.*
FROM cte c
WHERE NOT EXISTS (
SELECT *
FROM holes h
WHERE (h.id, h.year, h.course) = (c.id, c.year, c.course)
);
See the demo.

SQLite 5 tables, Join, Sum, GroupBy

I am desperately working on a issue and cannot resolve. I have a SQLite DB with five tables and corresponding columns:
Tab1 = {Job_ID, Company_ID, Source_ID}
Tab1_Category = {JOb_ID, CAtegory_ID}
Category = {ID, First_level, Second_level}
Tab2 = {Job_ID, Log_Date, Clicks, Applications}
Source = {ID, Name}
I craeated a sample db:
CREATE TABLE TAB1 (
`job_id` INTEGER,
`company_id` INTEGER,
`source_id` INTEGER
);
INSERT INTO TAB1
(`job_id`, `company_id`, `source_id`)
VALUES
('1', '222', '2'),
('2', '222', '1'),
('3', '222', '1'),
('4', '222', '1'),
('5', '255', '3');
CREATE TABLE TAB1_CATEGORY (
`job_id` INTEGER,
`category_id` INTEGER
);
INSERT INTO TAB1_CATEGORY
(`job_id`, `category_id`)
VALUES
('1', '31'),
('2', '36'),
('3', '33'),
('3', '35'),
('4', '32'),
('4', '31'),
('5', '34');
CREATE TABLE CATEGORY (
`id` INTEGER,
`first_level` VARCHAR(3),
`second_level` VARCHAR(3)
);
INSERT INTO CATEGORY
(`id`, `first_level`, `second_level`)
VALUES
('30', 'sss', 'aaa'),
('31', 'sss', 'aaa'),
('32', 'sss', 'bbb'),
('33', 'ggg', 'ccc'),
('34', 'ggg', 'ddd'),
('35', 'ggg', 'eee'),
('36', 'hhh', 'fff');
CREATE TABLE SOURCE (
`id` INTEGER,
`name` VARCHAR(3)
);
INSERT INTO SOURCE
(`id`, `name`)
VALUES
('1', 'mmm'),
('2', 'nnn'),
('3', 'ooo');
CREATE TABLE TAB2 (
`job_id` INTEGER,
`log_date` VARCHAR(10),
`clicks` INTEGER,
`applications` INTEGER
);
INSERT INTO TAB2
(`job_id`, `log_date`, `clicks`, `applications`)
VALUES
('1', '01-01-1999', '6', '2'),
('1', '02-01-1999', '7', '3'),
('1', '03-01-1999', '9', '1'),
('2', '02-01-1999', '4', '1'),
('2', '05-01-1999', '8', '2'),
('3', '03-01-1999', '9', '0'),
('4', '05-01-1999', '5', '3'),
('4', '06-01-1999', '4', '1'),
('5', '01-01-1999', '1', '0'),
('5', '03-01-1999', '3', '1');
I need the following results with one query>
list of all JOB_ID (Tab1) and Company_ID (Tab1) where First_level(from table category) is "ggg" or "sss" and Name (from table Source) is "mmm"
sum of clicks and sum of applications (Tab2) per Job_ID
sum of distinct Second_level (from table Category)
sum of applications for each company_ID (a company_ID can have many Job_ids)
This is what I did so far, but is not working the way i want it>
SELECT t1.job_id, t1.company_id,
SUM(t2.clicks), SUM(t2.applications), COUNT(DISTINCT c.second_level)
FROM TAB1 t1
JOIN SOURCE s ON s.id = t1.source_id
JOIN TAB1_CATEGORY tc ON t1.job_id = tc.job_id
JOIN CATEGORY c ON tc.category_id = c.id
JOIN TAB2 t2 ON t1.job_id = t2.job_id
WHERE c.first_level IN ('ggg', 'sss') AND s.NAME ='mmm'
GROUP BY t1.job_id
What I get is sum of all clicks/applications and not per job_id. :
job_id
company_id
SUM(t2.clicks)
SUM(t2.applications)
COUNT(DISTINCT c.second_level)
3
222
18
0
2
4
222
18
8
2
And this is what i want to get:
job_id
company_id
SUM(t2.clicks)
SUM(t2.applications)
COUNT(DISTINCT c.second_level)
Total Appl per company
3
222
9
0
2
4
4
222
9
4
2
4
First you must aggregate inside TAB2 and then join (with INNER joins).
Also you need SUM() window function for the column Total Appl per company:
SELECT t1.JOB_ID, t1.COMPANY_ID,
t2.total_clicks, t2.total_apps,
COUNT(DISTINCT c.SECOND_LEVEL) count_second_level,
SUM(t2.total_apps) OVER (PARTITION BY t1.COMPANY_ID) [Total Appl per company]
FROM TAB1 t1
INNER JOIN SOURCE s ON s.ID = t1.SOURCE_ID
INNER JOIN TAB1_CATEGORY tc ON t1.JOB_ID = tc.JOB_ID
INNER JOIN CATEGORY c ON tc.CATEGORY_ID = c.ID
INNER JOIN (
SELECT JOB_ID, SUM(CLICKS) total_clicks, SUM(APPLICATIONS) total_apps
FROM TAB2
GROUP BY JOB_ID
) t2 ON t1.JOB_ID = t2.JOB_ID
WHERE c.FIRST_LEVEL IN ('ggg', 'sss') AND s.NAME ='mmm'
GROUP BY t1.JOB_ID, t1.COMPANY_ID, t2.total_clicks, t2.total_apps
See the demo.
Results:
> job_id | company_id | total_clicks | total_apps | count_second_level | Total Appl per company
> -----: | ---------: | -----------: | ---------: | -----------------: | ---------------------:
> 3 | 222 | 9 | 0 | 2 | 4
> 4 | 222 | 9 | 4 | 2 | 4

Kusto summarize unique occurrences of the value in the column

I have following dataset:
let t1 = datatable(id:string, col1:string, col2:string)
[
'1', 'ValueA', 'AT',
'2', 'ValueC', 'AT',
'3', 'ValueA', 'AT',
'4', 'ValueB', 'AT',
'1', 'ValueC', 'v-username',
];
t1
| summarize (Id) by col1
My goal is to count occurrences of values in col1 per Id. Because ID=1 occurs twice, I need to decide whether to take ValueA or ValueC. This is decided by value of col2. If col2 startswith "v-" then take Value from this row.
When I use "summarize (Id) by col1" I am getting:
ValueA,2
ValueC,2
ValueB,1
ValueD,1
Total:6
Expected result is:
ValueA,1
ValueC,2
ValueB,1
ValueD,1
Total:5
Is it possible to achieve with Kusto?
a. when you run ... | summarize (id) by col1" you should get a semantic error as there's no aggregation function specified (e.g. you could have run... | summarize dcount(id) by col1`)
b. it's not clear where ValueD, 1, in your expected result, come from. as your datatable expression includes no record with ValueD
c. if i had to guess the solution to your question, despite a and b, this would be my guess:
let t1 = datatable(id:string, col1:string, col2:string)
[
'1', 'ValueA', 'AT',
'2', 'ValueC', 'AT',
'3', 'ValueD', 'AT',
'4', 'ValueB', 'AT',
'1', 'ValueC', 'v-username',
];
t1
| summarize c = dcount(id) by col1
| as T
| union (T | summarize c = sum(c) by col1 = "Total")

Joining Table Relation into one output SQLite

I am using sqLite and have two tables Profile and Option. Profiles can vary from options depending on the code (see my example below). Basically users can create custom profiles and hence the code is different.
How can I pull profile options into options list so I have a single output? The only relationship I have is based on code, but some profiles have codes not listed in options. I want codes not listed in Options to also be included and those that exist not be duplicated.
Code below see my comment on expected output.
Also created a fiddle here. http://sqlfiddle.com/#!5/3e657c/1/0
CREATE TABLE profile (id INTEGER PRIMARY KEY, profileId INTEGER, value integer, type text, "name" text, "min" integer, "max" integer, "justment" text, "sortOrder" INTEGER, "code", text);
INSERT INTO "profile" ("id", "profileId", "value", "type", "name", "code") VALUES
('1','1', '0', 'c', 'John', 'test_001'),
('2', '1','0', 'c', 'Peter', 'test_002'),
('3','1', '0', 'c', 'Custom Record', 'cust_003');
CREATE TABLE options (id INTEGER PRIMARY KEY , value integer, type text, "name" text, "min" integer, "max" integer, "justment" text, "sortOrder" INTEGER DEFAULT 0, "code" text);
INSERT INTO "options" ("id", "value", "type", "name", "code") VALUES
('1', '0', 'c', 'John', 'test_001'),
('2', '0', 'c', 'Peter', 'test_002'),
('3', '0', 'c', 'Paul', 'test_003'),
('4', '0', 'c', 'Tim', 'test_004');
Expected Output single list no duplicates
|Name|
John
Peter
Paul
Tim
Custom Record
/*
Not sure if this is even possible, but appreciate any insight. Probably going to have to do this with a loop in PHP, but if there is any SQL way it would be appreciated.
You can do it with UNION ALL and NOT EXISTS in the 2nd query:
select id, value, type, name, code from options
union all
select id, value, type, name, code from profile p
where not exists (
select 1 from options o
where p.code = o.code
)
You can change the select list to return the columns that you need.
See the demo.
Results:
| id | value | type | name | code |
| --- | ----- | ---- | ------------- | -------- |
| 1 | 0 | c | John | test_001 |
| 2 | 0 | c | Peter | test_002 |
| 3 | 0 | c | Paul | test_003 |
| 4 | 0 | c | Tim | test_004 |
| 3 | 0 | c | Custom Record | cust_003 |

some registers can't get with sql

I'm trying to get a count with these query. I would like that it shows me all nameHost (althought the count was 0) but with these query, the nameHost that the count is 0 not shows me nameHost.
Could you help me please?
select a.nameHost, count(b.job_name) as cuenta
from jobsDefinition b
right join cmdmzpre.nodes a
on b.node_id in (a.nodeid,a.nameHost)
where b.app not like 'UNPLAN' group by a.nameHost order by a.nameHost desc;
Example of tables:
nodes
=======
nameHost nodeid
--------- -------
a a
b b
b f
e g
jobsDefinition
================
node_id job_name
---------- -----------
a fruit
b apple
c iron
a banana
f orange
The output would be:
a 2 (fruit,banana)
b 2 (apple,orange)
e 0
As mentioned in my comment, From your dataset, **a** should be **2**.
Use this query:
---Table Data start ----
WITH nodes (nameHost, nodeid)
AS (SELECT 'a', 'a' FROM DUAL
UNION ALL
SELECT 'b', 'b' FROM DUAL
UNION ALL
SELECT 'b', 'f' FROM DUAL
UNION ALL
SELECT 'e', 'g' FROM DUAL),
jobdef (node_id, job_name)
AS (SELECT 'a', 'fruit' FROM DUAL
UNION ALL
SELECT 'b', 'apple' FROM DUAL
UNION ALL
SELECT 'c', 'iron' FROM DUAL
UNION ALL
SELECT 'a', 'banana' FROM DUAL
UNION ALL
SELECT 'f', 'orange' FROM DUAL)
--Table data end---
--Actual Query
SELECT n.namehost,
COUNT (jd.node_id) AS Cnt,
LISTAGG (jd.job_name, ',') WITHIN GROUP (ORDER BY 1) JB_NM
FROM nodes n
LEFT JOIN jobdef jd
ON n.nodeid = jd.node_id
GROUP BY n.namehost
ORDER BY namehost;

Resources