Join values of two tables that represent the missing values between those tables in SQLITE - sqlite

I have the following table
CREATE TABLE holes (`tournament_id` INTEGER, `year` INTEGER, `course_id` INTEGER, `round` INTEGER, `hole` INTEGER, `front` INTEGER, `side` INTEGER, `region` INTEGER);
With the following data sample
INSERT INTO holes (`tournament_id`, `year`, `course_id`, `round`, `hole`, `front`, `side`, `region`) VALUES
('33', '2016', '895', '1', '1', '12', '5', 'L'),
('33', '2016', '895', '1', '2', '18', '10', 'R'),
('33', '2016', '895', '1', '3', '15', '7', 'R'),
('33', '2016', '895', '1', '4', '11', '7', 'R'),
('33', '2016', '895', '1', '5', '18', '7', 'L'),
('33', '2016', '895', '1', '6', '28', '5', 'L'),
('33', '2016', '895', '1', '7', '21', '12', 'R'));
In addition, I have another table tournaments
CREATE TABLE tournaments (`tournament_id` INTEGER, `year` INTEGER, `R1` INTEGER, `R2` INTEGER, `R3` INTEGER, `R4` INTEGER);
With data
INSERT INTO tournaments VALUES
(33, 2016, 715, 715, 895, 400);
The values for R1, R2, R3 and R4 present ids of the courses.
I want the columns tournament_id, year and course_id that are missing in table holes based on all the possible values of table tournaments.
With the help of this answer I tried the following:
WITH h AS (
SELECT DISTINCT tournament_id, year, course_id
FROM holes)
SELECT t.tournament_id, t.year
FROM tournaments t
WHERE NOT EXISTS (
SELECT *
FROM h
WHERE h.tournament_id = t.tournament_id
AND h.year = t.year
AND h.course_id IN (t.R1, t.R2, t.R3, t.R4)
);
demo
The above goes a long way but I also want the h.course_id that is/are missing. Desired result:
33 2016 715
33 2016 400
These combinations of tournament_id, year and course_id are not present in holes. However, they do exists because they are present in tournaments.

For this requirement you need a resultset consisting of all the values of the Rx columns which you can get with UNION in a CTE.
Then you can use NOT EXISTS to get all the combinations of id, year and course that do not exist in holes:
WITH cte AS (
SELECT id, year, R1 AS course FROM tournaments
UNION
SELECT id, year, R2 FROM tournaments
UNION
SELECT id, year, R3 FROM tournaments
UNION
SELECT id, year, R4 FROM tournaments
)
SELECT c.*
FROM cte c
WHERE NOT EXISTS (
SELECT *
FROM holes h
WHERE (h.id, h.year, h.course) = (c.id, c.year, c.course)
);
See the demo.

Related

How to replace None values in Python SQL Lite?

I am working on the 8 Week SQL Challenge with the following data. I am using Python and SQL Lite.
c.execute('DROP TABLE IF EXISTS runner_orders')
c.execute('CREATE TABLE runner_orders (order_id INTEGER, runner_id INTEGER, pickup_time VARCHAR(19), distance VARCHAR(7), duration VARCHAR(10), cancellation VARCHAR(23))')
more_order2 = [ ('1', '1', '2020-01-01 18:15:34', '20km', '32 minutes', ''),
('2', '1', '2020-01-01 19:10:54', '20km', '27 minutes', ''),
('3', '1', '2020-01-03 00:12:37', '13.4km', '20 mins', 'NULL'),
('4', '2', '2020-01-04 13:53:03', '23.4', '40', 'NULL'),
('5', '3', '2020-01-08 21:10:57', '10', '15', 'NULL'),
('6', '3', 'null', 'null', 'null', 'Restaurant Cancellation'),
('7', '2', '2020-01-08 21:30:45', '25km', '25mins', 'null'),
('8', '2', '2020-01-10 00:15:02', '23.4 km', '15 minute', 'null'),
('9', '2', 'null', 'null', 'null', 'Customer Cancellation'),
('10', '1', '2020-01-11 18:50:20', '10km', '10minutes', 'null')]
c.executemany('INSERT INTO runner_orders VALUES (?,?,?,?,?,?)', more_order2)
c.execute(query)
conn.commit()
I am trying to replace the '' values with null with this code:
query = '''
UPDATE runner_orders
SET cancellation = null
WHERE cancellation IN ('NULL', '')
'''
c.execute(query)
query2 = '''
SELECT * FROM runner_orders
'''
for match in c.execute(query2):
print(match)
For some reason, when I output the values, I got None in all the NULL spaces and it does not replace them with null. How do I fix this?
null is not a string. SQL uses trinary logic: true, false, and null. SQL null roughly means "unknown" or "no value".
If you look at your table in the SQLite client you should see the value is null. Python is translating SQL null into the Python equivalent of None.

Kusto | KQL: Expand dynamic column to all combinations of two ( Couples | Tuples )

I have a scenario where I am trying to create a view that shows me all the unique couples of values per key. For example:
datatable(Key:string, Value:string)[
'1', 'A',
'2', 'B',
'2', 'C',
'3', 'A',
'3', 'B',
'3', 'C',
'3', 'C']
| sort by Key, Value asc
| summarize Tuples=make_set(Value) by Key
Result:
Key Tuples
1 ["A"]
2 ["B","C"]
3 ["A","B","C"]
Desired Result:
Key Tuples
1 ["A"]
2 ["B","C"]
3 ["A","B"]
3 ["A","C"]
3 ["B","C"]
How can I achieve this in KQL?
Here's a not too elegant nor efficient way, that uses an inner self join to get all combinations per Key
datatable(Key:string, Value:string)
[
'1', 'A',
'2', 'B',
'2', 'C',
'3', 'A',
'3', 'B',
'3', 'C',
'3', 'C'
]
| distinct Key, Value
| as hint.materialized=true T1
| join kind=inner T1 on Key
| where Value != Value1
| project Key, Tuple = tostring(array_sort_asc(pack_array(Value, Value1)))
| distinct Key, Tuple
| as hint.materialized=true T2
| union (
T1
| where Key !in ((T2 | project Key)) | project Key, Tuple = tostring(pack_array(Value))
)
| order by Key asc, Tuple asc
Key
Tuple
1
["A"]
2
["B","C"]
3
["A","B"]
3
["A","C"]
3
["B","C"]

SQLite 5 tables, Join, Sum, GroupBy

I am desperately working on a issue and cannot resolve. I have a SQLite DB with five tables and corresponding columns:
Tab1 = {Job_ID, Company_ID, Source_ID}
Tab1_Category = {JOb_ID, CAtegory_ID}
Category = {ID, First_level, Second_level}
Tab2 = {Job_ID, Log_Date, Clicks, Applications}
Source = {ID, Name}
I craeated a sample db:
CREATE TABLE TAB1 (
`job_id` INTEGER,
`company_id` INTEGER,
`source_id` INTEGER
);
INSERT INTO TAB1
(`job_id`, `company_id`, `source_id`)
VALUES
('1', '222', '2'),
('2', '222', '1'),
('3', '222', '1'),
('4', '222', '1'),
('5', '255', '3');
CREATE TABLE TAB1_CATEGORY (
`job_id` INTEGER,
`category_id` INTEGER
);
INSERT INTO TAB1_CATEGORY
(`job_id`, `category_id`)
VALUES
('1', '31'),
('2', '36'),
('3', '33'),
('3', '35'),
('4', '32'),
('4', '31'),
('5', '34');
CREATE TABLE CATEGORY (
`id` INTEGER,
`first_level` VARCHAR(3),
`second_level` VARCHAR(3)
);
INSERT INTO CATEGORY
(`id`, `first_level`, `second_level`)
VALUES
('30', 'sss', 'aaa'),
('31', 'sss', 'aaa'),
('32', 'sss', 'bbb'),
('33', 'ggg', 'ccc'),
('34', 'ggg', 'ddd'),
('35', 'ggg', 'eee'),
('36', 'hhh', 'fff');
CREATE TABLE SOURCE (
`id` INTEGER,
`name` VARCHAR(3)
);
INSERT INTO SOURCE
(`id`, `name`)
VALUES
('1', 'mmm'),
('2', 'nnn'),
('3', 'ooo');
CREATE TABLE TAB2 (
`job_id` INTEGER,
`log_date` VARCHAR(10),
`clicks` INTEGER,
`applications` INTEGER
);
INSERT INTO TAB2
(`job_id`, `log_date`, `clicks`, `applications`)
VALUES
('1', '01-01-1999', '6', '2'),
('1', '02-01-1999', '7', '3'),
('1', '03-01-1999', '9', '1'),
('2', '02-01-1999', '4', '1'),
('2', '05-01-1999', '8', '2'),
('3', '03-01-1999', '9', '0'),
('4', '05-01-1999', '5', '3'),
('4', '06-01-1999', '4', '1'),
('5', '01-01-1999', '1', '0'),
('5', '03-01-1999', '3', '1');
I need the following results with one query>
list of all JOB_ID (Tab1) and Company_ID (Tab1) where First_level(from table category) is "ggg" or "sss" and Name (from table Source) is "mmm"
sum of clicks and sum of applications (Tab2) per Job_ID
sum of distinct Second_level (from table Category)
sum of applications for each company_ID (a company_ID can have many Job_ids)
This is what I did so far, but is not working the way i want it>
SELECT t1.job_id, t1.company_id,
SUM(t2.clicks), SUM(t2.applications), COUNT(DISTINCT c.second_level)
FROM TAB1 t1
JOIN SOURCE s ON s.id = t1.source_id
JOIN TAB1_CATEGORY tc ON t1.job_id = tc.job_id
JOIN CATEGORY c ON tc.category_id = c.id
JOIN TAB2 t2 ON t1.job_id = t2.job_id
WHERE c.first_level IN ('ggg', 'sss') AND s.NAME ='mmm'
GROUP BY t1.job_id
What I get is sum of all clicks/applications and not per job_id. :
job_id
company_id
SUM(t2.clicks)
SUM(t2.applications)
COUNT(DISTINCT c.second_level)
3
222
18
0
2
4
222
18
8
2
And this is what i want to get:
job_id
company_id
SUM(t2.clicks)
SUM(t2.applications)
COUNT(DISTINCT c.second_level)
Total Appl per company
3
222
9
0
2
4
4
222
9
4
2
4
First you must aggregate inside TAB2 and then join (with INNER joins).
Also you need SUM() window function for the column Total Appl per company:
SELECT t1.JOB_ID, t1.COMPANY_ID,
t2.total_clicks, t2.total_apps,
COUNT(DISTINCT c.SECOND_LEVEL) count_second_level,
SUM(t2.total_apps) OVER (PARTITION BY t1.COMPANY_ID) [Total Appl per company]
FROM TAB1 t1
INNER JOIN SOURCE s ON s.ID = t1.SOURCE_ID
INNER JOIN TAB1_CATEGORY tc ON t1.JOB_ID = tc.JOB_ID
INNER JOIN CATEGORY c ON tc.CATEGORY_ID = c.ID
INNER JOIN (
SELECT JOB_ID, SUM(CLICKS) total_clicks, SUM(APPLICATIONS) total_apps
FROM TAB2
GROUP BY JOB_ID
) t2 ON t1.JOB_ID = t2.JOB_ID
WHERE c.FIRST_LEVEL IN ('ggg', 'sss') AND s.NAME ='mmm'
GROUP BY t1.JOB_ID, t1.COMPANY_ID, t2.total_clicks, t2.total_apps
See the demo.
Results:
> job_id | company_id | total_clicks | total_apps | count_second_level | Total Appl per company
> -----: | ---------: | -----------: | ---------: | -----------------: | ---------------------:
> 3 | 222 | 9 | 0 | 2 | 4
> 4 | 222 | 9 | 4 | 2 | 4

Kusto summarize unique occurrences of the value in the column

I have following dataset:
let t1 = datatable(id:string, col1:string, col2:string)
[
'1', 'ValueA', 'AT',
'2', 'ValueC', 'AT',
'3', 'ValueA', 'AT',
'4', 'ValueB', 'AT',
'1', 'ValueC', 'v-username',
];
t1
| summarize (Id) by col1
My goal is to count occurrences of values in col1 per Id. Because ID=1 occurs twice, I need to decide whether to take ValueA or ValueC. This is decided by value of col2. If col2 startswith "v-" then take Value from this row.
When I use "summarize (Id) by col1" I am getting:
ValueA,2
ValueC,2
ValueB,1
ValueD,1
Total:6
Expected result is:
ValueA,1
ValueC,2
ValueB,1
ValueD,1
Total:5
Is it possible to achieve with Kusto?
a. when you run ... | summarize (id) by col1" you should get a semantic error as there's no aggregation function specified (e.g. you could have run... | summarize dcount(id) by col1`)
b. it's not clear where ValueD, 1, in your expected result, come from. as your datatable expression includes no record with ValueD
c. if i had to guess the solution to your question, despite a and b, this would be my guess:
let t1 = datatable(id:string, col1:string, col2:string)
[
'1', 'ValueA', 'AT',
'2', 'ValueC', 'AT',
'3', 'ValueD', 'AT',
'4', 'ValueB', 'AT',
'1', 'ValueC', 'v-username',
];
t1
| summarize c = dcount(id) by col1
| as T
| union (T | summarize c = sum(c) by col1 = "Total")

perform a where in query in bookshelf.js

I want to perform a WHERE - IN query/operation but normal where gives error.
I want this
select * from `calendar_event_rsvp` where `event_id` in ('1', '2', '3')
But below code leads to
select * from `calendar_event_rsvp` where `event_id in` = '1', '2', '3'
Code
CalendarEventRSVP.forge()
.where({
"event_id": event_ids
})
How do i do this in bookshelf.js
Try to add the operator:
CalendarEventRSVP.forge()
.where('event_id', 'in', event_ids)
Or use knex's whereIn:
CalendarEventRSVP.forge()
.query({whereIn: {event_id: event_ids}})
try query() function on your model.
CalendarEventRSVP.query(function(qb){
qb.where('event_id' , 'in' , [1,2,3,4]) ;
})
.fetchAll()
.then();

Resources