How can I summarize a count() of multiple tables and include a 0 instead of null when no rows are returned - azure-data-explorer

I want to count the number of rows that have been generated today across multiple tables, then summarize this information in a table with the source table name and the count. I have been able to do this using union withsource.
However, if no rows are returned from a table then it is missed out of the summary. Instead I want to return a count of 0 if there are no rows and still include the table in the summary. Is this possible?
I think I need make-series but I can't figure out how to format the query.
Example
Today is: 2022-11-21
TableOne
RowId
TimeGenerated
1
2022-11-21
2
2022-11-21
3
2022-11-19
TableTwo
RowId
TimeGenerated
1
2022-11-19
2
2022-11-18
Kusto
union withsource=source TableOne, TableTwo
| where TimeGenerated >= ago(1d)
| summarize dcount(TimeGenerated) by source
Actual Output
source
dcount_TimeGenerated
TableOne
2
Desired Output
source
dcount_TimeGenerated
TableOne
2
TableTwo
0

set query_now = datetime(2022-11-22);
let TableOne = view()
{
datatable(RowId:int, TimeGenerated:datetime)
[
1 ,datetime(2022-11-21)
,2 ,datetime(2022-11-21)
,3 ,datetime(2022-11-19)
]
};
let TableTwo = view()
{
datatable(RowId:int, TimeGenerated:datetime)
[
1 ,datetime(2022-11-18)
,2 ,datetime(2022-11-17)
]
};
union withsource=source TableOne, TableTwo
| where TimeGenerated >= ago(3d)
| summarize dcount(TimeGenerated) by source
| join kind=rightouter datatable(source:string)["TableOne", "TableTwo"] on source
| project source = source1, dcount_TimeGenerated = coalesce(dcount_TimeGenerated, 0)
source
dcount_TimeGenerated
TableOne
2
TableTwo
0
Fiddle

Related

SQLite - Merge 2 tables according to modified date, insert a new row if necessary

I have a table having an ID column, this column is a primary key and unique as well. In addition, the table has a modified date column.
I have the same table in 2 databases and I am looking to merge both into one database. The merging scenario in a table is as follows:
Insert the record if the ID is not present;
If the ID exists, only update if the modified date is greater than that of the existing row.
For example, having:
Table 1:
id | name | createdAt | modifiedAt
---|------|------------|-----------
1 | john | 2019-01-01 | 2019-05-01
2 | jane | 2019-01-01 | 2019-04-03
Table 2:
id | name | createdAt | modifiedAt
---|------|------------|-----------
1 | john | 2019-01-01 | 2019-04-30
2 | JANE | 2019-01-01 | 2019-04-04
3 | doe | 2019-01-01 | 2019-05-01
The resulting table would be:
id | name | createdAt | modifiedAt
---|------|------------|-----------
1 | john | 2019-01-01 | 2019-05-01
2 | JANE | 2019-01-01 | 2019-04-04
3 | doe | 2019-01-01 | 2019-05-01
I've read about INSERT OR REPLACE, but I couldn't figure out how the date condition can be applied. I know as well that I can loop through each pair of similar row and check the date manually but this would be very time and performance consuming. Therefore, is there an efficient way to accomplish this in SQLite?
I'm using sqlite3 on Node.js .
The UPSERT notation added in Sqlite 3.24 makes this easy:
INSERT INTO table1(id, name, createdAt, modifiedAt)
SELECT id, name, createdAt, modifiedAt FROM table2 WHERE true
ON CONFLICT(id) DO UPDATE
SET (name, createdAt, modifiedAt) = (excluded.name, excluded.createdAt, excluded.modifiedAt)
WHERE excluded.modifiedAt > modifiedAt;
First create the table Table3:
CREATE TABLE Table3 (
id INTEGER,
name TEXT,
createdat TEXT,
modifiedat TEXT,
PRIMARY KEY(id)
);
and then insert the rows like this:
insert into table3 (id, name, createdat, modifiedat)
select id, name, createdat, modifiedat from (
select * from table1 t1
where not exists (
select 1 from table2 t2
where t2.id = t1.id and t2.modifiedat >= t1.modifiedat
)
union all
select * from table2 t2
where not exists (
select 1 from table1 t1
where t1.id = t2.id and t1.modifiedat > t2.modifiedat
)
)
This uses a UNION ALL for the 2 tables and gets only the needed rows with EXISTS which is a very efficient way to check the condition you want.
I have >= instead of > in the WHERE clause for Table1 in case the 2 tables have a row with the same id and the same modifiedat values.
In this case the row from Table2 will be inserted.
If you want to merge the 2 tables in Table1 you can use REPLACE:
replace into table1 (id, name, createdat, modifiedat)
select id, name, createdat, modifiedat
from table2 t2
where
not exists (
select 1 from table1 t1
where (t1.id = t2.id and t1.modifiedat > t2.modifiedat)
)

How to rank rows in a table in sqlite?

How can I create a column that has ranked the information of the table based on two or three keys?
For example, in this table the rank variable is based on Department and Name:
Dep | Name | Rank
----+------+------
1 | Jeff | 1
1 | Jeff | 2
1 | Paul | 1
2 | Nick | 1
2 | Nick | 2
I have found this solution but it's in SQL and I don't think it applies to my case as all information is in one table and the responses seem to SELECT and JOIN combine information from different tables.
Thank you in advance
You can count how many rows come before the current row in the current group:
UPDATE MyTable
SET Rank = (SELECT COUNT(*)
FROM MyTable AS T2
WHERE T2.Dep = MyTable.Dep
AND T2.Name = MyTable.Name
AND T2.rowid <= MyTable.rowid);
(The rowid column is used to differentiate between otherwise identical rows. Use the primary key, if you have one.)

SQLite subquery: "IN" the result of the outer query

I have two tables user and pair. I want to get the number of duplicate pairs (a, b) for each user.name.
user
name | id
-------------
"Alice" | 0
"Bob" | 1
"Alice" | 2
pair
id | a | b
-----------
0 | 0 | 1
0 | 1 | 3
1 | 0 | 1
2 | 1 | 3
In the above example, the result should be:
name | id | c
-------------------
"Alice" | 0,2 | 1
"Bob" | 1 | 0
When there is only one id for each user, I can do this:
SELECT name, id, (
SELECT COUNT(*) FROM pair JOIN pair AS p USING (id, a, b)
WHERE id = user.id AND pair.rowid < p.rowid
) AS c FROM user;
When there is multiple ids, I can get the correct result from the below query, but it is quite slow when there is more rows and more subqueries.
SELECT name, GROUP_CONCAT(id), (
WITH t AS (SELECT id FROM user AS u WHERE name = user.name)
SELECT COUNT(*) FROM pair JOIN pair AS p USING (a, b)
WHERE pair.id IN t AND p.id IN t AND pair.rowid < p.rowid
) AS c FROM user GROUP BY name;
I want to know that is there a simple and efficient way for this, like changing the WHERE clause from pair.id = user.id to pair.id IN <<the user.id list>>?
/* This will not work! "Error: no such table: user.id" */
SELECT name, GROUP_CONCAT(id), (
SELECT COUNT(*) FROM pair JOIN pair AS p USING (a, b)
WHERE pair.id IN user.id AND p.id IN user.id AND pair.rowid < p.rowid
) AS c FROM user GROUP BY name;
The GROUP BY name operation can be sped up if the database is able to go through the rows in order, without having to sort the table.
This can be done with an index on the name column (the other column makes this a covering index, which helps only a little more):
CREATE INDEX user_name_id_index ON user(name, id);
The query looks up pair rows by their id, a, and b values; these lookups can be sped up with an index on these columns:
CREATE INDEX pair_id_a_b_index ON pair(id, a, b);
To help the query optimizer make better decisions when selecting indexes, run ANALYZE.
The query optimizer gets improved constantly; get the newest SQLite version, if possible.
To check how your queries are executed, look at the output of the EXPLAIIN QUERY PLAN command.

Passing result variable to nested SELECT statement in Sqlite

I have the following query which works:
SELECT
SoftwareList,
Count (SoftwareList) as Count
FROM [assigned]
GROUP BY SoftwareList
This returns the following result set:
*SoftwareList* | *Count*
--------------------------
Office XP | 3
Adobe Reader | 3
Dreamewaver | 2
I can also run the following query:
SELECT
GROUP_CONCAT(LastSeen) as LastSeen
FROM [assigned]
WHERE SoftwareList = 'Dreamweaver';
Which would return the following result set:
*LastSeen*
----------
2007-9-23,2012-3-12
I wish to combine both of these queries into one, so that the following results are returned:
*SoftwareList* | *Count* | *LastSeen*
--------------------------------------------------------
Office XP | 3 | 2001-2-12,2008-3-19,2002-2-17
Adobe Reader | 3 | 2008-2-12,2009-3-20,2007-3-16
Dreamewaver | 2 | 2007-9-23,2012-3-12
I am trying this but don't know how to refer to the initial SoftwareList variable within the nested statement:
SELECT
SoftwareList,
Count (SoftwareList) as Count,
(SELECT
GROUP_CONCAT(LastSeen) FROM [assigned]
WHERE SoftwareList = SoftwareList
) as LastSeen
FROM [assigned]
GROUP BY SoftwareList;
How can I pass SoftwareList which is returned for each row, into the nested statement?
I think this is what you want:
SELECT SoftwareList, COUNT(SoftwareList) AS Count, GROUP_CONCAT(LastSeen)
FROM assigned GROUP BY SoftwareList

How to use ROW_NUMBER in sqlite

Here is my query given below.
select * from data where value = "yes";
My id is auto increment and below there is result of given query.
id || value
1 || yes
3 || yes
4 || yes
6 || yes
9 || yes
How to use ROW_NUMBER in sqlite? So that i can get result which is given below.
NoId || value
1 || yes
2 || yes
3 || yes
4 || yes
5 || yes
ROW_NUMBER AS NoId.
SQLite Release 3.25.0 will add support for window functions
2018-09-15 (3.25.0)
Add support for window functions
Window Functions :
A window function is a special SQL function where the input values are taken from a "window" of one or more rows in the results set of a SELECT statement.
SQLite supports the following 11 built-in window functions:
row_number()
The number of the row within the current partition. Rows are numbered starting from 1 in the order defined by the ORDER BY clause in the window definition, or in arbitrary order otherwise.
So your query could be rewritten as:
select *, ROW_NUMBER() OVER(ORDER BY Id) AS NoId
from data
where value = "yes";
db-fiddle.com demo
Try this query
select id, value, (select count(*) from tbl b where a.id >= b.id) as cnt
from tbl a
FIDDLE
| id | value | cnt |
--------------------
| 1 | yes | 1 |
| 3 | yes | 2 |
| 4 | yes | 3 |
| 6 | yes | 4 |
| 9 | yes | 5 |
I mended somewhat with fiddleanswer and got exactly the result as expected
select id, value ,
(select count(*) from data b where a.id >= b.id and b.value='yes') as cnt
from data a where a.value='yes';
result
1|yes|1
3|yes|2
4|yes|3
6|yes|4
9|yes|5
UPDATE: sqlite3 version 3.25 now supports window functions including:
row_number() over(order by id)
SQLITE3 Documentation
The ROW_NUMBER() windowing function can be done over an empty ORDER() like so (credit to #forpas):
select *, ROW_NUMBER() OVER() AS NoId
from data
where value = "yes";
SELECT (SELECT COUNT(*)
FROM main AS t2
WHERE t2.col1 < t1.col1) + (SELECT COUNT(*)
FROM main AS t3
WHERE t3.col1 = t1.col1 AND t3.col1 < t1.col1) AS rowNum, * FROM Table_name t1 WHERE rowNum=0 ORDER BY t1.col1 ASC

Resources