Kusto: Full table scan on join even join conditions are time-based? - azure-data-explorer

I have a Kusto Query like:
(Events
| take 1)
| join kind=leftouter Sensor_Data on $left.start_timestmp ==
$right.timestmp, someotherfield
and it will never return. The right side of the join has several billion entries.
If I do a
Events
| take 1
and use the result in the where clause of Sensor_Data it returns in no time.
The MS Support Team explains that this query requires a full table scan of the Sensors_Data table. The join parameters are not taken into consideration by the query optimizer.
Question:
Is the Kusto Query Optimizer really no able to optimize queries based on the join condition? To me it sounds a little bit like 1999 to have to first do the left side of the query manualy and then do the right side manualy as well? Is there some hint or so, to make this run fast?

consider rewriting your query as follows (for example) and see if that helps it perform better:
let x = toscalar(Events | take 1 | project pack("ts", start_timestmp, "sof", someotherfield));
Sensor_Data
| where timestmp == todatetime(x["ts"])
| where someotherfield == tostring(x["sof"])

Related

join kusto database with a function from another database

Edit:
Below are more details about what I am trying to achieve.
I have 1 cluster with two databases. Database1 contains Table1 with the below info
DB2 contains a table with the below info:
UniqueID contains the same data as userID on table1.
In DB2, I have a function that performs some filtering on Unique ID.
I know that I can join both tables as the below screenshot, but my production example is way more complex than that and requires me to use the function.
My goal is to run the Table1 query, and pass the the userID to the function getuserproperties, to get a merged output.
Something Like this which does not work :)
Finally found it :) It was very simple :)
let UID = 7;
database('db1').Table1
| join kind=leftouter database('db2').Table2 on $left.userID==$right.UniqueID
| where userID == toscalar(getUserProperties(UID))

Using the results of the output of one query as table names

I have written the following code which extracts the names of tables which have used storage in Sentinel. I'm not sure if Kusto is capable of doing the query but essentially I want to use the value stored in "Out" as names of tables. e.g union(Out) or search in (Out) *
let MyTables =
Usage
| where Quantity > 0
| summarize by DataType ;
let Out =
MyTables
| summarize makelist(DataType);
No, this is not possible. The tables referenced by the query should be known during query planning. A possible workaround can be generating the query and invoking it using the execute_query plugin.

Ambiguous column name, joining three tables

I am using "DB Browser for SQLite" to test my SQL query and I get the error:
ambiguous column name: Devices.Label
I have a database with three tables:
I want to get a result with the latest timestamp from each device, so something like:
Hot | 1545074701 | 25.8
Hot | 1545071101 | 23.2
Cool | 1545071101 | 24.0
My query looks like:
SELECT Devices.Label, max(Humidity.Timestamp), Humidity.H, max(Temperature.Timestamp), Temperature.C
FROM Temperature, Humidity
INNER JOIN Devices ON Temperature.DeviceID = Devices.DeviceID
INNER JOIN Devices ON Humidity.DeviceID = Devices.DeviceID;
What am I missing? Have I not specified from which table I am taking the info?
And the next step would of course be to get the proper result from the query, this will then be sent as a JSON object to an Android application.
Thanks to sticky bit I changed my query to:
SELECT
DevHum.Label, MAX(Humidity.Timestamp), Humidity.H, Temperature.C
FROM
Temperature, Humidity
INNER JOIN
Devices AS DevTemp ON Temperature.DeviceID = DevTemp.DeviceID
INNER JOIN
Devices AS DevHum ON Humidity.DeviceID = DevHum.DeviceID;`
Which seem to do the trick.
I decided to split it up in two queries, one for Hot and one for cool as combining them is above my level right now.

App Insights 'join' that returns only the first result

I have a job running hourly (at slightly different times) and logging metrics into Application Insights.
I want to trigger an alert based on the metrics from the latest job run.
let metrics = customMetrics | where ... | extend run = bin(timestamp, 1m);
let latestRun = metrics | top 1 by run desc;
metrics | join latestRun on run
Looking at metrics I can see this query should be returning 8 results. But it returns only the first of them. Why?
Surprisingly, this is by design - the query language does not use inner joins by default, instead it uses "innerunique" joins.
Switching to join kind=inner fixed my original query.

How to increase performace on a sqlite3 database that uses QT?

I am using QSQlQuery on a sqlite3 database. To fetch a particular item , I was populating the result from 4 different tables. I thought joining the tables would increase the performance/speed and get the result faster. So I joined 2 tables initially but it takes longer time to fetch the data after joining the tables (?)
Any suggestion on how to improve the performance would be really appreciated. Also, I was looking at the http://qt-project.org/doc/qt-4.8/qsqlquery.html and it is mentioned that using setForwardOnly would increase the performance on some databases. Any idea if it would work for SQLite3?
Thanks!
According to this link,
http://sqlite.org/cvstrac/wiki?p=PerformanceTuning
SQLite implements JOIN USING by translating the USING clausing into some extra WHERE clause terms. It does the same with NATURAL JOIN and JOIN ON. So while those constructs might be helpful to the human reader, they don't really make any difference to SQLite's query optimizer.
-I was wrong to join two tables and expect the fetch to be faster. It does not work with SQLite database. Instead using a "where" clause and joining two results directly definitely has some positive impact on the performance.
(example :
select * from A,B where A.id = B.id where A.id = 1; instead of
select * from A left outer join B on A.id = B.id where A.id = 1)
The SQLite translates first statement to second before compiling and you could save on the small amount of CPU time by directly using the second statement

Resources