Get maximal value per Azure Data Explorer table from tables with same schema - azure-data-explorer

I have multiple tables with telemetry Metric_1, Metric_2, Metric_3 and all those tables have the same schema (e.g. they contain Timestamp column). I'd like to get the most recent timestamp per table.
I found possibility of using union wildcard, but query
union Metric_*
|summarize Max= max(Timestamp)
never actually finished.
Query
Metric_1
|top 1 by Timestamp
takes no time. But even summarize on a single table takes forever (I killed it after 2 minutes)
Metric_1
|summarize Max= max(Timestamp)
Can you explain the time difference and suggest how to accomplish what I need? The outcome should be
Table | MaxTimestamp
Metric_1 | Date1
Metric_2 | Date2
Metric_3 | Date3

Related

Aggregate/Summarize Timeseries data in Azure Data Explorer using Kusto

I have a requirement where I need to regularize/aggregate data which is polled every 1 sec into 1 min intervals. And I have two columns which need to be aggregated as well, say SensorName, SensorValue. I am able to bin the timestamp to 1 minute, but I am not able to get the corresponding two colums. How do I do that? Below is the query I used and the output I get.
Table
| where TimeStamp between (datetime(2020-09-01)..datetime(2020-09-30))
| summarize by bin(TimeStamp , 1min)
based on my understanding of the question (could be wrong, as there's no clear specification of sample input/schema and matching output), you could try following this example - it calculates the average sensor value for each sensor name, using an aggregation span of 1 minute:
Table
| where TimeStamp between (datetime(2020-09-01)..datetime(2020-09-30))
| summarize avg(SensorValue) by SensorName, bin(TimeStamp, 1min)

kusto query - how to group by date and also group by name

In below query I am looking at one API (foo/bar1) duration in 80th percentile that called in given date range so that I can see if there is any spike or degradation. (image below)
let dataset = requests
| where name == "GET foo/bar1"
and timestamp between(datetime("2020-10-15") .. datetime('2020-10-28'));
dataset
| summarize loadTime = round(percentile(duration, 80)) by format_datetime(timestamp, 'yyyy-MM-dd')
| order by timestamp desc
The challenge I'm facing is there can be more than one API (there are about 150 in my environment) and I also want to get those API's 80th percentile but having difficulty how to do it or even possible.
I might figure this out.. by removing 'name' from dataset then add 'name' to grouping section at the end of summarize row.
let dataset = requests
|
where timestamp between(datetime("2020-10-25") .. datetime('2020-10-28'));
dataset
| summarize loadTime = round(percentile(duration, 80)) by format_datetime(timestamp, 'yyyy-MM-dd'), name
| order by timestamp desc

Count by minute in Riak TS

I'm trying to grasp the recently added group by in Riak TS.
I'm unable to find a way to group my results by minute, e.g. count. I'll show an example below.
CREATE TABLE Results
(
result VARCHAR NOT NULL,
time TIMESTAMP NOT NULL,
PRIMARY KEY (
(QUANTUM(time, 1, 'm')),
time
)
)
Inserts
INSERT INTO FreightMinuteResult VALUES ('Novo', '2017-12-07 12:03:45Z');
INSERT INTO FreightMinuteResult VALUES ('Novo', '2017-12-07 12:04:45Z');
INSERT INTO FreightMinuteResult VALUES ('Novo', '2017-12-07 12:05:45Z');
INSERT INTO FreightMinuteResult VALUES ('Novo', '2017-12-07 12:05:46Z');
Query
select count(*) from FreightMinuteResult where time > '2017-12-07 12:01:00Z' and time < '2017-12-07 12:06:00Z' group by time;
The result is
+--------+--------------------+
|COUNT(*)| time |
+--------+--------------------+
| 1 |2017-12-07T12:04:45Z|
| 1 |2017-12-07T12:03:45Z|
| 1 |2017-12-07T12:05:45Z|
| 1 |2017-12-07T12:05:46Z|
+--------+--------------------+
How to count the number of occurrences per minute using Riak TS?
Thanks.
The quantum is used to organize the data in the backend to streamline query operations, while group by uses the exact value of the specified field. The timestamps 2017-12-07T12:05:45Z and 2017-12-07T12:05:46Z occur in the same minute and will therefore be stored in the same location on disk, but they are still stored as distinct second-resolution timestamp values that will be grouped separately.
If you want to be able to group by the minute you will need to either round the timestamps when inserting, or modify your table to include a minute field that can be grouped.

Empty result in association parent-children when there are not children

I have the following schema:
hours table: this table has "constant" data, it never changes because only will store the schedule-able hours
hour (int)
----
8
9
10
appointments table
hour (int) | date (text)
--------------------------
10 | 25/08/2015
In my application I want to show only available hours to set a new appointment based in hour-date filter. For example, I can say that for the days:
25/08/2015: available hours are 8 and 9 because 10 is already taken
26/08/2015: available hours are 8, 9 and 10 because there are not appointments at that date.
At the beginning I was using this query:
select h.hour
from hours h, appointment a
where h.hour != a.hour and a.date = 'the-date';
This query only works if there are appointments in the given dates, but for the rest of dates without appointments it returns empty result. I can achieve this task via application, but I am trying to exhaust all db's possibilities.
You could use an outer join, but a subquery might be easier to understand:
SELECT hour
FROM hours
WHERE hour NOT IN (SELECT hour
FROM appointment
WHERE date = ?)

Getting the sum of a category in a specific month in sqlite

I am trying to get the sum of all categories from a certain month from my transactions table in my sqlite database. Here is how the table is set up...
| id | transactionDate | transactionAmount | transactionCategory | transactionAccount |
Now, I want to specify three things:
The account name
The month
The year
And get the sum of the transactionAmount grouped by transactionCategory from the specified account, year, and month.
Here is what my SELECT statement looks like...
SELECT SUM(transactionAmount) AS total, transactionDate, transactionCategory
FROM transactions
WHERE transactionAccount=? AND Strftime(\"%m\", transactionDate)=? AND Strftime(\"%y\", transactionDate)=?
GROUP BY transactionCategory ORDER BY transactionCategory
Unfortunately, this returns zero rows. I am able to get accurate results if I don't try and select the month and year, but I would like to see the data from specific ranges of time...
I figured out the issue. I was simply formatting the year incorrectly. It should have been strftime('%Y', transactionDate)=? NOT strftime('%y', transactionDate)=? - the difference being a capital Y vs. a lowercase one.

Resources