Kusto Query Language - Round datetime to nearest month using bin - azure-data-explorer

I have plenty of logs with its own timestamp, and I am trying to count the logs on a monthly basis.
Here is a sample table and query using bin(30d):
datatable(Date:datetime, Log:string)[
datetime(2018-02-02T15:14),"log1",
datetime(2018-03-23T12:14),"log2",
datetime(2018-03-24T16:14),"log3",
datetime(2019-04-26T15:14),"log4"]
| summarize count(Log) by bin(Date,30d)
The output I want:
Date count_Log
2018-02 00:00:00.0000000 1
2018-03 00:00:00.0000000 2
2019-04 00:00:00.0000000 1
The output I get:
Date count_Log
2018-01-17 00:00:00.0000000 1 //see the date, it shows JAN but the log is of Feb
2018-03-18 00:00:00.0000000 2
2019-04-12 00:00:00.0000000 1
I need the summary month wise, so How can I do it month wise? I accept the bin size as a parameter of different values like 1h, 1d, 7d, 10d, etc. There is no timespan of 1 month.
So How can I do it without having to extract month/year manually ?

you can use the startofmonth() function
for example:
datatable(Date: datetime, Log: string)
[
datetime(2018-02-02T15:14), "log1",
datetime(2018-03-23T12:14), "log2",
datetime(2018-03-24T16:14), "log3",
datetime(2019-04-26T15:14), "log4"
]
| summarize count() by startofmonth(Date)
Column1
count_
2018-02-01 00:00:00.0000000
1
2018-03-01 00:00:00.0000000
2
2019-04-01 00:00:00.0000000
1

Related

Kusto query Past 7days off each day log count with respect to timestamp

I'm not expect in kusto query can some one help me out this.
I need past 7days of each day log count with respect to timestamp off table.
Like today is Wednesday log count - 50
Tuesday log count - 105
Monday log count - 65 ...
Like that past 7 days of each day results.
If we assume today date is 9/9/22. I need 8/9/22 to 2/9/22 logs count off each day.
Last 7 days each day count expecting in kusto query
If I understood correctly, you could try this:
TableName
| where Timestamp >= startofday(ago(7d))
| where Timestamp < startofday(now())
| summarize count() by startofday(Timestamp)
result example:
Timestamp
count_
2022-09-02 00:00:00.0000000
2269683165695
2022-09-03 00:00:00.0000000
1950416520792
2022-09-04 00:00:00.0000000
1921780739235
2022-09-05 00:00:00.0000000
2161205348826
2022-09-06 00:00:00.0000000
2294524001765
2022-09-07 00:00:00.0000000
2351404246099
2022-09-08 00:00:00.0000000
2386512393024

Select data based on a day in a month, "If date is lesser than a day in a month"

I have a dataset with a column date like this :
id
date
1
2018-02-13
2
2019-02-28
3
2014-01-23
4
2021-10-28
5
2022-01-23
6
2019-09-28
I want to select data that have a date lesser than 15th February
I tried an ifelse statement to define 15th february for each year in my database but i want to know if there is an easier way to do so !
Here's the result i'm expecting
id
date
1
2018-02-13
3
2014-01-23
5
2022-01-23
Thanks in advance
You could use lubridate
library(lubridate)
library(dplyr)
d %>% filter(month(date)<2 | (month(date)==2 & day(date)<=15))
Output:
id date
1 1 2018-02-13
2 3 2014-01-23
3 5 2022-01-23
This is a base R option. Using format check if the month/year is between January 1st and February 15th.
df[between(format(df$date, "%m%d"), "0101", "0215"),]
Output
id date
1 1 2018-02-13
3 3 2014-01-23
5 5 2022-01-23

Kusto Query: Get the latest date in a column

I have a Kusto table, one of the column(ReceivedDate) is holding datetime. How to find the latest date, in this case it is 2021-07-27 00:00:00.0000000. I want to collect this value into a variable to use it in where clause.
ReceivedDate
2021-07-21 00:00:00.0000000
2021-07-22 00:00:00.0000000
2021-07-23 00:00:00.0000000
2021-07-24 00:00:00.0000000
2021-07-25 00:00:00.0000000
2021-07-26 00:00:00.0000000
2021-07-27 00:00:00.0000000
you can try the following:
let dt = toscalar(TableName | summarize max(ReceivedDate));
OtherTableName
| where Timestamp > dt
| ...

How to query SQLite data hourly?

The data table looks like the following:
ID DATE
1 2020-12-31 10:10:00
2 2020-12-31 20:30:00
3 2020-12-31 20:50:00
4 2021-01-02 17:10:00
5 2021-01-02 17:20:00
6 2021-01-02 17:30:00
7 2021-01-03 23:10:00
..
And I would like to query only the last entry per hour per day, and to have the resulte like:
ID DATE
1 2020-12-31 10:10:00
3 2020-12-31 20:50:00
6 2021-01-02 17:30:00
7 2021-01-03 23:10:00
..
I tried to look for hourly query and found the following
strftime('%H', " + DATE + ", '+1 hours')
However, not sure how to use it properly (e.g. with GROUP BY ? then how to ensure it takes the lastest entry of the hour), therefore, would be great to have some help here!
You can do it with ROW_NUMBER() window function:
SELECT ID, DATE
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY strftime('%Y%m%d%H', DATE) ORDER BY DATE DESC) rn
FROM tablename
)
WHERE rn = 1
ORDER BY ID
Instead of strftime('%Y%m%d%H', DATE) you could also use substr(DATE, 1, 13).
For versions of SQLite previous to 3.25.0 which do not support window functions you can do it with NOT EXISTS:
SELECT t1.*
FROM tablename t1
WHERE NOT EXISTS (
SELECT 1
FROM tablename t2
WHERE strftime('%Y%m%d%H', t2.DATE) = strftime('%Y%m%d%H', t1.DATE)
AND t2.DATE > t1.DATE
)
See the demo.
Results:
> ID | DATE
> -: | :------------------
> 1 | 2020-12-31 10:10:00
> 3 | 2020-12-31 20:50:00
> 6 | 2021-01-02 17:30:00
> 7 | 2021-01-03 23:10:00

Creating a SQLite query

I have a SQLite database, I want to create a query that will group records if the DateTime is within 60 minutes - the hard part is the DateTime is cumulative so if we have 3 records with DateTimes 2019-12-14 15:40:00, 2019-12-14 15:56:00 and 2019-12-14 16:55:00 it would all fall in one group. Please see the hands and desired output of the query to help you understand the requirement.
Database Table "Hands"
ID DateTime Result
1 2019-12-14 15:40:00 -100
2 2019-12-14 15:56:00 1000
3 2019-12-14 16:55:00 -2000
4 2012-01-12 12:00:00 400
5 2016-10-01 21:00:00 900
6 2016-10-01 20:55:00 1000
Desired output of query
StartTime Count Result
2019-12-14 15:40:00 3 -1100
2012-01-12 12:00:00 1 400
2016-10-01 20:55:00 2 1900
You can use some window functions to indicate at which record a new group should start (because of a datetime difference with the previous that is 60 minutes or larger), and then to turn that information into a unique group number. Finally you can group by that group number and perform the aggregation functions on it:
with base as (
select DateTime, Result,
coalesce(cast((
julianday(DateTime) - julianday(
lag(DateTime) over (order by DateTime)
)
) * 24 >= 1 as integer), 1) as firstInGroup
from Hands
), step as (
select DateTime, Result,
sum(firstInGroup) over (
order by DateTime rows
between unbounded preceding and current row) as grp
from base
)
select min(DateTime) DateTime,
count(*) Count,
sum(Result) Result
from step
group by grp;
DB-fiddle

Resources