I have the following query:
customEvents
| summarize count(datepart("Second", timestamp) )
by toint(customMeasurements.Latency)
This is counting the number of seconds past the minute and grouping it by an integer Latency.
How do I add an order by operator to this to order by these columns?
In order to do this you need to alias the columns.
Aliasing columns is performed by prefixing the value with column_alias=.
customEvents
| summarize Count=count(datepart("Second", timestamp) )
by Latency=toint(customMeasurements.Latency)
Then we can reference the columns by their aliases:
customEvents
| summarize Count=count(datepart("Second", timestamp) )
by Latency=toint(customMeasurements.Latency)
| order by Latency asc nulls last
Related
I've looked at many answers on SO concerning situations related to this but I must not be understanding them too well as I didn't manage to get anything to work.
I have a table with the following columns:
timestamp (PK), type (STRING), val (INT)
I need to get the most recent 20 entries from each type and average the val column. I also need the COUNT() as there may be fewer than 20 rows for some of the types.
I can do the following if I want to get the average of ALL rows for each type:
SELECT type, COUNT(success), AVG(success)
FROM user_data
GROUP BY type
But I want to limit each group COUNT() to 20.
From here I tried the following:
SELECT type, (
SELECT AVG(success) AS ave
FROM (
SELECT success
FROM user_data AS ud2
WHERE umd2.timestamp = umd.timestamp
ORDER BY umd2.timestamp DESC
LIMIT 20
)
) AS ave
FROM user_data AS ud
GROUP BY type
But the returned average is not correct. The values it returns are as if the statement is only returning the average of a single row for each group (it doesn't change regardless of the LIMIT).
Using sqlite, you may consider the row_number function in a subquery to acquire/filter the most recent entries before determining the average and count.
SELECT
type,
AVG(val),
COUNT(1)
FROM (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY type
ORDER BY timestamp DESC
) rn
FROM
user_data
) t
WHERE rn <=20
GROUP BY type
How do you perform the equivalent of an SQL sum SELECT SUM(column_name) FROM table_name in Kusto Query Language for Azure Data Explorer?
app("your-app").tableName
| summarize sum(columnToSum)
You don't need to have a "by" statement in your summarize, but you can add it for performing a group by, for example,
app("your-app").tableName
| summarize sum(columnToSum) by columnToGroupBy
I have data in a table for azure data explorer, let's say the following columns:
Day, non-unique-ID, Message-Content
What I want as an output is a table containing:
Day, Count of records per day, distinct Count of non-unique-ID per day
I know how to get one or the other:
summarize count() by Day
summarize dcount(non-unique-ID) by Day
but I don't know how to get a table containing both of those columns, because summarize will only let me run a single aggregate query per command.
You can use multiple aggregation functions in the same summarize operator, all you have to do is separate them with commas. So this will work:
summarize count(), dcount(non-unique-ID) by Day
Is there a way to use summarize to group 3 or more columns? I've been able to successfully get data from 1 or 2 columns then group by another column, but it breaks when trying to add a 3rd. This question asks how to add a column, but only regards adding a 2nd, not a 3rd or 4th. Using the sample help cluster on Azure Data Explorer and working with the Covid19 table, ideally I would be able to do this:
Covid19
| summarize by Country, count() Recovered, count() Confirmed, count() Deaths
| order by Country asc
And return results like this
But that query throws an error "Syntax Error. A recognition error occurred. Token: Recovered. Line: 2, Position: 36"
I had the right basic idea, you just can't use count repeatedly inline like that. You can use sum, dcount, or max:
Covid19
| summarize sum(Recovered), sum(Confirmed), sum(Deaths) by Country
| order by Country asc
Another example:
Covid19
| where Timestamp == max_of(Timestamp, Timestamp)
| summarize confirmedCases = max(Confirmed), active = max(Active), recovered = max(Recovered), deaths = max(Deaths) by Country
| order by Country asc
In this example I'm getting the latest data for each of the selected columns. Since I initially used the where clause to get the latest data you would think I could just list the columns, but when using summarize you have to use an aggregate function so I used max on each column
I've been searching the page for possible solutions but I can't find it anywhere.. What I need is pretty simple I need multiple rows to be displayed into one. I have tried || + ||, etc.
select c_category_in, c_data_services, c_dispositivos, c_averia as 'Sub-Category', count() as 'Total'
from tickets
group by c_category_in,c_averia,c_data_services,c_dispositivos
having (Total > 1)
screenshot
Based on your comments I would recommend taking a UNION of two separate groupings:
Grouping the data by c_data_services
Grouping the data by c_dispositivos
This results in a SELECT as following:
select c_category_in, c_data_services as 'Sub-Category', count() as 'Total'
from tickets
group by c_category_in, c_data_services
having (Total > 1)
union all
select c_category_in, c_dispositivos as 'Sub-Category', count() as 'Total'
from tickets
group by c_category_in, c_dispositivos
having (Total > 1)
The COALESCE function returns the first non-NULL value:
SELECT c_category_in,
COALESCE(c_data_services, c_dispositivos) AS SubCategory,
COUNT(*) AS Total
FROM tickets
GROUP BY c_category_in, SubCategory
HAVING Total > 1