I've got an sqlite table which contains start/stop timestamps. I would like to create a query which returns a total elapsed time from there.
Right now I have a SELECT (e.g. SELECT t,type FROM event WHERE t>0 AND (name='start' or name='stop) and eventId=xxx ORDER BY t) which returns a table which looks something like this:
+---+-----+
|t |type |
+---+-----+
| 1|start|
| 20|stop |
|100|start|
|150|stop |
+---+-----+
To produce the total elapsed time in the above example would be accomplished by (20-1)+(150-100) = 69
One idea I had was this: I could run two separate queries, one for the "start" fields and one for the "stop" fields, on the assumption that they would always line up like this:
+---+---+
|(1)|(2)|
+---+---+
| 1| 20|
|100|150|
+---+---+
(1) SELECT t FROM EVENT where name='start' ORDER BY t
(2) SELECT t FROM EVENT where name='stop' ORDER BY t
Then it would be simple (I think!) to just sum the differences. The only problem is, I don't know if I can join two separate queries like this: I'm familiar with joins that combine every row with every other row and then eliminate those that don't match some criteria. In this case, the criteria is that the row index is the same, but this isn't a database field, it's the order of the resulting rows in the output of two separate selects - there isn't any database field I can use to determine this.
Or perhaps there is some other way to do this?
I do not use SQLite but this may work. Let me know.
SELECT SUM(CASE WHEN type = 'stop' THEN t ELSE -t END) FROM event
This assumes the only values in type are start/stop.
Related
I have two Kusto tables in the same database, Open_Work_Items and Closed_Work_Items that appear respectively like so:
Item ID | Opened Date
1234 | <DateTime>
Item ID | Closed Date
1234 | <DateTime>
My issue is that I cannot remove work items from Open_Work_Items once the Item ID appears in Closed_Work_Items, but I would still like to query which work items are open. This means I need to find distinct Item IDs Open_Work_Items that do not appear in Closed_Work_Items, but I do not know which Kusto function(s) I can use to do so.
I've looked at Tabular and Scalar Operators, but I'm not understanding how I can combine them to get what I want here. Any help/advice would be appreciated!
Any help would be appreciated!
I think I figured it out:
Open_Work_Items | where Item_ID !in (Closed_Work_Items)
I'm using bag_unpack to explode the customDimensions column in the AppInsights traces table and want to "shape" the resultant table. All is fine if there are rows to work with. If there are not, subsequent operations that reference the exploded columns fail. For example (I boiled it down to an isolated repro),
datatable (Date:datetime, JSON:string )
[datetime(1910-06-11), '{"key": "1"}', datetime(1930-01-01), '{"key": "2"}',
datetime(1953-01-01), '{"key": "3"}', datetime(1997-06-25), '{"key": "4"}']
| where Date > datetime(2000-01-01)
| project parsed = parse_json(JSON)
| evaluate bag_unpack(parsed)
| project-rename value = key
// lots more data shaping here
Since the where filters out all rows, there is nothing to unpack. OK, that's fine but the data shaping ops (e.g., project-rename) fail saying
project-rename: Failed to resolve column reference 'key'
If you change the date in the where to be say 1900-01-01 then everything works as expected.
Note as well that if you remove the bag_unpack the project-rename some other column, it works fine with no rows. For example,
datatable (Date:datetime, JSON:string )
[datetime(1910-06-11), '{"key": "1"}', datetime(1930-01-01), '{"key": "2"}',
datetime(1953-01-01), '{"key": "3"}', datetime(1997-06-25), '{"key": "4"}']
| where Date > datetime(2000-01-01)
| project-rename value = JSON
I can see how the unpack creates the columns so if it didn't run the column doesn't get created but at the same time, why run the project at all if there are no rows?
In theory I could move the where down but I'm not sure if the query planning will recognize that and only do the subsequent project/data shaping on the reduced set of rows (filtered by the where). I've got a lot of rows and typically only need to operate on a few of them.
Pointers on how to work with bag_unpack and empty tables? Or columns that may or may not be there?
You could use the column_ifexists() function: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/columnifexists
For example:
... | project value = column_ifexists("key", "")
I am writing a Kusto query to display ths status of build results in time chart. That is the first column will display the time in 5 mins difference and the remaining columns will have the count for the respective Build status like (sucess, failed, in progress)
Once I do all the filters, I am using the below query
´´´| summarize count= count() by Status ,bin(timestamp(), 1h)
| render timechart´´´
It says unknown function and I am not sure how to display a time chart. So for each status how do I get the count for every 5 mins. Thanks for any inputs.
It seems that the issue is that you are using the function notation when you are telling the "bin" function which column to use, instead of simply provide the name of the column. In other words, remove the parenthesis after the column name timestamp as follows:
T
| summarize count= count() by Status ,bin(timestamp, 1h) | render timechart
If I have too many columns and a bunch of them start with similar strings , is there a way in Kusto to select them based on this pattern , such as using wild cards etc ?
e.g. Assuming we have some of the columns like datafield1, datafield2 ... , something like the following would be helpful
mytable | project datafield*
I know that this is not syntactically valid , so is there any workaround for achieving this easily?
project-keep does exactly what you want:
mytable | project-keep datafield*
I select orderids for a model in a subsearch and than select the most common materials for each orderid, so I get a list of every Material and the time it was a part of an order. I want to display the most common materials in percentage of all orders. So I need this amount how often every material was found and then divide that by total amount of orders.
sourcetype=file1 [subsearch... ->returns Orders] |
here I need to select the total amount of orders like:
stats dc(Orders) as totalamount by Orders|
stats dc(Orders) as anz by Material|
eval percentage= anz/totalamount|
sort by percentage desc
How can I perform the total amount of search?
I assume from your base search you will get the Orders and Material anyway,
You need to use eventstats for taking the total count . Below code should work
index=foo sourcetype=file1 [subsearch... ->returns Orders]
| stats count(Orders) as order_material_count by Material
| eventstats sum(order_material_count ) as totalCount
| eval percentage=(order_material_count*100)/totalamount
| fields Mateira, order_material_count , percentage
| sort - percentage
Try the streamstats command.
index=foo sourcetype=file1 [subsearch... ->returns Orders]
| streamstats count(Orders) as totalamount
| stats count(Orders) as anz by Material
| eval percentage=(anz*100)/totalamount
| sort - percentage