I have an application insights query. And in this query I want to join/combine several columns into a single column for display how can this be accomplished.
I want to combine ip, city, state, country.
customEvents
| where timestamp >= ago(7d)
| where (itemType == 'customEvent')
| where name == "Signin"
| project timestamp, customDimensions.appusername, client_IP,client_City,client_StateOrProvince, client_CountryOrRegion
| order by timestamp desc
strcat is your friend, with whatever strings you want as separators (i just use spaces in the example):
| project timestamp, customDimensions.appusername,
strcat(client_IP," ",client_City," ",client_StateOrProvince," ", client_CountryOrRegion)
also, the | where (itemType == 'customEvent') in your query is unnecessary, as everything in the customEvents table is already a customEvent. you only need a filter like that on itemType if you join multiple tables somehow (like union requests, customEvents or a join somewhere in your query that references multiple tables)
Related
I am trying to understand how to properly design a DynamoDB schema. I've read a few articles, watched some YouTube videos but, to be honest, I don't yet feel quite comfortable.
This is what I am trying to design properly:
two entities, "location" (id & name) and "vehicle" (id & name)
a location can have 0-n vehicles
a vehicle can be in 0-1 locations
Access patterns:
get a list of all available locations (id & name)
get a list of all available vehicles and their current location (id, name, location-id, location-name)
get a list of all vehicles in a given location (id, name)
I've read about adjacency lists and because there will be n-m relations I've decided to give it a try.
This is what I've came up with:
# | PK (GSI1-SK) | SK (GSI1-PK) | DATA
==|======================|====================|==============
1 | LOCATION#locationId1 | A | locationName1
2 | LOCATION#locationId2 | A | locationName2
3 | LOCATION#locationId1 | VEHICLE#vehicleId1 |
4 | LOCATION#locationId1 | VEHICLE#vehicleId2 |
5 | LOCATION#locationId2 | VEHICLE#vehicleId3 |
6 | VEHICLE#vehicleId1 | A | vehicleName1
7 | VEHICLE#vehicleId2 | A | vehicleName2
8 | VEHICLE#vehicleId3 | A | vehicleName3
#1-2 & #6-8 are my entity records, those with additional data for the entity itself (e.g. its name).
#3-5 is an example of how I would design a relationship. I've added an inverted GSI in order to be able to search in both ways.
Back to my access patterns:
get a list of all available locations (id & name)
query GSI1 for SK=A and PK begins with LOCATION#
get a list of all available vehicles and their current location (id, name, location-id, location-name)
query GSI1 for SK=A and PK begins with VEHICLE#
for each result item, query GSI1 for SK=VEHICLE#vehicleId and PK begins with LOCATION#
for each result item, query table for PK=LOCATION#locationId and SK=A
... this doesn't seem right
get a list of all vehicles in a given location (id, name)
query table for PK=LOCATION#locationId and SK begins with VEHICLE#
for each result item, query table for PK=VEHICLE#vehicleId and SK=A
... this doesn't seem right
Adjacency lists look like a nice and clean way to design complex relationships but either I am doing something wrong (probably) or they come with alot of querys that are necessary to look things up.
Any advice is appreciated.
I modelled this in DynamoDB Workbench:
Main Index (PK -> SK)
GSI1 (PK1 -> SK)
In order to:
"get a list of all available locations (id & name)"
select * from GS1 where PK1="ALL#LOCATION"
get a list of all available vehicles and their current location (id, name, location-id, location-name)
select * from MAIN-INDEX where PK="ALL#VEHICLE"
get a list of all vehicles in a given location (id, name)
select * from GSI1 where PK1="LOC#ID"
Several things to here:
It's important to distribute the traffic across all partition keys. I'm using "ALL#" partition keys in this design. Ideally you shard that somehow, there are several tricks like using dates or timestamp to the beginning of the day. You can randomly spread them across a fixed number of "ALL#" records and then randomly query 1 if your use case allows it. If you have millions of locations this is probably ok. That's how you take these decisions: think of the traffic and the behaviour of the data.
In order to use both indexes I put the "ALL#LOCATION" and the "ALL#VEHICLE" partition keys in different indexes.
Notice that vehicle 4 doesn't have a PK1. See what happens to GSI1. This is what's called a sparse index.
I denormalized the vehicle-location relationship. Assuming that the location ID and the location name are immutable it's ok to do this, the problem is when the attributes you denormalize are mutable, avoid that if possible.
I am trying to find what's causing the higher RU usage on the Cosmos DB. I enabled the Log Analytics on the Doc DB and ran the below Kusto query to get the RU consumption by Collection Name.
AzureDiagnostics
| where TimeGenerated >= ago(24hr)
| where Category == "DataPlaneRequests"
| summarize ConsumedRUsPer15Minute = sum(todouble(requestCharge_s)) by collectionName_s, _ResourceId, bin(TimeGenerated, 15m)
| project TimeGenerated , ConsumedRUsPer15Minute , collectionName_s, _ResourceId
| render timechart
We have only one collection on the DocDb Account (prd-entities) which is represents Red line in the Chart. I am not able to figure out what the Blue line represents.
Is there a way to get more details about the empty collection name RU usage (i.e., Blue line)
I'm not sure but I think there's no empty collection costs RU actually.
Per my testing in my side, I found that when I execute your kusto query I can also get the 'empty collection', but when I watch the line details, I found all these rows are existing in my operation. What I mean here is that we shouldn't sum by collectionName_s especially you only have one collection in total, you may try to use requestResourceId_s instead.
When using requestResourceId_s, there're still some rows has no id, but they cost 0.
AzureDiagnostics
| where TimeGenerated >= ago(24hr)
| where Category == "DataPlaneRequests"
| summarize ConsumedRUsPer15Minute = sum(todouble(requestCharge_s)) by requestResourceId_s, bin(TimeGenerated, 15m)
| project TimeGenerated , ConsumedRUsPer15Minute , requestResourceId_s
| render timechart
Actually, you can check the requestCharge_s are coming from which operation, just watch details in Results, but not in Chart, and order by the collectionName_s, then you'll see those requests creating from the 'empty collection', judge if these requests existing in your collection.
I've a lot of events in the traces section of Application Insights. I'm interested in two events "Beginning" and "End", they each have the same operation Id as they're logged in sets.
Sometimes the "End" event won't exist - as there will have a been a problem with the application we're monitoring.
We can say, for the sake of argument that we have these fields that we're interested in: timestamp, eventName, operationId
How can i calculate the exact time between the two timestamps for the pair of events for all unique operation Ids in a timespan?
My initial thought was to get the distinct operationIds from traces, where the eventName is "Beginning"... But that's as far as i get, as i'm not really sure how to perform the rest of the operations required. (Namely - the calculation, and checking if the "End" event even exists).
let operations =
traces
| where customDimensions.eventName = "Beginning"
| distinct operationId
Any help would be greatly appreciated!
EDIT: I'm obviously thinking about this all wrong. What i'm after is non-unique operationIds. This will filter out missing "end" events.
If i could then merge the resulting results together, based on that id, i would then have 2 timestamps, which i could operate on.
So, i figured it out after some coffee and time to think.
Ended up with:
let a =
traces
| summarize count() by operation_Id;
let b =
a
| where count_ == 2
| project operation_Id;
let c =
traces
| where operation_Id in (b)
| join kind = inner(traces) on operation_Id
| order by timestamp,timestamp1
| project evaluatedTime=(timestamp1 - timestamp), operation_Id, timestamp;
c
| where evaluatedTime > timespan(0)
| project seconds=evaluatedTime/time(1s), operation_Id, timestamp
I have an Application Insights Azure Stream Analytics query that looks like this...
requests
| summarize count() by bin(duration, 1000)
| order by duration asc nulls last
...which gives me something like this, which shows the number of requests binned by duration in seconds, recorded in Application Insights.
| 0 | 1000 |
| 1000 | 500 |
| 2000 | 200 |
I would like to able to add another column which shows the count of exceptions from all requests in each bin.
I understand that extend is used to add additional columns, but to do so I would have to reference the 'outer' expression to get the bin constraints, which I don't know how to do. Is this the best way to do this? Or am I better off trying to join the two tables together and then doing the summarize?
Thanks
As you suspected - extend will not help you much here. You need is to run join kind=leftouter on the operation IDs (leftouter is needed so you won't drop requests that did not have any exceptions):
requests
| join kind=leftouter (
exceptions
| summarize exceptionsCount = count() by operation_Id
) on operation_Id
| summarize count(), sum(exceptionsCount) by bin(duration, 1000)
| order by duration asc nulls last
I want to create a page in Drupal to report some basic forum information. I thought I'd use Views, but Views only lets you set one "entity" type per view but forum topics are made up of nodes and comments (aka, topics and replies).
Ideally, I'd like a single view that lists all forum nodes and comments together in a single table (sorted by date), along with a total number of both combined, if possible. Is there a way to do that with Views?
Update: What I'm looking for is something like this:
-------------------------------------------------------
| User | Post | Type | Date |
-------------------------------------------------------
| amy | post text appears here | post | 1/5/01 |
| bob | comment text appears here | comment | 1/5/01 |
| amy | another comment here | comment | 1/5/01 |
| cid | another post appears here | post | 1/4/01 |
| dave | yet another comment here | comment | 1/4/01 |
-------------------------------------------------------
total posts + comments: 5
Not sure what you really want. Either you can display nodes + number of comments or nodes and comments at the same level but then they don't have a total number because they are all separate? Or do you want to show each comment separate together with the number of comments in that thread?
If the latter, that might not be trivial.
Basically, you could create a UNION Select query and query both the node and the comment table. could look like this:
(SELECT 'node' AS type, n.nid as id, n.title as title, nncs.comment_count as comment_count, n.created as timestamp FROM {node} n INNER JOIN {node_comment_statistics} nncs ON n.nid = nncs.nid)
UNION
(SELECT 'comment' AS type, c.cid as id, c.subject as title, cncs.comment_count as comment_count, c.timestamp as timestamp FROM {comments} c INNER JOIN {node_comment_statistics} cncs ON c.nid = cncs.nid)
ORDER BY timestamp DESC LIMIT 10;
That will return a result containing: node/comment | id | title | comment_count | timestamp.
See http://dev.mysql.com/doc/refman/5.1/en/union.html for more information about UNION.
You can then theme that as a table.
Hints:
If you need more data, either extend
the query or use node/comment_load
You could also join {node} in the
second query and use the node title
instead of comment subject
That query is going to be slow
because it will always do a filesort
because you have a union there. It
might actually be faster to execute
two separate queries and then mangle
them together in PHP if you have a
large number of nodes/comments
It turns out the Tracker 2 module provides enough of what I needed.