Extend a columns value in same table - azure-data-explorer

I have the below datatable, where WId and ParentId are values of the same column but are related to each other. The State that's shown here is for WId, I want to extend another column as ParentIdState which should be the State of ParentId. (The value of State also exists in the same table). How can I do so?
datatable(WId:int, WType:string, Link:string, ParentId:dynamic, State:string)
[
374075, "Deliverable", "Link", dynamic(315968), "Started",
]
Updating further for clarification -
datatable(WId:int, WType:string, Link:string, ParentId:dynamic, State:string)
[
374075, "Deliverable", "Link", dynamic(315968), "Started",
315968, "Parent", "Link", dynamic(467145), "Planned"
]
ParentId is dynamic because it's extracted from a JSON. In the above datatable ParentId is actually a value of WId and has its relevant details. My intent is to extend my table to give ParentState in another column like below -
Table

You should use join or lookup.

I believe you could join 2 tables:
the one you provided with a small modification - type of ParentId is changed from dynamic to int (the same as the type of WId as the join will be performed on it).
a simplified version of table 1) - with only 2 columns: WId and State
let data = datatable(WId:int, WType:string, Link:string, ParentId:dynamic, State:string)
[
374075, "Deliverable", "Link", dynamic(315968), "Started",
315968, "Parent", "Link", dynamic(467145), "Planned"
]
| extend ParentId = toint(ParentId); // to make sure the type of ParentId is the same as WId
data
| join kind=leftouter (data | project WId, State) on $left.ParentId == $right.WId
| project WId, WType, Link, ParentId, State, ParentState = State1
There might be some optimization to be done here (for example by using materialize, but I'm not entirely sure)
You can also achieve the same with lookup
data
| lookup (data | project WId, State) on $left.ParentId == $right.WId
| project WId, WType, Link, ParentId, State, ParentState = State1

Related

How to check if Kusto input query parameters exist in a table?

I am passing a list of node IDs to a Kusto query as input. Right now I am printing the rows of a table after checking if the rows' node IDs are in the input list. This is a way to check if the input node IDs are valid and have corresponding rows in the table. This is my current query:
declare query_parameters(nodeIds:string);let nodes = todynamic(parse_json(nodeIds)); let serverNames = database('xyz').abc | where NodeId in~ (nodes) | project NodeId, DeviceName; serverNames;
I would like to instead print all the input nodes, valid or invalid - and if they are valid, their corresponding rows in the table. How can I do that?
(Note: I changed the names of the database and table to preserve anonymity. Thanks.)
I've tried iff, case and !in~.
Instead of using the in operator, you can use a left outer join.
For example:
let T = datatable(NodeId:string, DeviceName:string) [
"n1","d1",
"n2","d2",
"n5","d5",
"n6","d6"
]
;
let nodeIds = '["n1","n3","n4","n6"]'
;
let nodes =
print nodes = parse_json(nodeIds)
| mv-expand NodeId = nodes to typeof(string)
| project NodeId
;
let serverNames =
nodes
| join kind=leftouter hint.strategy=broadcast (
T
| project NodeId, DeviceName
) on NodeId
;
serverNames
NodeId
NodeId1
DeviceName
n1
n1
d1
n6
n6
d6
n4
n3

How to retreive custom property corresponding to another property in azure

I am trying to write a kusto query to retrieve a custom property as below.
I want to retrieve count of pkgName and corresponding organization. I could retrieve the count of pkgName and the code is attached below.
let mainTable = union customEvents
| extend name =replace("\n", "", name)
| where iif('*' in ("*"), 1 == 1, name in ("*"))
| where true;
let queryTable = mainTable;
let cohortedTable = queryTable
| extend dimension = customDimensions["pkgName"]
| extend dimension = iif(isempty(dimension), "<undefined>", dimension)
| summarize hll = hll(itemId) by tostring(dimension)
| extend Events = dcount_hll(hll)
| order by Events desc
| serialize rank = row_number()
| extend dimension = iff(rank > 10, 'Other', dimension)
| summarize merged = hll_merge(hll) by tostring(dimension)
| project ['pkgName'] = dimension, Counts = dcount_hll(merged);
cohortedTable
Please help me to get the organization along with each pkgName projected.
Please try this simple query:
customEvents
| summarize counts=count(tostring(customDimensions.pkgName)) by pkgName=tostring(customDimensions.pkgName),organization=tostring(customDimensions.organization)
Please feel free to modify it to meet your requirement.
If the above does not meet your requirement, please try to create another table which contains pkgName and organization relationship. Then use join operator to join these tables. For example:
//create a table which contains the relationship
let temptable = customEvents
| summarize by pkgName=tostring(customDimensions.pkgName),organization=tostring(customDimensions.organization);
//then use the join operator to join these tables on the keyword pkgName.

SQLite and multiple insert clean

I would like to populate a freshly created Table in a SQLite DB.
In this table, some keys are references to other tables and I'd like not to hard-code these references
-> I'm currently using a "mapping" table in order to fetch ids using names (~ constants emulation)
The problem is: this solution works but is very verbose
Minimal working example: (storing dictionary words, using foreign keys to a category table)
-- Tables creation
CREATE TABLE categories(
id INTEGER PRIMARY KEY,
name TEXT
);
CREATE TABLE words(
id INTEGER PRIMARY KEY,
id_category INTEGER NOT NULL,
name TEXT,
FOREIGN KEY(id_category) REFERENCES categories(id)
);
CREATE TABLE CONSTANTS(
name TEXT PRIMARY KEY,
value INTEGER NOT NULL
);
INSERT INTO categories(name) VALUES("noun");
INSERT INTO CONSTANTS(name, value) VALUES("category_noun", last_insert_rowid());
INSERT INTO categories(name) VALUES("abreviation");
INSERT INTO CONSTANTS(name, value) VALUES("category_abreviation", last_insert_rowid());
INSERT INTO categories(name) VALUES("character");
INSERT INTO CONSTANTS(name, value) VALUES("category_character", last_insert_rowid());
And now, the core of the problem: too much verbose.
In this example is only one foreign key, a few insert to illustrate the problem
INSERT INTO words(id_category, name) VALUES
((SELECT value FROM CONSTANTS WHERE name = "category_noun"),
"hello"),
((SELECT value FROM CONSTANTS WHERE name = "category_abreviation"),
"SO"),
((SELECT value FROM CONSTANTS WHERE name = "category_abreviation"),
"user"),
((SELECT value FROM CONSTANTS WHERE name = "category_character"),
"!")
;
I would like to have something looking like this pseudo-sqlite code:
-- same table creations as before
INSERT INTO words(id_category, name) VALUES
-- Fetch constants once
CAT_NOUM = SELECT value FROM CONSTANTS WHERE name = "category_noum"),
CAT_ABREV = SELECT value FROM CONSTANTS WHERE name = "category_abreviation"),
CAT_CHAR = SELECT value FROM CONSTANTS WHERE name = "category_abreviation")
)
-- Fill the table, using constants
(CAT_NOUM, "Hello"),
(CAT_ABREV, "SO"),
(CAT_NOUM, "user"),
(CAT_CHAR, "SO"),
...
;
I'm wondering if
There is already a SQLite solution to this problem
I should use something like sed to replace a hard-coded string like __SED__CAT_NOUM with its greped value in the SQLite script
Doing this stuff programmatically would be the right way
It is better to use INSERT...SELECT with UNION ALL instead of INSERT...VALUES:
INSERT INTO words(id_category, name)
SELECT value, 'hello' FROM CONSTANTS WHERE name = 'category_noun' UNION ALL
SELECT value, 'SO' FROM CONSTANTS WHERE name = 'category_abreviation' UNION ALL
SELECT value, 'user' FROM CONSTANTS WHERE name = 'category_abreviation' UNION ALL
SELECT value, '!' FROM CONSTANTS WHERE name = 'category_character';
See the demo.
Or use Row Values to join to CONSTANTS:
INSERT INTO words(id_category, name)
SELECT c.value, t.column2
FROM CONSTANTS C INNER JOIN (
VALUES ('category_noun', 'hello'),
('category_abreviation', 'SO'),
('category_abreviation', 'user'),
('category_character', '!')
) t ON t.column1 = c.name;
See the demo.
Results:
SELECT * FROM words;
| id | id_category | name |
| --- | ----------- | ----- |
| 1 | 1 | hello |
| 2 | 2 | SO |
| 3 | 2 | user |
| 4 | 3 | ! |

Select several event params in a single row for Firebase events stored in Google BigQuery

I'm trying to perform a very simple query for Firebase events stored in Google BigQuery but I´m not able to find a way to do it.
In the Android app, I´m logging an event like this:
Bundle params = new Bundle();
params.putInt("productID", productId);
params.putInt(FirebaseAnalytics.Param.VALUE, value);
firebaseAnalytics.logEvent("productEvent", params);
So, in BigQuery I have something like this:
___________________ _______________________ ____________________________
| event_dim.name | event_dim.params.key | event_dim.params.int_value |
|___________________|_______________________|____________________________|
| productEvent | productID | 25 |
| |_______________________|____________________________|
| | value | 1253 |
|___________________|_______________________|____________________________|
When I get the data from this table I get two rows:
___________________ _______________________ ____________________________
|event_dim.name | event_dim.params.key | event_dim.params.int_value |
|___________________|_______________________|____________________________|
| productEvent | productID | 25 |
| productEvent | value | 12353 |
But what I really need is a SELECT clause from this table to get the data as below:
___________________ _____________ _________
| name | productID | value |
|___________________|_____________|_________|
| productEvent | 25 | 12353 |
Any idea or suggestion?
You can pivot the values into columns like this
SELECT
event_dim.name as name,
MAX(IF(event_dim.params.key = "productID", event_dim.params.int_value, NULL)) WITHIN RECORD productID,
MAX(IF(event_dim.params.key = "value", event_dim.params.int_value, NULL)) WITHIN RECORD value,
FROM [events]
In case you want to generate this command using SQL, see this solution: Pivot Repeated fields in BigQuery
Using standard SQL (uncheck "Use Legacy SQL" under "Show Options" in the UI), you can express the query as:
SELECT
event_dim.name as name,
(SELECT value.int_value FROM UNNEST(event_dim.params)
WHERE key = "productID") AS productID,
(SELECT value.int_value FROM UNNEST(event_dim.params)
WHERE key = "value") AS value
FROM `dataset.mytable` AS t,
t.event_dim AS event_dim;
Edit: updated example to include int_value as part of value based on the comment below. Here is a self-contained example that demonstrates the approach as well:
WITH T AS (
SELECT ARRAY_AGG(event_dim) AS event_dim
FROM (
SELECT STRUCT(
"foo" AS name,
ARRAY<STRUCT<key STRING, value STRUCT<int_value INT64, string_value STRING>>>[
("productID", (10, NULL)), ("value", (5, NULL))
] AS params) AS event_dim
UNION ALL
SELECT STRUCT(
"bar" AS name,
ARRAY<STRUCT<key STRING, value STRUCT<int_value INT64, string_value STRING>>>[
("productID", (13, NULL)), ("value", (42, NULL))
] AS params) AS event_dim
)
)
SELECT
event_dim.name as name,
(SELECT value.int_value FROM UNNEST(event_dim.params)
WHERE key = "productID") AS productID,
(SELECT value.int_value FROM UNNEST(event_dim.params)
WHERE key = "value") AS value
FROM T AS t,
t.event_dim AS event_dim;

How can I properly handle an sqlite table self reference?

I have created an sqlite db for a small cattle herd.
CREATE TABLE Animals
(
animal_id PRIMARY KEY,
animal_name CHAR(15) NULL,
date_born DATE NULL,
f_parent REFERENCES Animals (animal_id) NULL,
m_parent REFERENCES Animals (animal_id) NULL,
date_purchased DATE NULL,
registered BIT NOT NULL,
gender CHAR(1) NOT NULL CHECK(gender IN ("M","F")),
breed INTEGER NOT NULL REFERENCES breed (breed_id)
);
CREATE TABLE termination (term_key INTEGER PRIMARY KEY, animal_id INTEGER, term_date DATE, sold BIT, price SMALLMONEY, comp_market_price SMALLMONEY, comp_market_tier TEXT);
I have this statement:
SELECT a1.animal_id, a1.animal_name, a1.f_parent, t1.term_date, t1.price, t1.comp_market_price, t1.comp_market_tier
FROM Animals AS a1, termination AS t1
WHERE a1.animal_id = t1.animal_id
AND a1.f_parent NOT NULL;
results are:
id#|'animal name'|'parent id#'|date sold ...
15|some name|4|2014-05-26 ...
...
which is correct and what I wanted except that in place of 'parent id#' I want the parent's name. The parent id# is a key in the same table as the offspring (as you see from my create statement above), but I can't figure out how to deal with this self-reference. I know the issue is rather common, and I've tried view tables, multiple joins, etc. to no avail. Please show code snippet of how I can print the same results showing the parents name in place of parent id#/key no.
thank you very much!
Something like this maybe?
select a.animal_id, a.animal_name,
(select animal_name
from animals
where a.f_parent = animal_id) as parent,
t.term_date, t.price, t.comp_market_price, t.comp_market_tier
from animals as a, termination as t using(animal_id)
where a.f_parent not null;
or this? (better execution plan)
select a.animal_id as id, a.animal_name as name,f.animal_name as mother,
t.term_date, t.price, t.comp_market_price, t.comp_market_tier
from animals as a, termination as t using(animal_id),
animals f
where a.f_parent = f.animal_id
and a.f_parent is not null;

Resources