Query ADX table name dynamically - azure-data-explorer

I have a need to be able to query Azure Data Explorer (ADX) tables dynamically, that is, using application-specific metadata that is also stored in ADX.
If this is even possible, the way to do it seems to be via the table() function. In other words, it feels like I should be able to simply write:
let table_name = <non-trivial ADX query that returns the name of a table as a string>;
table(table_name) | limit 10
But this query fails since I am trying to pass a variable to the table() function, and "a parameter, which is not scalar constant string can't be passed as parameter to table() function". The workaround provided doesn't really help, since all the possible table names are not known ahead of time.
Is there any way to do this all within ADX (i.e. without multiple queries from the client) or do I need to go back to the drawing board?

if you know the desired output schema, you could potentially achieve that using union (note that in this case, the result schema will be the union of all tables, and you'll need to explicitly project the columns you're interested in)
let TableA = view() { print col1 = "hello world"};
let TableB = view() { print col1 = "goodbye universe" };
let LabelTable = datatable(table_name:string, label:string, updated:datetime)
[
"TableA", "MyLabel", datetime(2019-10-08),
"TableB", "MyLabel", datetime(2019-10-02)
];
let GetLabeledTable = (l:string)
{
toscalar(
LabelTable
| where label == l
| order by updated desc
| limit 1
)
};
let table_name = GetLabeledTable('MyLabel');
union withsource = T *
| where T == table_name
| project col1

Related

How do I sort by names or postcodes using a drop-down list to determine what to sort by?

Currently, I am trying to create some software on Progress OpenEdge that sorts by customer's names or account codes.
So essentially, a box will open up when the program runs, the user will select "Name" from the drop-down list, and the program will display all the names in the database from alphabetical order.
Or, they will pick "Account" from the drop-down list, and it will display all the account codes in numeric order. I have attached a picture of the program here:
And this is currently the code I am using to print the results:
However, I'm not sure what I need to add for the others. Would I need IF statements, such as:
OR IF [drop down list] = "Account" THEN or something like that?
Any help would be appreciated.
While you can perform a static order by with a convulted set of if statements, it is a lot cleaner with a dynamic query.
To expand on Tom and Stefan's answers, you can use a query. The ABL lets you create a lot of things with static constructs and use them as dynamic. I think that in this case, you want to so something like the below.
Note that you can do build the query string either using OPEN QUERY qry FOR EACH ... or QUERY qry:QUERY-PREPARE('FOR EACH ... ') ; both will work equally well.
What I think you'd want is
(a) having a static definition of the query (ie DEFINE QUERY) since the cost of adding the buffer(s) to the query is done at compile time, not run time, and
(b) accessing the buffer fields statically (ie slmast.name rather than b::name )
define query qry for slmast.
define variable wc as character no-undo.
if condition eq true then
wc = "WHERE kco = s-kco AND warecode = lv-warecode AND pcode = fi-pcode AND name = fi-name BY name".
else
wc = "WHERE TRUE".
/* alternate
if condition then
open query qry for each slmast no-lock WHERE kco = s-kco AND warecode = lv-warecode AND pcode = fi-pcode AND name = fi-name BY name.
else
open query qry for each slmast no-lock.
*/
query qry:query-prepare(wc).
open query qry.
query qry:get-first().
do while available slmast:
/* do stuff with the buffer */
{&OUT} slmast.name.
query qry:get-next().
end.
query qry:query-close().
Using static constructs as far as possible means that you have less cleanup code to write and the code becomes more readable (IMO).
There are multiple ways to loop through the query results: using DO WHILE NOT QUERY qry:QUERY-OFF-END works as well as AVAILABLE slmast or b:AVAILABLE (if using a purely dynamic query).
As Stefan says, a dynamic query is what you want. This might help get you started:
define variable wc as character no-undo.
define variable q as handle no-undo.
define variable b as handle no-undo.
/* run your UI to get selection criteria amd then
* create a WHERE clause as appropriate
*/
wc = "WHERE kco = s-kco AND warecode = lv-warecode AND pcode = fi-pcode AND name = fi-name BY name".
create buffer b for table "slmast".
create query q.
q:set-buffers( b ).
q:query-prepare( substitute( "FOR EACH slmast NO-LOCK &1", wc )).
q:query-open().
do while q:get-next():
display
b:buffer-field( "name" ):buffer-value
b:buffer-field( "acode" ):buffer-value
b:buffer-field( "pcode" ):buffer-value
b:buffer-field( "trunmtd" ):buffer-value
b:buffer-field( "turnytd" ):buffer-value
.
end.

Improve Kusto Query - mailbox audit log search

I am trying to identify shared mailboxes that aren't in use. Checked "Search-MailboxAuditLog" already and some mailboxes do not return any results even tho auditing enabled, but can see activity in Azure sentinel.
Is there a way to improve below Kusto code? (During testing tried mailboxes with activities but sometimes do not get any results from the query)
With Kusto, Is there a way to loop through "mbs" like powershell "foreach ( $item in $mbs)"?
Thanks,
let mbs = datatable (name: string)
[
"xxx1#something.com",
"xxx2#something.com",
"xxx3#something.com",
];
OfficeActivity
| where OfficeWorkload == "Exchange" and TimeGenerated > ago(30d)
| where MailboxOwnerUPN in~ (mbs)
| distinct MailboxOwnerUPN
Update : Need help with the query
Input would be list of shared mailbox UPNs
Output would be list of shared mailboxes with any activity, example MBs with any action in “Operation" filed
"in" doesn't work on datatables (tabular inputs) like that; it is not a "filter", it is an "operator". The "where" is effectively the "foreach" you are referring to.
Given the sample input, the query could probably be written as:
OfficeActivity //tabular input with many records
| TimeGenerated > ago(30d) //Filter records to window of interest first
| where OfficeWorkload == "Exchange" //foreach row
| where MailboxOwnerUPN in~ ( //foreach row
"xxx1#something.com","xxx2#something.com","xxx3#something.com"
)
| distinct MailboxOwnerUPN
You can see it in the docs at https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/inoperator#arguments where "col" is the "column to filter"

How would I return a count of all rows in the table and then the count of each time a specific status is found?

Please forgive my ignorance on sqlalchemy, up until this point I've been able to navigate the seas just fine. What I'm looking to do is this:
Return a count of how many items are in the table.
Return a count of many times different statuses appear in the table.
I'm currently using sqlalchemy, but even a pure sqlite solution would be beneficial in figuring out what I'm missing.
Here is how my table is configured:
class KbStatus(db.Model):
id = db.Column(db.Integer, primary_key=True)
status = db.Column(db.String, nullable=False)
It's a very basic table but I'm having a hard time getting back the data I'm looking for. I have this working with 2 separate queries, but I have to believe there is a way to do this all in one query.
Here are the separate queries I'm running:
total = len(cls.query.all())
status_count = cls.query.with_entities(KbStatus.status, func.count(KbStatus.id).label("total")).group_by(KbStatus.status).all()
From here I'm converting it to a dict and combining it to make the output look like so:
{
"data": {
"status_count": {
"Assigned": 1,
"In Progress": 1,
"Peer Review": 1,
"Ready to Publish": 1,
"Unassigned": 4
},
"total_requests": 8
}
}
Any help is greatly appreciated.
I don't know about sqlalchemy, but it's possible to generate the results you want in a single query with pure sqlite using the JSON1 extension:
Given the following table and data:
CREATE TABLE data(id INTEGER PRIMARY KEY, status TEXT);
INSERT INTO data(status) VALUES ('Assigned'),('In Progress'),('Peer Review'),('Ready to Publish')
,('Unassigned'),('Unassigned'),('Unassigned'),('Unassigned');
CREATE INDEX data_idx_status ON data(status);
this query
WITH individuals AS (SELECT status, count(status) AS total FROM data GROUP BY status)
SELECT json_object('data'
, json_object('status_count'
, json_group_object(status, total)
, 'total_requests'
, (SELECT sum(total) FROM individuals)))
FROM individuals;
will return one row holding (After running through a JSON pretty printer; the actual string is more compact):
{
"data": {
"status_count": {
"Assigned": 1,
"In Progress": 1,
"Peer Review": 1,
"Ready to Publish": 1,
"Unassigned": 4
},
"total_requests": 8
}
}
If the sqlite instance you're using wasn't built with support for JSON1:
SELECT status, count(status) AS total FROM data GROUP BY status;
will give
status total
-------------------- ----------
Assigned 1
In Progress 1
Peer Review 1
Ready to Publish 1
Unassigned 4
which you can iterate through in python, inserting each row into your dict and adding up all total values in another variable as you go to get the total_requests value at the end. No need for another query just to calculate that number; do it manually. I bet it's really easy to do the same thing with your existing second sqlachemy query.

Efficient joining the most recent record from another table in Entity Framework Core

I am comming to ASP .NET Core from PHP w/ MySQL.
The problem:
For the illustration, suppose the following two tables:
T: {ID, Description, FK} and States: {ID, ID_T, Time, State}. There is 1:n relationship between them (ID_T references T.ID).
I need all the records from T with some specific value of FK (lets say 1) along with the related newest record in States (if any).
In terms of SQL it can be written as:
SELECT T.ID, T.Description, COALESCE(s.State, 0) AS 'State' FROM T
LEFT JOIN (
SELECT ID_T, MAX(Time) AS 'Time'
FROM States
GROUP BY ID_T
) AS sub ON T.ID = sub.ID_T
LEFT JOIN States AS s ON T.ID = s.ID_T AND sub.Time = s.Time
WHERE FK = 1
I am struggling to write an efficient equivalent query in LINQ (or the fluent API). The best working solution I've got so far is:
from t in _context.T
where t.FK == 1
join s in _context.States on t.ID equals o.ID_T into _s
from s in _s.DefaultIfEmpty()
let x = new
{
id = t.ID,
time = s == null ? null : (DateTime?)s.Time,
state = s == null ? false : s.State
}
group x by x.id into x
select x.OrderByDescending(g => g.time).First();
When I check the resulting SQL query in the output window when executed it is just like:
SELECT [t].[ID], [t].[Description], [t].[FK], [s].[ID], [s].[ID_T], [s].[Time], [s].[State]
FROM [T] AS [t]
LEFT JOIN [States] AS [s] ON [T].[ID] = [s].[ID_T]
WHERE [t].[FK] = 1
ORDER BY [t].[ID]
Not only it selects more columns than I need (in the real scheme there are more of them). There is no grouping in the query so I suppose it selects everything from the DB (and States is going to be huge) and the grouping/filtering is happening outside the DB.
The questions:
What would you do?
Is there an efficient query in LINQ / Fluent API?
If not, what workarounds can be used?
Raw SQL ruins the concept of abstracting from a specific DB technology and its use is very clunky in current Entity Framework Core (but maybe its the best solution).
To me, this looks like a good example for using a database view - again, not really supported by Entity Framework Core (but maybe its the best solution).
What happens if you try to do a more straight forward translation to LINQ?
var latestState = from s in _context.States
group s by s.ID_T into sg
select new { ID_T = sg.Key, Time = sg.Time.Max() };
var ans = from t in _context.T
where t.FK == 1
join sub in latestState on t.ID equals sub.ID_T into subj
from sub in subj.DefaultIfEmpty()
join s in _context.States on new { t.ID, sub.Time } equals new { s.ID, s.Time } into sj
from s in sj.DefaultIfEmpty()
select new { t.ID, t.Description, State = (s == null ? 0 : s.State) };
Apparently the ?? operator will translate to COALESCE and may handle an empty table properly, so you could replace the select with:
select new { t.ID, t.Description, State = s.State ?? 0 };
OK. Reading this article (almost a year old now), Smit's comment to the original question and other sources, it seems that EF Core is not really production ready yet. It is not able to translate grouping to SQL and therefore it is performed on the client side, which may be (and in my case would be) a serious problem. It corresponds to the observed behavior (the generated SQL query does no grouping and selects everything in all groups). Trying the LINQ queries out in Linqpad it always translates to a single SQL query.
I have downgraded to EF6 following this article. It required some changes in my model's code and some queries. After changing .First() to .FirstOrDefault() in my original LINQ query it works fine and translates to a single SQL query selecting only the needed columns. The generated query is much more complex than it is needed, though.
Using a query from NetMage's answer (after small fixes), it results in a SQL query almost identical to my own original SQL query (there's only a more complex construct than COALESCE).
var latestState = from s in _context.States
group s by s.ID_T into sg
select new { ID = sg.Key, Time = sg.Time.Max() };
var ans = from t in _context.T
where t.FK == 1
join sub in latestState on t.ID equals sub.ID into subj
from sub in subj.DefaultIfEmpty()
join s in _context.States
on new { ID_T = t.ID, sub.Time } equals new { s.ID_T, s.Time }
into sj
from s in sj.DefaultIfEmpty()
select new { t.ID, t.Description, State = (s == null ? false : s.State) };
In LINQ it's not as elegant as my original SQL query but semantically it's the same and it does more or less the same thing on the DB side.
In EF6 it is also much more convenient to use arbitrary raw SQL queries and AFAIK also the database views.
The biggest downside of this approach is that full .NET framework has to be targeted, EF6 is not compatible with .NET Core.

DynamoDB “OR” conditional Range query

Let's assume my table looks like:
Code |StartDate |EndDate |Additional Attributes...
ABC |11-24-2015 |11-26-2015 | ....
ABC |12-12-2015 |12-15-2015 | ....
ABC |10-05-2015 |10-10-2015 | ....
PQR |03-24-2015 |03-27-2015 | ....
PQR |05-04-2015 |05-08-2015 | ....
Provided a Code (c) and a date range (x, y), I need to be able to query items something like:
Query => (Code = c) AND ((StartDate BETWEEN x AND y) OR (EndDate BETWEEN x AND y))
I was planning to use a Primary Key as a Hash and Range Key (Code, StartDate) with an additional LSI (EndDate) and do a query on it.
I am not sure if there is a way to achieve this. I don't want to use the SCAN operation as it seems to scan the entire table which could be very costly.
Also, would like to achieve this in a single query.
One option would be to do this using QUERY and a FilterExpression. No need to define the LSI on this case. You would have to query by Hash Key with the EQ operator and then narrow the results with the Filter Expression. Here is an example with the Java SDK:
Table table = dynamoDB.getTable(tableName);
Map<String, Object> expressionAttributeValues = new HashMap<String, Object>();
expressionAttributeValues.put(":x", "11-24-2015");
expressionAttributeValues.put(":y", "11-26-2015");
QuerySpec spec = new QuerySpec()
.withHashKey("Code", "CodeValueHere")
.withFilterExpression("(StartDate between :x and :y) or (EndDate between :x and :y)")
.withValueMap(expressionAttributeValues);
ItemCollection<QueryOutcome> items = table.query(spec);
Iterator<Item> iterator = items.iterator();
while (iterator.hasNext()) {
System.out.println(iterator.next().toJSONPretty());
}
See Specifying Conditions with Condition Expressions for more details.
Additionally, although the previous query only uses the Hash Key , you can still group the records with the Range Key containing the dates in the following format:
StartDate#EndDate
Table Structure:
Code DateRange |StartDate |EndDate
ABC 11-24-2015#11-26-2015 |11-24-2015 |11-26-2015
ABC 12-12-2015#12-15-2015 |12-12-2015 |12-15-2015
ABC 10-05-2015#10-10-2015 |10-05-2015 |10-10-2015
PQR 03-24-2015#03-27-2015 |03-24-2015 |03-27-2015
PQR 05-04-2015#05-08-2015 |05-04-2015 |05-08-2015
This way If you happen to query only by Hash Key you would still get the records sorted by the dates. Also, I believe it is a good idea to follow the advice given about the unambiguous date format.nu

Resources