In Azure Stream Analytics Bad Request results when calling Azure Machine Learning function even though Azure ML service is called fine from C# - azure-machine-learning-studio

We have an Azure Machine Learning web service that is called fine from a C# program. And it works fine when called as an HTML post (with Headers and a JSON string in the body). However, in Azure Stream Analytics you have to create a Function to call an ML service. And when this function is called in ASA, it fails with Bad Request.
The documentation for the ML service gives the following documentation:
Request Body
Sample Request
{
"Inputs":{
"input":[
{
"device":"60-1-94-49-36-c5",
"uid":"5f4736aabfc1312385ea09805cc922",
"weight":"9-9-9-9-9-8-9-8-9-9-9-9-9-9-9-9-9-8-9-9-8-8-9-9-9-9-9-
9-9-9-9-9-9-9-8-9-9-9-9-9-9-9-9-9-9-9-9-9-8-9-9-9-9-9-9-9-9-9-9-9-9-9-9-9-9-
9-9-8-9-9-9-9-8-9-9-9-8-9-9-9-9-9-9-9-9-9-8-9-9-9-9-8-8-16-16-15-16-16-15-
15-16-15-15-15-15-16-15-15-16-15-15-9-15-15-15-15-15-15-15-9-15-16-15-15-9-
15-16-16-16-15-15-15-15-15-15-15-15-16-16-15-9-15-15-15-16-15-16-15-15-15-
15-15-16-15-15-16-16-15-15-15"
}
]
},
"GlobalParameters":{
}
}
The Azure Stream Analytics function (that calls the ML service above) has this signature:
FUNCTION SIGNATURE
SmartStokML2018Aug17 ( device NVARCHAR(MAX) ,
uid NVARCHAR(MAX) ,
weight NVARCHAR(MAX) ) RETURNS RECORD
Here the function is expecting 3 string arguments and NOT a full JSON string. The 3 parameters are strings (NVARCHAR as shown).
The 3 parameters have been passed in: device, uid and weight. And in different string formats. This includes passing the string arguments as JSON strings, using JSON.stringify() in a UDF, or sending in arguments with just data, no headers ("device", "uid", "weight"). But all calls to the ML service fail.
WITH QUERY1 AS (
SELECT DEVICE, UID, WEIGHT,
udf.jsonstringify( concat('{"device": "',try_cast(device as nvarchar(max)), '"}')) jsondevice,
udf.jsonstringify( concat('{"uid": "',try_cast(uid as nvarchar(max)), '"}')) jsonuid,
udf.jsonstringify( concat('{"weight": "',try_cast(weight as nvarchar(max)), '"}')) jsonweight
FROM iothubinput2018aug21 ),
QUERY2 AS (
SELECT IntellistokML2018Aug21(JSONDEVICE, JSONUID, JSONWEIGHT) AS RESULT
FROM QUERY1
)
SELECT *
INTO OUT2BLOB20
FROM QUERY2
Most of the errors are:
ValueError: invalid literal for int() with base 10: '\\" {weight:9'\n\r\n\r\n
In what format does the ML Service expect these parameters to be passed in?
Note: the queries have been tried with ASA Compatibility Level 1 and 1.1.

In an ASA function, you don't need to construct the JSON input to Azure ML yourself. You just specify your event fields directly. Eg:
WITH QUERY1 AS (
SELECT IntellistokML2018Aug21(DEVICE, UID, WEIGHT) AS RESULT
FROM iothubinput2018aug21
)
SELECT *
INTO OUT2BLOB20
FROM QUERY1

As mentioned in Dushyant post, you don't need to construct the JSON input for Azure ML. However, I've noticed that your input is in a nested JSON with Array, so you need to extract the field in your first step.
Here an example:
WITH QUERY1 AS(
SELECT
GetRecordPropertyValue(GetArrayElement(inputs.input,0),'device') as device,
GetRecordPropertyValue(GetArrayElement(inputs.input,0),'uid') as uid,
GetRecordPropertyValue(GetArrayElement(inputs.input,0),'weight') as weight
FROM iothubinput2018aug21 )
Please note that if you can have several messages in the "Inputs.input" array, you can use CROSS APPLY to read all of them (in my example I only assumed there is one).
More information on querying JSON here: https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-parsing-json
Let us know if it works for you.
JS (Azure Stream Analytics)

It turns out the ML Service is expecting devices with a KNOWN Mac ID. If a device is passed in with an UNKNOWN MAC ID, then there is a failure in the Python script. This should be handled more gracefully.
Now there are errors related to batch processing of rows:
"Error": "- Condition 'The number of events in Azure ML request ID 0 is 28 but the
number of results in the response is 1. These should be equal. The Azure ML model
is expected to score every row in the batch call and return a response for it.'
should not be false in method
'Microsoft.Streaming.CalloutProcessor.dll
!Microsoft.Streaming.Processors.Callout.AzureMLRRS.ResponseParser.Parse'
(Parse at offset 69 in file:line:column <filename unknown>:0:0\r\n)\r\n",
"Message": "An error was encountered while calling the Azure ML web service. An
error occurred when parsing the Azure ML web service response. Please check your
Azure ML web service and data model., - Condition 'The number of events in Azure ML
request ID 0 is 28 but the number of results in the response is 1. These should be
equal. The Azure ML model is expected to score every row in the batch call and
return a response for it.' should not be false in method
'Microsoft.Streaming.CalloutProcessor.dll
!Microsoft.Streaming.Processors.Callout.AzureMLRRS.ResponseParser.Parse' (Parse at
offset 69 in file:line:column <filename unknown>:0:0\r\n)\r\n, :
OutputSourceAlias:query2Callout;",
Type": "CallOutProcessingFailure",
"Correlation ID": "2f87188e-1eda-479c-8e86-e2c4a827c6e7"
I am looking into this article for guidance:
[Scale your Stream Analytics job with Azure Machine Learning functions][1]: https://github.com/MicrosoftDocs/azure-docs/blob/master/articles/stream-analytics/stream-analytics-scale-with-machine-learning-functions.md

I am unable to add a comment to the original thread regarding this so replying here:
"The number of events in Azure ML
request ID 0 is 28 but the number of results in the response is 1. These should be
equal"
ASA's call out to Azure ML is modeled as a scalar function. This means that every input event needs to generate exactly one output. In your case, seems that you are generating one output for 28 input events. Can you modify your logic to generate an output per input event?

Regarding the JSON format:
{ "Inputs":{ "input":[ { "device":"60-c5", "uid":"5f422", "weight":"9--15" } ] }, "GlobalParameters":{ } }
All the extra markup will be added by ASA when calling AML. Do you have a way of inspecting the input received by your AML web service? For eg, modify your model code to write to blob.

AML calls are expected to follow scalar semantics - one output per input.

Related

Recursive, Non-Dynamic (Refreshable) Web API via Power BI

I am trying to write a recursive web API call in PBI to collect all 27,515 records, the oDATA feed has a limit of 1,000 rows. I need this data to be refreshable in the PBI service, therefore these 28 requests via M code cannot be formulated in a dynamic way. PBI only allows for static or non-dynamic sources for refresh within the service. Below, I will share two pieces of M code, 1. one that is considered to be a dynamic data source (not what I need, but pulls all 27,515 records correctly) and 2. one that is a static data source (which is giving an incorrect number of 19,000 records, but is the type of data source that I need for this refreshing problem).
Noteworthy: Upon initial API call I receive a table named table "d" (in the photo below) with two rows one row it titled "results" which contains all of the data (1,000 rows) I need per request, the second row is titled "__next" which has the next API URL with an embedded skiptoken from the current calls worth of data. This skiptoken tells the API which rows to skip so that the next request doesn't deliver the data we have already collected.
Table d, Initial Table
M Code for Dynamic Data Source: This dynamic data source is pulling the correct number of records in 28 requests (up to 1,000 records per request) totaling 27,515 rows.
= List.Generate( ()=> Json.Document(Web.Contents("https://my_instance/odata/v2/Table?$format=JSON&$paging=snapshot"))[d],
each Record.HasFields(_, "results")= true,
each try Json.Document(Web.Contents(_[__next]))[d] otherwise [df=[__next="dummy_variable"]])
M Code for Static Data Source: This static data source is the type that I need for refreshing in PBI service (I confirmed it does refresh in the service), but is returning an incorrect number of rows, 19,000 versus 27,515. This code is calling 19 requests versus the needed 28 requests. I believe the error lies in the Query portion where I am attempting to call the next API URL with the skiptoken from the previous request.
= List.Generate( ()=> Json.Document(Web.Contents("https://my_instance/odata/v2/Table?$format=JSON&$paging=snapshot"))[d],
each Record.HasFields(_, "results")= true,
each try Json.Document(Web.Contents("https://my_instance/odata/v2/Table?$format=JSON&$paging=snapshot", [Query=[q=_[__next]]]))[d] otherwise [df=[__next="dummy_variable"]])
Does anyone see an error in the static code for iteratively calling each new request in the table [d] which has rows labeled [results] (all the data) and another row labeled [__next] which has the next URL with the skiptoken from the previous API call.
To be clear, in Web.Contents the url must be static, but you can freely use dynamic components in the RelativePath optional option argument (as in this simple example function) which is how you can generate dynamic web API queries that work in the service without the error you are seeing w.r.t. dynamic queries:
(current_page as text) =>
let
data = Web.Contents(
"https://my_instance/api/v2/endpoint", // static!
[
RelativePath = "?page="&current_page // dynamic!
]
)
in
data
So if you can split out the relative path of your _next parameter and feed it into such a function it will be OK for automatic refreshes in the Power BI service.

Submitting time data to wikibase with `wbcreateclaim` results in "invald-snak" error

I am currently trying to populate a wikidata instance via POST requests. For this purpose I use the requests library in Python together with the MediaWiki API.
So far I managed to create claims with different datatypes (like String, Quantity, Wikidata items, Media ...). The general scheme I use is this (with different value strings for each datatype):
import requests
session = requests.Session()
# authenticate and obtain a csrf_token
parameters = {
'action': 'wbcreateclaim',
'format': 'json',
'entity': 'Q1234',
'snaktype': 'value',
'property': 'P12',
'value': '{"time": "+2022-02-19T00:00:00Z", "timezone": 0, "precision": 11, "calendarmodel": "http://www.wikidata.org/entity/Q1985727"}',
'token': csrf_token,
'bot': 1,
}
r = session.post(api_url, data=parameters)
print(r.json())
Every attempt to insert data of type time leads to an invalid-snak error (info: "invalid snak data.").
The following changes did not solve the problem:
submitting the value string as dictionary value (without the single quotes),
putting the numeric values into (double) quotes,
using a local item for the calendarmodel ("http://localhost:8181/entity/Q73"),
adding before and after keys in the dictionary,
omitting timezone, precision, calendarmodel and combinations thereof,
formatting the time string as 2022-02-19,
submitting the request with administrator rights (though the error message does not
suggest a problem with insufficient user rights).
Do you know, what I'm doing wrong here? What must the value field look like?
I am aware of the fact that special libraries or interfaces for exist for these tasks. But I do want to use the Wikidata API directly with the requests library in Python.
Thank you very much for your help!
Installed software versions:
MediaWiki: 1.36.3
PHP: 7.4.27
MariaDB 10.3.32-MariaDB-1:10.3.32+maria~focal
ICU 67.1
It works if the value string is generated from the dictionary via json.dumps().

When are writeFields specified in Firestore requests and what replaces them?

The simulator now displays an error message trying to access request.writeFields.
Before that writeFields in Firestore Security Rules did just not work in real requests.
The message states the following:
The simulator only simulates client SDK calls; request.writeFields is always null for these simulations
Does this mean that writeFields are only specified in HTTP requests?
The documentation only states this:
writeFields: List of fields being written in a write request.
A problem that arises from this
I am searching for something that replaces this property because it is "always null".
request.resource.data in update also contains fields that are not in the requests, but already in the document to my knowledge.
Example
// Existing document:
document:
- name: "Peter"
- age: 52
- profession: "Baker"
// Update call:
document:
- age: 53
// request.resource.data in allow update contains the following:
document:
- name: "Peter"
- age: 53
- profession: "Baker"
But I only want age.
EDIT Mar 4, 2020: Map.diff() replaces writeFields functionality
The Map.diff() function gives the difference between two maps:
https://firebase.google.com/docs/reference/rules/rules.Map#diff
To use it in rules:
// Returns a MapDiff object
map1.diff(map2)
A MapDiff object has the following methods
addedKeys() // a set of strings of keys that are in after but not before
removedKeys() // a set of strings of keys that are in before but not after
changedKeys() // a set of strings of keys that are in both maps but have different values
affectedKeys() // a set of strings that's the union of addedKeys() + removedKeys() + updatedKeys()
unchangedKeys() // a set of strings of keys that are in both maps and have the same value in both
For example:
// This rule only allows updates where "a" is the only field affected
request.resource.data.diff(resource.data).affectedKeys().hasOnly(["a"])
EDIT Oct 4, 2018: writeFields is no longer supported by Firestore and its functionality will eventually be removed.
writeFields is still valid, as you can see from the linked documentation. What the error message in the simulator is telling you is that it's unable to simulate writeFields, as it only works with requests coming from client SDKs. The simulator itself seems to be incapable of simulating requests exactly as required in order for writeFields to be tested. So, if you write rules that use writeFields, you'll have to test them by using a client SDK to perform the read or write that would trigger the rule.

querying across application insights resource with REST API

I have data which are stored in 3 different application insights resources, thanks to query across resource feature added last year (https://azure.microsoft.com/en-us/blog/query-across-resources/) was possible to query those 3 application insights at once with app identifier.
I'm trying to execute this query through app insights REST API : https://dev.applicationinsights.io (app insights REST API) for a very basic need from a static HTML page (no backend)
but without luck
I do suspect that app identifier isn't supported, it is actually the case ? any workaround for my use case (no backend).
Here is an example with the query in the body. My queries are quite complex and have a lot of let statements and therefore passing the query in the body is easier. There are some PowerShell quirks in the example below, but I'll update with a C# example tomorrow.
The let statement in the example below is pretty pointless, it's mostly there to show that you can do complex queries with let expressions etc.
AppId is the Application Insights resource ID - and NOT the instrumentation key. The API key is just a long string and you can create up to 10 of them AFAIK.
You will find both the id and keys under API Access (I've added a screenshot as it's easy to get them confused). When you use the app() function use the app id.
$app1Id = "GUID"
$app2Id = "GUID"
$app1Key = "string"
$app2Key = "string"
# EXAMPLE: "X-Api-Key" = "key1:GUID1,key2:GUID2"
$headers = #{ "X-Api-Key" = "${app1Key}:$app1Id,${app2Key}:$app2Id"; "Content-Type" = "application/json" }
# EXAMPLE: "query" = "union app('GUID1').something, app('GUID2').something | limit 5"
$query = #{"query" = "let days=1d;union app('$app1Id').exceptions,app('$app2Id').exceptions | where timestamp > ago(days)"}
$body = ConvertTo-Json $query | % { [regex]::Unescape($_) }
$result = Invoke-RestMethod "https://api.applicationinsights.io/v1/apps/$app1Id/query" -H $headers -Body $body -Method POST
The query above will return all the exceptions for the two Application insights resources for the last day. You can do a query across 10 resources at the time of writing, 200 requests per 30 seconds or a max of 86,400 requests per day (UTC). Other limits apply if you use ADD.
NOTE: the extra {} in the header is a PowerShell quirk in regards to variables and the use of the colon char, and as you can see in the example you should not bracket the keys in the header :)
Checked with the dev team that owns that service:
You should be able to put the api key in as apiKey1:appId1,apiKey2:appId2 in the api key box and this should work.
the [object ProgressEvent] response is a bug in the explorer that should have really showed you an error.
And as a workaround, you could always do the queries inside the azure portal itself in workbooks for any of the AI resources, or hypothetically also from the analytics portal for any of the AI resources, and those wouldn't require the API key at all, if all you needed was to see the data.

AWS API Gateway - change to 404 if query returns nothing

I have a Dynamodb table with a few fields - my_id is the PrimaryKey. In the API gateway I set up a response with a method that takes in a parameter {my_id}.
Then I have an Integration Request mapping template that takes the passed in parameter and queries the table to return all the fields that match.
Then I have an Integration response mapping template that cleans up the returned items the way I want.
This all works perfect.
The thing I can't figure out how to do is if the parameter that is passed in doesn't match anything in the table, how do I get it to change from a 200 status into a 404?
From what I can tell when the passed in parameter doesn't match anything it doesn't cause an error, it just doesn't return anything.
It seems like I need to change the mapping template on the Integration response to first check if the params are empty and then somehow tell it to change the response status.
I can find info about this type of thing with people using Lambda, but I am not using Lambda - just the Dynamodb table and the API Gateway.
You can use Mapping Template to convert the response that you get from DDB and overrride the response code. You can get more details in the link https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-override-request-response-parameters.html
If you are using cloud formation, you can do this by using below snippet
IntegrationResponses:
- StatusCode: "200"
ResponseTemplates:
application/json: |
{
"payload" : {
}
},
}
IntegrationResponses:
- StatusCode: "200"
ResponseTemplates:
application/json: |
#set($inputRoot = $input.path('$'))
#if($inputRoot.toString().contains("Item"))
$input.json("$")
#set($context.responseOverride.status = 200)
#else
#set($context.responseOverride.status = 404)
#end
Api gateway currently supports mapping the status code using the status code of the integration response (Here dynamodb response code). The only workaround is to use a lambda function which outputs different error messages that can be mapped using a error regex http://docs.aws.amazon.com/apigateway/latest/developerguide/how-to-method-settings-execution-console.html.

Resources