Disable token breaks on punctuation LUIS.ai - microsoft-cognitive

I am working with Microsoft Cognitive Service's Language Understanding Service API, LUIS.ai.
Whenever text is parsed by LUIS, whitespace tokens are always inserted around punctuation.
This behavior is intentional, according to the documentation.
"English, French, Italian, Spanish: token breaks are inserted at any
whitespace, and around any punctuation."
For my project, I need to preserve the original query string, without these tokens, as some entities trained for my model will include punctuation, and it's annoying and a bit hacky to strip the extra whitespace from the parsed entities.
Example of this behavior:
Is there a way to disable this? It would save quite a bit of effort.
Thanks!!

Unfortunately there's no way to disable that for now, but the good news is that the predictions returned will deal with the original string, not the tokenized one you see in the example labeling process.
Here in the documentation of how to understand the JSON response you can see the example output preservers the original "query" string, and the extracted entities have the zero based character indices ("startIndex", "endIndex") in the original string; this will allow you to deal with the indices instead of parsed entity phrases.
{
"query": "Book me a flight to Boston on May 4",
"intents": [
{
"intent": "BookFlight",
"score": 0.919818342
},
{
"intent": "None",
"score": 0.136909246
},
{
"intent": "GetWeather",
"score": 0.007304534
}
],
"entities": [
{
"entity": "boston",
"type": "Location::ToLocation",
"startIndex": 20,
"endIndex": 25,
"score": 0.621795356
},
{
"entity": "may 4",
"type": "builtin.datetime.date",
"startIndex": 30,
"endIndex": 34,
"resolution": {
"date": "XXXX-05-04"
}
}
]
}

Related

Firestore Security Rules Really Spaghetti-Like?

while working on Firestore Security rules, I found out that there is no way to specify read/write-access on a Field-Level.
Everything that can be done is to specify access on a Document/Collection level.
But doesn't this enforce really weird database structures?
Consider this example:
[
{
"id": 15,
"name": "room1",
"color": "red",
"owner": "Tim"
},
{
"id": 642,
"name": "room2",
"color": "green",
"owner": "Charles"
},
{
"id": 989,
"name": "room3",
"color": "blue",
"owner": "Jane"
}
]
In this example I want to make it possible for e.g. Jane to read the Fields id name and owner of every entry in the collection, but I don't want her to see the field color of the rooms of the other persons.
This would be of course be possible with a data structure like this:
[
{
"id": 15,
"name": "room1",
"owner": "Tim",
"private_values": {
"color": "red"
}
},
{
"id": 642,
"name": "room2",
"private_values": {
"color": "green"
},
"owner": "Charles"
},
{
"id": 989,
"name": "room3",
"private_values": {
"color": "blue"
},
"owner": "Jane"
}
]
Everything I did was just move the "private"-values (in this case only the color) into another extra collection.
This way I can just set a rule for the root-object, and another extra rule on the object private_values.
Even though this is entirely possible to do, I wouldn't consider it especially clean when extrapolated to a bigger example where there would be for example more groups of users, who all need to be able to see different fields.
Is there a cleaner and better way to do this than the one I just explained, or is there anything else I missed?
Regards
You didn't miss anything. This is exactly what you're supposed to do. Documents in Firestore are the most granular unit for operations.
Note that you can also not read a partial document (you must read all the fields if you want to read any of the fields). If you write a Cloud Function that triggers when a document changes, you always receive the contents of the entire document, and you can't write a trigger for when an individual field changes.

Correct invalid json for use with Json.Net

I have some JSON which I have no control over it's from a third-party supplier and the quotes are not handled properly resulting in malformed JSON. I have asked them to correct it but in the meantime, I would like to be able to use it.
var json = "{
"news": {
"headline": "Headline",
"items: [
{
"title": "title1",
"description": "description1",
},
{
"title": "title2",
"description": "description2",
},
{
"title": "title3",
"description": "description "with quotes" in the middle",
},
]
}
}";
I am trying to use DeserializeObject with it
var obj = JsonConvert.DeserializeObject<MyClass>(json);
Ideally, I would like all three items in my deserialised object, but even two would be better than the DeserializeObject just blowing up because the JSON is badly formatted.
Is there a possible correction which can be applied? I have looked at regexes but it's difficult to come up with something that could work with long and complex examples with many more items than this simplified version.

Adobe Analytics API - Real Time Classification

I need to get from Omniture real time API a classify eVar, exclude some value, and then breackdown its with sitesection.
I try with this query:
{
"reportDescription": {
"source": "realtime",
"reportSuiteID": "**RSID**", //MY REPORT SUITE
"metrics": [{
"id": "instances"
}],
"elements": [{
"id": "evar", //MY EVAR
"top": 100,
"classification": "Real Time", //CLASSIFICATION NAME
"search": {
"type": "NOT",
"keywords": ["somevalue"] //THE VALUE TO EXCLUDE
}
},{
"id" : "sitesection",
"top" : 1
}],
"dateGranularity": "minute:1",
"dateFrom": "-1 minute"
}
}
But in the JSON response I see "somevalue" how if it not excluded.
The strange thing is that if I remove the "breakdown" (with sitesection) the classification filter seems to works fine.
I can't use classification filter if a breackdown is used in real time report? I can't find any documentation about that.
An other thing is that if I request a report with the classification, without any search, I receve the response but there is a lot of "::Unspecified::". The problem is that the "::Unspecified::" seems to be the last datas that Omniture receves form my webpages. I think this means that classifications are not in real time, also if you can to use it in real time report.

Wrong intent in Alexa Skill Request when using the simulator

I set up my intents using this intent schema:
{
"intents": [
{
"intent": "StartIntend"
},
{
"intent": "AMAZON.YesIntent"
},
{
"intent": "AMAZON.NoIntent"
}
]
}
My sample utterances look like this (it's german):
StartIntend Hallo
StartIntend Moin
StartIntend Guten Tag
Why does the Amazon Developer Console generate the following request, when I use the utterance "Yes" or "Ja"?
{
"session": {
"sessionId": "SessionId...",
"application": {
"applicationId": "amzn1.ask.skill...."
},
"attributes": {},
"user": {
"userId": "amzn1.ask.account...."
},
"new": true
},
"request": {
"type": "IntentRequest",
"requestId": "EdwRequestId...",
"locale": "de-DE",
"timestamp": "2017-02-17T21:07:59Z",
"intent": {
"name": "StartIntend",
"slots": {}
}
},
"version": "1.0"
}
Whatever I enter, it always is using the intend StartIntend.
Why is that? What have I forgotten / what have I done wrong?
The schema and utterance look correct.
I tried duplicating what you are seeing by performing the following steps:
Copied them as-is into a new skill on my account
Selected the North America region on the Configuration page.
Set the lambda to point to an existing lambda that I have. For testing purposes, I just need a valid ARN. I'm going to ignore the response anyways.
Then entered "Yes" into the service simulator
It indeed sent the Lambda the AMAZON.YesIntent.
So I conclude that there's nothing with the data you posted.
I tried entering Ja which resulted in the StartIntend, but I guess I would expect that since Ja is not "Yes" in North America.
Have you set the region to Europe, and entered a Lambda for the Europe region?
I talked about it with the Amazon Support. After some experiments it turned out, you have to write "ja" in lowercase. It seems to be a bug in the simulator itself.
When creating the skill in the Alexa Skills Kit, you need to choose the correct language i.e. German, see screenshot below.
Everything else seems to be correct.

Serialized Entities displaying only ID

I'm using JMSSerializer and FOSRestBundle. I have a fairly typical object graph, including some recursion.
What I would like to accomplish is that included objects beyond a certain depth or in general are listed only with their ID, but when serialized directly, with all data.
So, for example:
Users => Groups => Users
when requesting /user/1 the result should be something like
{ "id": 1, "name": "John Doe", "groups": [ { "id": 10 }, { "id": 11 } ] }
While when I request /group/10 it would be:
{ "id": 10, "name": "Groupies", "users": [ { "id": 1 }, { "id": 2 }, { "id": 4 } ] }
With #MaxDeph I can hide the included arrays completely, so I get
{ "id": 1, "name": "John Doe", "groups": [] }
But I would like to include just the IDs so that the REST client can fetch them if it needs them, or consult his cache, or do whatever.
I know I can manually cobble this together using groups, but for consistency reasons I was wondering if I can somehow enable this behaviour in my entire application, maybe even with a reference to maxdepth so I can control where to include IDs and where to include full objects?
For the sake of those finding this:
I found no other solution, but doing this with groups works just fine and gives me the result I was looking for.

Resources