Forms Recognizer can't identify fields without : as keys - microsoft-cognitive

I've been using Forms Recognizer for some days now and can't get it to recognize the keys in my forms.
I want to use it to extract the answers given by students in a test...here is an example.
I can't change the structure of the sheet students fill because it is a national exam and I don't have access to who organizes it.
So I trained a model as recommended on Microsoft documentation and used it to "read" the forms and it gets most of the answers, but it all comes as values of a key "Tokens"
{
"key": [
{
"text": "__Tokens__",
"boundingBox": [
0,
0,
0,
0,
0,
0,
0,
0
]
}
],
"value": [
{
"text": "01",
"boundingBox": [
110.1,
826.6,
125.6,
826.6,
125.6,
816.8,
110.1,
816.8
],
"confidence": 1
},
{
"text": "A",
"boundingBox": [
148.2,
834.4,
160.6,
834.4,
160.6,
816.8,
148.2,
816.8
],
"confidence": 1
},
{
"text": "26",
"boundingBox": [
229.4,
828.6,
246,
828.6,
246,
816.8,
229.4,
816.8
],
"confidence": 1
},
{
"text": "B",
"boundingBox": [
268.6,
834.4,
277.8,
834.4,
277.8,
816.8,
268.6,
816.8
],
"confidence": 1
}
Then I recreated the structure on excel but with : after the numbers and trained another model. I also printed some copies of it and filled in to test and Form Recognizer understood the numbers as keys.
{
"key": [
{
"text": "01:",
"boundingBox": [
270.4,
1625.4,
313,
1625.4,
313,
1600.5,
270.4,
1600.5
]
}
],
"value": [
{
"text": "A",
"boundingBox": [
350.7,
1620.9,
368.8,
1620.9,
368.8,
1587,
350.7,
1587
],
"confidence": 1
}
]
},
{
"key": [
{
"text": "26:",
"boundingBox": [
520.2,
1624.2,
552.8,
1624.2,
552.8,
1600.5,
520.2,
1600.5
]
}
],
"value": [
{
"text": "E",
"boundingBox": [
604.6,
1618.8,
625.8,
1618.8,
625.8,
1587,
604.6,
1587
],
"confidence": 1
}
]
}
Does anyone know some way to recognize the number fields as keys without the : ?

Form Recognizer will not consider the row numbers as keys unless specifically marked as keys, hence it currently does not discover them as keys.

Related

Here.com Route API - V8 - State Mileage

We're trying to migrate from routes v7 to routes v8. Using v8, how can we get a breakdown of miles per US State?
In version 7.2 we could do
https://route.ls.hereapi.com/routing/7.2/calculateroute.json?apiKey=API_KEY&mode=fastest;truck&excludecountries=MEX,CAN&metricSystem=imperial&routeattributes=sm,sc&instructionFormat=text&truckType=tractorTruck&trailersCount=1&waypoint0=geo!33.90251,-81.13206&waypoint1=geo!39.80203,-105.08759
And the state codes would be in the summaryByCountry element:
"summaryByCountry": [
{
"distance": 189887,
"trafficTime": 8239,
"baseTime": 8206,
"flags": [
"motorway",
"builtUpArea"
],
"text": "The trip takes 118 mi and 2:17 h.",
"travelTime": 8206,
"country": "South Carolina",
"_type": "RouteSummaryByCountryType"
},
...
In version 8, a similar request:
https://router.hereapi.com/v8/routes?apiKey=API_KEY&origin=32.20618,-110.96474&destination=40.391537,-104.681168&routingMode=fast&transportMode=truck&avoid[features]=ferry&exclude[countries]=MEX,CAN&units=imperial&return=polyline,summary,actions,instructions&spans=countryCode,length,truckAttributes,notices&truck[trailerCount]=1&via=40.014984,-105.270546
"spans": [
{
"offset": 0,
"truckAttributes": [
"open"
],
"length": 1460740,
"countryCode": "USA"
},
{
"offset": 14050,
"truckAttributes": [
"open",
"tollRoad"
],
"length": 272,
"countryCode": "USA"
},
{
"offset": 14053,
"truckAttributes": [
"open"
],
"length": 23153,
"countryCode": "USA"
}
v7: summaryByCountry
v8: Not present. If spans are requested with spans=countryCode,length, then information about the distance in each country can be retrieved. No plans to support in any other manner.
Response in v8 contains :
spans": [
{
"offset": 0,
"truckAttributes": [
"open"
],
"length": 76817,
"countryCode": "USA"
}
],
link to migration guide : https://developer.here.com/documentation/routing-api/migration_guide/index.html

Cuckoo report , meaning of report

'''
{
"category": "process",
"status": 1,
"stacktrace": [],
"api": "CreateThread",
"return_value": 176,
"arguments": {
"thread_identifier": 1228,
"function_address": "0x004021ce",
"flags": 0,
"parameter": "0x0012fef0",
"stack_size": 0
},
"time": 1647881189.051382,
"tid": 1836,
"flags": {}
}
'''
this is part of report of cuckoo,what's the meaning of the "status":1,sometimes it's "status":0.

Trouble accessing AWS Rekognition facial analysis using ``paws`` package

This is not a data analysis issue, so I don't have a data to reproduce.
I installed the paws package from this Github page to extract facial features (i.e., smile) via Amazon Rekognition. I am doing it as a part of a study to test performance across Microsoft Azure and Face++. By the way, I replaced "AccessKeyHere" and "SecretKeyHere" with appropriate security IDs.
library(paws)
Sys.setenv(
AWS_ACCESS_KEY_ID = "AccessKeyHere",
AWS_SECRET_ACCESS_KEY = "SecretKeyHere",
AWS_REGION = "us-east-1"
)
ec2 <- paws::ec2()
resp <- ec2$run_instances(
ImageId = "ami-f973ab84",
InstanceType = "t2.micro",
KeyName = "default",
MinCount = 1,
MaxCount = 1,
TagSpecifications = list(
list(
ResourceType = "instance",
Tags = list(
list(Key = "webserver", Value = "production")
)
)
)
)
Unfortunately, I get this error:
Error: InvalidKeyPair.NotFound: The key pair 'default' does not exist
I tried following through the Setting Up Credentials document in the Github page without success.
The results I want would look something along the lines of this (taken directly from Amazon demo):
{
"FaceDetails": [
{
"BoundingBox": {
"Width": 0.20394515991210938,
"Height": 0.4204871356487274,
"Left": 0.1556132435798645,
"Top": 0.11629478633403778
},
"AgeRange": {
"Low": 20,
"High": 38
},
"Smile": {
"Value": true,
"Confidence": 98.88771057128906
},
"Eyeglasses": {
"Value": true,
"Confidence": 99.87944030761719
},
"Sunglasses": {
"Value": true,
"Confidence": 99.51188659667969
},
"Gender": {
"Value": "Female",
"Confidence": 99.98441314697266
},
"Beard": {
"Value": false,
"Confidence": 99.99455261230469
},
"Mustache": {
"Value": false,
"Confidence": 99.99205017089844
},
"EyesOpen": {
"Value": true,
"Confidence": 100
},
"MouthOpen": {
"Value": true,
"Confidence": 99.64435577392578
},
"Emotions": [
{
"Type": "ANGRY",
"Confidence": 0.5140029191970825
},
{
"Type": "DISGUSTED",
"Confidence": 0.36493897438049316
},
{
"Type": "SURPRISED",
"Confidence": 1.5832388401031494
},
{
"Type": "CALM",
"Confidence": 7.553433418273926
},
{
"Type": "CONFUSED",
"Confidence": 2.7683539390563965
},
{
"Type": "SAD",
"Confidence": 0.1280381977558136
},
{
"Type": "HAPPY",
"Confidence": 87.08799743652344
}
],
"Landmarks": [
{
"Type": "eyeLeft",
"X": 0.23317773640155792,
"Y": 0.2868470251560211
},
{
"Type": "eyeRight",
"X": 0.3252476453781128,
"Y": 0.27732565999031067
},
{
"Type": "mouthLeft",
"X": 0.2494768351316452,
"Y": 0.4339924454689026
},
{
"Type": "mouthRight",
"X": 0.32560691237449646,
"Y": 0.42571622133255005
},
{
"Type": "nose",
"X": 0.29963040351867676,
"Y": 0.3560841381549835
},
{
"Type": "leftEyeBrowLeft",
"X": 0.18990693986415863,
"Y": 0.25858017802238464
},
{
"Type": "leftEyeBrowRight",
"X": 0.2559714913368225,
"Y": 0.23907452821731567
},
{
"Type": "leftEyeBrowUp",
"X": 0.22477854788303375,
"Y": 0.23571543395519257
},
{
"Type": "rightEyeBrowLeft",
"X": 0.3101874887943268,
"Y": 0.23408983647823334
},
{
"Type": "rightEyeBrowRight",
"X": 0.3540191650390625,
"Y": 0.24142536520957947
},
{
"Type": "rightEyeBrowUp",
"X": 0.3341374397277832,
"Y": 0.2246120721101761
},
{
"Type": "leftEyeLeft",
"X": 0.21425437927246094,
"Y": 0.28872400522232056
},
{
"Type": "leftEyeRight",
"X": 0.2506107687950134,
"Y": 0.28627288341522217
},
{
"Type": "leftEyeUp",
"X": 0.23298975825309753,
"Y": 0.2797400951385498
},
{
"Type": "leftEyeDown",
"X": 0.2338254302740097,
"Y": 0.29329705238342285
},
{
"Type": "rightEyeLeft",
"X": 0.3053741455078125,
"Y": 0.2805119752883911
},
{
"Type": "rightEyeRight",
"X": 0.33686137199401855,
"Y": 0.2753002941608429
},
{
"Type": "rightEyeUp",
"X": 0.3239244222640991,
"Y": 0.2698554992675781
},
{
"Type": "rightEyeDown",
"X": 0.32346177101135254,
"Y": 0.28338298201560974
},
{
"Type": "noseLeft",
"X": 0.27390313148498535,
"Y": 0.37751662731170654
},
{
"Type": "noseRight",
"X": 0.3062724471092224,
"Y": 0.373584508895874
},
{
"Type": "mouthUp",
"X": 0.29330143332481384,
"Y": 0.4100639820098877
},
{
"Type": "mouthDown",
"X": 0.2929871082305908,
"Y": 0.4546505808830261
},
{
"Type": "leftPupil",
"X": 0.23317773640155792,
"Y": 0.2868470251560211
},
{
"Type": "rightPupil",
"X": 0.3252476453781128,
"Y": 0.27732565999031067
},
{
"Type": "upperJawlineLeft",
"X": 0.14384371042251587,
"Y": 0.3039131164550781
},
{
"Type": "midJawlineLeft",
"X": 0.1776188313961029,
"Y": 0.4594067335128784
},
{
"Type": "chinBottom",
"X": 0.2889330983161926,
"Y": 0.5328735709190369
},
{
"Type": "midJawlineRight",
"X": 0.3430669903755188,
"Y": 0.441012978553772
},
{
"Type": "upperJawlineRight",
"X": 0.3498701751232147,
"Y": 0.28120794892311096
}
],
"Pose": {
"Roll": -4.4155192375183105,
"Yaw": 10.105213165283203,
"Pitch": 0.32932278513908386
},
"Quality": {
"Brightness": 60.6755256652832,
"Sharpness": 94.08262634277344
},
"Confidence": 99.99998474121094
}
]
}
If I could advance to this stage, it would be fantanstic. But it would be even nicer if the extracted data could look consistent with my Microsoft Azure results:
anger contempt disgust fear happiness neutral sadness surprise
emotion 0 0 0 0 0 1 0 0
emotion1 0 0 0 0 0 0.997 0.002 0
emotion2 0 0.001 0 0 0 0.994 0.004 0.001
emotion3 0 0 0 0 0 0.965 0.035 0
The error is with this line:
KeyName = "default",
It is referring to an Amazon EC2 Key Pair that should be attached to the Amazon EC2 instance. However, there is no keypair named default. Therefore, it fails.
To fix it, instead of default you should use the name of a Keypair that has been created. You can see a list of keypairs in the EC2 management console. You could also remove this line (without specifying a KeyName), but then you would not be able to login to the instance.

Watson doesn't get zeros

Let's say I have a conversation services configured in IBM Watson ready to recognize a number given in words and in pieces. For example, if I have the number 1320, it can be sent as thirteen twenty or thirteen two zero, etc.
In the first case I'll get something like this from a conversation service:
{
// ...
"entities": [
{
"entity": "sys-number",
"location": [
0,
5
],
"value": "13",
"confidence": 1,
"metadata": {
"numeric_value": 13
}
},
{
"entity": "sys-number",
"location": [
6,
12
],
"value": "20",
"confidence": 1,
"metadata": {
"numeric_value": 20
}
}
]
// ...
}
In the second case (thirteen two zero):
{
// ...
"entities": [
{
"entity": "sys-number",
"location": [
0,
5
],
"value": "13",
"confidence": 1,
"metadata": {
"numeric_value": 13
}
},
{
"entity": "sys-number",
"location": [
6,
14
],
"value": "2",
"confidence": 1,
"metadata": {
"numeric_value": 2
}
}
]
// ...
}
The big question here is: Where is my zero?
I know this question has been asked more than once, but none of the answers I found solved my current issues.
I've seen examples where a regular expression could be used, but that's for actual numbers, here I have words and Watson is the one who actually guesses the number.
Is there a way to obtain a third entry in my entities for that zero? or another work arround? or configuration I may be lacking?
I just tried this in Watson Assistant and it now gets the zero. With #sys-numbers enabled my utterance is thirteen two zero and I get these entities back:
entities: [
0: {
entity: "sys-number",
location: [0, 8],
value: "13",
confidence: 1,
metadata: {numeric_value: 13}
}
1: {
entity: "sys-number",
location: [9, 12],
value: "2",
confidence: 1,
metadata: {numeric_value: 2}
}
2: {
entity: "sys-number",
location: [13, 17],
value: "0",
confidence: 1,
metadata: {numeric_value: 0}
}
]
It might be the entities matching has been improved.

Oracle Reading a JSON_LIST of JSON_LIST using PLJSON

I'm trying to read a JSON object with has nested lists. Which looks like this:
[{
"id": 70070037001,
"text": "List 1",
"isleaf": 0,
"children": [
{
"oid": 100,
"text": "Innerlistobject100",
"isleaf": 0,
"children": [
{
"sid": 1000,
"text": "Innerlistobject1000",
"isleaf": 1
},
{
"sid": 2000,
"text": "Innerlistobject2000",
"isleaf": 1
}
]
},
{
"oid": 200,
"text": "Innerlistobject200",
"isleaf": 0,
"children": [
{
"sid": 1000,
"text": "Innerlistobject1000",
"isleaf": 1
},
{
"sid": 2000,
"text": "Innerlistobject2000",
"isleaf": 1
}
]
}
]
}]
ref: https://sourceforge.net/p/pljson/discussion/935365/thread/375c0293/ - where the person is creating the object, but I want to do the opposite and read it.
Do I have to iterate like this (note name is children within children):
Declare
l_Children_List json_list;
JSON_Obj json;
l_Child_JSON_Obj json;
Begin
IF (JSON_Obj.exist ('children')) THEN
IF (JSON_Obj.get ('children').is_array)
l_Children_List := json_list (JSON_Obj.get ('children'));
FOR i IN 1 .. l_Children_List.COUNT
IF (JSON_Obj.exist ('children')) THEN
IF (JSON_Obj.get ('children').is_array)
l_Children_List := json_list (JSON_Obj.get ('children'));
FOR i IN 1 .. l_Children_List.COUNT
jSON_child_val := l_Children_List.get (i);
l_Child_JSON_Obj := json (jSON_child_val );
LOOP
End If;
LOOP
End If;
End;
with json_example as (
select '{
"id": 70070037001,
"text": "List 1",
"isleaf": 0,
"children": [
{
"oid": 100,
"text": "Innerlistobject100",
"isleaf": 0,
"children": [
{
"sid": 1000,
"text": "Innerlistobject1000",
"isleaf": 1
},
{
"sid": 2000,
"text": "Innerlistobject2000",
"isleaf": 1
}
]
},
{
"oid": 200,
"text": "Innerlistobject200",
"isleaf": 0,
"children": [
{
"sid": 1000,
"text": "Innerlistobject1000",
"isleaf": 1
},
{
"sid": 2000,
"text": "Innerlistobject2000",
"isleaf": 1
}
]
}
]
}' as json_document
from dual
)
SELECT tab.*
FROM json_example a
join json_table (a.json_document, '$'
COLUMNS
(id NUMBER PATH '$.id'
,text VARCHAR2(50) PATH '$.text'
,isleaf NUMBER PATH '$.isleaf'
,NESTED PATH '$.children[*]'
COLUMNS
(oid NUMBER PATH '$.oid'
,otext VARCHAR2(150) PATH '$.text'
,oisleaf NUMBER PATH '$.isleaf'
,NESTED PATH '$.children[*]'
COLUMNS
(sid NUMBER PATH '$.sid'
,stext VARCHAR2(250) PATH '$.text'
,sisleaf NUMBER PATH '$.isleaf'
)
)
)
) tab on 1=1

Resources