I am using form recognizer custom model with labelling tool, https://fott.azurewebsites.net/.
I am trying to OCR NRIC card, it have gender information, value like 'M' or 'F'.
But, in labelling tool, cannot OCR this value in some images.
May I know the root cause? because it is single character?? or may be other reason?
and any way to fix the problem?
If OCR did not detect the character than you won't be able to label it in the FOTT tool. This seems like an OCR miss on the gender value single character.
could you also try https://fott-preview.azurewebsites.net/ which has a more recent OCR version?
Related
I've read many pages that point out that many office applications allow for this by typing the code followed by Alt + X, but frequently, I want to insert a symbol when I'm not in one of those applications. Is there a universal way to achieve this?
The character map is useless, unless you have time to manually search through all the characters available.
I posted the question at Super User, and basically, the response I got there was to use Alt codes for the symbols. However, I discovered that, on the whole, these only work for the first 256 Alt codes. So basically, the answer to my question is "No, there's not a good way."
I have been exploring Azure Form Recognizer for one of my project where we wants to perform OCR on some hand written texts.
The problem is that when we give scanned images to the tool to process, it some time doesn't even recognize the text written on it (even if it is clearly written). I tried multiple type of images by performing enhancement on it and also the B/W or colored copy of it but it doesn't works.
Some times it recognize value of two fields as one and this leads to incorrect data where one field is completely blank and other is having value of other one along with its own.
When there is NO VALUE in the tagged field in the testing data, it try to read the from some other place which is not even closer to that field or sometimes un-tagged
Could you please help with these queries.
Thanks in advance.
Can you please share also sample forms please make sure data is anonymized and without any real data ?
Please contact customer service to debug this issue.
Thanks,
Neta - MSFT
A Question in my mind Is it possible to convert Postscript(PS) File Into Word(doc) file using Asp.Net? If Yes then how can we resolve it via C# Code.
I don't know of any tool which will convert PostScript to word. Not only that, but you certainly can't reliably do anything except render the whole thing to an image, and isert that as a graphic.
Up to a point you can extract text, what is it you actually want to do ?
I'm trying to build a search that is similar to that on Google (with regards to exact match encapsulated in double quotes).
Let's use the following phrase for an example
"phrase search" single terms [different phrase]
Currently if I use the following code
Dim searchTermsArray As String() = searchTerms.Split(New String() {" ", ",", ";"}, StringSplitOptions.RemoveEmptyEntries)
For Each entry In searchTermsArray
Response.Write(entry & "<br>")
Next
my output is
"phrase
search"
single
terms
[different
phrase]
but what I really need is to build a key value pair
phrase search | table1
single | table1
terms | table1
different phrase | table2
where table1 is a table with general info, and table2 is a table of "tags" similar to that on stackoverflow.
Can anybody point me in the right direction on how to properly capture the input?
What are you trying to do is not that trivial. Implementing a search "similar to Google's" is far beyond parsing the search string.
I'd suggest you not to reinvent the wheel and instead use production ready solutions such as Apache Lucene.NET or Apache Solr. Those cope with both parsing and fulltext search.
But if you only need to parse this kind of strings then you should really consider solution Pete pointed to.
Regex is your friend. See this question
Depending on how fancy you plan in getting, you might consider the search grammar/implementation that's included with Irony.
http://irony.codeplex.com/
Search string parsing is a non-regular problem. That means that while a regular expression can get deceptively close, it won't take you all the way there without using proprietary extensions, building an unmaintainable mess of an expression, leaving nasty edge cases open that don't work how you'd like, or some combination of the three.
Instead, there are three correct ways to handle this:
Use a third-party solution like Lucene.
Build a grammar via something like antlr.
Build your own state machine.
For a problem of this level (and assuming that search is core enough to what you're doing to really want to implement it yourself), I'd probably go with option 3. This makes more sense when you realize that regular expressions are themselves instructions for how to set up state machines. All you're doing is building that right into your code. This should give you the ability to tune performance and features as well, without requiring adding a larger lexer component into your code.
For an example of how you might do this take a look at my answer to this question:
Reading CSV files in C#
hat I would do is build a state machine to parse the string character by character. This will be the easiest way to implement a fully-correct solution, and should also result in the fastest code.
Is there any reliable way to check if user has entered Arabic words into a form and tries to submit it? Can Javascript handle this? Or, only server script like .NET can do this?
I'm thinking that if possible the script should directly prevent the user from inputting Arabic words into the form and show an alert pop up.
Please share any examples if you have any idea how to do it.
Thanks
In Unicode, Arabic characters fall in a specific range. You can use a regular expression in JavaScript to check if a string contains any characters in that range. (You could also do that in c#.) Here's a really helpful tool that will let you select the ranges you want to search for and create a JS-compatible regex for that.
For example, [\u0600-\u06FF\u0750-\u077F] will match any characters that fall in the Unicode ranges for "Arabic" and/or "Arabic Supplement".
You could use the Google Ajax Language API to detect this. Here is an example.