What are difference between Computer Vision API v1.0 and v2.0? - microsoft-cognitive

Both have their own documentation and I see only small wording differences between those. Are there list of things that have actually changed? Has OCR for example improved on version 2.0 or it's the same except I guess the handwriting recognition? Some kind of changelog would really make a difference.

The only difference between v1.0 and v2.0 is the revised /recognizedText which has a breaking change in the input/output. All other endpoints are exactly the same. Also, if you have an key in an up-to-date pricing tier (including free), your API key will work in both versions.
As you may know, the Computer Vision API has two different OCR endpoints. The /ocr endpoint runs the older recognition engine with broader language coverage. The newer /recognizeText endpoint, which in v1.0 handled handwritten text, in v2.0 covers both handwritten and printed text using a newer engine. The /recognizeText endpoint remains English-only for now. You select between handwritten/printed modalities using the mode query parameter. See
documentation here.
As for documenting changes, there isn't one obvious place for this unfortunately. One option is to check the swagger repo.

Related

Does the Bokeh library have a JavaScript API?

It has so-called JS client. But all the docs or demos are written from the point of Python developer.
Does bokeh has standalone, non-python JavaScript API, and is it used by anyone in non-python environments?
Does bokeh has standalone, non-python JavaScript API
As of late 2019: Somewhat! (See below for more context)
and is it used by anyone in non-python environments?
Yes, definitely, though pure-BokehJS usage levels are still low compared to Python APIs. Improving the JS story is a 2020 goal.
A little history
The Bokeh project was started in 2012 with the explicit goal of providing Python developers a way to publish interactive visualizations in the web, without themselves having to get into "web tech", i.e. JavaScript. As such, the BokehJS library (which has always existed) was originally mostly a largely undocumented implementation detail. It didn't really help that the Bokeh developers themselves were not JS experts at the time. (Some of us still are not!)
As things progressed, and features like CustomJS callbacks and the ability to make custom extensions were added, the BokehJS side of things became more and more publicly exposed. That said, until fairly recently, BokehJS development has been very fast and furious and we were not in any position to provide guarantees around core API stability or expend resources on documentation that would likely be out of date very quickly. As two examples, in the last year BokehJS was completely re-written in TypeScript, which rendered any old CoffeeScript extensions or callbacks deprecated. Additionally the entire layout system was re-vamped to afford much higher performance.
Current status
For some time, there has been a fairly stable "high level" API for BokehJS, and you can find details of that in the Developing with JavaScript chapter of the users guide. Additionally, all the low level "models" and their properties are 100% aligned up between Python and JS, so the Python Reference Guide actually has all the information you might need to use models on the JS side as well.
We are very interested in improving BokehJS for pure JS usage in the coming year. We have been getting some very helpful issues from folks actually using BokehJS directly. Some major hurdles will be overcome with the upcoming 2.0 release, but there will still be work to do to really provide a great user-experience for JS devs. This is actually a fantastic opportunity for any interested JS devs to have a big impact by offering their input, advice, and collaboration. Anyone so interested should head over to the Bokeh project Discourse.

recognizing QR codes with the Computer Vision API

I'm in the design stages of writing a Xamarin Forms app that will process camera pictures of legal documents. Is it possible to use the Computer Vision API to detect QR codes and parse them?
I was looking for the same thing and reviewed the documentation again, but didn't find any suitable operation within Computer Vision up to version 2.1. In the end I settled for this service: http://goqr.me/api/ which does exactly that and seems to work quite well. I tried ZXing, too, but with unsatisfactory results.
Hope that helps.
Joerg.

Realm and RxSwift connectivity

I've been looking at options for persistence when using RxSwift and Realm was looking attractive due to it's relative simplicity and the availability of some extensions in the community repo.
Unfortunately although I can get Realm and RxSwift working nicely in Xcode 8b6, things of seriously wrong as soon as you try to connect them together as RxRealm does not currently compile (there seems to be more going wrong with it than the Grand Renaming as far as I can tell).
Is there a workaround that is reliable? I can't believe for a moment that there isn't, I just can't find a resource at present. I was thinking of converting the Result object into an Set or Array and making this Observable but. I'm not sure if the contents (Realm Objects) are going to be handled correctly. Knowing my luck, I suspect not!
There's a Pull Request towards the RxRealm project adding Swift 3 support: https://github.com/RxSwiftCommunity/RxRealm/pull/26
I suggest you try using that.
More generally, targeting an Xcode beta will by definition give you a less stable software ecosystem, since no one is submitting apps with that and it's a moving target (often with weekly breaking changes). So if you want stable software, use stable tools. Realm and RxRealm both support Swift 2.2 quite well, so using that will give you the best experience.

Using Windows Tablet PC Input to implement handwriting recognition

I want to write a app (initially Windows) that include handwriting to text recognition. I want to use the Windows built-in Tablet PC INput. My question is is there a way to capture the strokes as an image, "send these to the OCR engine used by the Tablet Input, and return the recognised text?
Or, are there any good open source handwriting libraries that could be used directly?
The primary development language is Qt.
I am not aware of any open source or free software libraries for handwriting recognition, so I wrote an adapter. My target was my tablet PC running Linux, but part of my solution can also be used directly on Windows, although you will need to adapt it to your needs.
You will need to read through the licenses for the components I used and validate your own use of them.
The source is available here: Ink2Text project
Part of this solution is a server which uses the XP Handwriting Recognition libraries to interpret the strokes which make up handwriting. As an aside, this does not use OCR - it uses connected graphs of the flow of the strokes.
Another complementary project provides a client handwriting widget: Stylus/Handwriting Input Panel. This is written in Java, and it's GPL3. It accepts the handwriting and sends it off to the server. Unless you wish to use it as is, it's of value solely to see the data format for the ink, although that's simple enough and you can probably deduce that with just the Ink2Text source code.
An earlier solution used the S/HIP with my MS Ink Server, which accepted input over regular network connections. That may also be useful depending on your architecture, but requires a running copy of Windows.
This system provides very good recognition of printed and cursive handwriting.
I will answer questions about it only in it's associated SourceForge forums, so that others may benefit from the answers as well - please don't ask here.
Cheers,
Bret
I want to be wrong, but unfortunately, there is no available open-source offline handwriting recognition system even close to MS' or Apple's Ink.
On Windows you can play with Ink Recognition (About Handwriting Recognition, Advanced Recognition Sample). C++ interface is available, but not as well documented, as .net implementation is. So, you need to apply more efforts and do a lot of research to achieve what you want.
For another systems (including Windows too) there is way to use Tesseract-OCR with your application. See Tesseract's base api. For better recognition quality, you may train tesseract and use your own trained data.
If you do not want to spend your time doing R&D tasks above, you can use paid solutions like: MyScript SDK, WritePad SDK and so on...

Text to Speech in ASP.NET

I would like to do some japanese text to speech on my dedicated windows 2003 x64 server with .net framework, using c#
I found something on google, but requires to install a lot of files on the server... i don't like, for stability issues: there is another option, like a linked dll or something?
You can use Microsoft Speech SDK. It's a set of COM APIs containing TTS and SR engines. I'm not sure if it contains Japanese TTS though.
What you most likely want is the Microsoft Speech Server especially if your webite is going to encounter any decent load or volume.
From the site:
"A speech platform, MSS contains all
the server components for deploying
telephony (voice-only) and multimodal
(voice/visual) applications. MSS
combines Web technologies,
speech-processing services, and
telephony capabilities into a single
system. "
There is also a dedicated Microsft Speech community which will likely help you get started in this realm. Also, I'm not sure what the latest version is...2004 R2?
This article has a decent diagram outlining the various components. Looks like a good fit for integration with an ASP Web Application.
using SAPI in an ASP.NET website, is impossible: the sound will be reproduced on the server :S
It seems that there is the need of Microsoft Speech Server
...
Or not? With asp.net is possible to run a commandline exe on the server to save an mp3, then stream that mp3, right? (how to do that? i will try to figure it)
I will go this way, i let you know the result :)
edit: this is how i solved:
How to save text-to-speech as a wav with Microsoft SAPI?
I save the generated voice in a wav file, then i embed it on the page, playing it in a flash player
COOL!!
Use Microsoft Speech Library and see this article Text to Speech with the Microsoft Speech Library and SDK version 5.1 in CodeProject. Also see Giving Computers a Voice in Coding4Fun
The System.Speech.Synthesis namespace has been part of the framework since .NET 3.0. However, it has internal dependencies on the Speech SDK COM libraries (it chooses the correct version depending on the host OS), so I would recommend prototyping the work before you jump in.
The class you should probably look at first is System.Speech.Synthesis.SpeechSynthesizer (whitepaper and example code)
Warning: I have personally experienced issues using the speech APIs in an ASP.NET environment whereby the request that returned the audio data never returned. Despite heavy debugging I was never able to resolve the issue and the feature was dropped. I have had an unresolved support case with Microsoft for 12 months now.

Resources