Google Cloud Translation offers Neural Machine Translation (NMT) and Phrase-Based Machine Translation (PBMT). It is stated that "nmt model is used if the language pair is supported, otherwise the PBMT model is used."(https://cloud.google.com/translate/docs/advanced/translating-text-v3).
Does Google Translate web (https://translate.google.com) work in the same way? Does it use exactly the same NMT model as cloud translate API?
I am trying out Azure Cognitive Services OCR to scan in an identity document. It works fairly well but I was wondering if it is possible to train the OCR engine or somehow link it to a learning service to improve character recognition ?
I don't think that you can train Azure OCR, but there is one new Azure service called Form Recognizer which gives better results than the previous OCR service and also you can train it on custom data.
I'm wondering which algorithms are used in Google Cloud Translation API to detect the language?
And how is the confidence level calculated?
Thank you for your help!
Google does not expose publicly at this point the model it is using for language detection in the Translation API.
I am using the Google Translate neural network (amazing improvement) via the Google Cloud Translation API in SDL Trados to process technical translations.
Of course it needs heavy post-editing, mostly terminology and sometimes style. I would really like if the neural network could learn from this post editing - but there seems to be no way to do feed my edits back.
It is possible when using the web interface manually (translate.google.com).
The (years unupdated) Google Translator Toolkit allowed to used a shared public TM, but that is now obsolete with the neural network.
Can I somehow feed translations back to Google Cloud Translation API to train it?
Their FAQ states this:
"Does Google use my data for training purposes?
No, Google does not use the content you translate to train and improve our machine translation engine. In order to improve the quality of machine translation, Google needs parallel text - the content along with the human translation of that content."
As you pointed out, in the documentation regarding confidentiality, it is highlighted that Google does not use the data for training purposes as a background/transparent process, due to the following reasons:
Confidentiality: for confidentiality reasons, the content inputted to the Translation API will not be used for training the model.
Non-feasibility: the Neural Network model behind Translation API would require the non-translated content plus the translated version suggested by the user in order add some training to the model; so it is not possible to train the model with just the non-translated text.
Moreover, there is currently not the possibility to suggest translations to the API in order to train the model in a more custom way.
As a side note, you might be interested in keeping an eye on AutoML, the new Google Cloud Platform's product that is currently still in alpha, but to which you can request access by filling in the form in the main page. It will allow the creation of custom Machine Learning models without requiring the dedication and expertise that other more complex products such as ML Engine require. The first product of the AutoML family to be launched will be AutoML Vision, but it is possible that similar products will appear for some of the other ML-related APIs in the Platform, such as the Translation API, which is the one you are interested in.
Also feel free to visit the Google Cloud Big Data and Machine Learning Blog from time to time in order to keep updated in the latest news in this field. If you are interested in AutoML, its release and presentation will probably have an article in the blog too.
So as a summary: no, currently you cannot feed suggested translations back to the Translation API, but in the future you might be able to do so, or at least have your own custom models.
I want to create an application where in I capture images of people within my family and detect who it is. can I use vision API to create cloud database to store different pictures of each family members labelled with their names so that when I pull a request from the API it scans the images from the database and detects which family member it is rather than just detecting the faces in it. can I train it to do so???
It is possible to train the classifier. More details can be found in this link: How to train and classify images using Google Cloud Machine Learning and Cloud Dataflow