Difference between the google translate API - google-translate

I am building an Open Source Chrome extension based on Google translate (here).
I have read the other questions about Google translate API (like this one and this one) but I still don't have my answer.
I found several URLs for Google translate like these:
https://translate.googleapis.com/translate_a/single?client=gtx&sl=en&tl=fr&dt=t&q=father&ie=UTF-8&oe=UTF-8
https://clients5.google.com/translate_a/t?client=dict-chrome-ex&sl=en&tl=fr&dt=t&q=father
It seems all the URL are a different combination of 3 parts:
a base URL :
translate.googleapis.com/translate_a/
https://translate.google.com/translate_a/
https://clients5.google.com/translate_a/
the first argument after the translate_a/: either single or t
the clients which can be gtx, t or dict-chrome-ex [or apparently any ID]
So far I have seen differences in the JSON returned.
This https://translate.googleapis.com/translate_a/single?client=gtx&sl=en&tl=fr&dt=t&q=father&ie=UTF-8&oe=UTF-8 returns this json:
[[["père","father",null,null,1]
]
,null,"en"]
While this https://clients5.google.com/translate_a/t?client=dict-chrome-ex&sl=en&tl=fr&dt=t&q=father returns this json:
{"sentences":[{"trans":"père","orig":"father","backend":1},{"src_translit":"ˈfäT͟Hər"}],"dict":[{"pos":"noun","terms":["père"],"entry":[{"word":"père","reverse_translation":["father","dad","parent","papa"],"score":0.70910621,"previous_word":"le","gender":1}],"base_form":"father","pos_enum":1},{"pos":"verb","terms":["engendrer","concevoir"],"entry":[{"word":"engendrer","reverse_translation":["generate","engender","give rise to","beget","breed","father"],"synset_id":[52561],"score":0.00017133754},{"word":"concevoir","reverse_translation":["design","conceive","devise","plan","form","father"],"synset_id":[52561],"score":4.8327973e-05}],"base_form":"father","pos_enum":2}],"src":"en","alternative_translations":[{"src_phrase":"father","alternative":[{"word_postproc":"père","score":1000,"has_preceding_space":true,"attach_to_next_token":false}],"srcunicodeoffsets":[{"begin":0,"end":6}],"raw_src_segment":"father","start_pos":0,"end_pos":0}],"confidence":1,"ld_result":{"srclangs":["en"],"srclangs_confidences":[1],"extended_srclangs":["en"]},"query_inflections":[{"written_form":"father","features":{"number":2}},{"written_form":"fathers","features":{"number":1}}],"target_inflections":[{"written_form":"père","features":{"gender":1,"number":2}},{"written_form":"pères","features":{"gender":1,"number":1}},{"written_form":"père","features":{"number":2}},{"written_form":"pères","features":{"number":1}}]}
So my question is what are the (other than this one) differences between the different combinations given above.
In which case should I use one rather than the other (except for the returned JSON). Is there one that is depreciated or that supports more request?
For the meaning of the queries: https://stackoverflow.com/a/29537590/3154274

In regards to your actual question, I'm not sure there are any meaningful differences other than what you have stated and it would be difficult to determine if and when any of them are deprecated.
Given the APIs are undocumented and don't appear to be intended for usage in this manner, I don't think any of them should be considered for use in the development of a real application.
However, for solving your problem of finding a free human language translation API, I would recommend the Azure Translator Text API which provides translation of 2 million characters per month as part of their free tier.
For your specific use case, where I assume there may be a high amount of duplicate translations, I feel that caching the results would provide a significant benefit in reducing your usage amount.

Related

How to specify gender in Google Cloud Translation API

I am using Google Cloud Translation API in one of my projects. I want to specify the gender for the translation. I am unable to find about this in Google Cloud Translation. I have also searched a lot on the Internet but not found any way to do this. I know how to specify the gender in Google Text to Speech API using the SSML, but I need it for the translation. Any help will be highly appreciated.
After much searching I have discovered that there is currently no way to do this.
I have made a feature request along these lines at the invitation of GCP support.
The documentation indicates that feature requests are prioritised by how often an issue is starred, so for now my best answer is to star the issue here so that they know how many people are interested in this.
Looking for the same...
As it is NMT (Neural Machine Translation), it reacts to context.
I tried many combinations and found that this works well so far (says, not 'to', not 'talk').
Examples are EN > ES
However, sometimes its effect doesn't reach far in the translation.
So you have to stick the 'prefix' before each sentence.
Sometimes you get irregular behavior (see lower case "estoy"). And when you change something irrelevant (to you, but not to the model) ... buala!
So the final version (for now) is:
I guess the point is:
Understanding how it works (Machine Learning Language Models)
The Model (Algorithm) they use is evolving, so you need to keep an eye, as what works today may break tomorrow.
Once you get the response you will have to filter out you 'prefix', but that is not too difficult.
Please comment if you find better ways (or the API gets updated).
Related info: https://ai.googleblog.com/2018/12/providing-gender-specific-translations.html

Translate API - different result from the web service

When using the translation API, I get a different translation (and worse) than if I use translate.google.com.
I am working on a project for a client, and the client was dissatisfied with the translation and noticed the difference.
Do these two service use different engines? I read that the API uses nmt-mode now, and that translate.google.com already uses the same engine.
Both set to translate from Norwegian to English.
Any more information that can clear this up?
Thanks!
The result differences between the translate.google.com and the Translation API calls are considered as an expected behavior that can be generated due to maintenance tasks and the logic used by the internal processes; However, the engines used for each service seems to be private information.
Based on this, it is normal to get some variances when using the API. I think you can use the model parameter option as an available workaround in case you want to specify which of the available models to use, as well as take a look on the Specifying a model official documentation to get detail information about this alternative.
It's almost about 3 years later and the problem still remains!
So I was trying to translate a dataset with the Google Translate API, but in the end it failed to translate some texts to the target language (in my case, Persian/Farsi). So I decided to check them to see if there's a pattern and maybe translate them using the web version of Google Translate.
As I was doing so, I figured that the web version actually could translate some of those untranslated texts, BUT not all. When trying to find a reason for such behaviour, I found out that most of them were names and not sentences. But as we know, names can easily be written with the target language characters as the translation. But why the API doesn't transform those names while the web version does? This photo will explain everything perhaps:
verified translation
As can be seen, some translations have a badge indicating that the translation has been verified, while some others don't.
So to recap, my guess is that maybe the API is set to only use verified translations, but as for the web version, even unverified translations are allowed since you can edit or report them.

Google Translate API Pricing and Language auto-detect Effeciency

I have the following three questions
I want to use Google's API to translate text. I know that Google charges separately for translation and detection. Google translate also supports translation two ways to translate
i) By specifying both source and target, as in
https://www.googleapis.com/language/translate/v2?key=INSERT-YOUR-KEY&source=en&target=de&q=Hello%20world&q=My%20name%20is%20Jeff
ii) By specifying just the target, where the source us auto-detected,
like this https://www.googleapis.com/language/translate/v2?key=INSERT-YOUR-KEY&target=de&q=Hello%20world
My question is, if i call the API as in the second example, will I be charged for both detection and translation or just translation?
Is it more efficient when you specify both source and target than when you just the target, or, are there any downsides of using the second way above?
How many words should be sent to Google Translate API to detect a language reliably?
Thanks
I pretty much translate using the second approach most of the time (not informing to google the source language) and they only charge for the translation, not for the detection.
However, you must be aware of the fact that, in case your source text is of the same language as your target language, google will attempt to translate it anyways, and sometimes it leads to confused results, or at least a translation which was not necessary, since you already had the text in the desired language.

When would you use 'Real' translation messages in Symfony2?

The Symfony documentation says:
Using Real or Keyword Messages This example illustrates the two
different philosophies when creating messages to be translated:
$translated = $translator->trans('Symfony2 is great');
$translated = $translator->trans('symfony2.great');
< snip >
The choice of which method to use is entirely up to you, but the "keyword" format is often recommended.
http://symfony.com/doc/current/book/translation.html
So when would you use 'Real' messages?
You really have to decide for yourself. It's a bit a matter of taste and a bit a matter of your translation workflow.
Real messages are good when you don't want the overhead of maintaining an additional translation file (for the origin language). Furthermore, if you forget to translate some of the messages, you'd still see a valid message in the origin language. It's also somewhat easier to translate from an original message rather than a keyword.
Keywords are better when messages are changing often, especially with long texts. You abstract away the purpose of a message from the actual text.
EDIT: there's one more scenario when you could argue that real messages are better than keys - when your website only supports one language but with multiple variations - like en_GB, en_US. Most of the messages will be the same, only few will vary. So most of the messages could be left as they are, and only the ones which are actually different between GB and US put into a translation files. It would require much less work compared to an approach with using keys (of course, assuming your messages don't change very often).
One usecase for the real format I could come up with is when messages are created by users via the UI — it would be silly to force them to come up with keywords for each phrase they want to translate.
I haven't had such a need yet, so I always use the keyword format.
For the most part I agree with #Jakub Zalas' answer, however, the last line is a bit off.
Keywords are better when messages may ever change - not just when changing often. This is outlined as well in the docs themselves:
The second method is handy because the message key won't need to be changed in every translation file if you decide that the message should actually read "Symfony2 is really great" in the default locale.
If the message changes and you haven't used a key but the message as key you have to change any code using this message to reflect that change. More places to change are more potential bugs. We have the ability to build in leverage by using message keys.
Real messages has no big interest. IMO you can use them if you are sure your application will always be mono-language and you want to gain a few minutes in development.
Keyword trans has the interest that if you have to translate your website, you'll see immediately if a translation is missing.
To facilitate translations, I personnaly use JMSTranslationBundle

Recommended method to download tweets based on search terms and store

I would like to download tweets based on certain search terms. I'm aware of HTTP GET and such techniques, but I'm not sure the best way to create a simple executable that downloads the tweets and saves them for subsequent analysis.
Any ideas? I'm a basic programmer - if you say "use curl" I know roughly what you mean but not how to set up an application to run curl commands!
Hence my dilemna.
Thanks in advance!
You absolutely can do it in c# or any other language.
From a very rudimentary standpoint, the Twitter API wiki will tell you how, but I know that's not what you're really asking.
I would suggest getting familiar with a good API such as Tweetsharp which also has methods not only for getting your typical timelines, but also using search. The advantage to this (aside from not having to handle your own serialization, etc.) is that it unifies the timeline and search calls as they are actually slightly different API's.
The downside to this approach though is that you're not going to be able to directly translate it to a mac, unless you write it using Silverlight.
the upside to this approach is that Tweetsharp gives you a number of options on how it gives you the data, which in turn gives you a number of options as to how to save the data.

Resources