I am trying to use Google Cloud Vision with TEXT_DETECTION to try and do OCR on a seven segment display, but am getting pretty lousy results, mostly because it seems to think its a different language. The typical locale it seems to associate it with is "zh" or "ja".
Is there a specific hint that I can give Cloud Vision which might produce better results?
For example, this image below --
produces this output --
"locale" : "ja",
...
...
"description" : "ココ\n"
I have also tried to preprocess the image by increasing contrast, gaussian blur and even erode it to fill in the spaces between the segments, but without much luck.
Any help/pointers would be appreciated.
Try adding this in your json code:
"languageHints": ["en"]
Related
I wonder if there is a list of the possible labels returned by google's object localization: 'human', 'dog', 'cat', etc.
Knowing all possible labels returned by the object localization service of Google, can help us use the service more efficiently. For example, if we are looking in our database for images with hats, we first send our images to the api, then we need to know all possible labels related to hat that google returned. Looking for the word "hat" in the labels will miss those images in which google object recognition returned "sombrero".
There is no extensive list available which has all the possible labels used in Google object localization. If you feel that list would be highly beneficial you may post a feature request in Google's issuetracker.
In any case, notice that Google object localization results contain a machine-generated identifier (MID) corresponding to a label's Google Knowledge Graph entry. Therefore, you may perform calls to the GKG API to check similar possible results.
For example, if you perform the call for Sombrero
https://kgsearch.googleapis.com/v1/entities:search?query=sombrero&key=<yourAPIKey>&limit=5&indent=True
you will obtain the results: Sombrero, Hat, Sun Hat, Sombrero Galaxy, Straw Hat.
I would like to store a set of images on my Google Cloud Services Bucket and compare an image against that set using the Vision API.
Is this possible?
The closest thing I could find in my research is creating a searchable image set https://cloud.google.com/solutions/image-search-app-with-cloud-vision but I can't see how I can leverage this to do what I want.
Ideal Scenario
I take an image on my device, I then send that image in a json object to the vision endpoint, that image is then compared against the image set in my Bucket and a similarity score is returned for each image in my Bucket.
Cloud Vision gives you a match percentage against a "label", not the specific image.
There is no universal measure of similarity between two images. Every another algorithm of similarity calculation uses the formula they thought would work best for their personal needs.
When I used the Could Vision to find the most similar image from the set probably the formula I used at the end was
https://drive.google.com/file/d/0B3BLwu7Vb2U-SVhKYWVMR2JvOFk/view?usp=sharing
But when I need to match rather by visual similarity than by labels I use my gem for the IDHash perceptual hashing algorithm https://github.com/Nakilon/dhash-vips
I used the computer vision api on an image. The word pizza was returned in describing the image and the only connection to pizza I can make is a pizza company logo on a napkin. The word birthday was also returned. Is there any way to figure out if the word pizza was returned because of the company logo, or it was a guess associated with the word birthday?
This depends on how much details the API gives you back. If it allows you to observe the intermediate outputs of the classifier that is used to categorize the image, you can see which parts of the image that results in high output values. The pizza company logo on a napkin, depending on how large it appears, is quite likely to cause this.
If you are using a more open API and a classifer, like keras and the networks provided under keras.applications, you can use what are called "class activation maps" to see which parts of the image causes the result.
If you find the above too had to do, one easy way to investigate the reason is to crop parts of the image using a loop and pass them to the API. I suspect that "birthday" might be related to a distributed feature and you might not be able to find where that comes from, whereas pizza might be from the logo or some other part of the image.
say that I have images and I want to generate labels for them in Spanish - does the Google Cloud Vision API allow to select which language to return the labels in?
Label Detection
Google Cloud Vision APIs do not allow configuring the result language for label detection. You will need to use a different API like Cloud Translation API to perform that operation instead.
OCR (Text detection)
If you're interested in text detection in your image, Google Cloud Vision APIs support Optical Character Recognition (OCR) with automatic language detection in a broad set of languages listed here.
For TEXT_DETECTION and DOCUMENT_TEXT_DETECTION requests, you can provide languageHints parameter in the request to get better results for certain cases where the language is unknown and/or not easily detectable.
languageHints[]
string
List of languages to use for TEXT_DETECTION. In most cases, an empty
value yields the best results since it enables automatic language
detection. For languages based on the Latin alphabet, setting
languageHints is not needed. In rare cases, when the language of the
text in the image is known, setting a hint will help get better
results (although it will be a significant hindrance if the hint is
wrong). Text detection returns an error if one or more of the
specified languages is not one of the supported languages.
The DetectedLanguage information is available in the request to identify the language along with a confidence value.
Detected language for a structural component.
JSON representation
{
"languageCode": string,
"confidence": number,
}
i would like to know if it's possible to make a navigation function that, before creating the path, check through an array of "indesiderate" points and, if the best way touch one of these, try to find an alternative path.
for example, if i have a busy street (from my database, not using traffic service), i took an array of point along all the street (at any switch of course), and i want to avoid these points (=this street)
i made this sort of list of points, but i cannot find a way to find alternatives in path computing
i saw in the gmaps api documentation that the avoid command alredy exist, but can be used only to avoid tolls or highway, but i cannot find a cord-avoid command
thanks
Not available at present. Vote (star) the enhancement request:
https://code.google.com/p/gmaps-api-issues/issues/detail?id=214