I have a set of a hundred or so images (eventually this will be a few thousand). From my app I want to be able to take a picture and upload it to Firebase and search wether the picture contains one of the images from the set and if so which one. Does ML Kit provide a suitable way to do this? I also saw that there is now a Google Cloud Vision API but this might be overkill? Is there already some open source projects on something similar?
You can now create custom image classification models from your own training data using ML Kit. Check out https://firebase.google.com/docs/ml-kit/automl-vision-edge
Related
I was wondering if there is some kind of documentation available for Google's AutoML Vision to train recognizing specific logo's.
At this point I only can find documentation about object detection.
Google already has a feature to detect popular product logos within an image - Detect logos. However it's managed by google and you won't find all logos there. It will recognize worldwide/know logos like GENERAL ELECTRIC, Shell, Nike.
There is also a IssueTracker - Feature Request which could give the possibility of modifying Vision API logo detection. You can star it to show you would also like this feature.
To achieve what you want, to recognise less known logos, you would need the AutoML train as you mention.
In AutoML Vision API Tutorial you can find some general information about training, however to your scenario I would suggest you to follow Building Image Detection with Google Cloud AutoML guide where Image Detection was used (open in private view in the browser if you have limitations).
You will find there all steps from Gathering Images through Image Labelling, Model Training, Model Evaluation to Outputs and Conclusions.
In this case you would need to use AutoML from scratch so you would need to train it using all logos.
everyone,
I have a completely untypical topic at this point and I hope that I am addressing the right people here.
I'm working on a personal project. I recently became a G Suite customer and would like to map my document and media management via Google Drive. The document management works well so far and with the help of Google Cloud Search I can easily find my documents across platforms.
Since I personally take a lot of pictures, I was wondering if I could use Google products to find a way to classify my pictures automatically. My approach was to use the label detection of the Vision API to store the 5 most likely labels as metadata. By using the metadata, I can then, when I search for example for architecture or animal, find all images that contain one of the following terms in a single search. The concept should of course be extendable to location and text detection.
I have already tried to create an automatism via pages like integromat.com that labels the photos, but unfortunately without success.
Well and now we come to the current situation. Since I realized that an active interaction with the Google Cloud is essential, I am looking for help from an experienced community. I hope that maybe someone here has a good or inspiring idea.
Maybe one more hint before the proposal is made. Google Photos is great and can do something like that, but it doesn't integrate with Google Cloud Search and managing RAW files would be terrible.
You can achieve what you want using the following approach:
Build a web/mobile app to upload photos to Google Drive or Cloud Storage.
Use the Google Vision API to fetch metadata from your image before uploading to Drive/Cloud Storage.
Use Google Cloud Search Rest API to index the extracted metadata along with the image URL to Cloud Search.
Create a custom Search Interface to search and display your indexed images.
Above steps should be able to point you in the right direction in implementing the solution. Let me know if you need further help with it.
I have been trying to use custom keyword using the speech devices sdk but have had problems when I use my own custom keyword and deploy to android phone (the standard ones are better but still not as good as I need or would expect in commercial application). The screen shots on linked page implies that you can "Add training data to train keyword model" however that doesn't appear when I use the speech studio.
My suspicion is that the generated speech files that are automatically created by the speech studio are not good enough to train model for users with accents (like myself).
We have not yet widely enabled the KWS model adaption.
The Custom Keyword generated from the portal aims to be sufficient for initial trial, it is not currently at the level for commercial application.
We are enabling the ability to upload data to adapt the model, this is being trialed with customers before wider roll-out. It is an upload on the Custom Keyword page and not the Custom Speech page.
Thank you for using the speech SDK!
Did you follow the instructions here:
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-devices-sdk-create-kws
And here (for how to prepare the data):
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-custom-speech-test-data#upload-data
I would like to implement a webcrawler that would be able to, given a photo of an object, search for it in a specific website.
I'm looking into Firebase ML Kit to do the job, more specificaly the Objects detection section but it is still unclear to me if that would be the right tool.
Anyone knows how could I implement that?
Comparing photos (or searching for photos) is not a use-case that ML Kit offers a pre-built model for.
The closest I can think of is running the ML Kit label extraction model on both images, and then checking if both return the same label(s). But even then you need to have all photo's locally already.
If you want something closer, you'll have to build a model yourself. But even then, you're likely going to need to build an extensive web service that searches web sites and pre-labels a lot of images.
Hi and thank you for looking into this.
(Disclaimer: I have little-to-no technical background and would like to find the least complex solution. Ideally, only "connecting" different out-of-the-box components and no coding.)
HIGH-LEVEL PROBLEM:
I have trained a model for text classification using Google AutoML. I want to make this model available on a website, ie I want to enable visitors to enter their text and to receive the model's predicted class.
CONSIDERATIONS SO FAR: AutoML allows us to deploy the model via REST API and I understand that what I want are the API's PUT and GET function (right?). Ideally, I would use some form of plug-in or script to create an input field for the user which accepts the PUT and then delivers the GET.
Are you aware of any services for this? I'm also happy to host the website in an content management system like WordPress.
I'm very open regarding other approaches to solving my problem and highly appreciate any constructive input.
Many thanks!
AutoML Documentation https://cloud.google.com/natural-language/automl/docs/predict
EDIT Jan 10 There is another question related to this and a depo is shared which supposedly provided a solution. I'm not able to access the depo but the question might help you to understand my issue. Is there a way to use Googles AutoML with JavaScript?
EDIT Jan 16 I have learned that in order to provide the input to the model the POST function could be used instead of the PUT.