Detecting a product within an image - google-cloud-vision

I want to detect a product within an image using cloud vision. If the product is too small relative to the image, then the algorithm does not detect it. For example, if I used an image of a product, it correctly labels it as a product but when I use a image of a person holding that product, it details plenty of (good) info about the person holding the object but fails to identify the object. Is there a way to force it?
You can use this image to test it using the Cloud Vision Web UI: https://img.bleacherreport.net/img/images/photos/003/758/947/hi-res-bc77cb085652783632d48c378e0a0ffb_crop_north.jpg?h=533&w=800&q=70&crop_x=center&crop_y=top
If I scan the entire image, it provides one label 'product' among other things. But if I crop just the coco cola in the image and scan that cropped image, it provides lot more details. E.g Coco Cola, soft drink etc. How can I get the details of the product if it only occupies a small portion within an larger image?

You can use Object Localization, which as stated can detect less prominent objects. I ran it on the image you provided, which returned 'bottle' for the cola - it also returns the boundingPoly vertices for the objects, which as you've noted you can use to crop the image and get a better detection

you need to pass feature PRODUCT_SEARCH with the request, which may default to TYPE_UNSPECIFIED ...so that it will know, that it shall detect products and not people or other prominent objects within the view.
see Searching for Products & Managing Products and Reference Images, which tell that, that you'd have to upload reference images of products to use that feature, which needs to ML learn to know these products first.

Related

What is the proper way to tag pages in Google Analytics?

I don't even know if "tagging pages" is what I mean.
Essentially, I have a large education website with many types of pages. Specifically, I want to tag our program pages by faculty, level, etc. For example, the Biology program page would be tagged with Science (as its faculty), and Undergraduate (as its level). It's possible that a program could belong to multiple faculties and/or levels (Psychology, for instance, is both a Science program and an Arts program). There is nothing in the URL to signify faculty or level. The website is built in Drupal, in case you know of any modules that could facilitate this.
I want to understand how different faculties/levels/etc perform. I will be building reports in Google Data Studio.
Any guidance would be appreciated!
What you are looking for is called 'content grouping'. If you haven't information in the URL you can define some rules when the page loads and pass the information to Analytics with the pageviews.
You can find more information here:
https://support.google.com/analytics/answer/2853423?hl=en
Then you can get these information from Data Studio.
Because of your multi-value needs, nothing in GA is going to satisfy your requirements out of the box. You will have to do some post-processing, and I am not familiar enough with Data Studio to know where its limits are in that regard.
As the previous poster suggested, Content Grouping is the standard way to create custom aggregations of pages. You can have multiple content groupings, such as Faculty and Level, but a page can be in only one group per grouping (not the clearest terminology but it appears to be what Google uses).
A different option is Custom Dimensions. There are two options here. One is to create custom dimensions for Level and Faculty. Each page can still have only one value per dimension, but you could send a comma-delimited string when a department is in multiple faculties (for instance) and then pull it apart again in a spreadsheet.
The second option is to create a custom dimension for Department directly, and associate each department to the appropriate one or more faculties and levels in your reporting.
How you set the custom dimensions or content grouping will depend on your implementation of GA. If you are using the Google Analytics Drupal module, it says it supports setting custom dimensions as a feature. If you are using Google Tag Manager you can set the dimension value in your tags directly, though of course it will need to decide what value to set on based on either totally enumerated rules you write or something it can read out of the page. Here is some Tag Manager documentation: Content Grouping via GTM; Custom Dimensions via GTM.
If the department is present in the page in some consistently marked-up way you can grab it; if not the Metatag module or one of its schema.org extensions might be able to provide you a spot to set a value for GTM to retrieve.

Comparing device captured image with image in Google Cloud Bucket

I would like to store a set of images on my Google Cloud Services Bucket and compare an image against that set using the Vision API.
Is this possible?
The closest thing I could find in my research is creating a searchable image set https://cloud.google.com/solutions/image-search-app-with-cloud-vision but I can't see how I can leverage this to do what I want.
Ideal Scenario
I take an image on my device, I then send that image in a json object to the vision endpoint, that image is then compared against the image set in my Bucket and a similarity score is returned for each image in my Bucket.
Cloud Vision gives you a match percentage against a "label", not the specific image.
There is no universal measure of similarity between two images. Every another algorithm of similarity calculation uses the formula they thought would work best for their personal needs.
When I used the Could Vision to find the most similar image from the set probably the formula I used at the end was
https://drive.google.com/file/d/0B3BLwu7Vb2U-SVhKYWVMR2JvOFk/view?usp=sharing
But when I need to match rather by visual similarity than by labels I use my gem for the IDHash perceptual hashing algorithm https://github.com/Nakilon/dhash-vips

How to distinguish stack DICOM images from overview image?

I have a stack of DICOM coronal images where I have used the Image Position (Patient)(0020,0032) tag to sort the images in correct order.
However, the stack also contains an "overview" image showing how the coronal slices where generated from an axial stack - see attachment.
Obviously I want to automatically skip this overview image when sorting the stack - does anyone know how to utilize DICOM tags (which?) to distinguish this one from the rest of the stack?
Sorting on Image Position (Patient)(0020,0032) seems correct.
Sorting on other tags like:
(0008,0012) Instance Creation Date and (0008,0013) Instance Creation Time
(0008,0022) Acquisition Date and (0008,0032) Acquisition Time
may not work. Technician may acquire intermediate images afterward if missing in earlier scan sequence.
(0020,0012) Acquisition Number and (0020,0013) Instance Number may not work for same reason.
So, tag you choose for sorting looks correct.
Now, images you are interested in are "AXIAL" and you want to skip "OVERVIEW" images from stacking.
Well, I am not sure if "OVERVIEW" is correct term. Do you mean Topogram/Scout/Scanogram/Localizer/Patient Protocol or something? Anyway, we will continue with your term.
Check the (0008,0008) Image Type attribute. For "AXIAL" images, it should contain value "AXIAL" generally at third position. Something like following:
ORIGINAL\PRIMARY\AXIAL
OR
DERIVED\SECONDARY\AXIAL
For "OVERVIEW" images, this either will not present or it will be different ("LOCALIZER" most probably assuming CT Image).
Please note that only first two values are mandatory. Values beyond that are optional.
Reference:
ftp://dicom.nema.org/MEDICAL/dicom/2016a/output/chtml/part03/sect_C.7.6.html#sect_C.7.6.1.1.2
https://dicom.innolitics.com/ciods/ct-image/ct-image/00080008
ftp://dicom.nema.org/MEDICAL/dicom/2016a/output/chtml/part03/sect_C.8.16.html#table_C.8-129
Another option which might be more reliable than the Image Type attribute is to check the Image Orientation Patient (0020,0037) since the localizers are usually perpendicular to the stack they are referencing.

How to determine why a word was included in description from vision api

I used the computer vision api on an image. The word pizza was returned in describing the image and the only connection to pizza I can make is a pizza company logo on a napkin. The word birthday was also returned. Is there any way to figure out if the word pizza was returned because of the company logo, or it was a guess associated with the word birthday?
This depends on how much details the API gives you back. If it allows you to observe the intermediate outputs of the classifier that is used to categorize the image, you can see which parts of the image that results in high output values. The pizza company logo on a napkin, depending on how large it appears, is quite likely to cause this.
If you are using a more open API and a classifer, like keras and the networks provided under keras.applications, you can use what are called "class activation maps" to see which parts of the image causes the result.
If you find the above too had to do, one easy way to investigate the reason is to crop parts of the image using a loop and pass them to the API. I suspect that "birthday" might be related to a distributed feature and you might not be able to find where that comes from, whereas pizza might be from the logo or some other part of the image.

Google store locator library limit markers in right hand panel

I'd like to limit the number of markers that appear on the map in the right hand panel to something like 10 at any zoom level.
How can this be achieved?
The library can and examples can be found here:
http://storelocator.googlecode.com/git/index.html
I am following the code example given here:
http://storelocator.googlecode.com/git/examples/panel.html
There is a code reference here:
http://storelocator.googlecode.com/git/reference.html
But it's still not clear to me exactly how I can customise the example I am following so that it only shows a maximum of 10 markers at any one time.
EDIT : Why I want to do this
I sell a product wholesale to many salons. With this map I am trying to show customers which salons they can go to buy the products I supply.
However in the example given by google, the full list of salons appear as markers on the map. This is not good because it is then possible for competitors to glean an entire list of salons that they can market competing products to.
The solution I'd like would be to only show a maximum of 10 markers at a time according whichever is closest to the inputted address.
For me the example( http://storelocator.googlecode.com/git/examples/panel.html ) always show only up to 10 entries in the panel. There is a hardcoded limit of 10 , so it's not possible to achieve it without modifying the store-locator.min.js
But when you wan't to display less than the 10 entries, it would be possible via CSS:
/* limit the displayed entries to 5 */
.store-list li:nth-child(n+6){display:none}
When you want to apply a higher limit(or when it should be compatible with IE<9) edit this part in store-locator.min.js(line 28)
m=e.min(10,c[E]);
(set the 10 to the desired value)
To limit the number of results at all edit this line in MedicareDataSource.prototype.parse_
for (var i = 1, row; row = rows[i]; i++)
and set it to
for (var i = 1, row; row = rows[i],i<XXX; i++)
(where XXX is the limit +1, so e.g. setting XXX to 11 will apply a limit of 10)
There's a few general approaches, and the better solution depends a bit on your total number of stores you have, and how hard you want to make it for someone to scrape.
You could continue to use the static data feed like in this example (which means sending all stores to the browser on load), and then add some logic to only display the closest 10 (such as setting the map to null for all markers that aren't also shown in the panel), but this is not a good solution if:
there are lots of stores (more than a thousand or so) since it will be unnecessarily slow to load them all when only displaying a few.
you don't want someone to look at your code and just grab the full CSV you're sending down the wire with all your data.
Given your scraping concern, a better method is probably to implement the store locator using a dynamic datasource that only returns the closest N records for a given lat/lng so you don't expose the entire thing at once. Using Google services you could use Maps Engine which has an API, and the store locator includes a Google Maps Engine example you could start with. Your security concern here is if these queries are publicly available for anyone to hit directly, the table is also public and then someone could query for the full table. So you'd want to put a proxy inbetween to avoid that type of query hack (although of course someone could just feed you lots of locations to eventually get all your stores if they really wanted).
Other options (again just looking at Google's stack although there are lots of alternatives for this kind of thing, like CartoDB and many more) include AppEngine's Search API which also returns the N closest items (but would require some server side coding which Maps Engine would not), or even put the data into Google spreadsheets and implement a basic Script -> Web Service, where your script takes the lat/lng and do some basic math to find the closest.
Again if you don't love the server side aspect then Maps Engine is probably your best bet for a quick start especially given there's a working sample in the storelocator code.

Resources