How to use Google Cloud Search and Vision API together - google-cloud-vision

everyone,
I have a completely untypical topic at this point and I hope that I am addressing the right people here.
I'm working on a personal project. I recently became a G Suite customer and would like to map my document and media management via Google Drive. The document management works well so far and with the help of Google Cloud Search I can easily find my documents across platforms.
Since I personally take a lot of pictures, I was wondering if I could use Google products to find a way to classify my pictures automatically. My approach was to use the label detection of the Vision API to store the 5 most likely labels as metadata. By using the metadata, I can then, when I search for example for architecture or animal, find all images that contain one of the following terms in a single search. The concept should of course be extendable to location and text detection.
I have already tried to create an automatism via pages like integromat.com that labels the photos, but unfortunately without success.
Well and now we come to the current situation. Since I realized that an active interaction with the Google Cloud is essential, I am looking for help from an experienced community. I hope that maybe someone here has a good or inspiring idea.
Maybe one more hint before the proposal is made. Google Photos is great and can do something like that, but it doesn't integrate with Google Cloud Search and managing RAW files would be terrible.

You can achieve what you want using the following approach:
Build a web/mobile app to upload photos to Google Drive or Cloud Storage.
Use the Google Vision API to fetch metadata from your image before uploading to Drive/Cloud Storage.
Use Google Cloud Search Rest API to index the extracted metadata along with the image URL to Cloud Search.
Create a custom Search Interface to search and display your indexed images.
Above steps should be able to point you in the right direction in implementing the solution. Let me know if you need further help with it.

Related

Firebase - Machine Learning and interest tracking to create an algorithm for sorting posts

One of my applications includes user-generated posts and functions in a similar way to Instagram. When a user opens the app they see a feed of posts sorted by date. This works when there just one small demographic using the app, but as the user base becomes more diverse, not everyone is interested in the same posts. This is why apps like TikTok and Instagram have algorithms to decide which posts to show to a user. Where do I even start with this? I understand that there need to be tags on each post for what they are about (this is where I think I can use machine learning) and then each users information needs to include their interests (I’m not sure what can be used to change this as they like or dislike posts). Is there a simple pre-built way of doing this or any examples? It seems fo be a pretty big secret that mostly big tech companies understand and use.
You could use Google's "cloud vision api(For Images): https://cloud.google.com/vision" and "Video Intelligent Api(For videos): https://cloud.google.com/video-intelligence/docs".
Video Intelligence Api could handle images too from byte stream.
Build a firebase function that analyse posted media with these api.
Build the rest of the logic from here. Find a way to detect their interest from post, save their interests.

Import.io - Can it replace Kimonolabs

I use Kimonolabs right now for scraping data from websites that have the same goal. To make it easy, lets say these websites are online shops selling stuff online (actually they are job websites with online application possibilities, but technically it looks a lot like a webshop).
This works great. For each website an scraper-API is created that goes trough the available advanced search page to crawl all product-url's. Let's call this API the 'URL list'. Then a 'product-API' is created for the product-detail-page that scrapes all necessary elements. E.g. the title, product text and specs like the brand, category, etc. The product API is set to crawl daily using all the URL's gathered in the 'URL list'.
Then the gathered information for all product's is fetched using Kimonolabs JSON endpoint using our own service.
However, Kimonolabs will quit its service end of february 2016 :-(. So, I'm looking for an easy alternative. I've been looking at import.io, but I'm wondering:
Does it support automatic updates (letting the API scrape hourly/daily/etc)?
Does it support fetching all product-URL's from a paginated advanced search page?
I'm tinkering around with the service. Basically, it seems to extract data via the same easy proces as Kimonolabs. Only, its unclear to me if paginating the URL's necesarry for the product-API and automatically keeping it up to date are supported.
Any import.io users here that can give advice if import.io is a usefull alternative for this? Maybe even give some pointers in the right direction?
Look into Portia. It's an open source visual scraping tool that works like Kimono.
Portia is also available as a service and it fulfills the requirements you have for import.io:
automatic updates, by scheduling periodic jobs to crawl the pages you want, keeping your data up-to-date.
navigation through pagination links, based on URL patterns that you can define.
Full disclosure: I work at Scrapinghub, the lead maintainer of Portia.
Maybe you want to give Extracty a try. Its a free web scraping tool that allows you to create endpoints that extract any information and return it in JSON. It can easily handle paginated searches.
If you know a bit of JS you can write CasperJS Endpoints and integrate any logic that you need to extract your data. It has a similar goal as Kimonolabs and can solve the same problems (if not more since its programmable).
If Extracty does not solve your needs you can checkout these other market players that aim for similar goals:
Import.io (as you already mentioned)
Mozenda
Cloudscrape
TrooclickAPI
FiveFilters
Disclaimer: I am a co-founder of the company behind Extracty.
I'm not that much fond of Import.io, but seems to me it allows pagination through bulk input urls. Read here.
So far not much progress in getting the whole website thru API:
Chain more than one API/Dataset It is currently not possible to fully automate the extraction of a whole website with Chain API.
For example if I want data that is found within category pages or paginated lists. I first have to create a list of URLs, run Bulk Extract, save the result as an import data set, and then chain it to another Extractor.Once set up once, I would like to be able to do this in one click more automatically.
P.S. If you are somehow familiar with JS you might find this useful.
Regarding automatic updates:
This is a beta feature right now. I'm testing this for myself after migrating from kimonolabs...You can enable this for your own APIs by appending &bulkSchedule=1 to your API URL. Then you will see a "Schedule" tab. In the "Configure" tab select "Bulk Extract" and add your URLs after this the scheduler will run daily or weekly.

Is there an API call to get a list of saved places in Google Maps?

I have a ton of saved places that appear on my Google Maps - but there is no way to manage, filter or search them. Is there a way to access these locations by API?
I scanned the maps api and can't find any reference. Is there another Google API that makes this available?
There do have a REST API can retrieve the saved places.
http://www.google.com/bookmarks/?output=xml
Visit this link to get more information.
https://www.google.com/bookmarks/
There are also api like:
https://www.google.com/bookmarks/find?q=conf&output=xml&num=10000
https://www.google.com/bookmarks/lookup?
But seems like they have been deprecated and most of document are not available anymore. Use them as you own risk.
Currently the list of saved places in My Maps is not available via an API. There is a feature request tracking this you can use to follow along # https://code.google.com/p/gmaps-api-issues/issues/detail?id=2953.
2022: I created a gist for parsing saved places from a shared list via python. It is really unstable because its a quick&dirty solution but maybe it will help someone: https://gist.github.com/ByteSizedMarius/8c9df821ebb69b07f2d82de01e68387d
Edit: The above answer did not yet take pagination into consideration. Please see my answer here.

Creating an indoor map

I wonder if someone here can help here ,in my web application I'm trying to create map section:
In my map section the objective is to show an indoor search like in the attached picture from Yahoo Maps does someone know how they Created the tenants names and level of the floor on the maps it self?
http://s30.postimg.org/4dh7mlpfl/Yahoo_Maps.png
I think the best answer for this one is going to depend on which mapping framework you were interested in using.
If you're using Yahoo maps: Yahoo got that indoor map data from Nokia's here platform. As far as I know, they don't offer an editor for the indoor mapping data. The major mapping platforms often have some self-service mechanism to add or correct mapping data. If you were set on having this available and you were using Yahoo's maps, you might want to try to contact someone at Nokia's "here" and see how you might be able to get that to happen.
With that being said, you can do something like that with Google Maps as well. They have information and a way to add the interior layout of a building here. I haven't used it...I just know that it exists, so I can't speak to it in much detail.
There is also some support for this kind of thing in OpenStreetMap. I would post a link to an example of it, but stackoverflow says I can't post more than 2 links unless I have more than 10 reputation. (Sorry...I'm still relatively new to posting on here.)

Access to old, no longer available, feed entries

I am working on a project that requires reliable access to historic feed entries which are not necessarily available in the current feed of the website. I have found several ways to access such data, but none of them give me all the characteristics I need.
Look at this as a brainstorm. I will tell you how much I have found and you can contribute if you have any other ideas.
Google AJAX Feed API - will limit you to 250 items
Unofficial Google Reader API - Perfect but unofficial and therefore unreliable (and perhaps quasi-illegal?). Also, the authentication seems to be tricky.
Spinn3r - Costs a lot of money
Spidering the internet archive at the site of the feed - Lots of complexity, spotty coverage, only useful as a last resort
Yahoo! Feed API or Yahoo! Search BOSS - The first looks more like an aggregator, meaning I'd need a different registration for each feed and the second should give more access to Yahoo's data but I can find no mention of feeds.
(thanks to Lou Franco) Bloglines Sync API - Besides the problem of needing an account and being designed more as an aggregator, it does not have a way to add feeds to the account. So no retrieval of arbitrary feeds. You need to manually add them through the reader first.
Other search engines/blog search/whatever?
This is a really irritating problem as we are talking about semantic information that was once out there, is still (usually) valid, yet is difficult to access reliably, freely and without limits. Anybody know any alternative sources for feed entry goodness?
Bloglines has an API to sync accounts
http://www.bloglines.com/services/api/sync
You have to make an account, subscribe to the feed you want to download, but then then you can download based on Date, which can be way in the past. Not sure of the terms.
The best answer I've found so far, is this: Google reader's unofficial API turns out to have a public access point for their feeds, which means there is no authentication needed. Use is as follows:
http://www.google.com/reader/public/atom/feed/{your feed uri here}?n=1000
replace the text in the squigglies (including the squigglies themselves) with the feed URI you're interested in. More information about the precise arguments can be found here:
http://blog.martindoms.com/2009/10/16/using-the-google-reader-api-part-2/
but remember to use the /public/ url if you don't want to mess with authentication

Resources