Getting search rank of freebase topics - freebase

Is there a way to get the search rank of freebase topics? We want to order the freebase topics in our database according to their search rank and popularity. I know the Metaweb search API returns the topics in the order of search rank but that applies only to the results for a given query string. We want to apply that logic on the topics that exists in our database.

The Freebase Search API ranks topics based mostly on keyword matching in the title and the description of a topic with the given search query. To get this same feature in your own database you'll have to write your own search ranking code or use a library like Lucene.
You might also be interested in a related discussion that happened on the Freebase mailing list last month about how to rank topics based on overall measures of popularity.

Related

which servies to use for periodically load data from multiple data sources, aggregate and provide fast search?

Please propose a solution design for my case. The data comes from various sources, some from api, some from csv. A user will search using filters.
Ex: Product data (source 1) and Product Reviews ( Source 2). A user will search for a product with its name and rating.
Considerations:
When product reviews will keep on changing. So, the search should reflect that information.
In future, some more sources will get added and additional filters will get added. Ultimately, only products will be shown to the users.
Product price will change. The search results should give updated information.
The search should be very fast.
I looked into using apache airflow, mongodb and elastic search. However, i felt i am complicating the solution.

Is it possible to retrieve LinkedIn Posts / Shares based on keywords from the API?

I know scraping LinkedIn is not permitted. But I see that you're able to retrieve shares based on someone's profile ID, but is it also possible to retrieve shares based on keywords?
Or is there a workaround where I can first get all profile ID's / Vanitynames based on a specific search and then get the shares / post content based on these ID's?
Thanks!
As far as i know linkedin is very strict in scrapping their data. I think there is no retrieving based on keyword. You can retrive all then , use filter to sort.

How can I Count Instances of Search Terms in Google Data Studio?

I'm working with internal site search terms from Google Analytics in Google Data Studio. I need to count how many times users searched specific terms on the website. The problem is, the data is case sensitive and users often misspell words when they search, so that won't get tallied in a normal count function. For example, "careers", "Careers", "cAREERS", and "carers" are all different searches. What formula can I use to easily count how many times users searched different terms?
First add a field with the formula LOWER. Then add a field with case when to correct each possible spelling errors.
Another route would be to create a "sounds like" field. Here BigQuery give a nice function SOUNDEX. Data Studio does not offer somthing like that, but you can build a function with reg_exs so that: first character of word and then only the vocals of the word, but remove duplicated vocals first.

Google freebase score

I am just wondering whether the score returned by google freebase by using their API, can be compared between different query entities? For example, can we set a threshold to decide the results with some certain score are of high relevance.
Thanks!

Freebase scoring with data dumps

If you use Freebase search to get matches for any entity by name, you will get results sorted by relevance score. Try for example Taj Mahal.
I'm trying to get similar results using Freebase data dumps, so in my database 'Taj Mahal' related topics would be sorted by relevance, i.e. building comes first, musician comes next and so on.
Is there any suggestions how to achieve this without querying Freebase search API?
The wiki page on relevance score that you linked to says:
Freebase entities have an inherent relevance score (ranking) computed during indexing that is function of its inbound and outbound link counts in Freebase and Wikipedia. Some popular Freebase entities also have a popularity score computed by Google. By default, both scores are combined together during queries.
Which should give you a pretty good idea where to start. Freebase in-degree and out-degree can be computed directly from the dump, but Wikipedia in/out-degrees would require using the Wikipedia dump (or Freebase's WEX dump). The "popularity score computed by Google" piece is obviously something that you're not going to be able to replicate.

Resources