Getting information of linked topics on a single request - freebase

So, say I search for a City using Freebase API. Say, San Francisco:
https://www.googleapis.com/freebase/v1/topic/m/0d6lp?limit=20&filter=/common/topic/description&filter=/common/topic/article&filter=/location/location/geolocation&filter=/location/location/containedby&filter=/travel/travel_destination/tourist_attractions
I get a bunch of data, including the '/location/location/containedby', which refers by which other entities this one is contained by. This is how I can find out to which State and Country the city belongs to.
The problem is that I only get those entities name and mid, but not '/common/topic/notable_for', therefore, I have to make separate queries per each entity, asking just the notable_for property, to find out which one of those is a Country, a State, or other stuff I don't need.
In example, this is one of the queries, which determines United States of America is a country:
https://www.googleapis.com/freebase/v1/topic/m/09c7w0?filter=/common/topic/notable_for
This is executed between 3 to 6 times each city.
Is there a way to tell the API to include more information about these linked entities on a certain Topic? Like on this case, to include '/common/topic/notable_for' on linked entities. It would save tons of queries, and time to the end user in my case.
Thank you for your time!

You can actually get these results using the new output parameter on the Freebase Search API. Like this:
query=/m/0d6lp
output=(description /location/location/geolocation (/location/location/containedby notable))
Try it out

I'd suggest using the MQL Read API if you want better control on the information returned. Then you can specify nested queries that ask for the contained_by locations to be returned with their types (or you can explicitly filter to only those which are a country or a state).

Related

How to structure a firestore database?

I'm creating a flutter app with firestore ,stuck on data structure scenario. So the scenario is a user can visit multiple cities, I have 3000 cities and 100000 users and I wanna be able to add and query, which users have visited a particular city.
my current implementation is users/userId/userDetails and cities/cityName/cityDetails
Given what little I've learned from the video guides provided by Firebase; it depends on whether you just want to list the users (ie, just some basic information on them) that have visited a city; or, if what you want is to list users by the city/cities that they've visited.
For the first case, where you want to just list each city's visitors, you'd be better off creating a document that contains a list of visitors by their id. This would make it so that when you consult the cities, you don't have to retrieve the massive list of visitors every time you call it.
Alternatively, for the second scenario, you might want to do the reverse, where each user has a (sub) document containing the list of cities they've visited. This would be especially useful if the list of cities is expected to be particularly large, or if the userDetails file contains a lot of information about the user (lots of fields).
If the userDetails file doesn't contain a lot of details, more specifically if the amount of fields is rather small and you don't expect the user to visit more than, say, 150 cities, you could include it as one of the regular fields inside of userDetails, by setting it as an array (if you don't care about the order) or setting it as a Map (if you plan to have each city be a key, with the value pair being when the city was visited)
That being said, I'm advising this under the assumption that the implementation instructions that you've stated have the final entry be a document (cityDetails and userDetails), as it is was I believe makes the most sense (having the userId and the cityName as the collection).
If I've misunderstood your structure and it's actually "cities" as a document, "cityName" as a collection and "cityDetails" as a document; please let me know so I may fix my answer.

Filtering results with Geofire + Firebase

I'm trying to figure out how to query with filter with Geofire.
Suppose I have restaurants with different category. and I want to add that category to my query. How do I go about this?
One way I have now is querying the key with Geofire, run the for loop through each key and get the restaurant, and insert the appropriate restaurant to the array.
These seems so inefficient. Is there any other way to go about this?
Ideally I will have the filtered results, and only load each item when they're about to be shown.
Cheers!
Firebase queries can only filter by one condition. Geofire already does quite some "magic" to allow it to filter on both longitude and latitude. Adding another property to that equation might be possible, but is well beyond what Geofire handles by default. See GeoFire: How to add extra conditions within the query?
If you only ever want to access one category at a time, you can put the restaurants in a top-level node per category and point Geofire to one category.
/category1
item1
g: "pns0h0mf2u"
l: [-53.435719, 140.808716]
item2
g: "u417k3dwub"
l: [56.83069, 1.94822]
/category2
item3
g: "8m3rz3s480"
l: [30.902225, -166.66809]
/items
item1: ...
item2: ...
item3: ...
In the above example, we have two categories: category1 with 2 items and category2 with just 1 item. For each item, we see the data that Geofire uses: a geohash and the longitude and latitude. We also keep a single list with the other properties of these 3 items.
But more commonly, you simply do the extra filtering in client-side code. If you're worried about the performance of that: measure it, share the code, JSON data and measurements.
This is an old question, but I've seen it in a few places on the web, so I thought I might share one trick I've used.
The Problem
If you have a large collection in your database, maybe containing hundreds of thousands of keys, for example, it might not be feasible to grab them all. If you're trying to filter results based on location in addition to other criteria, you're stuck with something like:
Execute the location query
Loop through each returned geofire key and grab the corresponding data in the database
Check each returned piece of data to see if it matches the other criteria
Unfortunately, that's a lot of network requests, which is quite slow.
More concretely, let's say we want to get all users within e.g. 100 miles of a particular location that are male and between ages 20 and 25. If there are 10,000 users within 100 miles, that means 10,000 network requests to grab the user data and compare their gender and age.
The Workaround:
You can store the data you need for your comparisons in the geofire key itself, separated by a delimiter. Then, you can just split the keys returned by the geofire query to get access to the data. You still have to filter through them, but it's much faster than sending hundreds or thousands of requests.
For instance, you could use the format:
UserID*gender*age, which might look something like facebook:1234567*male*24. The important points are
Separate data points by a delimiter
Use a valid character for the delimiter -- "It can include any unicode characters except for . $ # [ ] / and ASCII control characters 0-31 and 127.)"
Use a character that is not going to be found elsewhere in your database - I used *, but that might not work for you. Do not use any characters from -0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz, since those are fair-game for keys generated by firebase's push()
Choose a consistent order for the data - in this case, UserID first, then gender, then age.
You can store up to 768 bytes of data in firebase keys, which goes a long way.
Hope this helps!

What is the best way to manage relations in ElasticSearch?

Sorry if this question have been asked but i couldn't find clear answer on this subject.
I'm having troubles while creating my elasticsearch index, i'm not really sure how to manage relations properly.
Let's say i have we have the following entities:
Product
id
reference
Book
id
name
product_id
Shirt
id
color
product_id
StockItem
id
supplier_id
product_id
quantity
I'd like to :
Find a shirt from it's color
Find all books given by supplier_id 5
I wasn't able to find if i was supposed to do multiple queries, nested objects, parent/children relations, etc... I couldn't find a proper tutorial which says "do it that way".
Actually i'm working with nested objects but i find it quite dirty to redefine, in each of my type, all the data i need.
Do you have some advice on this ?
Thank's.
The key to searching and modeling relationships in Elasticsearch is to denormalize. This is because Lucene has a flat data model with no built-in support for relationships in your data.
Think of it from the perspective of your search results. What is the thing being searched for? What shows up in your search results? That is the thing you are searching against. If you want to filter or sort those things based on the values in a related object, then you need to pull those values in at indexing time.
If you're searching for shirts and want to filter by color, then your shirt documents should all have a color field on them. If you are searching books and want to filter to a certain supplier, then you should include the supplier name or ID as a field on your book documents.
Your choice of language and ES client may make this easier. For example, in Ruby, you can index the results of arbitrary method calls, allowing you to dynamically fetch from other associated models while indexing your data.
Nested structures or parent child relation is your best bet. I hope this blog will help.

How to retrieve resources based on different conditions using GET in RESTful api?

As per REST framework, we can access resources using GET method, which is fine, if i know key my resource. For example, for getting transaction, if i pass transaction_id then i can get my resource for that transaction. But when i want to access all transactions between two dates, then how should i write my REST method using GET.
For getting transaciton of transaction_id : GET/transaction/id
For getting transaction between two dates ???
Also if there are other conditions, i need to put like latest 10 transactions, oldest 10 transaction, then how should i write my URL, which is main key in REST.
I tried to look on google but not able to find a way which is completely RESTful and solve my queries, so posting my question here. I have clear understanding of POST and DELETE, but if i want to do same update using PUT for some resource based on condition, then how to do it?
There are collection and item resources in REST.
If you want to get a representation of an item, you usually use an unique identifier:
/books/123
/books/isbn:32t4gf3e45e67 (not a valid isbn)
or with template
`/books/{id}
/books/isbn:{isbn}
If you want to get a representation of a collection, or a reduced collection you use the unique identifier of the collection and add some filters to it:
/books/since:{fromDate}/to:{toDate}/
/books/?since="{fromDate}"&to="{toDate}"
the filters can go into the path or into the queryString part of the url.
In the response you should add links with these URLs (aka HATEOAS), which the REST clients can follow. You should use link relations, for example IANA link relations to describe those links, and linked data, for example schema.org or to describe the data in your representation. There are other vocabs as well, for example GoodRelations, and ofc. you can write your own vocab as well for your application.

Which URL describes the resource the best?

/competitions/1/clubs/5/players
/players/search?club_id=5
/players?club_id=5
When should I use a first-class URL for a resource, and when should I use a nested URL?
Update 1
Thanks for the answers so far. I'll try to clarify things a little further.
Competition and Club have a many-to-many relationship. Clubs can participate in multiple competitions. I guess that would make Club a first class entity, so the way to access a club would be for instance:
/clubs/33
But I also need to be able to access clubs that participate in a specific competition, so I need something like this too:
/competitions/2/clubs
But someone mentioned it isn't recommendable to make a resource accessible via multiple URI's. Doesn't this violate that?
Also, I presume a URI like this would not be preferable:
/competitions/2/clubs/33/players/5
But rather use this:
/clubs/33/players/5
Club has a one-to-many relationship with Player.
/competitions/1/clubs/5/players
As a URI is the identifier of a single resource, I would say the general rule is that if it is an object, it gets a 'first-class URL'.
I only tend to use the query parameters only when limiting/filtering lists, for example, /competitions/1/clubs/5/players?gender=MALE.
I use path elements if the relation "feels" tree/directory wise (like club has players /clubs/berlin/players). Parameters are more "tags", I use it often for search-filters (e.g. defenders of 'berlin' club with age older as 22 /clubs/berlin/players?position=defender&age=22).
I design URL structure by 'domain-importance'. The most basic concepts should go to the root. If possible don't go too deep down url-structure, I try not to duplicate or create alias collections which represent identical resources (costs double maintenance in code + documentation).
Generally putting /clubs as root feels more natural: /clubs/{club_id}/players
I would only expose players through /competitions/{comp_id}/clubs/{club_id}/players, if players-set of is different as /clubs/{club_id}/players, e.g. during competition
several players are blocked or didn't make it for the match-squad.
What do you mean with /competitions? Is it a tournament or a single match? If single match with two clubs maybe use home + away domain-concepts: /competitions/{comp_id}/home-club and
/competitions/{comp_id}/away-club .
Update-1 Answer
Here my thoughts on your update-question:
I guess /competitions/2/clubs is a subset of /clubs, not every club is competing in every competition. So both resources are different, so two URLs are fine.
Thinking again /competitions/2/clubs/33/players/5 should also be fine (but it is important that in server code duplication is avoided). This URL should even be mandatory when the returned resource is a subset of /clubs/33/players (e.g. players are injured or limit of team-size has been hit for specific competition).
I wouldn't put the ID numbers in the URL. They mean something only for those who actually knows what they mean, but for everyone else they are meaningless numbers.
You should always choose descriptive and related words for your URL, because the URL contribute to give informations about the linked resource.
Instead of using meaningless ID numbers, choose a unique name representing the name of the team or the competition, for example
/competitions/worldcup/clubs/usa/players
But if you really need to send that kind of anonymous data in the URL, then I would prefer to see them in a query.
Use only meaningful text for the URL.

Resources