I am looking to extract list of tourist attractions and their city,state and country information from Freebase. The property that has location is
"/location/location/containedby". There are different types for this object, "location/location" or "/base/biblioness/bibs_location". If the object has "/base/biblioness/bibs_location" i can get the value of "city", "state" etc. however if the object only has the type "/location/location" i need to go and get its "containedby" field and redo the above logic.
My question is can i perform a conditional query in Freebase like if type == "/location/location/" get xyz. if type== "/base/biblioness/bibs_location" get abc
MQL:
[{
"type": "/travel/tourist_attraction",
"id": null,
"name": null,
"name~=": "^San Diego",
"/location/location/containedby": {
"type": "/base/biblioness/bibs_location",
"name": null,
"id": null
},
"/location/location/geolocation": [{
"id": null,
"latitude": null,
"longitude": null
}]
}]
MQL doesn't support conditional logic, but you can query all the information that you're potentially interested in, making the subqueries optional so they don't filter the results, and then look at what you get back. It'll require conditional code in your result processor, but you won't have to make multiple queries. For example, you could query multiple levels of /location/location/contained by as well as /base/biblioness/bibs_location/state and whatever else you want.
Before you go spending too much time on this though, you might want to check how well populated /base/biblioness/bibs_location is. It looks to me like it's got less than 2K entities.
Related
I'm wondering if it is possible to get all info about a specific date from Freebase.
I can easily retrieve info about a date giving a specific topic, for example, to grab all persons of interest who were born on a specific date:
[{
"type":"/people/person",
"limit":1000,
"sort":"name",
"name":null,
"guid":null,
"timestamp":null,
"/people/person/date_of_birth":"1955-02-24"
}]
Is it possible to grab all types? I'm after things like people born on that date (which I have), major events (start of a war, assassination of a person of interest, etc), and so on.
Essentially I want to match all fields that are dates and return the full information about that entry, regardless of type.
Reflection is what you need here:
[{
"/type/reflect/any_value": [{
"type": "/type/datetime",
"value": "1955-02-24",
"link": {
"source": {
"id": null
},
"master_property": null
}
}]
}]
A couple of notes on that: the MQL manual I've linked to is somewhat bitrotted in its details but is still the best documentation that exists on MQL. Secondly, there's what I'm pretty sure is in MQL bug if you use "*": null or more specifically "target_value": null in the link clause above which makes it ignore the outer value you specified... so don't do that :-)
Let's say I want to get all movies in which at least two (different) actors called "John" played:
Example query:
[{
"type":"/film/film",
"name":null,
"limit":10,
"/film/film/initial_release_date":"2005"
"starring":[{
"a:actor": [{
"type": "/film/actor",
"name": null,
"name~=": "John",
}],
"b:actor": [{
"type": "/film/actor",
"name": null,
"name~=": "John",
}]
}]
}]
If you run the example query, you will see that it will list movies with only one "John" in them. How can I fix my query to exclude these results with duplicated children?
In general, you'll have to do the filtering client-side; queries in MQL are "tree-like" in that one part of the query can't refer to another part rather than being a generic graph.
In this case, you could look for films which have more than one "John" acting in them; however, MQL doesn't allow you to filter on a derived property like "count", so the best you can do is to reverse sort based on the count and then just stop processing as soon as you hit the first entry with "count": 1. However, that query times out if you remove the fixed 1935 release date (sorting in MQL kills performance), so you're probably stuck with just simple client-side filtering.
Trying to get some movies and their genres but leave out any records that contain the genre "Thriller" in the array of genres.
How do I not only ignore the genre key itself for "Thriller", but squelch that entire movie result? With my current query, Thriller is removed from the array of genres, but the parent object (film) is still displayed.
Here's my current workup in the query editor:
http://tinyurl.com/d2g54lj
[{
'type':'/film/film',
'limit':5,
'name':null,
'/film/film/genre': [],
'/film/film/genre!=': "Thriller",
}]
The answer provided is correct, but changes some other stuff in the query too. Here's the direct equivalent to the original query:
[{
"type": "/film/film",
"limit": 5,
"name": null,
"genre": [],
"x:genre": {"name":"Thriller",
"optional":"forbidden"},
}]
The important part is the "optional":"forbidden". The default property used is "name", but we need to specify it explicitly when we use a subclause (to allow us to specify the "optional" keyword). Using ids instead of names, as #kook did, is actually more reliable, so that's an improvement, but I wanted people to be able to see the minimum necessary to fix the broken query.
We can abbreviate the property name to "genre" from "/film/film/genre" since "type":"/film/film" is included (we also never need to use /type/object for properties like /type/object/name).
Answering my own question.
So the trick is to not use the != (but not) operator, but to actually flip it on its head and use the "|=" (one of) operator with 'forbid', like so:
[{
'type':'/film/film',
'limit':5,
'name':null,
'/film/film/genre': [{
"id": null,
"optional": true
}],
"forbid:/film/film/genre": {
"id|=": [
"/en/thriller",
"/en/slapstick"
],
"optional": "forbidden"
}
}]
Thanks to the following post:
Freebase query - exclusion of certain values
I'm trying to get all events in a geo bounding box (that approximately covers France), but I want to exclude all recurring events, so I don't get heaps of French Tennis opens and the like. For this I used the following in my query.
"/time/event/instance_of_recurring_event": {
"id": null,
"optional": "forbidden"
}
However, I've noted Cannes film festivals appear (the individual events for each year), because they do not have the instance_of_recurring_event property set. I can however see that the Recurring Event "Cannes Film Festival" has links to the 2006, 2007, 2008 (etc) film festival events, so I thought I might be able to eliminate them using some reflection. What I have so far is:
[{
"name": null,
"id": null,
"/time/event/instance_of_recurring_event": {
"id": null,
"optional": "forbidden"
},
"/time/event/locations": [{
"geolocation": {
"latitude>": 43.2,
"latitude<": 49.68,
"longitude>": -5.1,
"longitude<": 7.27
}
}],
"/type/reflect/any_reverse": [{
"id": null,
"estimate-count": null,
"name": null,
"/time/recurring_event/current_frequency": null
}]
}]
This allows me to see that the 2008 Cannes film festival is linked to by the Cannes Film Festival subject (that has a yearly recurrence), but I don't know if there's any way to use that to eliminate the 2008 Cannes film festival from my list. Does that make sense?
Try here for the query editor.
Thanks for any help!
Try this: http://tinyurl.com/3okuuzw
A couple of changes:
I added the type: /time/event so that you 'll only get objects of that type. In your query, you were not restricting by type, and in Freebase, you can assert a property on an object without the type. This is a minor change and probably won't have a big effect.
The /film/film_festival_event type of which the Cannes festival is one has a property /film/film_festival_event/festival pointing to the festival series.
I added a clause at the end of the query to exclude objects that have that property set with the assumption that they are recurring events.
This will only work for film festivals, but you can re-use the same pattern for other properties.
[{
"name": null,
"mid": null,
"type" :"/time/event",
"/time/event/instance_of_recurring_event": {
"id": null,
"optional": "forbidden"
},
"/time/event/locations": [{
"geolocation": {
"latitude>": 43.2,
"latitude<": 49.68,
"longitude>": -5.1,
"longitude<": 7.27
}
}],
"/film/film_festival_event/festival": [{
"mid": null,
"optional": "forbidden",
"limit" : 0
}]
}]
Some additional points:
a. You should use "mid" instead of "id" if you want to store the identifiers in your db or re-use them in any way later. mid is a stronger identifier than id since it survives merges and other data transformations. It's also faster to ask for mid instead of id - actually makes a big difference when the result set is large.
b. "limit" : 0 says "don't return this clause at all in the results". I think you still need the mid because you have to have at least one property in a clause that has other directives (limit and optional in this case).
Can get all triples with value null in specific field?
All people with date_of_birth equal null?
[
"type": "/people/person",
"date_of_birth":null,
"name":null
]
You need to use "optional":"forbidden" directive:
[{
"type": "/people/person",
"date_of_birth": {
"value": null,
"optional": "forbidden"
},
"name": null,
"id": null
}]
(I added "id":null so that the Query Editor gives clickable links)
Note that query has a default "limit":100, if you want more results then add an explicit limit clause. If that times out, then you'll need to use a MQL cursor.
If you need to deal with lots of results, the undocumented envelope parameter "page" provides more flexibility than "cursor", allowing you to move forward, back, or access a page at random, as opposed to just going forward like you can with the cursor.
The "optional": "forbidden" clause is the key to lots of useful queries. The "!everything" == "nothing" equivalency is just one of the most common ones.
Tom