How to query a list of OpenData end points - opendata

How can a query to get a list of end-points provided by a OpenData site using Socrata?

The simplest way to grab that is using /data.json to get a listing of all the datasets on the domain. Here's an example for data.seattle.gov.
That will list all the datasets on that domain. The identifiers listed in data.json can then be used with our SODA API Endpoints.

Related

How add ontology to `virtuoso-sparql-endpoint-quickstart`?

I'm new to Docker's world. I want to query an ontology locally. I have already configured
virtuoso-sparql-endpoint-quickstart.
It works, and my endpoint is http://localhost:8890/sparql.
Now I want to query my own ontology (not DBpedia). So can I still use the same endpoint? How can I add my ontology to virtuoso?
Please note that an ontology is a vocabulary used to describe one or more classes of entities. The descriptions themselves are typically referred to as instance data, and queries are usually run over such instance data. (There are a few ontologies used to describe ontologies, and these descriptions are also instance data, and queries might be made against them.)
There are a number of ways to load data into Virtuoso. The most useful for most people is the Bulk Load facility. For most purposes, you'll want to load your data into one or more distinct Named Graphs, such that queries can be scoped to one, some, or all of the those Named Graphs.
Any and all queries can be made against the same http://localhost:8890/sparql endpoint. Results will vary depending on the Named Graphs identified in your query.

Face API - create a person using a faceid list

I'm using Face.detect() to check if there are faces on a bunch of image files. After this, I collect the detected faceIds and call Face.group() to obtain groups of faces, which will group the faces by person.
Then I would like to create Persons using these lists of faceIds, but can't find the correct API method to do so. I can only create Person faces by re-uploading the images.
Is there a way to create Person (faces) using the previously obtained faceIds returned by Face.detect()?
This is not possible with the API today. Incidentally, a feature request for this exists already, and you may want to upvote/comment there.

Practical usage for linked data

I've been reading about linked data and I think I understand the basics of publishing linked data, but I'm trying to find real world practical (and best practise) usage for linked data. Many books and online tutorials talk a lot about RDF and SPARQL but not about dealing with other peoples data.
My question is, if I have a project with a bunch of data that I output as RDF, what is the best way to enhance (or correctly use) other people's data?
If I create an application for animals and I want to use data from the BBC wildlife page (http://www.bbc.co.uk/nature/life/Snow_Leopard) what should I do? Crawl the BBC wildlife page, for RDF, and save the contents to my own triplestore or query the BBC with SPARQL (I'm not sure that this is actually possible with the BBC) or do I take the URI for my animal (owl:sameAs) and curl the content from the BBC website?
This also asks the question, can you programmatically add linked data? I imagine you would have to crawl the BBC wildlife page unless they provide an index of all the content.
If I wanted to add extra information such as location for these animals (http://www.geonames.org/2950159/berlin.html) again what is considered the best approach? owl:habitat (fake predicate) Brazil? and curl the RDF for Brazil from the geonames site?
I imagine that linking to the original author is the best way because your data can then be kept up-to-date, which from these slides from a BBC presentation (http://www.slideshare.net/metade/building-linked-data-applications) is what the BBC does, but what if the authors website goes down or is too slow? And if you were to index the author's RDF I imagine your owl:sameAs would point to a local RDF.
Here's one potential way of creating and consuming linked data.
If you are looking for an entity (i.e., a 'Resource' in Linked Data terminology) online, see if there is Linked Data description about it. One easy place to find this is DBpedia. For Snow Leopard, one URI that you can use is http://dbpedia.org/page/Snow_leopard. As you can see from the page, there are several object and property descriptions. You can use them to create a rich information platform.
You can use SPARQL in two ways. Firstly, you can directly query a SPARQL endpoint on the web where there might be some data. BBC had one for music; I'm not sure if they do for other information. DBpedia can be queried using snorql. Secondly, you can retrieve the data you need from these endpoints and load into your triple store using INSERT and INSERT DATA features of SPARQL 1.1. To access the SPARQL end points from your triple store, you will need to use the SERVICE feature of SPARQL. The second approach protects you from the inability to execute your queries when a publicly available end point is down for maintenance.
To programmatically add the data to your triplestore, you can use one of the predesigned libraries. In Python, RDFlib is useful for such applications.
To enrich the data with that sourced from elsewhere, there can again be two approaches. The standard way of doing it is using existing vocabularies. So, you'd have to look for the habitat predicate and just insert this statement:
dbpedia:Snow_leopard prefix:habitat geonames:Berlin .
If no appropriate ontologies are found to contain the property (which is unlikely in this case), one needs to create a new ontology.
If you want to keep your information current, then it makes sense to periodically run your queries. Using something such as DBpedia Live is useful is this regard.

Scraping BRfares for train fares

I am looking for advise. The following website
http://brfares.com/#home
provides fares information for UK train lines. I would like to use it to build a database of travel costs for seasons tickets from different locations. I have never done this kind of thing before but have experience with Python/Bash scripting and some HTML.
Viewing the source code for a typical query the actual fair information is not displayed in index.html. Can anyone provide a pointer as to how to go about scraping (a new word for me) the information.
This is the url for the query : http://brfares.com/querysimple?orig=SUY&dest=0415&rlc=
the response is a json object.
First you need to build a lookup table of all destinations codes. you can use the following link to do that http://brfares.com/ac_loc?term=. Do it for all the letters in the alphabet and then parse for a unique list.
Then you take them by the pair, execute the json query, parse the returned json and feed the data to a database.
Now you can do whatever you want with that database.

Getting information of linked topics on a single request

So, say I search for a City using Freebase API. Say, San Francisco:
https://www.googleapis.com/freebase/v1/topic/m/0d6lp?limit=20&filter=/common/topic/description&filter=/common/topic/article&filter=/location/location/geolocation&filter=/location/location/containedby&filter=/travel/travel_destination/tourist_attractions
I get a bunch of data, including the '/location/location/containedby', which refers by which other entities this one is contained by. This is how I can find out to which State and Country the city belongs to.
The problem is that I only get those entities name and mid, but not '/common/topic/notable_for', therefore, I have to make separate queries per each entity, asking just the notable_for property, to find out which one of those is a Country, a State, or other stuff I don't need.
In example, this is one of the queries, which determines United States of America is a country:
https://www.googleapis.com/freebase/v1/topic/m/09c7w0?filter=/common/topic/notable_for
This is executed between 3 to 6 times each city.
Is there a way to tell the API to include more information about these linked entities on a certain Topic? Like on this case, to include '/common/topic/notable_for' on linked entities. It would save tons of queries, and time to the end user in my case.
Thank you for your time!
You can actually get these results using the new output parameter on the Freebase Search API. Like this:
query=/m/0d6lp
output=(description /location/location/geolocation (/location/location/containedby notable))
Try it out
I'd suggest using the MQL Read API if you want better control on the information returned. Then you can specify nested queries that ask for the contained_by locations to be returned with their types (or you can explicitly filter to only those which are a country or a state).

Resources