Duplicated resources for the same film in the Linked Movie Database - linked-data

I'm using data exposed by http://www.linkedmdb.org/, from their SPARQL endpoint and I noticed that for some films, there are two resources for example:
For the film Tales of Terror there are this two resources:
http://data.linkedmdb.org/resource/film/34038
http://data.linkedmdb.org/page/film/8359
For the film The Corruptor:
http://data.linkedmdb.org/resource/film/34739
http://data.linkedmdb.org/page/film/18849
What are the difference between the two resources representing the same film?

Related

University data aggregation

I have a client who wants to build a web application targeted towards college students. They want the students to be able to pick which class they're in from a valid list of classes and teachers. Websites like koofers, schedulizer, and noteswap all have accurate lists from many universities which are accurate year by year.
How do these companies aggregate this data? Do these universities have some api for this specific purpose? Or, do these companies pay students from these universities to input this data every year?
We've done some of this for a client, and in each case we had to scrape the data. If you can get an API definitely use it, but my guess is that the vast majority will need to be scraped.
I would guess that these companies have some kind of agreements and use an API for data exchange. If you don't have access to that API though you can still build a simple webscraper that extracts that data for you.

Any free mapping service to display and filter 250000+ datapoints?

I have participated in a Hackathon in my city, and the traffic department made public a dataset with more than 250 thousand traffic accident datapoints, each one containing Latitude, Longitude, type of accident, vehicles involved, etc.
I made a test to display the data using Google Maps API and Google Fusion Tables, but the usage limits were quickly reached with the first two years of a total of 13 years of records.
The data for two years can be displayed and filtered here.
So my question is:
Which free online services could I use in order to interactively display and filter 250 thousand such datapoints as map layers?
It is important that the service be free, because we are volunteering our time for non-profit public good. Currently our City Hall is implementing an API, but it is not ready yet, and it would be useful to present them some popularly well-accepted use-cases to make some political pressure for further API development with THEIR server (specially remotely querying a database instead of crawling a bunch of .csv files as it is now...)
An alternative would be to put everything in GitHub and load the whole dataset client-side to be manipulated with D3.js for example, but that seems very inefficient either for the client/user as for the server.
Thanks for reading, and feel free to re-tag if needed.
You need Google Maps API for Business to achieve what you want, but it costs a lot of money.
However, in some cases, you can get this Business Licence if you work for non-profit organization. I can't find the exact rules to be eligible for this free licence. I tried googled them but I can't find anything. I only find this link, just take a look if it can answer your problem.
You should be able to do that with Google Fusion Tables. The limit is 100,000 points per table, but you can overlay 5 layers onto a single map so in effect you can reach 500,000 points. I implemented the website below and have run it with over 200,000 points.
http://www.skyscan.co.uk/mapsearch.html

Travel APIs how to integrate them all?

I may start working on a project very similar to Hipmunk.com, where it pulls the hotel cost information by calling different APIs (like expedia, orbitz, travelocity, hotels.com etc)
I did some research on this, but I am not able to find any unique hotel id or any field to match the hotels between several API's. Anyone have experience on how can to compare the hotel from expedia with orbitz or travelcity etc?
Thanks
EDIT: Google also doing the same thing http://www.google.com/hotelfinder/
From what I have seen of GDS systems, and these API's there is rarely a unique identifier between systems for e.g. hotels
Airports, airlines and countries have unique ISO identifiers: http://www.iso-code.com/airports.2.html
I would guess you are going to have to have your own internal mapping to identify and disambiguate the properties.
:|
When you get started with hotel APIs, the choice of free ones isn't really that big, see e.g. here for an overview.
The most extensive and accessible one is Expedia's EAN http://developer.ean.com/ which includes Sabre and Venere with unique IDs but still each structured differently.
That is, you are looking into different database tables.
You do get several identifies such as Name, Address, and coordinates, which can serve for unique identification, assuming they are free of errors. Which is an assumption.

Does anyone know of a service/db I can use for businesses and geocode?

I was wondering how and where companies like Foursquare/Gowalla find and keep up to date their list of location/businesses.
Is it a web service? Do they buy a directory and enter it into a database?
This is from a comment I found at http://www.quora.com/Where-or-how-does-a-company-like-Foursquare-get-a-directory-of-all-locations-and-their-addresses
Companies usually get place data from one of the following:
Data licenses: Companies like Localeze, InfoUSA, Amacai etc.
license location data: Big players like TeleAtlas and Navteq serve as global aggregators of this data. There are also lots of small niche players that license e.g. restaurant data only, or ATM data only, on a per country basis.
Crowd Sourcing. Some companies crowd source their data. Open Data Sets. There are
some data sets with a creative commons or other license from which location related data can be extracted. E.g. GeoCommons and Wikipedia. APIs. A number of companies provides APIs by which you can access data on the fly. This include GeoAPI.com, Google, Yelp and others.
In general, this data is fragmented both in type (e.g. POI vs neighborhood or geocode) and place (US vs UK vs South Africa vs Wherever)
Google has a geocoding service that's freely available for personal use.
For business use, it costs a few$, but it's still pretty reasonable.
And the API is pretty straightforward
http://code.google.com/apis/maps/documentation/javascript/v2/services.html

Howto visually design a mashup query for programatic extraction

I'm into development of an application that fetches various inputs from internet pages whereas each information snippet comes from a different location (mashup).
I would like to generate the mashup building block (snippets) through a visual tool.
Do you know of anything similar that can be used for such a project? (Already made control, a sample code, article, etc.)
Preferred development environment is .NET - but not mandatory.
IMO the major challenge will be to extract the appropriate information from each feed in semantic form. Wikipedia describes mashups as:
There are many types of mashups, such
as consumer mashups, data mashups, and
enterprise mashups. The most common
type of mashup is the consumer mashup,
aimed at the general public.
Data mashups combine similar types of
media and information from multiple
sources into a single representation.
One example is AlertMap, which
combines data from over 200 sources
related to severe weather conditions,
biohazard threats, and seismic
information, and displays them on a
map of the world; another is Chicago
Crime Map, which indicates the crime
rate and location of crime in Chicago.
The classic mashup - Chicago crime - works because key information such as dates and geolocations are available semantically. Other types of common information are persons, organisations, and domain-specific identifiers.
When you have identified these you may wish to consider the RDF-based tools that the semantic web is developing. Note that governments are starting to emit their data in RDF so I would see this as a key technology
If your web pages do not have semantic information immediately you will probably have to create screen scrapers and HTML parsers. That's not very glamorous, there are no special tools and tends to be just hard work.

Resources