DBpedia

DBpedia is a project aiming to extract structured content from the information created as part of the Wikipedia project. Data is accessed using an SQL-like query language for RDF called SPARQL.

DBpedia allows users to semantically query relationships and properties associated with Wikipedia resources, including links to other related datasets. DBpedia has been described by Tim Berners-Lee as one of the more famous parts of the decentralized Linked Data effort.

DBpedia Spotlight is a tool for annotating mentions of DBpedia resources in text. This provides a solution for linking unstructured information sources to the Linked Open Data cloud through DBpedia.

Examples

Here are some example queries - dbpedia.org

For example, imagine you were interested in the Japanese shōjo manga series Tokyo Mew Mew, and wanted to find the genres of other works written by its illustrator.

PREFIX dbprop: <http://dbpedia.org/property/> PREFIX db: <http://dbpedia.org/resource/> SELECT ?who, ?WORK, ?genre WHERE { db:Tokyo_Mew_Mew dbprop:author ?who . ?WORK dbprop:author ?who . OPTIONAL { ?WORK dbprop:genre ?genre } . }

DBpedia combines information from Wikipedia's entries on Tokyo Mew Mew, Mia Ikumi and on works such as Super Doll Licca-chan and Koi Cupid.

How

Wikipedia articles consist mostly of free text, but also include structured information embedded in the articles, such as "infobox" tables, categorisation information, images, geo-coordinates and links to external Web pages. This structured information is extracted and put in a uniform dataset which can be queried.

The DBpedia project uses the Resource Description Framework (RDF) to represent the extracted information and consists of 3 billion RDF triples, 580 million extracted from the English edition of Wikipedia and 2.46 billion from other language editions.