Wikidata:Data access

From Wikidata
Jump to navigation Jump to search
Note: This page will shortly receive an update with the content presented in the recent Data Reuse Days presentation. You can view the draft here.

This page is a starting point for obtaining data from Wikidata. For methods that can be used inside Wikimedia projects, please see the page about how to use data on Wikimedia projects.

Basic important things to know

Volunteers like these people – and you – make Wikidata

Wikidata offers a wide range of general data about our universe as well as links to other databases. The data is published under the CC0 "Public domain dedication" license. It can be edited by anyone and is maintained by Wikidata's editor community.

Changes to APIs and data formats used to access Wikidata are subject to the Stable Interface Policy. Changes to stable interfaces will be announced accordingly. Note that not all data sources mentioned on this page are considered stable interfaces.

How can I get data out of Wikidata?

There are several ways to access and edit the data from Wikidata. You can access data per item, or the entirety of the data as dumps.

Per-item access to data

Data can be accessed either via dereferenceable URIs following linked data standards, or through the MediaWiki API.

Linked Data interface

Meet Q42

Each item or property has a persistent URI that you obtain by appending its ID (such as Q42 or P31) to the Wikidata concept namespace: http://www.wikidata.org/entity/

For example, the concept URI of Douglas Adams is http://www.wikidata.org/entity/Q42. Note that this URI refers to the real-world person, not Wikidata's description of Douglas Adams. However, it is possible to use the concept URI to access data about Douglas Adams by simply using it as a URL. When you request this URL, it triggers an HTTP redirect that forwards the client to the data URL for Wikidata's data about Douglas Adams: https://www.wikidata.org/wiki/Special:EntityData/Q42. The namespace for Wikidata's data about entities is https://www.wikidata.org/wiki/Special:EntityData/

Appending an entity's ID to this prefix creates the "abstract" (format neutral) form of the data URL of the entity. When you request a Special:EntityData URL, the special page applies content negotiation to determine the format of Wikidata's output. Most likely you opened the URL in a normal Web browser, and an HTML page of Wikidata's data about the entity will be displayed, because a web browser prefers HTML over other formats. Linked data clients would receive Wikidata's data about the entity in a different format such as JSON or RDF, depending on the HTTP Accept: header of their request.

For cases in which it is inconvenient to use content negotiation (e.g. to view non-HTML content in a web browser), you can also access data about an entity in a specific format by extending the data URL with an extension suffix to indicate the content format that you want, such as .json, .rdf, .ttl, .nt or .jsonld. For example, https://www.wikidata.org/wiki/Special:EntityData/Q42.json leads to a JSON export for item Q42. Specific revisions can be obtained by appending a revision query parameter like so https://www.wikidata.org/wiki/Special:EntityData/Q42.json?revision=112.

By default, The RDF returned from Linked Data interface is self-contained, and includes descriptions of other entities it refers to. Use ?flavor=dump to exclude such information.

MediaWiki API

See the documentation of the API. Note that there are multiple ways to query Wikidata entities. See Help:CirrusSearch and Help:Extension:WikibaseCirrusSearch for documentation of the advanced query possibilities.

Caution: Some API modules, in particular those accessed via action=query, will return raw page content. For entity pages, that raw page content is not guaranteed to use any documented format or follow any standard structure. Raw page content should be treated as an opaque blob. For access to the canonical JSON form of entity pages, use the wbgetentities and wbsearchentities modules.

Help: Don't hesitate to contact the community on Telegram or IRC if you need help to get a query working.

SPARQL endpoints

You can query the data in Wikidata through our SPARQL endpoint, the Wikidata Query Service. The service can be used both as an interactive web interface, or programmatically by submitting GET or POST requests to https://query.wikidata.org/sparql. RDF data can alternatively be accesses via an Linked Data Fragments[1] interface at https://query.wikidata.org/bigdata/ldf. See the user manual and local community pages for more information.

Bots

We welcome well-behaved bots

You can also access the API by using a bot. See Wikidata:Bots for more on bots.

Access to dumps

You can download dumps of the whole content of Wikidata. See the database dumps documentation.

Incremental updates and event streams

The Wikimedia recent changes event streams can be used to see entity changes in real time. The recent changes API is also available but is not recommended for new tools as it does not publish the changes themselves and encur more load on the servers as each entity change must be separately looked up.

Best practices to follow

Our logo

Wikidata offers you the data in Wikidata for free with no requirement to attribute under CC-0. We would however greatly appreciate if you would mention Wikidata as the origin of your data. This will allow us to ensure that the project stays around for a long time and provides you with up-to-date and high quality data. We will also promote the best projects using Wikidata's data. Some examples for attributing Wikidata: "Powered by Wikidata", "Powered by Wikidata Tags", "Powered by Wikidata data", "Powered by the magic of Wikidata", "Using Wikidata data", "With data from Wikidata", "Data from Wikidata", "Source: Wikidata", "Including data from Wikidata", ... You can also use one of the ready-made files from us.

You may use the Wikidata logo (see above), but should not do so in any way that implies endorsement by Wikidata, or the Wikimedia Foundation.

Please offer your users a way to report issues in the data and find a way to feed this back to Wikidata's editor community. We are currently working on streamlining this process. Until then please announce where you collect issues on the Project chat.

Examples and showcases

A number of great tools are being built on top of Wikidata. The external tools page collects them.

See also