Blog

CKAN DataStore and Data API

  • Rufus Pollock
  • 27 Mar 2012
Today we've got some big news: the arrival of a new version of the CKAN DataStore and Data API. This work will ship as part of CKAN v1.7, and is already live on the DataHub.

What is it?

The CKAN DataStore provides, as an integrated part of CKAN, a database for structured storage of data - together with a powerful Web-accesible, JSON-based, Data API. Data can be automatically inserted into the DataStore from spreadsheet files in .csv or .xls (Excel) format that are uploaded or linked to in CKAN. Alternatively it can be added directly via the Data API. By building on the excellent open-source ElasticSearch storage and search engine (which is in turn built on Lucene), the CKAN DataStore is able to leverage those technologies' excellent scalability, reliability and search capabilities, delivering a great combination of features and robustness.

Why the excitement?

The new DataStore and Data API significantly enhances existing CKAN capabilities, making it easy to do things like:
  • Link or upload a file and automatically get a powerful web API to your data in seconds.
  • Build apps and mashups quickly and easily. Using the DataStore means you don't need to spend time creating and hosting a data API, but can instead focus on building your app or mashup using CKAN's powerful JSON-based one.
  • Utilise the full capabilities of our powerful Recline data explorer, for example rich filtering and full-text search of datasets (when their data is in the DataStore).

How it works

Let's see the datastore in action. Here is a dataset on thedatahub.org created a few weeks ago. It contains some data which featured in an article in the Guardian, ranking English local authorities by various measures of poverty. [IMG: Datastore 1] The dataset is in a resource which is a link to a Google Docs spreadsheet. CKAN's DataStore doesn't recognise these at the moment, but all is not lost - we simply download it from Google Docs as a .csv file. We will now add this as a second resource to the CKAN dataset. Uploading a file to CKAN was covered in this blog post (though the interface has changed slightly). To use the DataStore, the upload process is identical. However, when editing the resource's details, there is now an option to enable the Datastore, which we select: [IMG: Datastore 2] When we save the changes and return to the dataset page, both resources are now shown. We select the new resource: [IMG: Datastore 3] Resources now have their own page in CKAN, so the page for the new resource is displayed. Because the .csv file is now in the DataStore, CKAN gives an interactive preview onto the file, using the built-in Recline Data Explorer. [IMG: Datastore 4] Recline enables the user to do filtering, searching, graphing and sorting on the resource page. On the screenshot above, the user is about to sort local authorities in order of one of the given poverty measures. Watch this space for a post soon with more information about CKAN's data viewer. The screenshot also shows that the resource now has a "Data API" button. This displays useful information about using the API for this resource, including the endpoints, some examples, and links to further information.

Find out more

For more information, check out the detailed docs or read through the 5-minute slideshow overview: