CKAN meets The DataTank
The DataTank is a data adapter for machine-readable data. It can take data in any of numerous formats, from a CSV file to a SPARQL endpoint, and create an HTTP interface on top of it. This interface is a REST API which lets you read the data in a different format, page it, and query it. The latest version of the DataTank was released two weeks ago.
We asked ourselves: what if we could combine its power with CKAN, a great data registry which stores metadata for all kinds of data (machine-readable or not), with great search functionality and integrated data storage? This new CKAN extension is the answer.
To use the extension, you need to be running both The DataTank and CKAN. When you add a JSON or XML resource (file) to CKAN, it will automatically be added to your DataTank instance, making it instantly usable by app developers. The DataTank also can also export metadata using DCAT.
We’re currently working on more, smarter integration of The DataTank into CKAN. We want to extend CKAN’s “add dataset” interface to allow the user to add extra information about a file (for example, whether there is a header in a CSV file), which will be added to the DataTank’s discovery document. All help is welcomed in developing this further! If you can code in Python, know how to create extra fields in CKAN and know how to call an HTTP API, you’ll love contributing.
In the longer term, the Datatank has some more features in the pipeline: reintroducing SPECTQL, a query language allowing API sources to be filtered and queried that was developed for an earlier version of TDT, having automatic mappings from machine-readable data of which the model is known to RDF using tdt/streamingrdfmapper, analytics on top of the usage, and so on.
We’d love for more people to get involved in the project. Here are some suggestions:
- Check out our website
- Join the DataTank discussion list
- Fork The DataTank’s code
- Or fork the ckan extension
We look forward to hearing from you!