The Open Knowledge Data Packager

Sean Hammond
09 Jun 2014
Share

Today we’re launching the Open Knowledge Data Packager!

Data Packager is a web app for quickly creating and publishing Tabular Data Packages from collections of CSV files on your computer. You can register for a free user account and start creating data packages now, or take a look at a sample data package. With Data Packager’s simple interface you can create a data package, upload CSV files to it, enter some metadata, and get a web page where users can explore and download your data package. When you login, you’ll be taken to your dashboard, where you’ll see a list of any packages you’ve created so far and Add package button: [caption id="attachment_3347" align="aligncenter" width="687"]

My Data Packager dashboard[/caption]

Click the Add package button to create a new data package and you’ll be taken to a form where you can enter the title and other metadata for your package:

[caption id="attachment_3348" align="aligncenter" width="736"]

Creating a new data package[/caption]

Click on Next: Add CSV files and you’ll be taken to a form where you can upload one or more CSV files to your data package:

[caption id="attachment_3349" align="aligncenter" width="724"]

Uploading CSV files to a new data package[/caption]

Finally, click on Finish to create your data package. You’ll be taken to your data package’s page:

[caption id="attachment_3350" align="aligncenter" width="733"]

Browsing your newly created data package[/caption]

You can publish the URL of this page, or share it with anyone who you want to share your data package with.

Why Tabular Data Packages?

Tabular Data Packages (defined by the DataProtocols.org Tabular Data Package spec) are a simple and easy-to-use data publishing and sharing format for the web. A Tabular Data Package is a collection of CSV files with a datapackage.json file. The datapackage.json file contains metadata about the package (title of the package, description, keywords, license, etc.) and schemas for each of the package’s CSV files.

The format is a good compromise between CSV and Excel, providing the simplicity and ease-of-use of CSV with some of the expressivity of full-blown spreadsheets.

The schemas for the CSV files use the JSON Table Schema format, a simple format for tabular data schemas. It includes metadata for each of the CSV file’s columns (column name, type, description, etc.) and optional primary and foreign keys for the file.

Data Packager Features

After you’ve created your data package and uploaded some CSV files to it, Data Packager has a few nice features for you…

Download data packages

The Download Data Package button on your data package’s page will download a ZIP file including all of your package’s CSV files and the datapackage.json file containing the metadata you entered for your package and files, plus schemas for each of your files:

Schema browser

Data Packager automatically generates a JSON Table Schema for each CSV file that you upload. The generated schema includes:

Column names for each of the file’s columns (taken from the CSV file’s header row, if it has one)
The type of the data in each column (string, number, date…), inferred from the values in the columns
Some descriptive statistics calculated for numerical columns (minimum and maximum values, mean, standard deviation…)
Temporal extents (earliest and latest dates) for date and time columns

By clicking on one of the CSV files on your data package’s page, you can browse the file’s schema using the schema browser. Each file’s page shows a preview of the CSV file contents, and by clicking on the columns in the preview you can inspect the schema for each column:

[caption id="attachment_3353" align="aligncenter" width="956"]

The schema browser[/caption]

Schema editor

By clicking the Edit button on one of your CSV file’s pages, you can edit the file’s JSON Table Schema and add your own custom attributes. Data Packager validates all the changes that you make and gives helpful error messages if you try to save an invalid schema.

[caption id="attachment_3354" align="aligncenter" width="956"]

The schema editor[/caption]

Primary and foreign keys

If you add primary and foreign keys to a CSV file’s schema, they’ll also be shown on the file’s page.

[caption id="attachment_3355" align="aligncenter" width="516"]

Primary and foreign keys[/caption]

API

All of Data Packager’s features can also be used via its JSON API.

Open Source

Data Packager is 100% open source. You can:

Deploy your own Data Packager site - just follow our instructions to install Data Packager on an Ubuntu server
Contribute to the Data Packager source code on GitHub - send us a pull request!
Report bugs using our issue tracker

Built with CKAN

Data Packager is built using CKAN, the highly-customisable open source data portal platform. All Data Packager features are implemented by a CKAN extension, ckanext-datapackager.

Someone Built a Sheet Music Directory on CKAN. I Did Not See That Coming.

In Category on 24 Jun 2026

The Most Unexpected CKAN Use Case I've Ever Seen: A Sheet Music Directory With AI Metadata

Wolfgang from Ondics built an open source sheet music catalog on CKAN — with AI metadata generation, YouTube playback, and cross-instance sharing. Here's how.

In Category on 23 Jun 2026

See What's New in the CKAN World: Ecosystem Catalog, HDX Spotlight, New Community Forum — and CKAN Running a Sheet Music Directory

A recap of what the CKAN community covered on June 17, 2026: a live demo of the new CKAN Ecosystem Catalog, a deep-dive into HDX Tabular Data Endpoints, the launch of the new community discussion forum — and, surprise surprise, a very unexpected use of CKAN as a sheet music directory with AI-assisted metadata. Yes, really.