There have been some changes of plan and a lot of work behind the scenes since the last round-up of CKAN news back in July. Major areas of work include the new Datastore and significant overhauls to CKAN’s web user interface. The release of CKAN 1.8 has been delayed, in order that it can include the new Datastore. Some details are below.
CKAN’s Datastore provides structured storage and querying of data, and underpins CKAN’s previews as well as other custom-made applications to access and process data. In current releases of CKAN, the Datastore is built on elasticsearch, an open-source search engine which was quick to build and is good for providing full-text search of a data resource. However, elasticsearch is limited in what it can do, and we have wanted for a while to replace it with a full database-backed system. We’ve now had a chance to spend some time on this – David has built a new Datastore based on PostgreSQL, and Dominik has been adding improvements and working on the migrations.
The new Datastore will have a full SQL search API, enabling more complicated queries, including queries that join data across different data tables. (We’ve also found some reliability problems with elasticsearch which it will hopefully solve.)
The new Datastore is one of several major changes in progress to the code base. As a result work has started on a code branch for version 2.0 to reflect the fact that it will be a significant change. The changes include the new Jinja templating system (and a resulting change to all the default templates), which will not cause much visible change for users, but will make life much easier for anyone wanting to run their own customised version of CKAN. Legacy support for the old templates will be included, so that instances with existing customisations can keep them until they switch to the new system.
CKAN’s default user interface hasn’t changed significantly in 5 years and sadly needed attention. That changed dramatically for the better recently when the team gained a front end developer – first Aron and now JohnM – who have worked with Toby on improving the UI. The results can be seen on the CKAN demo site. They will be integrated into CKAN’s default UI in version 2.0.
Organizations and authorization
CKAN 2.0 will also include a new access authorization system. At its heart will be Organizations, a new feature to enable a smoother workflow for data publishers. An Organization will be a collection of datasets and users, so that a dataset can only be changed by users in the relevant Organization. (For a community instance like the DataHub, all users will belong to a default Organization.)
The release of CKAN 1.8 has been delayed, in order that it can include the new Datastore. To anyone who’s been waiting eagerly for 1.8, apologies! On reflection, we decided that for existing sites running 1.7.1 to upgrade now, only to have to do another upgrade shortly when the new DataStore is added, would be an unnecessary effort for little gain. The good news is that the Datatore has now been added to the 1.8 branch and will be undergoing testing over the next couple of weeks. An added complication is that sites using the old elasticsearch Datastore will need a way to migrate to the new one.
Comings and goings
Our first front end developer Aron left recently, and John Martin has joined the team to replace him. Dominik, a university student from Potsdam who joined us as an intern for the summer, will be going back to college soon, but we hope to be seeing more of him! We also said farewell recently to Ross, who made a great contribution during his time on the team as a developer, and Toby too is imminently off to pastures new. All the best to them from the CKAN team.