Blog

ESRC Consumer Data Research Centre

  • alexsingleton
  • 28 Oct 2015
Integral to our activities as part of the ESRC Consumer Data Research Centre, we spent the summer working on a project that would create a searchable catalogue of the various data holdings that we are assembling, including retailer data that we have negotiated access to, but also a wealth of value added open data products. The site is available here: data.cdrc.ac.uk One aspect that we were especially pleased with is the introduction of data stores for each local authority in the UK. These all have a separate URL for their own datastore; so, Liverpool could be found here for example: data.cdrc.ac.uk/lad/liverpool We do not believe in simple replication of data sources available elsewhere, and we have added value to each open data deposit by reengineering these into new formats that are optimized for simple analysis, and which we hope are going to limit barriers to entry. As of 5/10/15 we created 8,738k separate data items for a very wide variety of topics. Not every local authority have resources to create their own datastore, and for those which do, we hope that what we have created will be complementary. We have also linked many of the outputs through to our mapping interface which is available here: maps.cdrc.ac.uk Some Technical Bits.... Given the location of this blog post; for the development we used the CKAN platform as this was open source and was widely used in those other data stores that we were familiar with. Off the shelf we have however made some considerable customisation. The infrastructure we used to develop the CKAN included developing on Docker images for all the services that the CKAN relies upon, including a service management and configuration system. We also were dealing with multiple uploads that had been created using either R, Python and PostGIS, so we also scripted a bulk dataset uploading tool. Some specific customisation: For the products/topics/LADs/National/Regional search tabs:
  • Added support for filters based on products/topics/LADs
  • Added groups/labels for Open/Safeguarded/Secure datasets.
  • Added an interactive map on the front page (based on our maps.cdrc.ac.uk platform)
  • Added a Twitter feed
  • Add a blog proxy to a WordPress blog aggregator
  • Add download tracking
  • Improve efficiency of the CKAN code associated for the group listing
  • Add a geojson preview on the dataset pages
  • Prevented non-logged in users downloading - unfortunately we need to have this functionality to provide usage data to our funding body (sorry!)
  • Add system-wide notification messages
  • Add Google Analytic tracking code
Data blog Customize a WP theme Cerauno to fit into the CKAN Other additions
  • A plugin was developed to improve the user registration form (https://github.com/esrc-cdrc/ckan-ckanext-userextra)
  • Add checkboxes for newsletter options and a dropdown menu for sectors
  • Customized the metadata with a third-party plugin ckanext-schemin
  • Added a commenting system with a third-party plugin ckanext-ytpcomments
  • Improved the user experience of the commenting system by changing the look and feel, and allowed in-place commenting and editing.
We also did some Major bugfixes/improvements to the CKAN including:
  • Fixed the tracking system (broken by latest releases of CKAN)
  • Fixed the type system for groups
Besides these, there were various other small bugfix/improvements on the CKAN and third-part plugins. We hope that these and our continuing contributions have been of use, and that you enjoy our data store: data.cdrc.ac.uk Thanks in particular go to the hard work of Data Scientists Wen Li, Hai Nguyen and Michail Pavlis who have spent much of summer working on this project.