CKAN at OKFestival: raw data now!
Last week’s OKFest is finally over, after a hectic week of talks, workshops, films, hackathons, and more. You can read about highlights such as Hans Rosling’s brilliant talk over on the OKFN blog. The biggest challenge for me was being in two places at once on Wednesday afternoon, with both the CKAN workshop and a panel discussion, including me, in the Open Science stream on ‘Immediate access to raw data from experiments’, where I was on the panel, running in nearby buildings at overlapping times. (Happily I more or less pulled it off.)
The CKAN workshop was a surprise hit, with over 30 people crowding round to hear about CKAN’s latest features and future directions. Some went on to ask questions of CKAN developers about installing, using the API, writing extensions, and more, while others joined a discussion with Antonio Acuña, head of data.gov.uk, about starting a users’ group and about data.gov.uk’s experiences and recommendations.
The experimental data session led to a lively and interesting discussion, chaired by Panton Fellow Sophie Kershaw. From the panel, I spoke about the advantages of publishing data as soon as possible. Researchers are the biggest re-users of their own data and stand to benefit most from publishing it – provided the publication platform chosen is simple to use and provides added value.
Next, Joss Winn of the Orbital project spoke about the platform they are developing (based on CKAN) to enable immediate access to various kinds of experimental data. He stressed that immediate access need not mean immediate publication – it may not be possible to publish the data now for various reasons. However, a good management system should help the researcher rather than be an extra burden, and makes it trivial to publish later at the flick of a switch. Finally, Mark Hahnel of Figshare pointed out that funders will increasingly look at all outputs from research they fund, not just publications – and increasingly this may mean that researchers are required to publish data.
Researchers often have reasons for not publishing data – some good and some bad. But this week Ben Goldacre’s new book provides a timely reminder that data left unpublished can lead to research that does not forward the cause of knowledge, and even actively retards it. Surely almost all scientists starting out in their career hope to expand the frontiers of science, and there couldn’t be a clearer demonstration that one simple step will help: publish the data!