Using CKAN: building a Brazilian government data portal
Open Government Data took a stride forward in Brazil this year with the coming into force of the Access to Information Act and the National Open Data Infrastructure. This will make all public agencies put their data openly on the web, which will generate a volume of information never before seen in the history of Brazil. A key tool has been the government data portal at dados.gov.br, a CKAN portal which is a central point of access to all government data. Here are some reflections on how the portal was built and on our experiences of working with CKAN.
In 2011, The Director of the Brazilian Ministry of Planning asked the IT team to look into what Open Data might mean in Brazil, how it had been done in other countries, and try out some experiments. In the course of our research we put in place a supplier database and another database for local government and NGO disbursements. These were experiments at the time, but they proved useful for us, the rest of government, and other citizens, so they are now definitive databases which we will continue to improve and maintain.
We thus had experience of the value of Open Data when the office of the Comptroller-General submitted the bill that would become the Access to Information Act, requiring all government departments to release data in accessible, open formats. We were delighted by this commitment to open data, but we also realised that with the many arms of government publishing data in different ways and in different places, it would be hard to find the data you need. Since we were in a department responsible for co-ordinating government action, and already had experience with releasing open data, we realised it was up to us to solve this problem. We therefore started work on an open data portal.
Choosing the technology
Several factors had to be considered when deciding on the best tool to use to create an open data catalogue. We had no budget at the time for buying in a commercial solution, and in any event, government IT policies recommend, and sometimes require, the use of free software in preference to proprietary software. We also knew the difficulties in centralising all data in one place, so we decided to keep it in a distributed infrastructure – where each agency or department is responsible for its own data, and a central catalogue uses the architecture of the web to refer to where the data can be found.
CKAN was a tool which fitted these requirements perfectly, and in addition one member of the Open Data Team was already familiar with it from involvement in the Brazilian community open data portal. As CKAN was also used by the UK government at data.gov.uk, we were confident it would continue to be supported. We therefore decided to use it, and started planning the portal in collaboration with civil society stakeholders. When the law was approved in November 2011 dados.gov.br was already in beta, and all departments had 6 months to get their data online.
Our experience using CKAN has been positive throughout, from deploying an instance to work on extensions and translations. The beta version was created in a one-day open sprint of development on the portal (pictured below). The excellent installation documentation then meant that a person without much experience of Python or systems administration was able to put it online in a few hours. Where we did find minor bugs, because it is an open-source project, we could fix them – and submit our fixes back to the codebase to benefit other users.
In 2012 we wanted to make some customisations to the site. In an attempt to get a more data-focused home page, we wanted a ‘featured datasets’ extension showing three randomly selected datasets, and a tag list, as well as a news feed on the front page. The process of writing the extensions was very smooth: CKAN’s architecture is solid and easy to work with, and lends itself well to being extended, with a mechanism for creating plugins and using them in your application. Our only niggle was that the documentation for developing plugins was incomplete and was divided between two locations, the CKAN docs and wiki, which makes it a little difficult to find the information you need. However in creating the plugins we had a lot of support from the Open Knowledge Foundation and CKAN community, for which we are grateful.
Another important factor is CKAN’s backward-compatibility. This is not automatic (it needs human maintenance), but it does work very well. Our catalogue has the same database as when it was first deployed: we have never needed to install an instance and populate it manually, and every time we have upgraded to a new version, we’ve been able to run some scripts to do the upgrade and get the benefit of the new functionality.
Localisation: CKAN in other languages
Finally, let me highlight the ease of using CKAN in a different language. It is very easy to change CKAN’s default from English to another language, simply by specifying the desired language in a configuration file, if there is already a translation of the interface into that language. If there is no translation for your language, there is a simple mechanism for translating it using the Transifex tool, which allows one or more contributors and reviewers to work on the translations. We have been involved in the Brazilian Portuguese translations for new versions of CKAN since we started working with it.