Support Us

Building tools for Open Data adoption

denis - September 11, 2015 in Community, Feature, Featured

At DataCats, we are focused on a simple problem — how do we make sure every single government has easy access to get up and running with Open Data? In other words, how do we make it as easy as possible for governments of all levels to start publishing open data?

The answer, as you might tell by this blog, is CKAN. But CKAN uses a very non-traditional technology stack, especially by government standards. Python, PostgreSQL, Solr, and Unix, are not in the toolbox of most IT departments. This is true not only for local government in Europe and North America, but also for almost all government in the developing world.

Our answer to this problem are two software projects which, like CKAN, are Free and Open Source Software. The first is the eponymously named datacats, and the second is named CKAN Multisite. The two projects together aim to solve the operational difficulties in deploying and managing CKAN installations.

datacats is a command line library built on Docker, a popular new alternative to virtualization that is experiencing explosive growth in industry. It aims to help CKAN developers easily get set up and running with one or more CKAN development instances, as well as deploy those easily on any provider – be it Amazon, Microsoft Azure, Digital Ocean, or a plain old physical server data centre.

Our team has been using datacats to develop a number of large CKAN projects for governments here in Canada and around the world. Being open source, we get word every week of another IT department somewhere that is trying it out.

CKAN Multisite is a companion project to datacats, targeted at system administrators who wish to manage one or more CKAN instances on their infrastructure. The project was very generously sponsored by U.S. Open Data. Multisite provides a simple API and a web interface through which starting, stopping, and managing CKAN servers is as simple as pressing a button. In essence it gives you your very own CKAN cloud.

CKAN is as an open source project that many national and large city governments depend on as the cornerstone of their open data programs. We hope that these two open source projects will help the CKAN ecosystem continue to grow. If you are a sysadmin or a developer working on CKAN, give it a try — and if you have the appetite — consider contributing to the projects themselves.

Matthew Fullerton and some interesting CKAN extension development.

Steven De Costa - August 21, 2015 in Community, Extensions

Matthew Fullerton - mattfullertonNote: This is a re-post from one of our CKAN community contributors, Matthew Fullerton. He has been working on some interesting extensions, which are outlined below. You can support Matthew’s work by providing comments below, or you can link through to his GitHub profile to comment or get in touch there.

 

Styling GeoJSON data

The GeoView extension makes it easy to add resource views of GeoJSON data. In our extended extension, attributes of the features (lines, points) in the FeatureCollection are styled according to MapBox’s SimpleStyle spec.

Here’s an example where the file has been processed to add colors based on traffic flow state: https://smartlane.io/dataset/differentgeovisualizations/resource/49f0fcffb3c848c8b1c6ddc33e4a83fe

And another where the points are styled to (vaguely) look like colored traffic lights: https://smartlane.io/dataset/differentgeovisualizations/resource/a4e397adcbd948bfa77a296c5fcc9559 (watch out, it can take a while to load)

Realtime GeoJSON data

Using leaflet.realtime, an extension for the leaflet library that CKAN (GeoView) uses to visualize GeoJSON, maps can have changing points or colors/styles.

Here is an example of traffic lights changing according to pre-recorded data: https://smartlane.io/dataset/trafficlightstreamfrankfurtniederrad/resource/b6e4319ef29b480bad6d214a753d3c2d

I’ll try and add a demo with moving data points soon, it ought to work without any further code changes. The problem is often getting the live data in GeoJSON format… but we have a backend for preprocessing other data.

Realtime data plotting

By making only a few small changes, we are able to continuously update Graph views. You can see the changing (or not) temperature in our office here: https://smartlane.io/dataset/temperaturesensor/resource/bd6456385541499e861bf9c97e60f35a

That’s an example for ‘lines and points’ but it works for things like bar graphs too. Last week we had people competing to achieve the best time in a remote controlled robot race where their time was automatically displayed as a bar on a ‘leader board’. For good measure we had an automatically updating histogram of the times too. Updating the actual data in CKAN is easy thanks to the DataStore API.

Matthew Fullerton

Freelance Software Developer and EXIST Stipend holder with the start up project “Tapestry” http://www.smartlane.de/

Two new CKAN extensions – Webhooks and Geopusher

Steven De Costa - August 16, 2015 in Extensions

Denis Zgonjanin recently shared the following update on two new extensions via the CKAN Dev mail list.

If you are working on CKAN extensions and would like to share details with other developers then post your updates via the mail list. We’ll always look at promoting the great work of community contributions via this blog :) If you have an interesting CKAN story to share feel free to ping @starl3n to organise a guest post.

From Denis:

Webhooks

A problem I’ve had personally is having my open data apps know when a dataset they’ve been using has been updated. You can of course poll CKANperiodically, but then you need cron jobs or a queue, and when you’re using a cheap PaaS like heroku for your apps, integrating queues and cron is just an extra hassle.

This extension lets people register a URL with CKAN, which CKAN will call when a certain event happens – for example, a dataset update. The extension uses the built-in CKAN celery queue, so as to be non-blocking.

If you do end up using it, there are still a bunch of nice features to be built, including a simple web interface through which users can register webhooks (right now they can only be created through the action API)

Geopusher

So you know how you have a lot of Shapefiles and KML files in your CKANs (because government), but your users prefer GeoJSON? This extension will automatically convert shapefiles and KML into GeoJSON, and create a new GeoJSON resource within that dataset. There are some cases where this won’t work depending on complexity of SHP or KML file, but it works well in general.

This extension also uses the built-in celery queue to do its work, so for both of these extensions you will need to start the celery daemon in order to use them:

`paster --plugin=<span class="il">ckan</span> celeryd -c development.ini`

Beauty behind the scenes

Tryggvi Björgvinsson - August 5, 2015 in Deployments, Extensions, Featured

Good things can often go unnoticed, especially if they’re not immediately visible. Last month the government of Sweden, through Vinnova, released a revamped version of their open data portal, Öppnadata.se. The portal still runs on CKAN, the open data management system. It even has the same visual feeling but the principles behind the portal are completely different. The main idea behind the new version of Öppnadata.se is automation. Open Knowledge teamed up with the Swedish company Metasolutions to build and deliver an automated open data portal.

Responsive design

In modern web development, one aspect of website automation called responsive design has become very popular. With this technique the website automatically adjusts the presentation depending on the screen size. That is, it knows how best to present the content given different screen sizes. Öppnadata.se got a slight facelift in terms of tweaks to its appearance, but the big news on that front is that it now has a responsive design. The portal looks different if you access it on mobile phones or if you visit it on desktops, but the content is still the same.

These changes were contributed to CKAN. They are now a part of the CKAN core web application as of version 2.3. This means everyone can now have responsive data portals as long as they use a recent version of CKAN.

New Öppnadata.se

New Öppnadata.se

Old Öppnadata.se

Old Öppnadata.se

Data catalogs

Perhaps the biggest innovation of Öppnadata.se is how the automation process works for adding new datasets to the catalog. Normally with CKAN, data publishers log in and create or update their datasets on the CKAN site. CKAN has for a long time also supported something called harvesting, where an instance of CKAN goes out and fetches new datasets and makes them available. That’s a form of automation, but it’s dependent on specific software being used or special harvesters for each source. So harvesting from one CKAN instance to another is simple. Harvesting from a specific geospatial data source is simple. Automatically harvesting from something you don’t know and doesn’t exist yet is hard.

That’s the reality which Öppnadata.se faces. Only a minority of public organisations and municipalities in Sweden publish open data at the moment. So a decision hasn’t been made by a majority of the public entities for what software or solution will be used to publish open data.

To tackle this problem, Öppnadata.se relies on an open standard from the World Wide Web Consortium called DCAT (Data Catalog Vocabulary). The open standard describes how to publish a list of datasets and it allows Swedish public bodies to pick whatever solution they like to publish datasets, as long as one of its outputs conforms with DCAT.

Öppnadata.se actually uses a DCAT application profile which was specially created for Sweden by Metasolutions and defines in more detail what to expect, for example that Öppnadata.se expects to find dataset classifications according the Eurovoc classification system.

Thanks to this effort significant improvements have been made to CKAN’s support for RDF and DCAT. They include application profiles (like the Swedish one) for harvesting and exposing DCAT metadata in different formats. So a CKAN instance can now automatically harvest datasets from a range of DCAT sources, which is exactly what Öppnadata.se does. For Öppnadata.se, the CKAN support also makes it easy for Swedish public bodies who use CKAN to automatically expose their datasets correctly so that they can be automatically harvested by Öppnadata.se. For more information have a look at the CKAN DCAT extension documentation.

Dead or alive

The Web is decentralised and always changing. A link to a webpage that worked yesterday might not work today because the page was moved. When automatically adding external links, for example, links to resources for a dataset, you run into the risk of adding links to resources that no longer exist.

To counter that Öppnadata.se uses a CKAN extension called Dead or alive. It may not be the best name, but that’s what it does. It checks if a link is dead or alive. The checking itself is performed by an external service called deadoralive. The extension just serves a set of links that the external service decides to check to see if some links are alive. In this way dead links are automatically marked as broken and system administrators of Öppnadata.se can find problematic public bodies and notify them that they need to update their DCAT catalog (this is not automatic because nobody likes spam).

These are only the automation highlights of the new Öppnadata.se. Other changes were made that have little to do with automation but are still not immediately visible, so a lot of Öppnadata.se’s beauty happens behind the scenes. That’s also the case for other open data portals. You might just visit your open data portal to get some open data, but you might not realise the amount of effort and coordination it takes to get that data to you.

Image of Swedish flag by Allie_Caulfield on Flickr (cc-by)

CKAN 2.4 release and patch releases

davidread - July 22, 2015 in Releases, Uncategorized

We are happy to announce that CKAN 2.4 is now released. In addition, new patch releases for older versions of CKAN are now available to download and install.

CKAN 2.4

The 2.4 release brings a way to set the CKAN config via environment variables and via the API, which is useful for automated deployment setups. 2.4 also includes plenty of other improvements contributed by the CKAN developer community during the past 4 months, as detailed in the 2.4.0 CHANGELOG

If you have customizations or extensions, we suggest you trial the upgrade first in a test environment and refer to the changes in the changelog. Upgrade instructions are below.

CKAN patch releases

These new patch releases for CKAN 2.0.x, 2.1.x, 2.2.x and 2.3.x fix important bugs and security issues, so users are strongly encouraged to upgrade to the latest patch release for the CKAN version they are using.

For a list of the fixes included you can check the CHANGELOG:

Upgrading

For details on how to upgrade, see the following links depending on your install method:

If you find any issue, you can let the technical team know in the mailing list or the IRC channel.

 

Some introductory presentations for CKAN

Steven De Costa - June 8, 2015 in Community, Presentations

Reposted from the CKAN Association LinkedIn group. Feel free to join if you use LinkedIn.

Thanks to Augusto Herrmann Batista and OK Brazil for allowing the following repost:

I recently presented a couple of “lightning courses” to introduce an audience to CKAN.

One was at the Linked Open Data Brasil conference in Florianópolis, Brazil, on November 2014. It’s in Portuguese language.

http://www.slideshare.net/AugustoHerrmannBatis/minicurso-de-ckan

The other one was presented at the IV Moscow Urban Forum, in Russia, on December 2014. This one is in English.

http://www.slideshare.net/AugustoHerrmannBatis/ckan-overview

Feel free to share and reuse, as they are CC-BY.

Bazinga! Minutes from the CKAN Association Steering Group – 1 April (no joke)

Steven De Costa - April 1, 2015 in Association, Featured

Readme.txt

The following minutes represent what the Steering Group discussed today but please remember its also just a meeting (context: no real work is ever done in a meeting). The objective is to discuss and assign actions when needed, to make decisions when needed and to generally align everyone in the various ways each member is already supporting the CKAN project. Reading between the lines of this update there are a few points to call out and make mention of.

  1. The Steering Group (SG) are renewed with energy and determination. While the last meeting might have been some time ago we have set ourselves the objective of meeting weekly (after next week) because it is clear that the CKAN project is advancing rapidly and support from the SG needs to align with the velocity of the project without any risk of holding it back. Let’s add some buzzwords and suggest that the SG is aiming to bootstrap the project and intersect on multiple vectors to achieve maximum lift via regular and meaningful engagement with its project stakeholders (Please don’t take that last sentence seriously).
  2. ‘Distill out a 1-3 pager’ in relation to the business plan means getting lean and putting focus on the most essential parts of the CKAN Association business plan. Long docs with much wordage are great in some situations but in the case of the CKAN project we have an avid community of exceptionally bright people who are fine with the key objectives, strategies and tactics put forward in the most succinct way possible.
  3. If there is to be an operating model for the SG then it will be this: Say what is going to get done. Get it done+. Let everyone know it is done.
  4. Some awesome questions are answered at the end of this post.

+ In some cases things might not actually get done but we will strive to do the best we can. Yes, we’ll be transparent with goals. Yes, we’ll be happy to take any and all feedback. Yes, we are working for the CKAN project and are ultimately governed via public peer review by the project’s community.

CKAN Association Steering Group Meeting 1 April 2015

  • Present: Ashley Casovan (Chair), Steven De Costa, Rufus Pollock (Secretary)
  • Apologies: Antonio Acuna

Minutes

  1. Steering Group Goals (next quarter)
    1. Announce more clearly existence and purpose of Steering Group
      1. steering group email alias: steering-group@ckan.org (goes to group)
    2. Announce objectives which are
      1. Finalise business plan (have now had out for consultation for some time)
        1. Distill out a 1-3 pager
        2. Finalise and announce to list
        3. hangout on air to announce
      2. Community meetings
        1. Technical team run one at least one (general) developer community meeting in next 2m
        2. At least one users community meeting in next 2m
  2. Responsibilities of the SG
    1. Like a board – see http://ckan.org/about/steering-group/
    2. Similarities to Drupal Board: job is to support the community in moving the project forward – self-determination
  3. Review Actions
    1. https://trello.com/b/D6zxiuFJ/ckan-association-steering-group – primarily business plan and response to questions [note this Trello board is private]
  4. CKAN Event at IODC
    1. CIOs and CTO – CKAN is part of national and regional infrastructure
    2. efficiency gains on open data
    3. https://github.com/ckan/ideas-and-roadmap/issues/120
    4. Technical capability
  5. Review of student position description
    1. Ashley to send out to SG members for comment
  6. Meeting schedule: SG will meet weekly (for present) every Thursday at 12 noon UK (for 30m)
  7. Publishing minutes from this meeting – will aim to send asap

Your questions answered

Q: Is the SG interested in increasing transparency of the SG meetings? How will this be achieved?

SG: Yes. This was discussed and we would like to propose the next meeting be run in two parts. One part will be closed to attend to some regular business of the group with regard to coordinating efforts. The second half will be broadcast as a Hangout on Air for people to watch. We’ll aim to collect questions ahead of the meetings and address them during the broadcast with further options for an active Q&A session from the audience.

Q: Have the SG determined whether members of the association are yet contributing funds, or developers to the project? What are they? What happens if members don’t?

SG: There is ongoing work in this area. Most members are contributing in-kind (not exclusively developers). We’d be happy to make the pledged contributions public via the members listing on CKAN.org. At this time it is an honour system with regard to meeting membership obligations that are provided in-kind. If a member is suspected of not providing the expected level of in-kind contribution then the Steering Group will investigate and consider appropriate actions upon conclusion of such investigations.

Q: How does the SG see its role with respect to providing direction for the project?

SG: Support the community of both technical stakeholders and users in ways which allow them to act in concert to move the project forward in the direction these stakeholders determine to be best for the project.

Q: How is the SG raising more funds, other than membership, to further fund development of CKAN?

SG: This is a question the Steering Group is working through currently. Our focus is on the Business Plan and putting strategic objectives down for all to see via that document. Grant applications and the coordination of requirements to meet the needs of a group of platform owners is also being considered. With the latter the proposed approach is to release an expression of interest for funding support against specific development activities. Those who highly value such activities would be asked to help contribute to a pool of funds that would then see the development work paid for.

CKAN Association Steering Group – about to set sail!

Steven De Costa - March 31, 2015 in Association, Events, Featured

boatThe CKAN Association Steering Group will be meeting in about 30 hours from now. I wanted to make sure we took the opportunity to ask for community questions regarding the CKAN project.

So, please comment here with any questions you might like discussed and/or answered by the steering group :)

This will be my first chance to catch up with everyone in the group so I will have lots of questions of my own. I’m also keen to provide updates on how I see things are going with regard to developing and extending the CKAN community and its reach with regard to communications activities. We have a modest starting point, so updates will be easy to provide. It would be great to get comments via this post on what more people would like to see. However, there are many action items incomplete from within the Community and Communications Team so I’ll also be reporting on that. We don’t yet have a list of CKAN vendors and this is clearly needed based on the number of CKAN Dev list requests regarding upgrade questions when planning a move to 2.3.

Some great positive indicators I see for the project are the number of people active on the CKAN Dev email list and the high volume of quality conversations that are taking place there. It appears the the 2.3 release has been the catalyst needed for a fantastic reinvestment (at least publicly) from both the regular technical team members and the wider community of awesome people doing amazing stuff within their own open data projects. I would like those on the steering group to recognise this change and actively work to support ###MOAR###!

As a new member of the steering group I should introduce myself. You can see the bio attached to this post but for a fresh video-cast of something I’m involved in within my local area you can also take a look at the Australian Open Knowledge Chapter Board meeting that was held earlier today. The video is embedded below. I do actually mention the work I’ve been doing within the CKAN association at some point so please excuse the ‘inception’-like self referential nature of all this.

The main message here is – steering group meeting in about 30 hours. Please comment on this post to amplify your voice within that forum.

Rock on! Steven

 

Presenting public finance just got easier

Tryggvi Björgvinsson - March 20, 2015 in Extensions, Feature, Featured, Releases, Visualization

mexico_ckan_openspending

CKAN 2.3 is out! The world-famous data handling software suite which powers data.gov, data.gov.uk and numerous other open data portals across the world has been significantly upgraded. How can this version open up new opportunities for existing and coming deployments? Read on.

One of the new features of this release is the ability to create extensions that get called before and after a new file is uploaded, updated, or deleted on a CKAN instance.

This may not sound like a major improvement  but it creates a lot of new opportunities. Now it’s possible to analyse the files (which are called resources in CKAN) and take them to new uses based on that analysis. To showcase how this works, Open Knowledge in collaboration with the Mexican government, the World Bank (via Partnership for Open Data), and the OpenSpending project have created a new CKAN extension which uses this new feature.

It’s actually two extensions. One, called ckanext-budgets listens for creation and updates of resources (i.e. files) in CKAN and when that happens the extension analyses the resource to see if it conforms to the data file part of the Budget Data Package specification. The budget data package specification is a relatively new specification for budget publications, designed for comparability, flexibility, and simplicity. It’s similar to data packages in that it provides metadata around simple tabular files, like a csv file. If the csv file (a resource in CKAN) conforms to the specification (i.e. the columns have the correct titles), then the extension automatically creates the Budget Data Package metadata based on the CKAN resource data and makes the complete Budget Data Package available.

It might sound very technical, but it really is very simple. You add or update a csv file resource in CKAN and it automatically checks if it contains budget data in order to publish it on a standardised form. In other words, CKAN can now automatically produce standardised budget resources which make integration with other systems a lot easier.

The second extension, called ckanext-openspending, shows how easy such an integration around standardised data is. The extension takes the published Budget Data Packages and automatically sends it to OpenSpending. From there OpenSpending does its own thing, analyses the data, aggregates it and makes it very easy to use for those who use OpenSpending’s visualisation library.

So thanks to a perhaps seemingly insignificant extension feature in CKAN 2.3, getting beautiful and understandable visualisations of budget spreadsheets is now only an upload to a CKAN instance away (and can only get easier as the two extensions improve).

To learn even more, see this report about the CKAN and OpenSpending integration efforts.

If ‘Change’ had a favourite number…it would be 2.3

Steven De Costa - March 11, 2015 in Featured, Releases

There’s something about the number 2.3. It just rolls off the tongue with such an easy rectitude. Western families reportedly average 2.3 children; there were 2.3 million Americans out of work when Barrack Obama took Office; Starbucks go through 2.3 million paper cups a year. But the 2.3 that resonates with me most is 2.3 billion. That was the world population in the late 1940’s, and growing. WWII was over and we were finally able to stand up, dust off the despair of war and Depression, bask in a renewed confidence in the future, and make a lot of babies. We were on the brink of something and what those babies didn’t know yet was that they would grow up to usher in a wave of unprecedented social, economic and technological change.

We are on the brink again. Open data is gaining momentum faster than the Baby Boomers are growing old  and it has the potential to steer that wave of change in all manner of directions. We’re ready for the next 2.3. Enter CKAN 2.3.

Here are some of the most exciting updates:

  • Completely refactored resource data visualizations, allowing multiple persistent views of the same data an interface to manage and configure them. Check the updated documentation to know more, and the “Changes and deprecations” section for migration details: http://docs.ckan.org/en/ckan-2.3/maintaining/data-viewer.html

  • Responsive design for the default theme, that allows nicer rendering across different devices

  • Improved DataStore filtering and full text search capabilities

  • Added new extension points to modify the DataStore behaviour

  • Simplified two-step dataset creation process

  • Ability for users to regenerate their own API keys

  • Changes on the authentication mechanism to allow more secure set-ups. See “Changes and deprecations” section for more details and “Troubleshooting” for migration instructions.

  • Better support for custom dataset types

  • Updated documentation theme, now clearer and responsive

If you are upgrading from a previous version, make sure to check the “Changes and deprecations” section in the CHANGELOG, specially regarding the authorization configuration and data visualizations.

To install the new version, follow the relevant instructions from the documentation depending on whether you are using a package or source install: http://docs.ckan.org/en/ckan-2.3/maintaining/installing/index.html

If you are upgrading an existing CKAN instance, follow the upgrade instructions: http://docs.ckan.org/en/ckan-2.3/maintaining/upgrading/index.html

We have also made available patch releases for the 2.0.x, 2.1.x and 2.2.x versions. It is important to apply these, as they contain important security and stability fixes. Patch releases are fully backwards compatible and really easy to install: http://docs.ckan.org/en/latest/maintaining/upgrading/upgrade-package-to-patch-release.html

Charting the CKAN boom.

The following graph charts population from 1800 to 2100 but we’re interested in the period from the mid-1940s when there was a marked boost in population growth.

World population estimates from 1800 to 2100

World population estimates from 1800 to 2100. Sourced from Wikipedia: http://en.wikipedia.org/wiki/World_population The growth from 2.3 Billion in the 1940s is the Boom!

With the recent release of CKAN 2.3 we’re expecting a similar boost in community contributions. To add your voice to the community and boost the profile of the CKAN project please share a picture on twitter and include the hashtag #WeAreCKAN.

cooltext115409351606537