Under the hood of CKAN
The following is a guest post from Christophe Guéret, member of the working group on EU Open Data
CKAN.net is a community-based effort for creating an open catalog of public datasets. Using the web site, everyone is free to register datasets thereby stating their existence and the possible links between them along with extra meta-data (license, author, …). One of the nice features of CKAN is that the data about the datasets is stored in a structured, and consistent way. This allows for a direct export of this information into RDF data. It also seduced Richard Cyganiak and Anja Jentzsch who, last year, decided to drop the wiki pages (1,2) they used to draw the LOD Cloud in favor of CKAN.
In addition to storing structured data, CKAN also offers a convenient API for accessing it. It’s a ReST API which also comes with several binding interface for Python, PHP, Perl, … and make it easy to get a list of packages matching some criteria. This API can, for instance, be employed to get a list of packages tagged as ‘lod-cloud’ and render them using protovis as shown by Ed Summers and Richard Cyganiak on this site. An other interesting use is to get the same data and reformat it into some suitable for consumption by network analysis software (c.f. this blog post). In plus of offering a wide range of visualisations for the network, these software can also compute several metrics highlighting aspects of the graph that can not be observed by looking at some nodes individually.
Here are two examples of rendering, the first realised by Ed Summers and Richard Cyganiak:
The second realised by Rinke Hoekstra:
Visualisation of the clusters in the LOD Cloud, rendering done by Gephi:
If you haven’t tried it yet, go check the API and its documentation and start re-formating the data from CKAN.net to make something new out of it ;-)