Blog

Meetup Recap | The CKAN Ecosystem: Insights and Analysis from the POSE Team

Our July 2023 CKAN Monthly Live Meetup #19 presented an exclusive deep-dive into the comprehensive work of the POSE Project. Focused on strengthening the CKAN ecosystem, the team has been conducting a thorough analysis of data, engaging in insightful interviews, and hosting participatory workshops, all with the aim to improve CKAN's effectiveness as an open data, open government, and open science platform. This webinar provided a unique opportunity to delve into their preliminary findings.

CKAN Monthly Live 19.png

  • Eager to get straight into the heart of our latest webinar? Click this link to access the recording.

Our July 2023 CKAN Monthly Live Meetup #19 presented an exclusive deep-dive into the comprehensive work of the POSE Project. Over the past nine months, the POSE Project embarked on an in-depth exploration of the CKAN ecosystem. This marked their first extensive attempt at building relationships within the CKAN community. This in-depth exploration of the CKAN ecosystem by the POSE Project team underlined the power of collaboration and community, shining a light on the critical role they play in navigating its complex dynamics.

An Interactive Session with the POSE Team

The session, led by Bob Gradeck and his colleagues Ross Reilly, Liz Monk, and Elise Silva from the University of Pittsburgh, along with Joel Natividad and Sami Baig from datHere spotlighted their significant contributions to the initiative funded by the U.S. National Science Foundation’s Pathways to Enable Open-Source Ecosystems (POSE) program. Focused on strengthening the CKAN ecosystem, the team has been conducting a thorough analysis of data, engaging in insightful interviews, and hosting participatory workshops, all with the aim to improve CKAN's effectiveness as an open data, open government, and open science platform. This webinar provided a unique opportunity to delve into the preliminary findings of the POSE team's intensive study of the CKAN ecosystem.

Missed the meetup? Here's a summary of the key insights:


Key Learnings from the POSE Project's Study

The presentation was kicked off by Bob who touched upon a few points:

  1. Collaboration: The CKAN Ecosystem's Secret Sauce: Navigating the CKAN ecosystem isn't a solo journey. It's a buddy system. A fact highlighted by both Bob and Joel. They acknowledged the indispensable role that partnerships played in navigating through a diverse community that is geographically widespread, fostering relationships within the ecosystem for the first time. These partnerships were the stepping stones to viewing CKAN through the lens of an ecosystem, a perspective shift that added immense value to their exploration.
  2. Understanding CKAN as an Ecosystem: We can say that with the help of the POSE Project, CKAN has begun to be perceived not just as a tool but as an ecosystem with vast potential. Their mutual journey in the ecosystem painted a picture of CKAN not just as a tool but as an interactive universe, thriving on collaboration.
  3. Sherlock Holmes-ing the CKAN Ecosystem: The POSE team turned detectives, conducting 20+ interviews and dissecting ecosystem artefacts like usage patterns and extensions deployed. All these efforts were aimed at achieving a holistic understanding of CKAN's use, exploring utilized extensions, and much more. The team is currently conducting a series of Sense-making workshops aimed at gathering community feedback to further enhance their understanding of the ecosystem's dynamics.
  4. Understanding the CKAN Community: Who makes up the CKAN ecosystem? Throughout their exploration, the team identified a range of roles within the CKAN ecosystem, including stewards, core team members, consultants, code contributors, community builders, adopters, and end-users. This kaleidoscope of roles was indicative of the vibrant community that powers the CKAN ecosystem. It's like a CKAN family with a shared dedication to the principles of open data.
  5. Looking Forward: As the project wraps up, the team aims to develop a Phase Two proposal for the NSF. This proposal aims to provide additional resources to the CKAN ecosystem and identify what can further strengthen it.

Decoding the Research: Insights and Learnings

Dr. Elise Silva from the University of Pittsburgh led a comprehensive study to delve deep into the CKAN ecosystem. This meticulous endeavour involved more than 20 interviews and featured input from over 30 individuals, encapsulating the diverse array of users that bring this ecosystem to life. This comprehensive representation enabled the research team to gather a well-rounded understanding of CKAN's multifaceted user base.

Using constructivist grounded theory, a qualitative analysis approach that delves into the subjective nature of the research and how the researcher engages with participant data, Dr. Silva and her team navigated through the data. Their method of analysis employed a 'hybrid open and closed coding' technique. It allowed them to begin the process with predetermined themes while simultaneously welcoming fresh, unexpected concepts that emerged as they explored the interviews. The interviews underscored the importance of onboarding, connecting with existing users, enhancing support, documentation, and overall accessibility within the ecosystem. Conversations around the value of open data and the urge for community building resonated significantly among the participants. They emphasized these core values as the reason behind their active participation in the CKAN ecosystem.

CKAN Instance Analysis: The Pulse of the Ecosystem

Ross Reilly presented an intriguing "CKAN health check". He delved into CKAN instances' size, lifespan, updates, and geographical distribution. His findings indicate a buzzing ecosystem, with active updates and a global presence. You can find out more in his report here: CKAN Instance Analysis

Despite the absence of a single, comprehensive list of CKAN instances, the team managed to compile a list from two websites: datashades.info and dataportals.org.

Key takeaways from the CKAN instance analysis:

  • Of the approximately 1,000 URLs examined, a total of 381 functioning CKAN instances were found.
  • CKAN instances were identified from 59 different countries, with a range in size from 0 to 1,000,000+ datasets.
  • One of the challenges was to determine whether the identified portals were indeed instances of CKAN. For this, the team used the API functions provided by CKAN, and any data portal that had a valid response to at least one API call was considered a CKAN instance.
  • Analysis of the size of the instances revealed a median of 242 datasets per instance, with a few instances hosting a disproportionately large number of datasets. The top three instances in terms of the number of datasets were EUDAT B2Find data portal, which had a staggering 1.12 million datasets, the Government of Indonesia’s Satu Data Indonesia, and the Geological Survey of Queensland Open Data Portal.
  • In terms of geographic distribution, CKAN instances were found to be most prevalent in Brazil and the USA, followed by Argentina, Australia, Canada, Germany, the UK, Spain, Italy, and Japan.
  • The estimated age of the instances, calculated using metadata creation dates, revealed that most instances were launched in the year 2017. Moreover, the instances appeared to be maintained regularly after launch.
  • Analysis of the CKAN versions in use revealed that most instances have not upgraded to the latest version (2.10), with a quarter still using a version below 2.7.
  • The findings suggest a need for outreach and technical assistance to help users upgrade to more secure and sophisticated versions of the platform. Furthermore, it highlights the value in boosting the adoption of CKAN as a data management solution.

CKAN Evolution: From a Timeline to Prime Time

Sami Baig introduced us to an interactive timeline of CKAN's journey. Imagine a movie reel of CKAN's development milestones, from its birth to its current version. You can see it here: CKAN Major Releases Timeline: A Journey of Continuous Improvement.

With the help of an open-source tool called TimelineJS (by Knight Lab from Northwestern University), the team converted significant release notes into a timeline, starting from CKAN's creation at version 0.1 by Rufus Pollock in 2005-2006, all the way to its ongoing development at version 2.10 and beyond. Highlighting key milestones like the solid foundation and launch of data.gov.uk (1.0 version), a significant system architectural overhaul (2.0 version) that drastically improved performance and scalability, and a transformative switch replacing CKAN’s web framework from Pylons to Flask and support for Python 3, the timeline serves as an effective visualization tool reflecting CKAN's continuous commitment to the open data ecosystem over the past 17 years. The frequency of new releases and advancements signals a promising future for the platform. The team plans to maintain and update this timeline to provide an easy-to-understand historical view. Sami mentioned they'd soon be sharing a detailed blog post explaining the behind-the-scenes process of building this timeline. Stay tuned!

Sense-making Workshops: CKAN's Idea Factory

Over the past month, Liz Monk and her team have held multiple sense-making workshops. These workshops are based on the findings from the interviews they conducted over the past year, which focused on themes such as onboarding new contributors, CKAN adoption and maintenance, and community building. The workshops have not only provided a platform for sharing experiences and learning but also fostered new connections within the community, thus solidifying the inherent collaborative spirit of CKAN. Three of these insightful workshops have been completed to date, but don't worry if you've missed out so far - there's still one more to go. Slated for next week, the final session is set to continue gathering valuable thoughts and feedback, ensuring no stone is left unturned. You can register here: Workshop 4: Open Discussion.

And for those who couldn't attend the initial sessions or are keen to revisit the discussions, there's good news. The team has already published a recap of the first two workshops in a blog post. You can see it here: Sensemaking Workshop Recap.

Q&A session

There was an insightful exchange of thoughts rather than a traditional Q&A session. Here are some of the interesting moments:

Co-Steward's Insight: Building a Thriving Tech Movement with Purpose, Balance, and Strong Sense of Community

In a recent discussion, co-steward Steven De Costa shared his thoughtful insights on the key elements driving the success and progress of a tech movement. He highlighted three elements as fundamental - a collective purpose, balance, and a strong sense of community. Steven believes these elements are as, if not more, important than the mere technical or functional direction of a project.

Reflecting on our project's journey, Steven painted a picture of the many ways these key aspects have taken shape within the project. As we have navigated the intricate process of product development, community building, and ecosystem mapping, harmony has started to resonate, affirming the balance he spoke of and pointing us all in a shared direction. This harmony is more than just a comfortable rhythm - it's a reinforcement of our technical work, a testament to the unity of purpose that we share.

A Lake of Potential Energy: Harnessing Collective Efforts

Steven offered an intriguing analogy comparing our project's journey to a vast lake filled with water. Over time, our collective efforts have accumulated like the waters of this lake, each initiative contributing its own stream. This 'water' represents huge potential energy, ready to be harnessed - much like hydroelectric power. Various ongoing initiatives, including the diligent work of Alex Gostev and the engaging CKAN Monthly Live meetups, act as channels through which we are funneling this energy, converting the potential into meaningful output. It's a dynamic process, where diverse streams of effort converge and amplify the collective power.

Looking Forward: What Does the Future Hold?

As we move forward, Steven asked a thought-provoking question - what do we, as a community, need to provide next to bolster confidence in our project's forthcoming plans? This query, which was left open for us to ponder, reflects the ongoing need for engagement, collaboration, and mutual support.

Values, Purpose, and Open Source: A Call for Alignment

Steven highlighted the crucial role of purpose and strong values in the evolution of the project. He emphasized that a project like ours, akin to an open-source entity, necessitates alignment around a set of shared values. This alignment does not come from social pressure but rather stems from good guidance, mentorship, and leading by example.

He stressed the importance of creating a safe space where people feel free to make mistakes, feel supported in their successes, and are ultimately unified by a common purpose. In his view, we are at a pivotal moment, where data management platforms have the potential to provide invaluable benefits to society through projects like open data platforms. They can be instrumental in combating disinformation, fostering trust in public institutions, and countering misinformation.

This struggle isn't necessarily aggressive; instead, it's a matter of making data available and accessible. Steven sees a purpose, driven by this 'for good' initiative, creating digital public goods as an ideal goal for our project. While acknowledging the inevitable competitive aspect and the importance of the technical direction in any tech project, he concluded by reiterating that the social purpose of the project is equally vital.

A Balance Between Social Purpose and Technical Ambition

While acknowledging the inherent competitive aspect of any tech project and the importance of technical direction, Steven concluded by underscoring the immense value of the project's social purpose. As we move forward, this balance between technical ambition and social impact will continue to shape our journey, steering us toward meaningful and sustainable progress.

The Evolution of CKAN: Perspectives

Ivan Begtin, a long-time user of CKAN, offered insights on the platform's role within the shifting open data ecosystem. He noted growing competition due to the rise in diverse data portals and repositories, necessitating more comprehensive monitoring of CKAN installations. Despite the slowing momentum behind government transparency, Begtin observed an increasing trend towards open research data, an area where CKAN's presence is notably underrepresented. He proposed that CKAN needs to adapt, either by broadening its feature set or by narrowing its focus to target specific topics, thus questioning the continuing relevance of open government data as CKAN's primary focus.

In response, co-steward Steven De Costa acknowledged the evolving landscape of the open data ecosystem, while underlining that CKAN is driven by a robust community of contributors and adopters. The history of data catalogs and jurisdictional government data are embedded in CKAN’s DNA, giving it a defensible position. CKAN’s strength lies in its established user base, community, proven data management practices, perseverance, and adaptation to changes. He emphasized the importance of focusing on these strengths, rather than the weaknesses, to continue to move forward.

Steven highlighted CKAN's resilience and longevity, attributing these qualities to the dedication and care of the community and tech team. He praised CKAN's non-commercial nature, which allows the project to maintain its direction without being swayed by external financial interests. In his view, this characteristic makes CKAN a reliable platform for customers looking for a stable foundation for their long-term data management needs.

Conclusion

As we look to the future, we're excited to leverage the collective energy we've built, and we invite everyone in our community to join us as we navigate the next stages of our project. It's through these combined efforts that we will truly make a difference.

In the spirit of continued growth and learning, we invite you to register for the last workshop sof the POSE team cheduled for next week. This engaging session promises to offer further insights into the evolution of the CKAN ecosystem.

Register now and continue making a difference in the open data world: Sensemaking Workshop Registration.

Let's bring our shared vision to life, together.


Useful links:

Recording available

Click this link to access the recording.