Enhancing DCAT support in CKAN (DCAT-AP v3, scheming integration, and more)
A review of the recent developments in CKAN's DCAT support, and how you can get involved
📖 Useful files:
- Meeting notes
- CivicDataLab's presentation
At our most recent CKAN Monthly Live meetup, we had the opportunity to hear from CivicDataLab, a pioneering organization in India that works at the intersection of data, technology, design, and social science. They work to harness the potential of the open-source movement to enable citizens to engage better with public reforms. Their mission revolves around empowering civil society organizations, governments, non-profits, think tanks, media houses, universities, and other stakeholders through data-driven decision-making.
Read on to discover the success stories behind Open Budgets India, Open City, and JusticeHub and how CivicDataLab leveraged CKAN to innovate in public policy and civic engagement - practical insights you can apply yourself!
We were happy to have a panel of dedicated professionals from CivicDataLab joining us: Deepthi Chand, DC (Co-Founder), Abhinav Singh (Tech Lead), Shoaib Ahmed (Associate Lead Engineer - FrontEnd), and Sai Krishna (Data Engineer).
CivicDataLab has a multi-disciplinary approach to solving societal issues. Starting with public finance, the organization has expanded its efforts to include law and justice, climate change, and digital public goods, among other areas. The CivicDataLab model is iterative: begin by opening up datasets, collaborate with various stakeholders to develop tailored data solutions, and then work on enhancing the data literacy of these stakeholders. Importantly, CivicDataLab prioritizes capacity-building for various stakeholders to adopt these data solutions. The cycle then recommences, to identify more data to liberate and new solutions to develop.
CivicDataLab introduced Open Budgets India (OBI), a platform that provides stakeholders with a unified view of budget documents from different levels of the government.
JusticeHub serves as a data exchange platform in the legal ecosystem. It diverges significantly from the OBI platform by taking customization to a higher level, particularly in dataset creation and presentation. It stands out for its customized user interface and metadata designed to meet the needs of various stakeholders.
CivicDataLab's Open City initiative is designed for ordinary citizens. It shifts the focus from merely presenting data to enhancing user engagement and education, particularly for those not traditionally data-savvy.
CivicDataLab faced several challenges in making data useful and accessible. First, they need to create platforms that do more than just list data. They have to provide useful insights tailored to different needs. Second, the platforms must be flexible enough to allow custom views based on the user's role or specific case. Third, the way data is organized needs to be more specific to the situation, going beyond basic categories like ‘resources’, 'datasets' etc. Finally, while the data may be stored in standard formats like CSV, it should be displayed in various formats to meet different user needs.
During deployments, a key realization was the need for custom presentation methods to effectively reach different stakeholders, especially those who aren't tech-savvy. That’s when they began to explore decoupling CKAN's front end from its back end. This allows for a more customized interface while leveraging CKAN's robust data management strengths.
CivicDataLab has gone the extra mile to create specialized dashboards to suit unique needs - Sector Dashboards, Constituency Dashboards, Budgets for Justice, Zombie Tracker, and Data 4 Districts, to name a few. These dashboards enable nuanced, domain-specific analysis and queries, serving as hubs for stakeholder engagement.
The key learnings from CivicDataLab's approach include:
The team is working on a specialized solution called Open Publishing (oPub). This effort is guided by core principles including privacy by design, design for scale, оpen for scale, and a commitment to open-source. The effort is focused on modularity and being data-driven, with the aim of adhering to diverse data standards, indicators, frameworks, custom licenses, etc. The team also envisions implementing customizable feedback loops and alerts, as well as incorporating user insights and real-time data analytics into the platform. The ultimate goal is to facilitate the easier and more robust adoption of the platform across various sectors, including climate, health, legal aid, and governance.
CivicDataLab's achievements mark a pivotal shift in how data can empower society. Using open data, they've made crucial information accessible to everyone, revolutionizing civic participation and transparency in governance. As they continue to innovate, they're setting a new standard for how data can drive social progress.
CKAN serves as the reliable data management backbone in their ventures, highlighting the software's adaptability for diverse applications. It's not just robust—it's adaptable. CKAN meets the unique needs of various projects, from budget transparency to urban planning. The bottom line? A solid backend like CKAN is the foundation for creating user-focused experiences that truly make an impact. If you want to see how data can revolutionize society, keep an eye on CivicDataLab and the power of CKAN.
Q1: How much time did it take to go through all the processes?
A1: The projects have various time spans—eight years for Open Budgets India (OBI), three years for JusticeHub, and two years for Open City. The work presented is an accumulation of long-term learnings.
Q2: For the storytelling, did you use any other visualization tools to help facilitate the storytelling?
A2: WordPress was used for textual content moderation and country-specific content development. For visual elements, D3 was used for Union Budget Explorer, Apache SuperSet for some solutions, and Apache Echarts is being explored for specific use cases that require greater control over visual representation.
Q3: Any of these innovations being open source as well?
A3: All the platforms discussed are open source and come under the umbrella of the Civic Data Lab organization.
Q4: When you engage with end users, how do you define data standardization with them?
A4: Different strategies are adopted depending on the dataset. For procurement data, Open Contracting Partnership standards were employed. For end-users, data standardization is not the focus; instead, they are interested in specific use cases. The objective is to align data standards with the insights that the end users wish to derive.
Q5: What other technologies are you using alongside CKAN?
A5: In addition to CKAN, next.js and react.js are used for custom views. Python service layers are sometimes placed in between to process data. Other visualization tools like D3, eTots, and SuperSet are also used. Django CMS and WordPress have also been deployed on certain platforms.
Q6: Regarding ETL, are you building your own pipelines, or using any technology?
A6: Initially, Airflow was used for ETL tasks. However, to gain more programmatic control over pipelines, the team switched to using Prefect.
Q7: Who are the current users for the platform? Is there a revenue stream or is it a non-profitable contribution to the general public?
A7: The platforms are public goods with no revenue streams derived directly from them. Revenue is primarily generated through engagements with government entities, often centered around data analysis and promoting the open data ecosystem.
Q8: Are you integrating WordPress with CKAN and could you elaborate more on that method?
A8: WordPress is integrated with CKAN at specific touchpoints, primarily for data exchange, such as fetching Groups list or datasets. However, they have not integrated the authentication mechanisms; CKAN remains the primary authentication system.
Q9: What is the approach you take to do the data modeling?
A9: The approach to data modeling is dictated by the platform's specific use case and the insights it aims to provide. For example, in a sector dashboard, the sector becomes the first-class citizen of the data model. The overarching strategy is to develop data models that best serve the users' needs and the insights the platform seeks to convey.
A review of the recent developments in CKAN's DCAT support, and how you can get involved
CKAN 2.11 introduces Table Designer: form builder and enforced validation for your data