In Category on 11 Oct 2024
Enhancing DCAT support in CKAN (DCAT-AP v3, scheming integration, and more)
A review of the recent developments in CKAN's DCAT support, and how you can get involved
repeating_text
fields.
You must be using CKAN 2.8 or later and a custom IPackageController plugin to index datasets with repeating subfields.
repeating_label
may be used to provide a singular label for each group of subfields.
repeating_subfields
contains a list of field definitions to repeat.
- field_name: submission
label: Submissions
repeating_label: Submission
repeating_subfields:
- field_name: date
label: Date
preset: date
required: true
- field_name: text
label: Text
preset: markdown
required: true
- field_name: flags
label: Flags
preset: multiple_checkbox
choices:
- label: Draft
value: D
- label: Approved
value: A
- label: Response Required
value: R
Data stored in subfields is represented as lists of JSON objects in the API.
"submission": [
{
"date": "2021-02-01",
"text": "an example submission",
"flags": [
"D"
]
},
{
"date": "2021-02-05",
"text": "another one",
"flags": [
"A",
"R"
]
}
],
Normal and custom validation rules apply and are displayed in the form by referencing the subfield group and the field with the error, e.g. “Submission 2: Text: Missing value”
before_index
plugin and Solr schema to handle indexing repeating subfields the best way for your own site.
class SubmissionsIndexPlugin(p.SingletonPlugin):
"""
Map submission dataset fields to Solr fields
"""
p.implements(p.IPackageController, inherit=True)
def before_index(self, data_dict):
flags = set()
text = []
for sub in data_dict.get('submission', []):
text.append(sub['text'])
flags |= set(sub['flags'])
# replace list of dicts with plain text to prevent Solr errors
data_dict['submission'] = '\n'.join(text)
# index flags present in any submission
data_dict['submission_flags'] = sorted(flags)
return data_dict
For submission_flags
to accept multiple values we must add a multivalued field to our Solr schema <fields>
configuration:
<field name="submission_flags" type="string" indexed="true" stored="true" multiValued="true"/>
These new fields will now be available for use with CKAN advanced search e.g. submission:example
or submission_flags:D
.
If you don’t need advanced search or faceting based on repeating subfields you may use the included scheming_nerf_index
plugin. This plugin passes repeating fields to Solr as JSON strings to prevent indexing errors instead and doesn’t require a customized Solr schema.
Future CKAN support for dynamic fields in Solr will simplify this required configuration.
repeating_text
fields from ckanext-repeating with a dynamic form with add and remove buttons.
- field_name: contributors
label: Contributors
preset: multiple_text
Data stored in repeating text fields is represented as lists of strings in the API.
"contributors": [
"Person A",
"Person B",
"Person C"
],
required: true
may be used to require at least one entry. Per-field and other types of validation are not yet implemented. Add a comment to the multiple text validation issue if you would like to work on this feature.
original post
A review of the recent developments in CKAN's DCAT support, and how you can get involved
CKAN 2.11 introduces Table Designer: form builder and enforced validation for your data