Earlier this month, the Statistical Information System Collaboration Community (SIS-CC) convened a workshop on Next Generation SDMX Data Modelling. Chaired by Edgardo Greising (ILO), the three-hour session brought together SIS-CC members, partners and invited experts – including session facilitators Yamil Vargas (IMF), Glenn Tice (BIS), Gyorgy Gyomai and David Barraclough (OECD) – to explore emerging needs and co-investment opportunities for SDMX data modelling, with a particular focus on interoperability and integration across the .Stat Suite.
Identifying future modelling needs
After an introduction by Edgardo Greising, Yamil Vargas led an interactive brainstorming session aimed at generating ideas and solutions for data modelling in SDMX. The session facilitated open dialogue, collaboration, and drew on the varied perspectives within the group to identify different data modelling use cases, personas, needs, challenges, and potential improvements, particularly with the use of AI.
The goal was to harness the collective expertise and diverse viewpoints of the participants to explore new ideas, address existing challenges, and uncover innovative solutions along these lines. Through lively discussions, the group successfully focused on four key areas, aligning technical and business perspectives to enable future prototypes and pilot projects.
Defining the ideal toolkit
Glenn Tice (BIS) then invited attendees to imagine an “ideal toolkit” for SDMX modelling. Ideas coalesced around:
- Graphical/visual UI to reduce dimensional-model complexity and support multiple artefacts in one view
- Multi-source integration, pulling in metadata and classifications from Excel, other tools, SDMX stores, Global Registry, and the SDMX Global Discovery Service.
- Re-use of concepts, code lists and classifications via embedded governance workflows
- FOSS, community-driven foundations with semantic versioning for artefact management
- Web-based, profile-aware designs to serve both high-capacity and low-capacity data producers/countries
- AI-powered expert assistance, offering guidance based on prior modelling solutions and simulating alternative design choices
- Governance hooks for enforcing harmonisation and annotating models throughout development
Automating model generation
Looking to the future, György Gyomai showcased two flagship scenarios for auto-generating SDMX data models:
- Gateway tool—allowing resource-constrained agencies to ingest existing non-SDMX datasets and spin up structural metadata automatically, thereby unlocking .Stat tools for first-time users;
- Data integration—providing rapid “lift-and-shift” of legacy data sources into SDMX repositories via parser-driven model creation.
ILO and EPAM then presented on their pilot implementations, demonstrating how rule-based engines and machine-assisted workflows has the potential to reduce days of manual modelling work to a matter of hours.
Ensuring model quality
David Barraclough rounded off the technical sessions by illustrating how structural metadata quality checks can be woven into the model design process, and demoed a prototype tool that can automate structural metadata quality checks on any web service. David also showed how the tool can be run post-factum on a corpus of structural metadata to find and improve on quality issues, and can be separately configured for organisations and different use cases.
Furthermore, if the SDMX concept-oriented guidelines were made easy to parse, AI tools could be trained to partially check the quality of models. David noted that:
“To enhance SDMX models, we need to improve quality, reduce costs, and accelerate delivery. Currently, organisations face a trade-off: relying on dedicated SDMX teams ensures high quality and interoperability but slows development, while involving non-SDMX experts may accelerate modelling but at the expense of consistency and quality. Embedding best practice checks directly into modelling tools can help realise a better overall quality without sacrificing development speed. This tool is designed to support that goal. Also, enabling AI to parse the SDMX best practices could further boost quality.”
Open discussion and next steps
Under Edgardo’s facilitation, participants shared feedback on tooling gaps, data validation challenges and training needs.
We warmly invite all SIS-CC members to review the workshop outcomes, experiment with the prototypes and share further insights. Together, we can advance a truly interoperable, user-centred approach to SDMX data modelling—one that upholds the SIS-CC values of openness, collaboration and quality.