Lunchtime presentation on classification management systems

Implementing a New Classification
Management System at Statistics
New Zealand
Andrew Hancock, Statistics New
Arofan Gregory, Metadata Technology
North America
Background and Overview
• In 2010, SNZ started a ten-year program to
modernize their production of statistics
• Vision included moving away from paperoriented production of classifications with slow
revision cycles to a more dynamic system
– Machine-processible formats
– Rapid and flexible release capability
• Centralized management to support the
production process
– Wanted a collaborative management system, not just
a repository
Modernization Objectives
• Replace legacy CARS repository
• Leverage a concept-oriented model (like
• Reduce proliferation of classification versions
• Revise process for management, storage, and
dissemination of classifications
• Standardize concepts and categories within
• “Aria” is the name chosen for the new
classification management system – it means
“concept” in Maori
• Under the development arrangement, Aria
will be available for licensing by other
statistical agencies
– Currently, only SNZ requirements are
implemented, but broader support is planned
• Implemented in Java for cross-platform use
Concept management
Downloads and Dissemination
Search Capabilities
“Statistical Standards”
Concept Management
• Concepts are very important in Aria
– Associated with categories
– Associated with levels
– Can be related to other concepts to create concept systems
Concepts (2)
Classifications (2)
Editing Functions
• Many typical functions: Split, Merge, Transfer,
Restore, Takeover, Replace/Breakdown, etc.
• Supports “views” (subsets)
• Target-view and source-view of concordances
• Can be auto-generated or created manually
Downloads and Dissemination
• Allows for many types
of structured and
documentary formats
• Allows for download of
comparison views of
classifications in several
• Access control hides
classifications which are
not available to
particular users
Search Capabilities
• Search is implemented using SOLR
• Has Quick-Search capability
• Has full search capabilities
“Statistical Standards”
• SNZ has a concept of “standard” methods and ways of measuring
certain concepts.
• It is possible to attach additional information to specific concepts to
make this information easily available.
Underlying Models
• GSIM places huge emphasis on Concepts
– Categories are the use of Concepts in GSIM
– This is a very powerful feature of GSIM, and it allows many
new functions to be realized within a metadata system
• SKOS provides an RDF-based, flexible approach to
describing classifications
– Very flexible
– Supports inferencing based on known relationships
• Neuchatel provides an important model for many
aspects of classifications
• Sufficient information is held to express classifications
in DDI, SDMX, and other standard formats
Aspects of Modernization (1)
• Increased support for process-oriented classification
– A management system, not a repository
– Supports workflow status and access control
– Enables very flexible, frequent updates to classifications
• Supports granular re-use
– Use of concepts and categories across entire collections of
– Allows for views - subsets of existing classifications (not
new versions)
– Reduces proliferation of classification versions
Aspects of Modernization (2)
• As classifications are versioned, concordances can be
automatically generated
– The system knows when classifications are split, joined,
• Architecture is service-oriented
– Easy to integrate with existing systems
– Could fit into a CSPA architecture
• Uses the GSIM idea of Concepts to “know” more about
places where classifications are used in data
– New dissemination and production functionality can be
If you are interested in having a more-detailed
online demonstration of the system, please feel
free to contact us:
[email protected]