Disseminating "active" metadata from earlier phases as

advertisement
Scenario b(iii)
Disseminating "active" metadata form earlier phases as reference metadata
Premise
As least from the perspective of the ABS, Scenario b(i) (support for end-to-end statistical
production process) results in some specific requirements and opportunities when it comes
to the dissemination phase.
Analysis and decision making related to statistical outputs can be facilitated if there is rich
information available related to how the data of interest was collected and processed,
including detailed information about data collection instruments (eg survey forms), variables
(input, intermediate and output), classifications, methods applied during the statistical
production process, process metrics (eg response rates and other data quality
measures/indicators) etc.
Rather than providing users with a single daunting "monolithic" report on all the "reference"
information that is available, research confirms that a "layered and linked" approach is more
effective. In this way a particular consumer of a statistical output can review concise
summary information which may alert them to specific more detailed reference information
they should "drill into further" based on their particular interests and needs related to
interpretation and use of the statistical output.
Both SDMX and DDI provide good support for "layering information" by supporting
resolvable, reliable, references to additional information rather than having to include all
possible information "in line" in a single structured report/package.
Both

for production efficiency, and

to ensure reference metadata reflects what actually occurred rather than representing
separately maintained documentation which may be inaccurate (eg out of date,
incomplete)
a key architectural principle from the ABS perspective is to ensure "reference" metadata
associated with statistical outputs is populated automatically based on (typically a relevant
subset of) metadata used actively in earlier stages of the statistical production process.
Requirement
In accordance with Scenario b(i) actively harnessing metadata described using DDI in the
statistical production process where this standard is most fit for purpose, Scenario b(iii)
requires that selected metadata used actively ("structurally") in DDI form in earlier phases of
the statistical production process is able to be expressed as SDMX reference metadata for
dissemination purposes.
It makes little difference whether such reference metadata is

"published" to an externally accessible registry/repository, and referenced by an "inner
shell" of reference metadata directly associated with an statistical output

included in the "inner shell" of reference metadata directly associated with an statistical
output
When it comes to fully "telling the story" of how a particular statistical output has been
produced it is seen as likely that being able to automate some elements of "capturing and
retelling" the story will rely on progress on Scenario b(iv) ("Drill back" from aggregates to
microdata).
In the meantime, however, a range of existing means of using DDI within Scenario b(i)
should result in active metadata described using that standard which can already also be
presented as reference metadata via SDMX when it comes to dissemination.
Implications
It should be possible to identify and agree the sets of active metadata described using DDI
which would provide the greatest value if made available in the near term as reference
metadata using SDMX. This would provide a tangible starting point and should support a
quick, high value result. Having achieved this initial result work would continue on
progressively designing, and automating the production of, other packages of reference
metadata.
Once initial "high value" targets for packages of reference metadata have been agreed,
work would commence on the details of how these should be represented as reference
metadata within SDMX. This is seen as likely to result in new standard Metadata Structure
Definitions (eg as part of Content Oriented Guidelines) that allow (at least a relevant subset
of) semantics from DDI to be expressed in a consistent way in SDMX.
It is noted this approach may

have some connections with Scenario a(iii), and

be seen as establishing within SDMX an increased basic "plug and play" capability with
DDI in a similar fashion to the way Scenario b(iv) foreshadows either or both standards
potentially "plugging and playing" with other purpose specific standards that may be
necessary to support that scenario
Once defined and agreed, such "standard" Metadata Structure Definitions might also be
populated by agencies (eg international organisations) which see insufficient business value
in harnessing DDI directly. In this case these other agencies would be providing reference
metadata that was only ever "documentary" in nature but would be populating a structure
shared with NSIs that populated their reference metadata based on actively used content
from earlier in the statistical production process. This should have significant benefits in
regard to consistency where consumers source data and reference metadata from both NSIs
and from other sources.
Download