ET_MDD_v0.7_30Octobe..

advertisement
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
PhUse
Emerging Technology Working Group
Metadata definitions
Document1
Page 1 of 37
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
Table of Contents
1
INTRODUCTION: PURPOSE OF THIS DOCUMENT .................................................................... 4
2
SCOPE ............................................................................................................................................ 4
3
DEFINITIONS .................................................................................................................................. 5
3.1 METADATA MANAGEMENT .................................................................................................... 5
3.1.1 Metadata ...................................................................................................................... 5
3.1.1 Structural metadata ...................................................................................................... 6
3.1.2 Descriptive metadata ................................................................................................... 7
3.1.3 Study Instance Metadata ............................................................................................. 8
3.1.1 Metadata repository ..................................................................................................... 9
3.1.2 Metadata registry ....................................................................................................... 11
3.1.3 Data element .............................................................................................................. 11
3.1.4 Attribute ...................................................................................................................... 13
3.1.5 Class .......................................................................................................................... 14
3.1.6 Data type .................................................................................................................... 15
3.2 MASTER DATA MANAGEMENT ............................................................................................ 18
3.2.1 Master Data ................................................................................................................ 18
3.2.2 Master Data Management ......................................................................................... 18
3.2.3 Master Reference Data .............................................................................................. 18
3.2.4 Master Data Source System ...................................................................................... 19
3.2.5 Reference Data .......................................................................................................... 19
3.2.6 Reference Data Management .................................................................................... 19
3.3 CONTROLLED TERMINOLOGY, CODE SYSTEMS & VALUE SETS .................................. 21
3.3.1 Controlled Terminology/controlled vocabulary ........................................................... 21
3.3.1 Code system .............................................................................................................. 23
3.3.1 Dictionary ................................................................................................................... 24
3.3.2 Concept ...................................................................................................................... 24
3.3.3 Code ........................................................................................................................... 25
3.3.4 Concept domain ........................................................... Error! Bookmark not defined.
3.3.1 Code list ..................................................................................................................... 26
3.3.2 Value set .................................................................................................................... 26
3.4 INTEROPERABILITY .............................................................................................................. 28
3.4.1 Interoperability ............................................................................................................ 28
3.4.2 Technical interoperability (“machine interoperability”) .............................................. 28
3.4.3 Semantic interoperability ............................................................................................ 29
3.4.4 Process Interoperability ............................................................................................. 29
3.5 DATA AGGREGATION, INTEGRATION ................................................................................ 30
3.5.1 Data pooling ............................................................................................................... 30
3.5.2 Data aggregation........................................................................................................ 30
3.5.3 Data integration .......................................................................................................... 30
4
INPUT (DRAFT MATERIAL THAT CAN BE USED – TO BE DELETED IN FINAL DOCUMENT)31
4.1 METADATA MANAGEMENT .................................................................................................. 31
Document1
Page 2 of 37
Project:
Title:
Version: 0.7
4.2
4.3
4.4
4.5
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
MASTER DATA MANAGEMENT ............................................................................................ 31
CONTROLLED TERMINOLOGY ............................................................................................ 32
INTEROPERABILITY .............................................................................................................. 35
DATA AGGREGATION ........................................................................................................... 35
5
REFERENCES & RELATED DOCUMENTS ................................................................................ 36
6
APPENDICES ............................................................................................................................... 36
6.1 CDISC GLOSSARY ................................................................................................................ 36
Document1
Page 3 of 37
Project:
Title:
Version: 0.7
1
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
INTRODUCTION: purpose of this document
This document provides agreed definitions within the PhUse CSS working group around metadata
management and related aspects across the industry. It is expected that these definitions will be re-used
in the FDA guidelines as cross industry definitions.
To be of operational value, the document contains not only definitions but also a short description and
example of use. Whenever possible, the definitions are built from those existing definitions from FDA
guidance's, CDISC glossary, check cross industry definition (e.g. Gartner). Reference to the source
definition is provided either directly with the definition or in the reference section.
This document does not intend to be extensive and complete. It is intended to bring clarification on the
most commonly used (and misused !) definition in our industry around metadata and master data
management;
The CDISC glossary [CDISC1] (and document in attachment) is used as reference in this document. It is
expected that the reader of this document is familiar with the abbreviations and Synonyms contained in
the CDISC glossary; these are not repeated here.
2
SCOPE
The following topic areas are in scope of this document
• Metadata management.
• Master data management
• Controlled terminology
• Data pooling, data integration, data aggregation
• Interoperability, semantic interoperability
Definitions are provided per topic area to ease reading and structure of this document.
Document1
Page 4 of 37
Project:
Title:
Metadata Management
Metadata Definitions
Date: 8th August 2013
Version: 0.7
3
Working Group:
Emerging Technologies
DEFINITIONS
3.1
Metadata management
(Organization Level)
Study
Metadata
Metadata
Structural
Metadata
Descriptive
Metadata
Semantic
Descriptive
Metadata
3.1.1
(Study Level)
Study Structural
Metadata
Study Descriptive
Metadata
Process
Descriptive
Metadata
Metadata
Synonym
Definition
source
& 


Description
Document1
Wikipedia. The term metadata refers to "data about data". The term is ambiguous,
as it is used for two fundamentally different concepts (types).
o Structural metadata is about the design and specification of data
structures and is more properly called "data about the containers of data";
o Descriptive metadata, on the other hand, is about individual instances of
application data, the data content. In this case, a useful description
ISO 11179. “Descriptive data about an object [ISO/IEC 20944-1]”. Thus, metadata
is a kind of data.
Adrienne Tannenbaum, Metadata Solutions:
o "Metadata: the detailed description of the instance data; the format and
characteristics of populated instance data; instances and values depending
on the role of the metadata recipient." and "Instance data: That which is
input into a receiving tool, application, database, or simple processing
engine".
o Meta metadata “The descriptive details of metadata; metadata qualities
and locations that allow tool-based processing and access; the basic
attributes of metadata solutions:”
Metadata describe instance data.
 Instance data are data stored in a computer as the result of data entry by a
person or data processing by an application.
 A metadata can become an instance data described itself by a level 2 metadata
(or meta metadata)
o Each CDISC standard or instance of a standard defined could be
considered an object. That object will have properties that describe
Page 5 of 37
Project:
Title:
Metadata Management
Metadata Definitions
Date: 8th August 2013
Version: 0.7
o
Working Group:
Emerging Technologies
the operations that can be performed on it and by whom; i.e, Global
SDTM objects -standard template definitions for SDTM standard
domains for each version of the standard- can be copied and a few
properties adjusted (instantiated at a compound level or study level to
force the inclusion of PERM variables and define some of themor some
EXP variables as Mandatory). The available "Copy" operation and the
available "properties that can be changed" and associated "values
permitted to change (from x to y)" are metadata elements to be used
by the corresponding MDR processing tool to instantiate that object.
The relationships among standards can be considered meta-metadata
so that "conversion" or "visualization" tools can relate data elements
as they move from one instance of data to other data instance of the
data. – mapping
There are 2 types of metadata (see below for more details description and examples)
Example

Structural metadata

Descriptive metadata
See structural metadata and descriptive metadata
Recommended See structural metadata and descriptive metadata
definition
3.1.1
Synonym
Definition
source
Structural metadata
Standard metadata or Data Standard (subset of structural metadata as legacy data,
without standards, also have structural metadata)
& 

http://en.wikipedia.org/wiki/Metadata
The design and specification of data structures (e.g. format, semantic, ..), cannot
be “data about data”, because at design time the application contains no data. In
this case the correct description would be "data/information about the containers
of data".
[FDA1]
Structural metadata is structured information that describes, explains, or
otherwise makes it easier to retrieve, use, or manage data.
Description
Document1
Structural metadata is what most of people mean by metadata. Structural metadata is
said to “give meaning to data” or to put data “in context.”
Key components of structural metadata include data domains, data elements,
terminology, data mappings and transformations, and data derivations.
The successful usage of structural metadata requires data standards governance that
should include:
 workflows to address the creation and/or revision of structural metadata
 version control of structural metadata and study instance metadata (see definition
below)
 access control, by user role
Page 6 of 37
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
Standards metadata (a subset of structural metadata), is the source from which the
study instance metadata (see below) is built.
Example
The number 120 itself is meaningless without structural metadata such as


The name of the variable (e.g. Systolic Blood Pressure) with its definition
The unit related to this physical quantity (e.g; Systolic Blood Pressure Unit =
mmHG)
CDISC SDTM is the data standard approved across the industry for clinical data to be
transferred to the FDA.

For instance the variable “Sex” is described by a set of structural meta data such as
the label, data type (char) and associated value sets (male and female, ..), role in
SDTM, …

The metadata for the AE (Adverse Event) SDTM domain that is compliant with the
CDISC SDTM Implementation Guide (Version 3.1.3) consists of attributes such as
Variable Name, Variable Label, Type, Controlled Terms, Role, etc.
A data model - describing the classes, attributes, relationships and hierarchies –
constitutes the structural metadata of the underlying data base.
Recommended In pharmaceutical research, structural metadata describes the instance data that are
definition
collected and derived during clinical research across different processes and systems.
As such they facilitate clinical software re-use and thus business process efficiency.
Structural metadata is defined, maintained, and governed at the level of an
organisation (pharma company, CRO, CDISC, ..) across all projects; at the study level, it
is the study instance metadata - extracted from the structural metadata – which is of
application.
3.1.2
Descriptive metadata
Synonym
Definition
source
Description
Document1
Process metadata (subset of descriptive metadata)
Semantic metadata (subset of descriptive metadata)
& 
http://en.wikipedia.org/wiki/Metadata
The individual instances of application data, the data content. In this case, a useful
description would be "data about data content" or "content about content".

Ralph Kimball's "Process metadata describes the results of various operations in a
data warehouse."
It is used in different contexts
 Data operations and statistical analysis (semantic metadata)Additional content
on the data that support further analysis of the data. For instance patient
population in the context of a clinical trial study is descriptive metadata
 Software implementation (process metadata): describes the results of various
operations happening in an application, be it in a data warehouse or any other
Page 7 of 37
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
application. This includes
o processes used to reformat (convert) or transcode content.
o all information needed to support data lineage & traceability
o details of origin and usage (including start and end times for creation,
updates and access).
Descriptive metadata is often a key enabler in deriving business value from data
through both direct relationships and indirect relationships between instance data. In
effect, it creates the “how”, “where”, “who”, and “when” for the instance data.
Example

“How” - how the instance data is used within the info flow

“Where” - source of the instance data

“Who” - who created, modified and approved the instance data

“When” - versioning info of the instance data

Data operations and statistical analysis (semantic metadata): patient population,
indication, therapeutic area

Software implementation (process metadata):
o metadata needed for the effective management of version control for
structural metadata: UserID who executed the last modification, date of
the last modification,UserID who approved the last modification.
o metadata needed for the effective management of instance data:
o what is source of the data, in which system(s) is it authored
o which transformation happened to the data, how, when, by whom
o metadata needed for managing access control: different roles for
accessing information and which action can they can perform (create,
read, update, delete)
o Audit trail: who access which information, when
Recommended In pharmaceutical research, descriptive metadata describes process or domain-specific
definition
information about instance data collected and derived during clinical research. It
provides conceptual, contextual, and processing information for instance data and as
such descriptive metadata is a key enabler in deriving business value from instance
data. It can also provide greater depth and more insight about the "container" of the
data, whether it is a file, document, or representation.
Descriptive metadata is defined itself by structural metadata; it is generated by
systems or people.
3.1.3
Study Instance Metadata
Synonym
Definition
source
Document1
Study Data Standards or Study Specific Structural metadata (subset of Study Instance
metadata)
& (no source found)

Study Instance metadata is a defined grouping of metadata that serves as the most
Page 8 of 37
Project:
Title:
Metadata Management
Metadata Definitions
Version: 0.7
Date: 8th August 2013
Working Group:
Emerging Technologies
complete representation of the metadata that defines an individual study.

Description
It is commonly thought of as the set of metadata that is actually consumed by the
clinical technology platform to facilitate processes that are more automated and
consistent.
Study Instance Metadata consists of Structural metadata and some Descriptive
metadata to support the management of the Study Instance Metadata

Example of Study Instance Structural metadata: subset of SDTM data domains and
variables needed to collect and derive instance data for a specific study

Example of Study Instance Descriptive metadata. For a Statistical Computing
Environment (SCE) that is leveraging metadata to automate the production of
TLFs, the Study Instance Descriptive metadata could include study-specific
selections that help the SCE process the metadata, such as the selection of BY
variables to determine appropriate breaks for a table in that particular study.
The Study Instance Structural Metadata is extracted from the Structural metadata
maintained at the enterprise/organisation level; is therefore a subset of the enterprise
Structural metadata.
The Study Instance Metadata is exported to and consumed by the clinical data
platform to ensure maximal automation and consistency of the processes for trial
design, execution, storage, analysis, and submission.
Example
see above
Recommended
definition
3.1.1
Synonym
Definition
source
Metadata repository
Metadata registry
& http://datadictionary.blogspot.com/2008/03/metadata-repositories-vs-metadata.html
Definitions from Dr. Data Dictionary site - a place, room, or container where something
is deposited or stored. Note that here is nothing in this definition about the quality of
the things being stored or the process to check to see if new incoming items are
duplicates of things already in the repository. If I have 100 users they could each
define "Customer" as they see fit and put their own definition into the metadata
repository as their own definition. No problems.
http://en.wikipedia.org/wiki/Metadata_repository
“A Metadata repository is a database created to gather, store, and distribute
contextual information about business data, when documented it is known as
metadata. This contextual information of business data include meaning and content,
policies that govern, technical attributes, specifications that transform, and programs
that manipulate.
Document1
Page 9 of 37
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
The metadata repository is responsible for physically storing and cataloging metadata.
The metadata that is stored should be generic, integrated, current, and historical.
Generic for a metadata repository means that the meta model should store the
metadata by generic terms instead of storing it by an applications-specific defined
way, so that if your data base standard changes from one product to another the
physical meta model of the metadata repository would not need to change.
Integration of the metadata repository allows all entities of the enterprise business to
view all metadata subject areas. The metadata repository should also be designed so
that current and historical metadata both can be accessed. Metadata repositories used
to be referred to as a data dictionary.
http://en.wikipedia.org/wiki/Data_dictionary . A data dictionary, or metadata
repository, as defined in the IBM Dictionary of Computing, is a "centralized repository
of information about data such as meaning, relationships to other data, origin, usage,
and format." The term may have one of several closely related meanings pertaining to
databases and database management systems (DBMS):
 a document describing a database or collection of databases
 an integral component of a DBMS that is required to determine its structure
a piece of middleware that extends or supplants the native data dictionary of a DBMS
http://www.springerreference.com/docs/html/chapterdbid/63927.html
http://www.uspto.gov/web/patents/patog/week13/OG/html/1388-4/US08407194
20130326.html
http://www.bls.gov/ore/pdf/st000010.pdf
Description
Example

Data Store for Structural metadata, defined within an organization

Study Instance Metadata are derived from the Structural metadata defined in a
Metadata repository, but are generally not stored in the MDR as they are study
specific

Descriptive metadata are not stored either in a MDR
CDISC SHARE
NCI caDSR
Recommended A metadata repository (MDR) is a centralized repository of structural metadata, with
definition
information about instance data such as semantics (meaning), relationships to other
data, origin, usage, and format.
When the emphasis is put on control of new metadata – through a specific registration
process with well identified administration/registration authority - the metadata
repository is often called a metadata registry
Recommendation is to use terms
 Metadata registry when the software has a strong registration process
 Metadata repository when the software is more of a library with less emphasis on
Document1
Page 10 of 37
Project:
Title:
Metadata Management
Metadata Definitions
Version: 0.7
Date: 8th August 2013
Working Group:
Emerging Technologies
registration
3.1.2
Metadata registry
Synonym
Definition
source
Metadata repository
& http://en.wikipedia.org/wiki/Metadata_registry A metadata registry is a central
location in an organization where metadata definitions are stored and maintained in a
controlled method.
A metadata registry typically has the following characteristics:
 Protected environment where only authorized individuals may make changes
 Stores data elements that include both semantics and representations
 Semantic areas of a metadata registry contain the meaning of a data element with
precise definitions
 Representational areas of a metadata registry define how the data is represented
in a specific format, such as in a database or a structured file format (e.g., XML)
http://datadictionary.blogspot.com/2008/03/metadata-repositories-vs-metadata.html
Definitions from Dr. Data Dictionary site - A Registry has the connotation of more than
just a shared dumping ground. Registries have the additional capability to create
workflow processes to check that new metadata is not a duplicate (for a given
namespace). One of the definitions from Webster is an official record book. Note the
word official
ISO/IEC 11179-3 Third edition 2013-02-15
3.2.113
Registry: information system for registration (3.2.108)
Description
3.2.78
metadata registry (MDR): information system for registering metadata (3.2.74)
 The structure of a metadata registry is specified in the form of a conceptual data
model. The metadata registry is used to keep information about data elements
and associated concepts, such as “data element concepts”, “conceptual domains”
and “value domains”.
See above
Example
See above
Recommended See above
definition
3.1.3
Data element
Synonym
Document1
Variable
(Note: the term “attribute” is also used interchangeably for DE when “attribute” is
synonym of a variable or the property of a class)
Page 11 of 37
Project:
Title:
Metadata Management
Metadata Definitions
Version: 0.7
Definition
Date: 8th August 2013
Working Group:
Emerging Technologies
[FDA1]
A data element is the smallest (or atomic) piece of information that is useful for
analysis (e.g., a systolic blood pressure measurement, a lab test result, a response
to a question on a questionnaire).
A data element is an atomic unit of data that has precise meaning or precise semantics
[CDISC1]
1. For XML, an item of data provided in a mark-up mode to allow machine
processing. [FDA - GL/IEEE]
2. Smallest unit of information in a transaction. [Center for Advancement of Clinical
Research]
3. A structured item characterized by a stem and response options together with a
history of usage that can be standardized for research purposes across studies
conducted by and for NIH. [NCI, caBIG]
NOTE: The mark up or tagging facilitates document indexing, search and retrieval,
and provides standard conventions for insertion of codes.
[ISO/IEC 11179-4:2004, 3.4]
Description
Example
Document1
Unit of data for which the definition, identification, representation and permissible
values are specified by means of a set of attributes.
The data element is foundational concept in an ISO/IEC 11179 metadata registry. The
purpose of the registry is to maintain a semantically precise structure of data
elements.
Each Data element in an ISO/IEC 11179 metadata registry:
 should be registered according to the Registration guidelines (11179-6)
 will be uniquely identified within the register (11179-5)
 should be named according to Naming and Identification Principles (11179-5)
 should be defined by the Formulation of Data Definitions rules (11179-4)
 may be classified in a Classification Scheme (11179-2)
A Data Element is the most elementary unit of data that cannot be further subdivided
from a semantic point of view, as it is linked with a precise meaning.
A data element has different properties:
 An identification such as a data element name
 A clear definition/ semantic description
 A data type
 Optional enumerated permissible values (value sets)
 One or more representation terms (synonyms)
 An author and registration authority who takes responsibility for the definition of
the data element
Birth Date is a Data Element
It is described by a set of properties
 DE name: Birthdate
Page 12 of 37
Project:
Title:
Version: 0.7




Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
Definition/description: date and time on which the subject is born
Data type: date (mm/dd/yyyy – hh/mm/ss – time zone)
Value sets: not applicable
Synonyms: BRTHDTC in CDISC SDTM, birthdate in BRIDG
If Variable in SDTM is provided as a synonym of Data Element, then Data Element
would have a similar association to ItemDef as Variable to ItemDef in the Define-XML.
Recommended A Data Element is the most elementary unit of data that cannot be further subdivided
definition
from a semantic point of view, as it is linked with a precise meaning. The definition,
identification, representation and permissible values of a data element are specified
by means of a set of properties.
3.1.4
Attribute
Synonym
Property
(Note: the term “Data element” is also used interchangeably for attribute – but it is a
different concept)
Definition
source
& http://en.wikipedia.org/wiki/Attribute_(computing)
In computing, an attribute is a specification that defines a property of an object,
element, or file. An attribute of an object usually consists of a name and a value; of an
element, a type or class name; of a file, a name and extension.
[Source: Understanding HL7 version 3: Andrew Hinchley]
Attributes are abstractions of the data captured about classes.
[Source: ISO 1087]
Attribute is short for attribute type and attribute value. Attribute type: category of
attribute values used as a criterion for the establishment of a concept system
[source: Medical Data Management” Florian Leiner et al]
Attribute value: Value of an attribute type as observed for a particular object.
[Source: ISO 21090] Characteristic of an object that is assigned a name and a type
NOTE The value of an attribute can change during the lifetime of the object.
Description
A prerequisite for correct and proper use and interpretation of data is that both users
and owners of data have a common understanding of the meaning and representation
of the data. To facilitate this common understanding, a number of attributes, of the
data have to be defined. Such attributes include: the element’s name, data type,
caption presented to users, detailed description, and basic validation information such
as range checks.
Description of the characteristics of an object /class in a logical model. If the attributes
represent the most elementary unit of data that cannot be further subdivided from a
Document1
Page 13 of 37
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
semantic point of view it can be considered as a Data Element.
Attribute is an overloaded term. It is sometime used as synonym of Data Element or as
synonym of a property of a Data Element. While the first case may be correct in many
cases1, we suggest to avoid the second practice and to use the term “property”
instead.
Example
in BRIDG,
 raceCode is an attribute of class Person (i.e. Person.raceCode),
 value is an attribute of DefinedObservationResult.
Recommended Properties of an object or class in a conceptual or logical data model.
definition
3.1.5
Class
Synonym
Definition
source
Description
Example
Object
& http://en.wikipedia.org/wiki/Class_(computer_programming)
In object-oriented programming, a class is a construct that is used to define a distinct
type. The class is instantiated into instances of itself – referred to as class instances,
class objects, instance objects or simply objects. ….A class usually represents a noun,
such as a person, place or thing, or something nominalized. For example, a "Banana"
class would represent the properties and functionality of bananas in general. A single,
particular banana would be an instance of the "Banana" class, an object of the type
"Banana"
[Source: ISO 21090]class
descriptor for a set of objects with similar structure, behaviour and relationships
Description of a set of objects that share the same attributes, operations, methods,
relationships, and semantics


StudySite Class in the BRIDG model
ManufacturedMaterial class in HL7 RIM: An Entity or combination of Entities
transformed for a particular purpose by a manufacturing process
Recommended Description of a set of objects that share the same attributes, operations, methods,
definition
relationships, and semantics
A class has:

An identifier such as a class name

A clear object definition / semantic description

One or more representation terms/words

A list of Data Element (also known as attributes)

A list of related classes and a description of the relationship type(s)
In an information model, like BRIDG, an attribute may have a data type like “ADDRESS” which is a class. This
attribute will not qualify as being a Data Element
1
Document1
Page 14 of 37
Project:
Title:
Version: 0.7

3.1.6
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
Any description – in addition to Data Elements – that allow to map the object
within an application
Data type
Synonym
Definition
source
Storage format
&
[Source: ISO 11404]
A data type is a classification identifying one of various types of data, such as realvalued, integer or Boolean, that determines the possible values for that type; the
operations that can be done on values of that type; the meaning of the data; and the
way values of that type can be stored.
[Source: ISO 21090]
set of distinct values, characterized by properties of those values, and by operations
on those values
[ Source: http://msdn.microsoft.com/]
Objects that contain data have an associated data type that defines the kind of data;
for example, character, integer, or binary, the object can contain. The following objects
have data types:
 Columns in tables and views.
 Parameters in stored procedures.
 Variables.
 Transact-SQL functions that return one or more data values of a specific data
type.
 Stored procedures that have a return code, which always has an integer data
type.
Description
Storage format in a Data Base – not the display format in the User Interface
Data types define the kind of data – or the format - that can be included in a field (Data
Element, Attribute or variable). There are two categories of data type:
 simple / primitive data types such as Boolean, Integer, Character –defined in
ISO 11404,
 abstract data types –defined in ISO 21090 – and defining basic concepts that
are commonly encountered in healthcare in support of information exchange.
Abstract data types are using the terminology, notations and data types
defined in ISO/IEC 11404, thus extending the set of data types defined in that
standard
Example


Document1
Primitive data type (ISO 11404): boolean, enumerated, character, time, integer,
real, …
Abstract data types (ISO 21090): Address, PQ (for Physical Quantity) or II (for
Instance Identifier), CD (Concept Descriptor), Range (low, high), Period (start, end)
Page 15 of 37
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
Recommended Data types define the format - that can be included in a specific Data Element (or
definition
variable or attribute) , There are two categories of data type:
 simple / primitive types such as Boolean, Integer, Character –defined in
ISO11404,
 abstract data types such as Address, PQ (Physical Quantity) –defined in ISO
21090 – and using the terminology, notations and data types defined in
ISO/IEC 11404
3.1.7
Value level metadata
(to be checked by Marcelina)
--testCD Height & Weight have different format & unit
When you have a variable (column in a table) you will not apply the same properties across all the row
=> you need to differentiate what properties
Synonym
Definition
source
Description
& CDISC
SAS
“Value level metadata” is a a specific term used in the CDISC Define-XML standard due to the
way the some Data Elements (as per the definition of data element agreed above) are
organized in the CDISC / SDTM, SEND or ADaM standards.
When Data Elements are part of a data structure that combines those elements with less
granularity of attribute definitions, then the Data elements have to be described individualy.
This is the mechanism (implementation approach) used in the Define-XML standard.
Set of value for a variable under certain condition
Questionnaire – multiple testCD (one for each question) –

Set of value for a variables under a specific condition (is this a value set ?)

testCD can be SystolicBP or DiastolicBP- questions related to this can be different
based on a different context
difference between value set and value level meta ?

Example
Document1
VLM is a set of metadata that applies below the level of an SDTM variable (e.g.
tesCDUnit can only be done at the level of the variable )

Page 16 of 37
Project:
Title:
Version: 0.7
Recommended
definition
Document1
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies

Page 17 of 37
Project:
Title:
Metadata Management
Metadata Definitions
Version: 0.7
3.2
3.2.1
Date: 8th August 2013
Working Group:
Emerging Technologies
Master data management
Master Data
Synonym
Definition
source
Master Reference Data
& [Gartner – Magic Quadrant for Master Data Management of Customer Data Solution]
http://www.gartner.com/technology/reprints.do?id=1-1CK9UDO&ct=121019&st=sb
Master data is the consistent and uniform set of identifiers and extended attributes
that describes the core entities of the enterprise, such as customers, prospects,
citizens, suppliers, sites, hierarchies and chart of accounts.

Description



Example

Master Data is business data that has a consistent meaning and definition, shared
across systems. It is produced into a “master system” as part of a transaction and
is used for reference and validation in transactions within other systems.
Master Data – as any other data – are defined with structural Meta data
Site identification information such as: Site ID, Site Name, Site Address, …
Investigator identification attributes
Study Identification attributes
Recommended 
definition
3.2.2
Master Data Management
Synonym
Definition
source
Reference Data Management
& [Gartner – Magic Quadrant for Master Data Management of Customer Data Solution]
http://www.gartner.com/technology/reprints.do?id=1-1CK9UDO&ct=121019&st=sb
MDM is a technology-enabled discipline in which business and IT work together to
ensure the uniformity, accuracy, stewardship, semantic consistency and accountability
of the enterprise's official, shared master data assets.
[Source: Master Data Management]
Master Data Management (MDM) is the collective application of governance, business
processes, policies, standards and tools facilitate consistency in data definition.
Description
MDM has the objective of providing processes for collecting, aggregating, matching,
consolidating, quality-assuring, persisting and distributing such data throughout an
organization to ensure consistency and control in the ongoing maintenance and
application use of this information.
Example
Recommended
definition
3.2.3
Master Reference Data
Synonym
Document1
Page 18 of 37
Project:
Title:
Metadata Management
Metadata Definitions
Version: 0.7
Definition
source
Date: 8th August 2013
Working Group:
Emerging Technologies
&
Description
A combination of Master Data and Reference Data. The governance of these 2 components is
quite different:
 reference data are often defined by external organizations and are defined at design time;
they are generally managed within a terminology server (or a meta data repository) as part
of all the code systems
 master data are created during application run time through a transaction and are stored
into the source system considered as the source of truth.
Example

Recommended
definition
3.2.4
Master Data Source System
Synonym
Definition
source
&
Description

In context of Master Reference Data Management this corresponds to the set of code
systems that are commonly used across many different systems and attributes
Example


List of Country codes
List of Therapeutic areas
Recommended
definition
3.2.5
Reference Data
Synonym
Definition
source
&
Description

Example

Recommended
definition
3.2.6
Reference Data Management
Synonym
Definition
source
&
Description

Example

Document1
Page 19 of 37
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
Recommended
definition
Document1
Page 20 of 37
Project:
Title:
Metadata Management
Metadata Definitions
Date: 8th August 2013
Version: 0.7
3.3
Working Group:
Emerging Technologies
Controlled Terminology, code systems & value sets
In this section we only limit the definition to the terms most often used in clinical research operations,
to clarify the confusion between terms like “code lists”, “controlled terminology”, “ dictionary” like
MedDRA.
The components of controlled vocabularies …
How Controlled
Vocabularies are described
and used
Concept
Identifiers
Concepts
.. with example from CDISC Terminology
How
Controlled
In define.xml(not
machine
processable)
 Controlled
terminology
CT/
Vocabulariescode
are: CDISC
described
NCI EVS CT
Value set CUI for SEX: C66731
Female CUI: C16576
Concept
Representation
and used


Other (machine processable): OID. URI
Concept
Identifiers
Concepts
“Women”
Concept
Representation
C16576 + F
F (primary)
Designations
Codes
Code
System
Versioning
ISO 21090
Datatypes – the
CD Concept
Descriptor
Designations
C16576
Codes
Code
System
Versioning
Code
Systems
Value Set
Definition
Female
ISO 21090
Datatypes – the
CD Concept
Descriptor
Code
Systems
Value Set
Definition
Value Sets
Value Set
Versioning
3.3.1
female
Value Sets
inspired
inspired from
from Julie
Julie James,
James,
BlueWave
BlueWave Informatics
Informatics
Value Set
Versioning
C66731 (for SEX)
inspired
inspired from
from Julie
Julie James,
James,
BlueWave
BlueWave Informatics
Informatics
Controlled Terminology/controlled vocabulary
Synonym
Definition
source
Document1
Controlled vocabulary,
& [CDISC].
CDISC Controlled Terminology is a set of standard value lists that are used throughout
the clinical research process from data collection through analysis and submission
History of alignment of CDISC terminology:
 NCI EVS (Enterprise Vocabulary Services) original terminology applicable to
SDTMIG (2005)
 HL7 EHR Clinical research functional profile linking HL7 standards with CDISC
Page 21 of 37
Project:
Title:
Version: 0.7



Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
CDASH (data collection standards)
HITSP - (replaced by HITSC)
ISO - in progress
JIC - Future intention to align with JIC?
http://en.wikipedia.org/wiki/Controlled_vocabulary
Controlled vocabularies provide a way to organize knowledge for subsequent retrieval.
Controlled vocabulary schemes mandate the use of predefined, authorised terms that
have been preselected by the designer of the vocabulary
[Source: Mapping from a Clinical Terminology to a Classification: AHIMA]
Controlled means that the content of the terminology is validated with careful quality
assurance procedures in place to ensure that the terminology is structurally sound,
biomedically accurate and consistent with current practice.
Controlled terminology in the context of Controlled Vocabulary:
[Amy Warner, A Taxonomy Primer].
Controlled vocabularies … are organized lists of words and phrases, or notation
systems, that are used to initially tag content, and then to find it through
navigation or search.

Description
Document1
[Source: ISO Standard 1087] and [Medical Informatics: Computer Applications in
Healthcare and Biomedicine]
The terms terminology, vocabulary and nomenclature are often used
interchangeably by creators of coding systems and by authors discussing the
subjects. ISO Standard 1087 (Terminology –Vocabulary) lists the various
definitions for these terms.
o Terminology: Set of terms representing the system of concepts of a
particular subject field
o Nomenclature: System of terms that is elaborated according to preestablished naming rules
o Dictionary: Structured collection of lexical units, with linguistics
information about each of them
o Vocabulary: Dictionary containing the terminology of a subject field
A Controlled Terminology is a synonym of Controlled Vocabulary.
It is a set of standardized words and phrases (designations) used to refer to concepts.

It has a defined scope or describes a specific domain

It may support categorization, indexing, and retrieval of information (optional).

A good terminology typically includes preferred terms and synonyms while
promoting consistency in preferred terms and in the assignment of the same terms
to similar content.
Page 22 of 37
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
A controlled terminology – or code system – can be used for coding i.e. assignation of
a code together with a verbatim
Example
ICD-9 CM, SNOMED CT, LOINC, MedDRA are all controlled terminologies AND code
systems
CDISC CT is a controlled terminology but not a true code system because


no OID to represent all the CDISC CT as a unique well identified set ,
governance: The organisation that publishes/manage it (NCI) with OID and
designation, is not the same than the one responsible for it (CDISC)
 it can be extended by the sponsor
Recommended Same as description
definition
3.3.2
Code system
Synonym
Controlled Terminologies, Controlled Vocabularies, Coding schemes
(and sometime also code lists e.g. ISO country code)
Definition
source
& [Source: ISO 21090]
managed collection of concept identifiers, usually codes, but sometimes more complex
sets of rules and references
references
NOTE They are often described as collections of uniquely identifiable concepts with
associated representations, designations, associations and meanings.
EXAMPLES ICD-9, LOINC and SNOMED-CT
Description
A Code System is a more strictly “regulated” controlled terminology
• A Code system may be described as “a collection of uniquely identifiable concepts
with associated representations, designations, associations, and meanings” (B for
Blue, Y for Yellow) – while a controlled terminology could be just a list of words
(Blue, Yellow, ..)
• A Concept should be unique in a given Code System and should have unique
identifier (e.g. CUI – concept unique identifier), following the governance rules of
the Code System
• A Code system should have:
 an identifier (e.g. OID) that uniquely identifies the Code System.
 a description consisting of prose that describes the Code System, and may
include the Code System uses, maintenance strategy, intent and other
information of interest
 administrative information proper to the Code System, such as ownership,
source URL, and copyright information
 a code system version, as the code system could evolve over time (with some
time change in the underlying concept)
A controlled terminology – or code system – can be used for coding i.e. assignation of
Document1
Page 23 of 37
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
a code together with a verbatim
Example
ICD-9 CM, SNOMED CT, LOINC, and MedDRA, NCIT (NCI Thesaurus), ISO 3166 for
country code
Note: CDISC CT is not a code system as it does not have a strict version control and
governance– see above).
Recommended A Code system – as a controlled terminology - is described as “a collection of uniquely
definition
identifiable concepts with associated representations, designations, associations, and
meanings”. Each concept in a code system is unique. A code system has strict
governance rules to manage its content (and this is the main difference with a
controlled terminology where there is no governance).
3.3.3
Dictionary
Synonym
Definition
source
Controlled Terminology/Controlled vocabulary
&
-
Description
Often used in clinical data management for MedDRA, this term is an overloaded term
with different significations in different contexts. We therefore suggest to avoid its use
and use the proper wording i.e. controlled terminology or code system
Example
MedDRA
Recommended Do not use !
definition
3.3.4
Concept
Synonym
Definition
source
& [Source: ISO 21090]
unitary mental representation of a real or abstract thing; an atomic unit of thought
NOTE 1 It should be unique in a given code system.
NOTE 2 A concept can have synonyms in terms of representation and it can be a
primitive or compositional term.
Description
• A Concept is a unitary mental representation of a real or abstract thing – an atomic
unit of thought – within a specific context
• The purpose of defining the concept is to share meaning in information exchange
• They constitute the smallest semantic entities with which models are built. The
authors and the readers of a model use concepts and their relationships to build
and understand the models; these are what matter to the human user of models.
A concept can be labelled with a code (machine readable) and/or a designation
(human readable) ; a collection of codes constitute a code system
• Concepts and real world objects are defined at a different level (object is an actual
Document1
Page 24 of 37
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
thing that exists – while a concept is a mental thing)
Example
real “unit of thought”: apple, pomme (when we need a more refined definition such as
green or red apple – the concept can be refined)
abstract “unit of thought”: love
Recommended A Concept is a unitary mental representation of a real or abstract thing – an atomic
definition
unit of thought; a concept can be labelled with a code and/or a designation
3.3.5
Code
Synonym
Definition
source
& [Source: ISO 21090]
concept representation published by the author of a code system as part of the code
system, being an entity of that code system
Description
• A Code is a machine processable Concept Representation published by the author
of a Code System as part of the Code System
• It is the preferred unique identifier (unambiguous) for that concept in that Code
System for the purpose of communication (preferred machine-readable identifier),
and is used in the 'code' property of an ISO 21090 CD data type
• Codes are sometimes meaningless identifiers, and sometimes they are mnemonics
that imply the represented concept to a human reader.
Note:
• a concept representation has a code and one or more designations. If there is
more than one designation of the same concept – these are synonym of each
other’s.
 In a code system that has synonyms, it is useful to have a “primary
designation” assigned by the code system provider.
 This is helpful in maintenance, because if a change is needed then this can
be done without needing to retire and re-author the whole concept;
whereas if there is no primary designation, it is difficult to decide whether
making a change to “one of the synonyms” means retiring and reauthoring the whole concept.
• a decode is generally used as the (primary) designation of a concept
Example
•
•
•
MedDRA code – has meaningless identifiers – “10040589” (Shoplifting)
ISO (2 letter) Country codes – mnemonic – GB = Great Britain
In CDISC Vocab
• C16576 is the code for Female in CDISC Vocab CT
• F is the designation for Female
• Female might be another designation (and is a synonym of F , and should
ideally be the primary designation as this more human readable)
Recommended Meaningless identifiers of a concept, which should ideally be linked with a designation
Document1
Page 25 of 37
Project:
Title:
Metadata Management
Metadata Definitions
Date: 8th August 2013
Version: 0.7
definition
3.3.6
Working Group:
Emerging Technologies
(or decode) which is human readable/meaningful
Code list
Synonym
Definition
source
Value set, Code system (e.g. ISO country code)
&
Description
Code lists within a database are implementations of a CT. The coded value is
operational and not necessarily part of the CT. For example a codelist 1=Male,
2=Female is the sponsor application of the CDISC terminology for SEX containing value
list (Male, Female).
Example
Recommended Do not use – not precise enough – use either code system or value set as appropriate
definition
4
RESTART HERE
4.1.1
Value set
class
0,n
1,n
dataElement
0,1
valueSetDEBinding
valueSet
1,1
valueSetDefinition
0,1
valueSetType
= intentional
0,1
1,n
valueSetType
= extensional
codedConcept
codeSystem
1,n
Synonym
Definition
Document1
Code list
& Source: ISO 21090]
Page 26 of 37
Project:
Title:
Version: 0.7
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
source
that which represents a uniquely identifiable set of valid concept representations,
where any concept representation can be tested to determine whether or not it is a
member of the value set
NOTE A concept representation can be a single concept code or a post-coordinated
combination of codes.
Description
•
•
•
A Value Set represents a uniquely identifiable set of valid concepts in context i.e.
bound to a specific data element.
A value set draws from one or more code systems
Example: most SDTM value sets can be extended with sponsor defined concepts
(which needs to be defined as part of the sponsor code system)
– LBTESTCD is a value set that can be extended
– AESEV cannot be extended
Questions for discussion: (CNE, coded no exception or CWE – coded with execption)
How to control/describe the use of an existing CT (e.g. CDISC) within a particular
organisation where the data standards do not necessarily use the CDISC codes? i.e. the
sponsor CT is an instance of CDISC CT but modified in some way?
Sponsor need to have a properly governed code systems

By taking CDISC CT and governing it (and potentially adding new concepts)
they can build a code system
 However by adding new concepts the sponsor diverge with the industry
standards
In which case is it good to add new concepts in a code system ? To discussed next
time: does it make sense to add new concepts in a standalone fashion – you do no
have a standard any more
Example
Recommended
definition
Document1
Page 27 of 37
Project:
Title:
Version: 0.7
4.2
4.2.1
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
Interoperability
Interoperability
Synonym
Definition
source
& 
ISO 11179 interoperability concerning the creation, meaning, computation, use,
transfer, and exchange of data [ISO/IEC 20944-1]

ISO 1117: capability to communicate, execute programs, or transfer data among
various functional units in a manner that requires the user to have little or no
knowledge of the unique characteristics of those units [ISO/IEC 2382-1]"

IEEE: ability of two or more systems of components to exchange information and
to use the information that has been exchanged. IEEE
(Source:
http://www.ieee.org/education_careers/education/standards/standards_glossary.
html)
Description
Example
Recommende
d definition
4.2.2
Technical interoperability (“machine interoperability”)
Synonym
Definition
source
& Technical Interoperability: The focus of technical interoperability is on the conveyance
of data, not on its meaning. Technical interoperability encompasses the transmission
and reception of information that can be used by a person but which cannot be further
processed into semantic equivalents by software. Note that mathematical operations
can be -- and frequently are -- performed at the level of technical interoperability. A
good example is the use of a “check digit” to determine the integrity of a specific unit
of transmitted or keyed-in data. The same mathematical formula is performed at each
end of a transaction and the results compared to assure that the data was successfully
transmitted.
Technical interoperability moves data from system A to system B.
Synonyms: Functional, Syntactic, exchange
(Source: Coming to Term: Scoping Interoperability for Health Care, HL7 EHR
Interoperability WG)
Description
Example
Recommended
definition
Document1
Page 28 of 37
Project:
Title:
Version: 0.7
4.2.3
Metadata Management
Metadata Definitions
Date: 8th August 2013
Working Group:
Emerging Technologies
Semantic interoperability
Synonym
Definition
source
& Semantic Ineroperability: To maximize the usefulness of shared information and to
apply applications like intelligent decision support systems, a higher level of
interoperability is required. This is called semantic interoperability which has been
defined as the ability of information shared by systems to be understood… so that
non-numeric data can be processed by the receiving system. Semantic interoperability
is a multi-level concept with the degree of semantic interoperability dependent on the
level of agreement on data content terminology and the content of archetypes and
templates
used
by
the
sending
and
receiving
systems.
Semantic Interoperability ensures that system A and system B understand the data in
the same way
(Source: Coming to Term: Scoping Interoperability for Health Care, HL7 EHR
Interoperability WG)
Description
Example
Recommended
definition
4.2.4
Process Interoperability
Synonym
Definition
source
Document1
& Process Interoperability: Process interoperability is an emerging concept that has
been identified as a requirement for successful system implementation into actual
work settings. It was identified during the project by its inclusion in academic papers,
mainly from Europe, and by its being highlighted by an Institute of Medicine (IOM)
report issued in July 2005 which identified this social or workflow engineering as key to
improving safety and quality in health care settings, and for improving benefits
realization. It deals primarily with methods for the optimal integration of computer
systems into actual work settings and includes the following:
• Explicit user role specification
• Useful, friendly, and efficient human-machine interface
• Data presentation/flow supports work setting
• Engineered work design
• Explicit user role specification
• Proven effectiveness in actual use
Process interoperability coordinates work processes, enabling the business processes
at the organizations that house system A and system B to work together. Process
interoperability is achieved when human beings share a common understanding, so
that business systems interoperate and work processes are coordinated.
Comment: EU Interoperability framework (EIF) defines organizational Interoperability
which might be the same as process interoperability?
Page 29 of 37
Project:
Title:
Metadata Management
Metadata Definitions
Version: 0.7
Date: 8th August 2013
Working Group:
Emerging Technologies
(Sources: 1. Coming to Term: Scoping Interoperability for Health Care, HL7 EHR
Interoperability WG and
2. Principles of Health Interoperability HL7 and SNOMED (Health Information
Technology Standards), author: Tim Benson, April 2012)
Description
Example
Recommended
definition
4.3
4.3.1
Data aggregation, integration
Data pooling
POOLING is the act of pulling together different kinds of data on the same patient (or set of patients in a
clinical trial) to give a holistic representation of what was observed for each patient during the clinical
trial.
 Observed data are the foundation of the clinical trial and should accurately reflect what happened
during the course of the trial to the patients in the trial.
 Once a trial is completed and a database locked, the observed data should never change. It becomes
a historical record/fact of what occurred during the trial.
 Observed data is frequently manipulated to transfer it from one system to another or to facilitate
analysis and presentation of the data.
 Transformations are defined as data mappings to restructure the data format, but leave the data
itself unchanged. This often occurs since the format in which the data is collected will depend on the
source and the IT requirements for such data collection and storage. This is largely a rules-based
activity.
 Derivations are the use of mathematical or logical algorithms to change or to create new data
values or flags. Derivations also include imputations for missing data to facilitate statistical analysis
and inference.
4.3.2
Data aggregation
4.3.3
Data integration
INTEGRATION is the storage of individual datasets in a common physical or virtual IT system. The
individual datasets remain distinct entities, but have are located in the same IT
environment/infrastructure.
Document1
Page 30 of 37
Project:
Title:
Version:
0.1
5
Metadata Management
Metadata Definitions
Working Group:
Emerging Technologies
Date: 22 April 2013
INPUT (draft material that can be used – to be deleted in final document)
5.1
Metadata management
Term
Synonym
Definition
Metadata
Management
MEM
Metadata Management is a worldwide infrastructure composed of policies, procedures, standards, models,
skills, tools and training needed to promote the shareability of data throughout the enterprise and to our
customers.
5.2
Master data management
Term
Synonym
Definition
Master Data
Master Data is business data that has a consistent meaning and definition to ne shared across systems; this
applies particularly to data such as site identification, investigator identification, and study identification. It
is produced into a “master system” as part of a transaction and is used for reference and validation in
transactions within other systems.
Master Data – as any other data – are defined with structural Meta data
Master Data MDM
Management
Master Data Management comprises a set of processes and tools that consistently defines and manages the
non-transactional data entities of an enterprise which is fundamental to the company’s business operations
(may include reference data). Master Data Management has the objective of providing processes for
collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing such data
throughout the enterprise to ensure consistency and control in the ongoing maintenance and application
use of this data. This is sometimes known as Reference Data Management.
Document1
Page 31 of 37
Project:
Title:
Term
Synonym
Master
Reference
Data
Version:
0.1
Definition
Metadata Management
Metadata Definitions
Working Group:
Emerging Technologies
Date: 22 April 2013
A combination of Master Data and Reference Data. The governance of these 2 components is however quite
different:

reference data are often defined by external organizations and are defined at design time; they are
generally managed within a terminology server (or a meta data repository) as part of all the code
systems

master data are created during application run time through a transaction and are stored into the
source system considered as the source of truth.
Master Data
Source
System
Master Data Source System is the application that houses a master data “dimension” (or type of master data
such as site or investigator) for Perceptive Informatics. The system is available to all applications
(operational and information provisioning, including the Data Warehouse) across the enterprise.
Reference
Data
In context of Master Reference Data Management this corresponds to the set of code systems that are
commonly used across many different systems and attributes
Reference
Data
Management
Management of Reference Data
5.3
Controlled terminology
Term
Synonym
Concept
Definition
A concept is a “unit of thought” within a particular domain – a unitary or atomic mental representation of a
real or abstract thing
Concepts, as abstract, language- and context-independent representations of meaning, are important for
the design and interpretation of static information models. They constitute the smallest semantic entities2
with which models are built. The authors and the readers of an information model use concepts and their
relationships to build and understand the models.
2
As models are layered and developed, the size and description of the smallest semantic entity may change, to best meet the use case(s) and requirements, and to
show different views on reality
Document1
Page 32 of 37
Project:
Title:
Term
Synonym
Version:
0.1
Definition
Metadata Management
Metadata Definitions
Working Group:
Emerging Technologies
Date: 22 April 2013
code
Code’ is the machine-processable part of a Concept Representation, published by the author of a code
system as part of the code system.
It is the preferred unique machine-readable identifier for that concept in that code system and is used in the
'code' property of an ISO 21090 CD data type.
Codes are sometimes meaningless identifiers, and sometimes they are mnemonics that imply the
represented concept to a human reader; meaningless identifiers are advised particularly in larger vocabulary
systems
Code system
A Code System is a managed collection of concept representations, including codes and/or designations (or
human readable text/decode), but sometimes with more complex sets of rules, references (definitions), and
relationships.
Although things may be differentially referred to as terminologies, vocabularies, or coding schemes, or even
classifications, the ISO 21090 CD datatype considers all such collections ‘code systems’.
A code system is typically created for a particular purpose; they may consist of finite collections, such as
concepts that represent individual countries, colours, or states, or they may represent broad and complex
collections of concepts across a particular domain, e.g., SNOMED-CT, ICD, LOINC, and CPT. A code system
should be uniquely identifiable; for ISO 21090conformant uses, this identifier shall take the form of an ISO
OID.
Concept
definition
A concept definition is the explanation of the meaning of the concept. The concept definition may be
provided wholly by the concept designation, with or without additional text etc. (see concept
representation), but particularly in large code systems that employ description logic or similar ontological
functionality, the full definition of the concept may require knowledge of its relationship to other concepts
within the code system.
Concept
designation
A concept designation is a language symbol for a concept that is intended to convey the concept meaning to
a human being. A concept designation may also be known as an appellation, symbol, or term, this latter
being
the
most
common
synonym.
A concept designation is typically used to populate the 'displayName' property of an ISO 21090 CD data
type.
Document1
Page 33 of 37
Project:
Title:
Term
Synonym
Version:
0.1
Definition
Metadata Management
Metadata Definitions
Working Group:
Emerging Technologies
Date: 22 April 2013
Concept
domain
A concept domain is a sentence or paragraph that defines the semantic space (the totality of meaning that
can be expressed by the concepts that can be used) for the “thing" that a coded attribute in an information
model
is
to
encompass,
plus
examples
of
these
“things”.
For example: an information model class is “car” and the coded attribute is “manufacturer”; the concept
domain is “The company that makes/markets the car to the general public; examples include General
Motors, Ford Motor Company and Mercedes-Benz”.
Concept
identifier
A concept identifier is a vocabulary object that unambiguously and globally uniquely represents a concept
within
the
context
of
a
code
system
in
a
machine
readable
way.
A concept identifier consists of: cthe OID for Code System + Code (+ Designation/Display name).
To make a Concept Identifier human readable, the “display name” (the designation) is added thus: the OID
for Code System + Code (+ Designation/Display name). The designation (display name) is not mandatory in
the ISO 21090 concept identifier, but it is considered good terminology practice to always have the
designation for safety reasons (data unscrambling etc.)3.
Concept
representati
on
A concept representation is a vocabulary object that enables the description and manipulation of a concept
in
systems
and
applications
(such
as
information
models,
xml
schema).
A concept representation is minimally formed by putting together a code and a designation. However, a
concept representation in a code system may also be augmented with additional text, annotations,
references and other resources that serve to further identify and clarify what the concept is.
Value set
A value set is a uniquely identifiable set of valid concept identifiers that instantiate a concept domain in use
(in an application, an xml instance etc.) where any concept identifier used can be tested to determine
whether it is a member of the value set at a specific point in time.
Value sets exist to instantiate the permissible content of a concept domain for a particular use in an
information model vocabulary binding, in analysis, in UI data collection - in a pick list (drop-down box), etc.
A value set is useful only in the context of instantiation of an attribute in an information model, not as a
stand-alone object (this is in contrast to a code system, which exists in its own right).
3
Debate as to whether the display name should be carried in a concept identifier continues. There are a significant group who feel that the display name should
not be carried.
Document1
Page 34 of 37
Project:
Title:
Version:
0.1
5.4
Metadata Management
Metadata Definitions
Working Group:
Emerging Technologies
Date: 22 April 2013
Interoperability
Term
Synonym
Semantic
Interoperabil
ity
Definition
FDA guidance
“Interoperability” means the ability to communicate and exchange data accurately, effectively,
securely, and consistently with different information technology systems, software applications, and
networks in various settings, and exchange data such that clinical or operational purpose and meaning
of the data are preserved and unaltered.
Technical interoperability describes the lowest level of interoperability whereby two different systems
or organizations exchange data so that the data are useful. There is nothing that defines how useful.
The focus of technical interoperability is on the conveyance of data, not on its meaning. Technical
interoperability supports the exchange of information that can be used by a person but not necessarily
processed further. When applied to study data, a simple exchange of nonstandardized data using an
agreed-upon file format for data exchange (e.g., SAS transport file) is an example of technical
interoperability.
Semantic interoperability describes the ability of information shared by systems to be understood, so
that nonnumeric data can be processed by the receiving system. Semantic interoperability is a multilevel concept with the degree of semantic interoperability dependent on the level of agreement on
data content terminology and other factors. With greater degrees of semantic interoperability, less
human manual processing is required, thereby decreasing errors and inefficiencies in data analysis. The
use of controlled terminologies and consistently defined metadata support semantic interoperability.
Process interoperability is an emerging concept that has been identified as a requirement for
successful system implementation into actual work settings. Simply put, it involves the ability of a
system to provide the right data to the right entity at the right point in a business process.
5.5
data aggregation
Document1
Page 35 of 37
Project:
Title:
Version: 0.1
6
Metadata Management
Metadata Definitions
Working Group:
Emerging Technologies
Date: 22 April 2013
REFERENCES & RELATED DOCUMENTs
Related Documents
Reference
No.
Document Name
Filename
[FDA1]
Guidance for Industry. Providing Regulatory
Submissions in Electronic Format — Standardized
Study Data - DRAFT GUIDANCE . February 2012
http://www.fda.gov/downloads/Drugs/Guid
ances/UCM292334.pdf
[CDISC1]
CDISC Glossary - 2009
http://www.cdisc.org/stuff/contentmgr/file
s/0/08a36984bc61034baed3b019f3a87139/
misc/act1211_011_043_gr_glossary.pdf
[ISO1]
ISO1179 ISO/IEC 11179 Metadata Registry (MDR)
standard
Accessible on ISO site
[ISO2]
ISO2109
ISO 21090 Healthcare Data Type Standard
Accessible on ISO site (draft version
available on Internet)
Status
Name
Company
Date
Signature
Author
Author
Author
Author
7
7.1
Appendices
CDISC glossary
cdisc_glossaryterms_
version7.1_final_2008.doc
Document1
Page 36 of 37
Project:
Title:
Metadata Management
Metadata Definitions
Version: 0.1
8
Working Group:
Emerging Technologies
Date: 22 April 2013
Parking log of implementation
Maintenance/governance of code system – including the need for primary designation
Document1
Page 37 of 37
Download