3.1.3 Analysis Dataset Creation (single study) - [iis2]

advertisement
CDISC MDR
Business Requirements
Specification
3.1.3
Analysis Dataset Creation (single study)
High level

requirement



Problem to
solve (+
root cause
analysis)
Presumption is that a data collection specification already exists that
identifies all the concepts and variables to be included in the study.
Analysis dataset creation spans aspects of the protocol (the analysis section),
the study analysis plan and the submission analysis plan
Deliverable is an analysis dataset specification which meets all the protocol,
study and submission requirements and is sufficient for the purposes of
writing the programming required to create these datasets
Want to be able to write programs to produce ADaM compliant datasets and
traceability (results back to ADaM datasets back to SDTM datasets back to
CRF/eCRF) as efficiently as possible
Process
No process to
enforce collection
of meta-data when
collecting data
No process to ensure
consistency in data collection
across studies
No tools to store relationship
between variables at a
conceptual level (e.g. SYSBP
may be collected with site and
position)
Data collection tool
do not support
collection of metadata ???
High level
storyboard
Detailed
storyboard
People
Additional burden on data collection
team with benefits to other people
Protocol Team focus on ONE
protocol and overlook need for data
integration for submission (ISSE and
ISE) and further data mining
CDISC SDTM does not manage
different groupings in different contexts
(e.g. SYSBP with/without qualifiers)
CDISC SDTM limited to safety
No consistency in
the way variables
are used across
studies
No information on
how variables were
linked together in
data collection
No agreement on terminology/code
list in clinical standards
BRIDG is the conceptual model
linking variables
Technology
External data standards
1. Generate SDTM datasets in a metadata driven way
2. Define analysis datasets and variables (including algorithms)
3. Create ADaM datasets and variables using the CMDR variables now provided
in SDTM datasets (via step 1)
4. Create define.xml with full traceability using the CMDR metadata and
metadata created and stored in steps 1-3 above
SDTM
The starting point for creation of analysis datasets is a set of datasets containing
CMDR compliant variables populated with the data collected in the study. If the
datasets are SDTM compliant, no pre-processing of the data is required before
commencing analysis dataset creation. If that is not the case, then the first step
is to create SDTM compliant datasets.
Note that some SDTM domains (primarily safety domains) are well specified in
the SDTM Implementation Guide (SDTM IG). Many domains not covered by
the SDTM IG are likely to remain company specific due to different reporting
Page: 1 / 4
CDISC MDR
Business Requirements
Specification
needs and different toolsets.
For consistent and efficient generation of SDTM datasets, a metadata driven
approach is employed:
1.
CMDR concepts for which named SDTM domains exist in the SDTM IG will
already have that link established, as a default, within the concept (e.g. the
AE concept would be identified as matching the SDTM AE domain)
2. A study-based link between CMDR concepts and SDTM classes and domains
will be established in a study/compound/company-based metadata registry
for those CMDR concepts for which named SDTM domains do not exist in
the SDTM IG (and SDTM domains were not agreed during the population of
the CMDR)
3. Using the link (stored within the concept) between the BRIDG model and
the CMDR variables, generic programs would “map” CMDR variables to the
relevant SDTM variables within the appropriate SDTM domain (identified in
steps 1 and 2 above)
4. Any CMDR variables which do not fit into the relevant findings,
interventions, events domain would be auto-populated into domain-level
SUPPQUAL datasets to ensure consistent handling of variables
CMDR
 Concept definition. There is a complex concept defined for each SDTM
domain define in the SDTM IG. This complex concept contains all the simple
concepts related to STDM variables ; this will allow to map any relevant
variables to SDTM variables
 Note: SDTM variables names should be used a gold standard variable
name as much as possible
 Concept definition. It is possible to define new complex concepts to cover
domains not yet covered in SDTM. The CMDR would then be the definitive
source for named SDTM domains. The SDTM IG would provide key training
and guidance for implementation, but the IG would no longer be maintained
as the authoritative source for the SDTM specification.
 Note: a company-specific MDR should also be able to manage simple and
complex concepts, e.g. in the case of new indications/proprietary
information
 this should however be minimized
 the variables included in these company specific concept should be
based on CMDR simple concept/variable as much as possible
Analysis Datasets
Through review of the protocol analysis section and study/submission analysis
plans (which identify the analyses and displays that are required):
 Identify new analysis datasets required
 Identify new analysis variables required
 Identify algorithms required for analysis variables
 may need to identify alternative algorithms required to
Page: 2 / 4
CDISC MDR
Business Requirements
Specification



demonstrate robustness of analysis to different algorithms
some algorithms may be data driven e.g. pre-defined
algorithm doesn’t handle all situations that are present in the
data
establish rules about handling of incomplete data
identify any variables required for statistical procedures e.g.
dummy variables for logistic regression etc
ADaM
Create ADaM datasets using metadata defined and stored in step 2 above and by
conducting the following
 Identify protocol violators (using the CDISC inclusion/exclusion
standard; may need new analysis variables for this e.g. to determine
compliance or to derive from the medical history whether the subject
has had “cancer in past N years?”
 Set population flags (often set programmatically, but sometimes
entered manually)
CMDR (for Analysis Datasets and ADaM creation)
 Concept definition. It is possible to define a complex concept for each
analysis data set. This practice should however be limited to “standard data
sets” that are expected to be created regularly.
Very specific analysis data set should be created either within the company
specific MDR or in the study meta-data registry.
 Variable definition. It is possible to specify new variables (linked to a simple
concept). While defining new variables – it is possible to provide a algorithm
together if the variable is derived.
Note: in the first phase of CMDR there is not further specification on
algorithm, and it will be hard to get agreement on definitions for many
analysis variables. In a second phase we may want to standardize on
derivation algorithms as they also define semantic.
 Note: Analysis Datasets and ADaM creation would use CMDR variable and
concept information in a basic lookup manner but would not provide
algorithms/coding information
Summary of
needs
Define.xml
Generate define.xml that provides full traceability using CMDR metadata,
metadata derived and stored as part of the SDTM generation, and metadata
derived during the ADaM dataset creation
CMDR
 Variable search. possibility to retrieve variables and their full definitions for
inclusion into a define.xml file
 Define.xml could link to CMDR content rather than carrying
duplicated/downloaded information (but as this is more of a define.xml
requirement, needs to be coordinated with CDISC TLC)
CMDR
 CMDR concepts for which named SDTM domains exist will already have
Page: 3 / 4
CDISC MDR
Business Requirements
Specification

that link established, as a default, within the concept (the AE concept would
be identified as matching the SDTM AE domain)
Using the link (stored within the concept) between the BRIDG model and
the CMDR variables, generic programs would “map” CMDR variables to the
relevant SDTM variables
Study/compound/company-based metadata registry
 A study-based link between CMDR concepts and SDTM classes and domains
will be established in a study/compound/company-based metadata registry
for those CMDR concepts for which named SDTM domains do not exist
Page: 4 / 4
Download