Define_Case_Study

advertisement
Define.XML
A CASE study describing how to produce
Define.XML from within SAS
Dianne Weatherall
Sep 2012
Copyright © 2012 Quintiles
Topics for Discussion
Why do we create the Define?
Requirements to create the Define?
What is Define.XML?
How do we create the Define in SAS?
Summary
2
Note
• Examples and code are based on the Draft Define V2
• Slides are based on SDTM Define, but similar principles can be applied to
the ADaM Define
> Key differences are explained in the Define Draft V2
3
WHY
4
Why?
Requirements?
What?
How?
Summary
Why?
5
Why?
Requirements?
What?
How?
Summary
Why?
• To describe what is included in the electronic data transfer
(datasets, variables, codelists, etc) -> METADATA
-> an elaborate “PROC CONTENTS” and more
• To provide a user-friendly document with links to navigate easily, and to
access information easily
• To easily access the annotated CRF and datasets
• To have a standard format to describe the electronic data transfer
• Why XML?
> Machine readable format for use by software applications
> Browser-based report (through the use of an XSL stylesheet)
> Vendor neutral, platform independent
6
REQUIREMENTS
7
Why?
Requirements?
What?
How?
Summary
Requirements?
• Metadata in a usable format, easy to capture and ability to import into SAS
(recommended format = Excel)
> Dataset metadata:
list of domains -> name, description, class, structure, purpose, keys, location
> Variable metadata:
list of variables: -> name, label, type, controlled terms / format, origin, role, comment
> Controlled terms:
coded value, decode
> Value-level metadata:
variable, where, type, controlled terms or format, origin, computation method
> Computational algorithms:
reference name, name, computation method
• Data in XPT format
• Annotated CRF in PDF format
• Supporting documentation:
> Reviewers guide / complex algorithms in PDF format
8
Why?
Requirements?
What?
How?
Summary
Requirements?
• Example: dataset metadata
9
Why?
Requirements?
What?
How?
Summary
Requirements?
• Example: variable metadata
10
Why?
Requirements?
What?
How?
Summary
Requirements?
• Example: controlled terms
> Usually includes all possibilities on the CRF, not only those in the data
> Needs to be set up from the CRF, not the data
11
Why?
Requirements?
What?
How?
Summary
Requirements?
• Example: value-level metadata
> 2 types: SUPPQUAL (QVAL) and FINDINGS (ORRES)
> Can be generated from the data, especially electronic labs
12
Why?
Requirements?
What?
How?
Summary
Requirements?
• Example: computational algorithms
> Used for standard algorithms that can be applied across variables
> Very long derivations should rather be specified in a PDF file, and linked from the variable
comment
13
Why?
Requirements?
What?
How?
Summary
Requirements?
• Specification vs Metadata
> Specification
- Instructions to a programmer to get to the final SDTM / ADaM dataset
- Contains derivations that often refer to raw variable names (for programming)
- May contain SAS code in some cases
- May refer to other documents (e.g. SAP etc)
> Metadata
- Describes the contents of the final SDTM/ADaM dataset
- Contains derivations that describe in words how the data was derived (for traceability and
reviewer information)
- Does not contain SAS code and does not refer to raw variables
- Contains links to other documents
14
Why?
Requirements?
What?
How?
Summary
Requirements?
• Specification vs Metadata
> To combine or not to combine?
- It makes sense to create one document that serves both purposes
» Prevents rework
» Consistency
» Less work
» BUT… make sure you separate programmer notes from reviewer comments
> Example:
15
WHAT
16
What?
• Define.XML with and without stylesheet
17
Why?
Requirements?
What?
How?
Summary
What?
18
Why?
Requirements?
What?
How?
Summary
What?
19
Why?
Requirements?
What?
How?
Summary
What?
20
Why?
Requirements?
What?
How?
Summary
What?
21
Why?
Requirements?
What?
How?
Summary
What?
22
Why?
Requirements?
What?
How?
Summary
What?
• Define-XML Version 2.0.0 can be used for SDTM and ADaM
• Terminology:
> Def: declaration of an object
> Ref: reference to an object
> OID: identify a metadata object
• Validation of the Define.XML:
>
>
>
>
It must properly reference versions of the CDISC standards
Conform to schemas
Meet all requirements of the Define-XML Specification
Can be checked using OpenCDISC according to defined rules
23
Why?
Requirements?
What?
How?
Summary
What?
• Structure of the XML file
>
>
>
>
>
>
>
>
>
>
XML header, the ODM root element, Study and MetaDataVersion
Linked supporting PDF documents (e.g. aCRF)
Value List definitions (e.g. for Supplemental Qualifiers / Findings Tests)
Where Clause definitions (referenced in Value List definitions)
Dataset definitions (ItemGroup)
Variable definitions (ItemDef)
Code List definitions (Codelist)
Methods definitions (MethodDef)
Comments definitions (CommentDef)
Leafs definitions
24
HOW
25
Why?
Requirements?
What?
How?
Summary
PREPARATION BEFORE WRITING
TO XML FILE
> Create XPT files of the SDTM datasets
> Import the specifications into SAS and create a dataset containing the Dataset and
Variable metadata
> Import the methods into SAS and merge onto the metadata by the method name
> XML does not like special characters – convert characters to text
> Create a value-level metadata dataset by choosing “TESTCD”, “PARMCD” and
“QNAM” variables and obtaining the values in those variables from the data
> Create a codelist dataset by running through variables and value-level variables with
codelists attached to them and obtaining the values for those variables
> Codelists -> set up in specs vs create from data
26
Why?
Requirements?
What?
How?
Summary
How?
• Create XML file within SAS using:
> DATA steps
> FILE and PUT statements
27
Why?
Requirements?
What?
How?
Summary
XML Header + ODM root element
> Example code:
28
Why?
Requirements?
What?
How?
Summary
XML Header + ODM root element
> Example code:
Study-specific derivations:
- Creation date/time
29
Why?
Requirements?
What?
How?
Summary
XML Header + ODM root element
> Example code:
Simple FILE and PUT statements
Study-specific parameters can be read in using macro parameters or derived
- e.g. creation date/time
30
Why?
Requirements?
What?
How?
Summary
Study + MetaDataVersion
> Example code:
31
Why?
Requirements?
What?
How?
Summary
Study + MetaDataVersion
32
Why?
Requirements?
What?
How?
Summary
Study + MetaDataVersion
> Example code:
Simple FILE and PUT statements – use the MOD option on the FILE statement to
append to the text file
Study-specific parameters can be read in using macro parameters or derived
- e.g. study name and description
33
Why?
Requirements?
What?
How?
Summary
Linked supporting PDF documents
(e.g. aCRF) + Leafs definition
> Example code:
34
Why?
Requirements?
What?
How?
Summary
Linked supporting PDF documents
(e.g. aCRF) + Leafs definition
35
Why?
Requirements?
What?
How?
Summary
Linked supporting PDF documents
(e.g. aCRF) + Leafs definition
> Example code:
Simple FILE and PUT statements with a MOD option
36
Why?
Requirements?
What?
How?
Summary
Linked supporting PDF documents
(e.g. aCRF) + Leafs definition
> Example code:
Simple FILE and PUT statements with a MOD option
37
Why?
Requirements?
What?
How?
Summary
Value List definitions (e.g. Findings
Tests) and Where Clause definitions
> Example code:
38
Why?
Requirements?
What?
How?
Summary
Value List definitions (e.g. Findings
Tests) and Where Clause definitions
> Example code:
39
Why?
Requirements?
What?
How?
Summary
Value List definitions (e.g. Findings
Tests) and Where Clause definitions
40
Why?
Requirements?
What?
How?
Summary
Value List definitions (e.g. Findings
Tests) and Where Clause definitions
> Example code:
Example: VSORRES
Create a dataset in SAS of value-level variables and their values.
Output an ITEMREF statement for each value of VSORRES (e.g. DIABP) and
reference a WHERECLAUSE.
41
Why?
Requirements?
What?
How?
Summary
Value List definitions (e.g. Findings
Tests) and Where Clause definitions
> Example code:
Example: VSORRES
Output a WHERECLAUSEDEF statement for each value of VSORRES (e.g.
DIABP) and use the same WHERECLAUSE name referenced in the ITEMREF.
42
Why?
Requirements?
What?
How?
Summary
Value List definitions (e.g. for
Supplemental Qualifiers) and Where
Clause definitions
> Example code:
43
Why?
Requirements?
What?
How?
Summary
Value List definitions (e.g. for
Supplemental Qualifiers) and Where
Clause definitions
> Example code:
44
Why?
Requirements?
What?
How?
Summary
Value List definitions (e.g. for
Supplemental Qualifiers) and Where
Clause definitions
45
Why?
Requirements?
What?
How?
Summary
Value List definitions (e.g. for
Supplemental Qualifiers) and Where
Clause definitions
> Example code:
Example: SUPPDM.QVAL
Output an ITEMREF statement for each value of SUPPDM.QNAM (e.g. RACE1)
and reference a WHERECLAUSE.
46
Why?
Requirements?
What?
How?
Summary
Value List definitions (e.g. for
Supplemental Qualifiers) and Where
Clause definitions
> Example code:
Example: SUPPDM.QVAL
Output a WHERECLAUSEDEF statement for each value of SUPPDM.QVAL (e.g.
RACE1) and use the same WHERECLAUSE name referenced in the ITEMREF.
47
Why?
Requirements?
What?
How?
Summary
Dataset definitions (ItemGroup)
• Dataset Definitions:
> Dataset definitions:
- ItemGroupDef element
- Include references to variable definitions
- Include references to actual data file
> Variable references:
- ItemDef element
> Dataset definition:
- def:leaf element
48
Why?
Requirements?
What?
How?
Summary
Dataset definitions (ItemGroup)
• Attributes:
> Repeating:
- No for datasets with one record per subject
- Yes for datasets with more than one record per subject
> Mandatory:
- Yes for required variables
- No for Expected or Permissible variables
> Domain:
- Name of the domain
> IsReferenceData:
- Yes for reference datasets (no subjects)
- No for subject-level datasets
49
Why?
Requirements?
What?
How?
Summary
Dataset definitions (ItemGroup)
> Example code:
50
Why?
Requirements?
What?
How?
Summary
Dataset definitions (ItemGroup)
51
Why?
Requirements?
What?
How?
Summary
Dataset definitions (ItemGroup)
> Example code:
Example: DM
Output ITEMGROUP statement for each domain and include a ITEMREF
statement for each variable. Include a link to the dataset.
52
Why?
Requirements?
What?
How?
Summary
Variable definitions (ItemDef)
• Variable Definitions:
> Variable definitions:
- ItemDef element
> Other elements:
- Controlled terminology: CodeList
- Value level metadata:
def:ValueListDef
- Computational method: MethodDef
- Comments:
def:CommentDef
- Origin:
def:Origin
53
Why?
Requirements?
What?
How?
Summary
Variable definitions (ItemDef)
• Data types:
>
>
>
>
>
>
>
>
>
>
text
integer
float
datetime
date
time
partialDate
partialTime
incompleteDatetime
durationDatetime
54
Why?
Requirements?
What?
How?
Summary
Variable definitions (ItemDef)
> Example code:
55
Why?
Requirements?
What?
How?
Summary
Variable definitions (ItemDef)
56
Why?
Requirements?
What?
How?
Summary
Variable definitions (ItemDef)
> Example code:
Example: DM
Output ITEMDEF statement for each variable. Include a link to the BLANKCRF for
CRF Page origins and include the page number. This comes from the metadata in
specs.
57
Why?
Requirements?
What?
How?
Summary
Code List definitions (Codelist)
> Example code:
58
Why?
Requirements?
What?
How?
Summary
Code List definitions (Codelist)
> Example code:
59
Why?
Requirements?
What?
How?
Summary
Code List definitions (Codelist)
60
Why?
Requirements?
What?
How?
Summary
Code List definitions (Codelist)
61
Why?
Requirements?
What?
How?
Summary
Code List definitions (Codelist)
> Example code:
Example: AE.AESEV
Output CODELISTREF in the ITEMDEF statement for each variable with a
codelist.
62
Why?
Requirements?
What?
How?
Summary
Code List definitions (Codelist)
> Example code:
Example: AE.AESEV
Output a CODELIST statement for each codelist and include each value as a
CODEDVALUE. Use the same name for the codelist as referenced in the
ITEMDEF statement.
63
Why?
Requirements?
What?
How?
Summary
Methods definitions (MethodDef)
> Example code:
64
Why?
Requirements?
What?
How?
Summary
Methods definitions (MethodDef)
> Example code:
65
Why?
Requirements?
What?
How?
Summary
Methods definitions (MethodDef)
66
Why?
Requirements?
What?
How?
Summary
Methods definitions (MethodDef)
> Example code:
Example: EG.EGDRVFL
For variables with method attached to them, include the METHODOID in the
ITEMREF in the ITEMGROUP statement.
67
Why?
Requirements?
What?
How?
Summary
Methods definitions (MethodDef)
> Example code:
Example: EG.EGDRVFL
Output a METHODDEF statement for each method. Use the same name as
referenced in the ITEMREF statement. Include a link to the Complex Algorithms
document if required.
68
Why?
Requirements?
What?
How?
Summary
Comments definitions
(CommentDef)
> Example code:
69
Why?
Requirements?
What?
How?
Summary
Comments definitions
(CommentDef)
> Example code:
70
Why?
Requirements?
What?
How?
Summary
Comments definitions
(CommentDef)
71
Why?
Requirements?
What?
How?
Summary
Comments definitions
(CommentDef)
> Example code:
Example: AE.AEENDY
For variables with comments attached to them, include the COMMENTOID in the
ITEMDEF.
Output a COMMENTDEF statement for each comment. Use the same name as
referenced in the ITEMDEF statement.
72
SUMMARY
73
Why?
Requirements?
What?
How?
Summary
Summary
• Define.XML files help to describe what is in an electronic submission of data
• Define.XML is a user-friendly document when rendered using a stylesheet
• DATA steps and PUT statements can be used to create a Define.XML file in
SAS using dataset, variable and value-level metadata and codelists/method
information
74
Q&A
75
Download