Weijin Gan
Manager, Macrostat
July 2012
Background
12 Legacy studies of the same compound, Neurology disease, CSR Completed.
5 studies, Phase II - Phase III
7 studies, Phase I – Phase II
Objective : Convert legacy data to CDISC standard
(SDTM/ADaM) for FDA submission.
(The submission includes 37 studies in total, 12 studies were done by our group)
The challenges we face
•
•
•
Data Flow
Raw Data
Blank CRF
SDTM IG
Statistician
Programmer
Annotated CRF
SDTM
SDTM Spec
Excel mapping spreadsheet
CDISC expert review
Sponsor review
CSR QC
Report
Legacy
Study CSR
ADaM Data
ADaM Spec
Excel mapping spreadsheet
SDTM Data
SAP
ADaM IG
Challenge - SDTM Compliance
•
2 core SDTM spec SDTM mapping for all studies aCFR and SDTM spec reviews.
1) STDM IG provide very specific mapping rule and a lot example.
2) Consult CDISC expert for special case.
3) Not every variable have clear mapping solution in SDTM. It can be done in different way
Challenge - SDTM Compliance
CASE1 : –SP vs. –OTH
“specified” map to –SP
“other, specify” map to –OTH
CASE2: “Not submitted”
Challenge - SDTM Compliance
Case3: CO vs. SUPP —
Comment to each domain could go to CO domain or SUPP- domain
Case4: DA vs. EX
Drug Accountability CRF may dosage information and should go into EX domain, rather than the DA domain
Challenge - SDTM Compliance
•
OpenCDISC
WebSDM
All error messages from OpenCDISC reports need to be checked and resolved
All warning messages from OpenCDISC should be commented appropriately
Challenge - SDTM Compliance
•
1) Variables in aCRF match SDTM spec
2) The list of Controlled Terminology matches the actual values in the datasets (Value level Metadata)
3) Variable role matches SDTM IG
4) Variable attributes match the datasets
The ultimate purpose is to keep the define.xml consistent with the SDTM.
Challenge - SDTM Compliance
Example of Consistency check across aCRF, SDTM spec and SDTM
Challenge - ADaM Compliance
• Develop ADaM specification based on SAP
• ADaM spec underwent several internal and external reviews
• SAS program checked about 183 rules from ADaM IG.
Such as
1) Name convention
2) Variable attributes
3) Value level matches the ADaM dataset
4) Required variable in ADSL
5) Data Point Traceability
6) One-to-one mapping of variable value
……
Challenge - ADaM Compliance
Example of ADaM compliance check
Challenge - CSR Validation
• Replicate Key table of CSR
• Investigate and document discrepancies between output from CDISC dataset and CSR table. Create QC report for each study
Sample of QC report
Challenge - CSR Validation
• Without the original program code and analysis dataset, investigating discrepancies was very difficult.
Possible reason of discrepancy
1) Different raw data
2) Different version of the WHO drug dictionary / MedDRA
3) Algorithm applied differently, e.g. visit mapping, LOCF, imputation of missing values
4) Data issue
Tremendous effort was spent to keep the STDM/ADaM consistent across the studies.
1. Core spec prepared at beginning
2. Core team reviewed aCRF/spec across studies to check the consistency
3. SAS program checks the consistency of variable names and attribute across studies
Lesson Learned
• Checking SDTM/ADaM compliance still relies on manual review, especially in the early stages
Although there are tools to check compliance, such as openCDISC, WebSDM, these tools only work after
SDTM/ADaM datasets were created. Spotting issues at early stages helps to avoid re-work after SDTM/ADaM creation. Finding major issues at early stages, eg. aCFR mapping, data structure, relies on the statistician's knowledge of CDISC. Tools do help a lot. Full automation of compliance check is not always possible.
Lesson Learned
• With the general CDISC knowledge, the mapping rule of some special domains or endpoints can be quickly learned from CDISC expert
• Additional effort is needed to QC and maintain
SDTM/ADaM spec, if define.xml will be created.
Lesson Learned
• Prepare the programming team with CDISC knowledge is essential
Some useful knowledge
ISO 8601 date format
Controlled Terminology, extensibles vs. Non-extensibles
-SEQ base on the keys specified in CDISC Standards Library
Timing Variables for BDS Datasets ADY , ADT, AVISIT
Question?