Regulatory Submission Datasets in the World of Evolving Standards Dave Christiansen, DrPH Christiansen Consulting, CDISC Founding Director CC Christiansen Consulting “Safety and the Critical Path“ 2005 FDA/Industry Statistics Workshop Sept 14-16, 2005 Washington, DC Acknowledgments Sally Cassel, Lincoln Technologies Kaye Fendt, Data Quality Research Institute Wayne Kubick, Lincoln Technologies Rebecca Kush, CDISC Randy Levin, FDA Bob O’Neil, FDA Bill Qubeck, Pfizer Norm Stockbridge, FDA Steve Wilson, FDA CDISC ADaM and SDS Teams © Copyright 2005, David H. Christiansen 1-2 Disclaimer Views expressed in this presentation are those of the speaker and not, necessarily, of the Food and Drug Administration, CDISC or any other organization. © Copyright 2005, David H. Christiansen 1-3 State of Clinical Trial Research - 1995 FDA-regulated products accounted for about 25 cents of every consumer dollar spent in the United States Yet each company established its own clinical trials content standards independent of other companies in the industry Technological advances were available to make the submission and review process more efficient © Copyright 2005, David H. Christiansen 1-4 Challenges for Adoption of Standards Requirements for our industry not clearly defined and articulated Past efforts not focused on overall clinical data management requirements Organizations have focused on internal standards vs.industry-level; resistance to share internal standards May require a change in process for the organization Clinical data standards must accommodate scientific context and complexity inherent in clinical research © Copyright 2005, David H. Christiansen 1-5 Multiple Organizations with Shifting Standards Out-license Operational Database A Statistical Analysis SAS Statistical Analysis in S+ Output and Report Out-license Kaye Fendt, 2001 Operational Database B Output and Report Operational Database CRO Operational Database A CRO Statistical Analysis Statistical Analysis SAS Output and Report Output and Report In-license In-license In-license 1-6 Initial Solutions to the Problem Remote Data Entry (RDE) processes emerged in the 1970s, but languished for 20 years without significantly impacting the Clinical Trials arena. CANDAs/ CAPLAs – Too many different standards The FDA CARS (Computer Assisted Review of Safety) and SMART (Submission Management and Review Tracking) initiatives took initial steps to develop Electronic Review tools. © Copyright 2005, David H. Christiansen 1-7 Computer Assisted NDAs (CANDAs) and Computer Assisted Product License Applications (CAPLAs) Operational Database A Statistical Analysis SAS Operational Database B Statistical Analysis in S+ CANDA from Company A Operational Database A Statistical Analysis SAS Operational Database B Statistical Analysis in S+ CANDA from Company B Operational Database A Statistical Analysis SAS Operational Database B Statistical Analysis in S+ CAPLA from BioTech X 1-8 SMART Initiative at FDA Operational Database From Company A Operational Database From Company B Operational Database From Company C Conversion to FDA Standard DB Structure FDA Standard Database Kaye Fendt, 2001 FDA Tools 1-9 Regulatory Environment Applicants were required to provide CRTs with submissions – CFR 314.50 Clinical reviews were primarily a paper process task – even for CRTs 1992 PDUFA – Initial Prescription Drug User Fee Act added time commitment pressures to reviewers © Copyright 2005, David H. Christiansen 1-10 Regulatory Environment Constant pressure for FDA scientists to make the “right” decisions in a timely fashion ICH / ESTRI discussions Electronic submission of CRFs/CRTs Guidance for Industry on Electronic Submissions – General Considerations 1999 © Copyright 2005, David H. Christiansen 1-11 Setting was Perfect for … Development and acceptance of clinical trials content standards Industry acceptance and participation in standards development Regulatory participation in the standards development process © Copyright 2005, David H. Christiansen 1-12 The Solution(s) 1990s Electronic Data Capture (EDC) tools reemerged as a serious interest 21CFR 11 published in March 1997 CDISC started in 1997 FDA Guidance for Industry: Computerized Systems Used in Clinical Trials published in April 1999 FDA Guidance for Industry: Electronic Submission of NDAs/BLAs, 1999 © Copyright 2005, David H. Christiansen 1-13 A Shared Vision Pharma Tech/Software Labs Data Biotech Standards Other Vendors Public Regulatory CROs © Copyright 2005, David H.2002 Christiansen Steve Wilson, 1-14 CDISC History Began in 1997 as a volunteer organization DIA Special Interest Area Community (SIAC) from 1998-1999 Incorporated as a non-profit organization in 2000 Members and sponsors today include over 150 companies (biopharmaceuticals, CROs, academic institutions, IT providers, etc.) Global reach, with CDISC Coordinating Committees in Europe and Japan © Copyright 2005, David H. Christiansen 1-15 Clinical Data Interchange Standards Consortium (CDISC) CDISC is an open, multidisciplinary, non-profit organization committed to the development of worldwide industry standards to support the electronic acquisition, exchange, submission and archiving of clinical trials data and metadata for medical and biopharmaceutical product development. The CDISC mission is to lead the development of global, vendor-neutral, platform-independent standards to improve data quality and accelerate product development in our industry. © Copyright 2005, David H. Christiansen 1-16 CDISC Collaborations with Food and Drug Administration (FDA) Liaisons on SDS, ADaM, SEND, Protocol Representation Teams SDTM referenced in eCDT Study Data Specification – July 2004 Analysis Dataset Guidance under development at FDA with input from ADaM DEFINE.XML for SDTM submission metadata referenced in eCDT Study Data Specification – March, 2005 Co-chair HL7 RCRIM Technical Committee with CDISC and HL7 Rebecca Kush, 2004 © Copyright 2005, David H. Christiansen 1-17 FDA Cooperative Research and Development Agreement (CRADA) Warehouse physical design IBM CRADA Data loader, front-end for reviewer Lincoln Technologies CRADA Patient Profile Viewer PPD Informatics CRADA Integrating animal tox data PharmQuest CRADA Randy Levin, © Copyright 2005, David2004 H. Christiansen 1-18 Alphabet Soup ICH – International Committee on Harmonisation ICH has developed a Common Technical Document (CTD) that provides for a harmonised structure and format for new product applications FDA has a draft guidance on a Electronic Common Technical Document (eCTD), including structures for datasets and programs ICH E3, E6 and E9 provide some models XML allows navigation and “smart” datasets © Copyright 2005, David H. Christiansen 1-19 More Alphabet Soup HIPAA - Heath Insurance Portability and Accountability Act of 1996 Standards for the electronic exchange, privacy and security of health information. Collectively these are known as the Administrative Simplification provisions HL7 - Health Level 7 Electronic messaging standards for medical practice data HHS supports standardized model of an electronic health record FDA is a sponsor CDISC and HL7 have a formal affiliation 1-20 © Copyright 2005, David H. Christiansen More Alphabet Soup SNoMed - Systematized Nomenclature of Medicine Purchased by HHS for $34M National Library of Medicine will make it available without charge throughout the U.S XML - eXtensible Markup Language Used by ICH for the electronic Common Technical Document (eCTD) backbone Used by FDA for the electronic Table of Contents (eTOC) Proposed by CDISC and FDA to replace pdf for metadata (DEFINE.XML) © Copyright 2005, David H. Christiansen 1-21 More Alphabet Soup JANUS Janus is intended to capture all clinical data collected from a clinical trial along with enough of a machineinterpretable description of the study protocol to permit a high degree of automated analysis A database with a structured data that will utilize tools being developed for FDA medical reviewers FDA specified vertical data structures for SDTM V3.1 datasets SDTM (and Janus) currently explicitly exclude Statistical Analysis Datasets © Copyright 2005, David H. Christiansen 1-22 Primary Reviewer Tasks Involving Submission Datasets Statisticians Medical Reviewers Replicate analyses Test assumptions Perform alternative analyses View data used for a specific table View patient profiles Auditors Compare source data values to CRFs or source documents Verify derivations 1-23 © Copyright 2005, David H. Christiansen Submission Dataset Concepts Datasets and documentation should be adequate to allow reviewers to answer the following questions: (1) Do the submitted data and documentation clearly describe the conduct and results of the trial? (Can the reviewer understand the data and results?) (2) Is the clinical evidence of sufficient quality to ensure that the reported results are accurate and true? (Does the reviewer believe the data and results?) © Copyright 2005, David H. Christiansen 1-24 CDISC Data Models and the Clinical Trial Research Process with Drafts as of May, 2005 Data Sources • Site CRFs • Laboratories • Contract Research Organizations • Development Partners Operational Data Interchange: ODM LAB Operational Database •Metadata •Study Data •Audit Trail •Archive ODM = Operational Data Model LAB = Laboratory Data Model SEND= Standards for the Exchange of Non-clinical Data © Copyright 2005, David H. Christiansen Submission Data Interchange: SMM SDTM ADaM SEND Regulatory Submission Datasets •Machine Readable Metadata (Partial) •Study Data Tabulations •Statistical Analysis Datasets •SEND SMM = Submission Metadata Model SDS = Submission Domain Standards ADaM = Analysis Dataset Models 1-25 Evolution of Case Report Tabulations Code of Federal Regulation: 21 CFR 314.50 1988 Guideline on the Statistical Sections 1997 Guidance on Archiving Data: 21 CFR 11 1999 Guidance on Providing Regulatory Submissions in Electronic Format ICH E3 - Structure and Content of Clinical Study Reports ICH Common Technical Document eCTD and Study Data Specification Guidance for Review Staff and Industry - Good Review Management Principles and Practices for PDUFA Products 1-26 © Copyright 2005, David H. Christiansen Regulation and Guidance: Case Report Tabulations (CRTs) 21CFR 314.50 (f) (1) “The tabulations are required to include the data on each patient in each study, except that the applicant may delete those tabulations which the agency agrees, in advance, are not pertinent to a review of the drug`s safety or effectiveness.” 1988 Guideline for the Format and Content of the Clinical and Statistical Sections of an Application defines CRTs as: “These case report tabulations contain, in an organized fashion , essentially all data (efficacy, safety, pharmacology) collected in the case report.” “…being entirely comprehensive, (they) serve as an archival or reference document, not as listings suitable for ordinary review.” “These tabulations are distinct from, and more extensive than, the tabulations of individual patient data called for as parts of the full reports of controlled clinical studies…” © Copyright 2005, David H. Christiansen 1-27 Guidance: Data Listings 1988 Guideline defines patient data listings as: Demographic and baseline data, effectiveness data, and safety data from “full reports of controlled clinical studies and the safety portions of reports of all studies.” The data listings requested as part of the report (in an appendix to it) are focused on the particular variables critical to the analyses carried out, allowing the reviewer to examine the individual patient data underlying critical group measurements. These report listings are generally “subsets of relevant effectiveness and safety variables used in analyses and tables.” © Copyright 2005, David H. Christiansen 1-28 1997 NDA Guidance: Archiving Submissions in Electronic Format 21 CFR Part 11 - Electronic Records; Electronic Signatures regulation provides for the voluntary submission of parts or all of an application in electronic format Case Report Tabulations may be submitted as PDF files in two forms: Domain Profiles - commonly referred to as patient line listings or patient data listings, domain profiles consist of all data collected for a CRF domain (such as demographics, vital signs, labs, efficacy measures) from one study. Patient Profiles - one or more pages that contain all of the study data collected for an individual patient. © Copyright 2005, David H. Christiansen 1-29 1999 NDA Guidance: Providing Regulatory Submissions in Electronic Format Each dataset is a single SAS transport file and, in general, includes a combination of raw and derived data. Each CRF domain (e.g.,demographics, vital signs, adverse events) should be provided as a single dataset. In addition, datasets suitable for reproducing and confirming analyses may also be needed. Patient profiles can also be provided as PDF files © Copyright 2005, David H. Christiansen 1-30 Common Usage of CRT until 2003 CRTs were interpreted by many (including CDISC) as the CRF domain datasets Analysis datasets were not CRTs Listings were defined by some as the printed or PDF representation of a dataset with some additional “selection” variables There was no clear distinction between CRTs and data listings for datasets In 2003 FDA interpreted 21 CFR 314.50(f)(1) as defining CRTs to include: Study Data Tabulations Statistical Analysis Datasets Data Listings Patient Profiles © Copyright 2005, David H. Christiansen 1-31 International Committee on Harmonization (ICH): “E3 Structure and Content of Clinical Study Reports” ICH E3 study reports provide for: Selected Patient Data Listings (Appendix 16.2) including discontinued patients, protocol deviations, exclusions, demography, compliance, AEs, etc. Individual Patient Data Listings (Appendix 16.4) “Data listings (tabulations) of patient data utilized by the sponsor for statistical analyses and tables supporting conclusions and major findings. These data listings are necessary for the regulatory authority's statistical review, and the sponsor may be asked to supply these patient data listings in a computer-readable form.” © Copyright 2005, David H. Christiansen 1-32 FDA Guidances Relating to the ICH Common Technical Document (CTD) M4: Common Technical Document for the Registration of Pharmaceuticals for Human Use M2: eCTD: Electronic Common Technical Document Specification ICH E3: Structure and Content of Clinical Study Reports Draft FDA eCTD Guidance: Providing Regulatory Submissions in Electronic Format - Human Pharmaceutical Product Applications and Related Submissions This guidance makes recommendations regarding the use of eCTD document information backbone files described ICH M2 and M4 and the clinical study report content described in ICH E3. © Copyright 2005, David H. Christiansen 1-33 Draft eCTD Guidance: Case Report Tabulations Data tabulations Data tabulations datasets Data definitions Data listings Data listing datasets Data definitions Analysis datasets Analysis datasets Analysis programs Data definitions Subject profiles IND safety reports © Copyright 2005, David H. Christiansen 1-34 eCTD Study Data Specifications V 1.1 March, 2005 “Data tabulations are datasets in which each record is a single observation for a subject.” Specifications are located in the Study Data Tabulation Model (SDTM) developed by CDISC at www.cdisc.org/models/sds/v3.1/index.html. Each dataset is provided as a SAS Transport (XPORT) file. “Data listings are datasets in which each record is a series of observations collected for each subject during a study or for each subject for each visit during the study organized by domain.” Currently, there are no further specifications for organizing data listing datasets. General information about creating datasets can be found in the SDTM implementation guides referenced in the data tabulation dataset specifications. Each dataset is provided as a SAS Transport (XPORT) file. © Copyright 2005, David H. Christiansen 1-35 eCTD Study Data Specifications V 1.1 March, 2005 (cont) “Analysis datasets are datasets created to support specific analyses. Programs are scripts used with selected software to produce reported analyses based on these datasets.” Each dataset is provided as a SAS Transport (XPORT) file. Programs should be provided as both ASCII text and PDF files and should include sufficient documentation to allow a reviewer to understand the submitted programs. It is not necessary to provide analysis datasets and programs that will enable the reviewer to directly reproduce reported results using agency hardware and software. Currently, there are no other additional specifications for creating analysis datasets. “Subject profiles are displays of study data of various modalities collected for an individual subject and organized by time.” Each individual patient’s complete patient profile is in a single PDF file or a book-marked section of a single PDF file for all patients. © Copyright 2005, David H. Christiansen 1-36 So what are CRTs? Original regulation was written in the era of paper submissions At one point, CRTs were collected or raw data Currently defined as all data submitted No clear distinction between data tabulations and listings No clear distinction between derived variables on data tabulations and analysis datasets © Copyright 2005, David H. Christiansen 1-37 Statistical Review of Clinical Trials Data Efficacy and safety Confirmatory/Exploratory– focus on evaluating sponsor’s results Check appropriateness of statistical models and conclusions – programs & analysis datasets Assess quality/completeness of data Evaluate the impact of sponsor’s analytical decisions – derived variables, missing/messy data (“quirks” – R. Helms) – sensitivity analyses Answer new, review-related statistical questions Communication with sponsors Archive results © Copyright 2005, David H.2005 Christiansen Steve Wilson, 1-38 Statistical Review Environment No programmers Multiple projects Increasingly electronic world Understaffed Without documentation standards, every review is an adventure © Copyright 2005, David H.2005 Christiansen Steve Wilson, 1-39 Submission Files Data Tabulations Observations in SDTM Standard Format Data Listings Domain views by subject, by visit CRTs Data Submitted to FDA Patient Profiles Complete view of all subject data Define Metadata Description Document Analysis Files Custom datasets to support an analysis © Copyright 2005, David H. Christiansen Steve Wilson, 2005 1-40 SDTM & Analysis Files: Today’s Mantra BOTH ARE NEEDED FOR REVIEW! (for now) © Copyright 2005, David H.2005 Christiansen Steve Wilson, 1-41 Specifications: eCTD File Organization © Copyright 2005, David H.2005 Christiansen Steve Wilson, 1-42 SDTM & Analysis Datasets Currently, SDTM describes observations from a clinical trial SDTM data (with appropriate tools) are particularly useful in medical officer evaluation of safety It is well recognized that datasets that are used in the analysis have been restructured and contain additional information (derived variables, flags, comments, etc.) To facilitate communication between statistical reviewers and sponsors, there is a need to standardize the documentation and content of these datasets The CDISC/ADaM Team has a guidance describing the documentation of analysis files. © Copyright 2005, David H.2005 Christiansen Steve Wilson, 1-43 Goals of Draft Guidance: Datasets & Documentation Designed for Review Enable reviewers to understand, replicate, explore, confirm, reuse, etc. Clear, unambiguous communication of decisions, analysis and results Underlying principles: Can a reviewing statistician understand? Can a reviewing statistician efficiently: Quality Assure? Validate? Analyze? Steve Wilson, © Copyright 2005, David 2005 H. Christiansen 1-44 Draft Guidance: Standard Metadata/Documentation 1. 2. 3. Steve Wilson, © Copyright 2005, David 2005 H. Christiansen Analysis Analysis Datasets Analysis Variables 1-45 Challenges Still need to get reviews done Transitioning from/adapting to current Industry practice -- Next Steps vs. “Vision” Getting experience Work with minimal resources Good review practice Moving target – efficacy and safety Adopting to Change – Training/communication/resources/tools Science Communication: External and Internal Maintaining/improving Collaboration © Copyright 2005, David H.2005 Christiansen Steve Wilson, 1-46 Good Review Management Principles and Practices for PDUFA Products New guidance for FDA review Defines FDA reviewing steps Application completeness Pre-submission Application receipt Filing Review Planning Review Advisory Committee Wrap-up and Labeling Action © Copyright 2005, David H. Christiansen 1-47 Application Completeness “A complete application will receive a comprehensive and complete review within a specified time frame.” Must be readable and well organized Should eliminate the need for unplanned amendments Incomplete if it “meets the regulatory criteria for filing but lacks important information needed to complete the review and regulatory decision-making process, is disorganized, or does not conform to the recommended format for electronic submissions.” © Copyright 2005, David H. Christiansen 1-48 Evolution of Analysis-Level Metadata from Statistical Models ANALYSIS NAME – A unique identifier for this analysis. DESCRIPTION – A text description of the contents of the display. This will normally contain more information than the title of the display. REASON – The rationale or authority for performing the analysis. Suggested controlled terminology will facilitate classification and searching. DATASET – The name of the analysis dataset(s) used should be linked to the analysis dataset used for this analysis. Also may include the specific selection criteria to identify the appropriate records selected for this analysis. DOCUMENTATION – Contains the information about how the analysis was performed. © Copyright 2005, David H. Christiansen 1-49 Analysis-Level Metadata (cont.) DOCUMENTATION – Contains the information about how the analysis was performed. Could be a text description, or a link to other documents Protocol Statistical Analysis Plan (SAP) Analysis generation program (i.e., a statistical software program used to generate the analysis result) Contents will depend on: The level of detail required to describe the analysis Whether or not the sponsor will be providing a corresponding analysis generation program Sponsor-specific requirements and standards © Copyright 2005, David H. Christiansen 1-50 Analysis Metadata Example Subject Characteristics by Assigned Treatment Group for ITT Population Placebo Active Total nn nn nn Number of subjects randomized Treatment Received Placebo Active Age in Years Mean±SD Age Groups N(%) 21-30 31-40 41-50 51+ Race N(%) Caucasian Asian …… Sex N(%) Female Male Baseline Height (cm) Mean±SD Baseline Weight (Kg) Mean±SD Baseline BMI (Kg/M2) Mean±SD © Copyright 2005, David H. Christiansen nn n xx±x.x nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) xxx±xx.x xxx.x±xx.xx xx.xx±x.xxx n nn xx±x.x nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) xxx±xx.x xxx.x±xx.xx xx.xx±x.xxx nn nn xx±x.x nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) nn(xx%) xxx±xx.x xxx.x±xx.xx xx.xx±x.xxxx 1-51 Analysis Metadata Example Analysis-level Metadata Analysis name Description Table 1.1 Demographic and Subject Characteristics, ITT Population Table 1.2 Subject Disposition Summary CDISC 2005, ADaMDavid Team, 2005 © Copyright H. Christiansen Reason Dataset Documentatio n Prepathname/ SAP Section specified ADSL.xpt X.Y in - select pathname/ Protocol records Tab1_1.SAS where ITT=Y Prepathname/ FDA request specified ADSL.xpt xx.xxx in Protocol 1-52 Analysis Program Documentation Programs used to generate an analysis using submitted Analysis Dataset(s) as input Programs may be used for several purposes Replicate analysis Exploratory analysis Auditing Programs may be used at different levels As documentation As “code fragments” Execute in FDA environment © Copyright 2005, David H. Christiansen 1-53 Analysis Program Functionality Written documentation of the statistical process and the dataset analyzed Statistical software program code fragments that describe the statistical process and the analysis dataset used Statistical software programs that compute the results but do not format the results in the same manner as the table or figure in the final report Statistical software programs that exactly replicated the table or figure in the final report © Copyright 2005, David H. Christiansen 1-54 Analysis Dataset Creation Documentation Documents the creation of the submitted Statistical Analysis Datasets Programs may be used for several purposes Replicate datasets Create similar datasets for exploratory analysis Auditing Programs may be used at different levels As documentation As “code fragments” Execute in FDA environment © Copyright 2005, David H. Christiansen 1-55 Analysis Dataset Creation Documentation (cont.) The source of the Statistical Analysis Dataset should be clearly documented, allowing the reviewer to trace back data items to their source Documentation may depend on the source of Statistical Analysis Datasets Created from the Study Data Tabulation datasets (sequential processing) Created in a separate work process from the operational database (parallel processing) © Copyright 2005, David H. Christiansen 1-56 Issues: Submission of SAS Programs Purpose? Which SAS programs? Replicate analysis Exploratory analysis Auditing Dataset creation programs Analysis programs How will programs be used? As documentation As “code fragments” Execute in FDA environment © Copyright 2005, David H. Christiansen 1-57 Issues: Submission of SAS Programs (cont.) Sponsors/CRO work flows vary Proprietary programs Dataset size restrictions in Guidelines Standardized report programs are complicated Macros are difficult to transport and understand Need to start dialogue with FDA statisticians © Copyright 2005, David H. Christiansen 1-58 Implementation in the Real World In theory, theory and practice are the same. In practice, they’re not. - Yogi Berra How do we incorporate evolving standards into REAL work processes? Need to balance present needs with future gains Transitioning from/adapting to current Industry practice -- Next Steps vs. “Vision” – Steve Wilson, FDA © Copyright 2005, David H. Christiansen 1-59 Analysis Dataset Creation: Parallel and Sequential Data Flow ODB ODB Operational Database Extraction Programs Operational Database Extraction Programs Operational Database Extraction and Analysis Dataset Creation Programs Study Data Tabulations Analysis Dataset Creation Programs Study Data Tabulations © Copyright 2005, David H. Christiansen Statistical Analysis Datasets Statistical Analysis Datasets 1-60 So where are we? Technology is evolving Regulations are evolving Standards are evolving Even definitions are evolving © Copyright 2005, David H. Christiansen 1-61 How can we survive? Start now, develop a plan that will deal with present and adapt for the future Design for flexibility Design with basic principles and concepts of clinical trials, statistics and data management in mind © Copyright 2005, David H. Christiansen 1-62 Thank you Dave Christiansen Christiansen Consulting davechristiansen@cableone.net 208/338-3808