CDISC Board of Directors Meeting 10-11 Dec 2007 Creating a Clinical Data Element Dictionary A Proposal Preamble CDISC has made progress on many fronts • There is a CDISC “brand” CDISC has worked on strategies/plans over the years • • • Currently a strategy in place Operational plans/objectives for 2008 in place 2008 budget is in place Fundamentally change some of CDISC’s approach 2 Preamble This is a discussion first and foremost about WHAT we will do. What one thing, if done well and consistently, would have the most impact on your business? Ken Blanchard, Mission Possible What is ‘the pearl of great price.’ If we agree on WHAT, then we can discuss HOW. 3 Motivation for the WHAT What standards work is done today – Lilly example Lilly Data Element Standards \\Rodan\rodan.grp\GCDS_EB_PUBLIC 3560 pages in our “Dictionary” ~25,000 variables It’s all pdf (yuk !!!!) 4 Motivation for WHAT Link to Analysis Dataset Standards http://corpweb.d51.lilly.com/statmath/CoE/ADS/ADS_st d.html Thousands of pages of documentation in our total ADS specifications For each study, CROs get hundreds of pages of requirements that describe the data elements that we want, variable names, valid values, formats, etc. 5 Motivation for WHAT Dozens of people at Lilly and CROs communicate using these voluminous documents CROs have dozens of people mapping data to company-specific formats, naming conventions, etc. The CDISC Business Case largely predicated on eliminating these activities • Reduce mapping data from one form to another to transfer or to integrate it. 6 Motivation for WHAT http://www.wikihit.org/wiki/index.php/Main_Page The Clinical Data Definitions created in WikiHIT are not completely useful for clinical research studies. caDSR has some useful elements, but is a bit outdated and not entirely functional for what is needed for clinical research studies. • Too complex in all its detail NCI EVS has some useful elements, but does not have all the information and functionality that is needed by companies involved in clinical research. • Data definitions do not have all information (e.g. valid values) 7 The Language of Clinical Trials It is more important to share a common vocabulary than it is to have agreement on common grammatical rules. Content is more important than structure. Es ist wichtiger, einen allgemeinen Wortschatz, als zu teilen es Vereinbarung über allgemeine grammatische Richtlinien haben soll. A common vocabulary more important for sharing than understanding of typical rules of grammar. 8 CDISC Adoption by Pharma Make SDTM more useful, implementable • • Need more specificity Need more definitions on variables – data elements Standard data elements – this is what FDA wants, what pharma wants, what CROs want • FDA under pressure to do something quickly? CDISC dealing with healthcare, other standards organizations, etc. • • Don’t let the perfect hold up the good Need more focus from CDISC 9 CDISC Adoption by Pharma It’s about saving dollars for pharma, CROs and labs by simplifying interchange of data. It’s about helping companies and FDA integrate data from regulated clinical research studies. CDISC Business case has little to do with healthcare at this point. 10 Motivation for WHAT Summary There is an enormous unmet need for more content. • The CDISC Business Case is largely dependent on well defined data element standards being broadly available. Others are playing in this space, but do not meet the needs of pharma clinical research and regulatory submissions. CDISC Terminology Program has primarily focused on controlled terminology supporting SDTM, but not the data elements themselves. 11 What Is a Data Element? All the pieces of information (i.e. metadata) needed to unambiguously describe a concept English dictionary analogy • • • • Word – desk Phonetic spelling – dĕsk Part of speech – noun Definition – a piece of furniture with a flat top for writing [could also be thought of as the concept] • • Source – Latin, discus etc. 12 Data Element A Data Element is a unit of data for which definition, identification, representation, and permissible values are specified by means of a set of attributes; the smallest unit of data. The purpose of a data element definition is to define a data element with words or phrases that describe, explain, or make definite and clear its meaning. 13 Data Elements – Vertical v. Horizontal Vertical Data Set Structure Patient 1 1 1 1 1 1 Visit 1 1 1 2 2 2 Variable HR SBP DBP HR SBP DBP Value 55 128 84 57 122 83 •Valid Values for Variable are: HR, SBP, DBP. •A controlled terminology •For each ‘term’, provide the metadata to describe it: •Definition, units, valid values, etc. 14 Data Elements – Vertical v. Horizontal Horizontal Data Set Structure Patient 1 1 Visit 1 2 HR 55 57 SBP 128 122 DBP 84 83 Each variable has a name (terminology) and a corresponding set of metadata to describe it (definitions, units, valid values, etc.) 15 Clinical Data Element for Pharma Variable name (draft) • • • • • • • • • Label / concept Valid values of the variable itself Data type (num, char, date, …) Units Key words (e.g. biomarker, osteoporosis, …) – facilitate searches Source / reference (as needed) SDTM data domain Regulatory requirement [A team needs to define what are the essential metadata pieces of information that are parsimonious – enough to eliminate ambiguity, but few enough to be useful, consumable, understandable, burdenless.] 16 Creating a Clinical Data Element Dictionary (CDED) Task Force Members • • • • • • • • Steve Ruberg Bron Kisler Scott Getzin Doug Fridsma Chris Chute Sue Dubman Dave Iberson-Hurst Cara Willoughby 17 Proposal – WHAT - Unmet Need Comprehensive, electronically accessible, organized dictionary of unambiguous data element standards for our industry • • One of the most fundamental problems we all face within our own pharma companies, but even more acutely across the pharma industry/enterprise. Consistent with Strategy Theme #2, #5, #6 THE place where people go for clinical data element standards. THE thing for which CDISC is known ?!?! 18 Alignment and Focus If additional funding can be secured, standards specific to therapeutic areas will become part of the extended CDASH scope. CDISC Press Release #33 15 May 2007 KEY QUESTION Given the importance of this area and the need to move quickly, should we re-prioritize and divert resources (people and $$) to this effort? 19 Alignment and Focus FOCUS Where do we focus? • ISO, AHIC, AHRQ, NLMEc, industry architecture, … Initial focus on meeting pharma industry needs • If others want to piggyback on that effort, that is fine. Initial focus on clinical data and clinical trial metadata Initial focus on raw/observed data • There is a lot of territory to conquer within this focus area. Other opportunities (pre-clinical data elements, derived data elements) can be explored in the future. 20 Impact on Other CDISC Teams Clinical Data Element Dictionary (CDED) Terminology, SDTM, CDASH, LAB and SEND all converge into a common approach focused on the data elements and their exquisite definition • • Reduces need to harmonize CDISC models if they all utilize the same data element definitions Harmonization happens “on the front end” rather than after the fact The transport standard for carrying standardized content (ODM, HL7, SAS, other???) can be whatever BRIDG – work continues as is • Tightly coordinate standard data elements with BRIDG efforts 21 Creating a Clinical Data Element Dictionary (CDED) Initial Inputs Content Standards Transport Standards CDED SAS CDASH SDTM 80% LAB Protocol 20%ODM TB? CV? HL7 Other Existing? 22 Proposal HOW - Business model An open, electronic, peer production environment with appropriate governance Like MedDRA, but open and free Like Wikipedia, but more governance Like LINUX, but more granular and dynamic CDISC must adopt a more flexible and rapid development process 23 Clinical Data Element Standards Governance Template Submission Review Final Anyone Downloadable (define.xml) Searchable – text, key words search shows status (submit, review, final) 24 Governance for the CDED Governing Board 2 Full-Time CDISC Employees Lead 1 Team 1 Lead 1 ~ 6-8 SME’s Lead 2 Team 2 Lead 2 ~ 6-8 SME’s Lead 3 ... Lead k Team 3 Lead 3 ~ 6-8 SME’s ... Team k Lead k ~ 6-8 SME’s 25 Proposal WHO - CDISC CDISC has the opportunity to assert an even greater leadership role in this arena. Leverage CDISC’s strengths – Strategy Theme #1 Independence Consensus building Strong pharmaceutical / clinical research expertise Global recognition Place substantial priority and focus on this effort “The pearl of great price” 26 Proposal WHEN - ASAP The time is right to charge ahead aggressively There is a large, unmet business need FDA and others are looking for a “content leader” CDISC has ongoing terminology efforts Technology is in place (i.e. wikis) Mindset is in place (i.e. people can work virtually) Others are advancing on this front and we may be left out 27 Budget Transition personnel to this effort Continue/finalize ongoing CDASH efforts Redirect some Terminology Team efforts Need part-time Governance team members Contracted for ~25% of their time SMEs for TA or data domains Leverage CROs, software members of CDISC 28 Summary There remains a clear need to have unambiguous clinical data element standards (CDES) • • • • • Considerable efforts still spent on exchanging data Considerable efforts still spent on integrating data Needed across the drug development industry Broad set of data domains (safety, efficacy, outcomes, PK, etc.) Independent of strategies related to messaging or transport technologies Let’s act decisively and move quickly. 29 Benefits of Using Documented CDEs Facilitates common data collection by defining content and scope. Supports semantic data relationships. Defines valid values for enumerated data. Improves understanding of data. Simplifies and documents data analysis. Provides historical context for data collections. Encourages reuse of existing data structures. Facilitates sharing of data across organizational entities. Facilitates integration of data across studies. 30