SDTM Implementation - Best Practices Pantaleo Nacci, Head Statistical Safety & Epidemiology/PV Biometristi Italiani Associati (BIAS) Seminar: CDISC, SDTM and ADaM: Moving from theory to practice SAS Institute - Milano, 14 March 2014 Agenda The story so far Benefits of implementing SDTM Basic requirements Legacy data conversion: a missed opportunity? The way forward Conclusions The CDR Project In 2010 Novartis Vaccines (NVx) initiated a very ambitious project to pool together all legacy studies for which data are electronically available in-house, as well as all new EDC trials, within the framework of a Clinical Data Repository (CDR) A slightly adapted variant of CDISC SDTM 3.1.2 was selected as common standard for final storage, analysis and reporting, which allows storing of study metadata as well All legacy study data are being remapped to this standard in two phases, and their metadata retrieved (mostly from protocols) and entered, for a total of around 400 studies 3 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices The CDR Project (2) All EDC panels have been redesigned to collect data in a format compliant with CDASH standards, which allows easy remapping to SDTM After a careful and lengthy selection process among three candidates, the solution chosen as the basis of the CDR has been SAS Drug Development (SDD) as a hosted environment The re-mapping is done using custom-written SAS programs rather than SAS Clinical Data Integration (CDI), as initially planned 4 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices The CDR Project (3) (Re-)coding of adverse events, concomitant medication, medical history, etc. is performed in Oracle Thesaurus Management System (TMS) Our home-grown Standard Reporting Software suite of validated SAS programs was adapted to run in the new SDD environment AND to use the new CDISC data structures Last but not least, all our existing processes were migrated to the new platform, and several new ones mapped and documented 5 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Two years later... More than two years after the CDR went live, all new clinical studies are now running completely in it • Data get pushed automatically into the system at regular intervals • SDTM data are then obtained directly from CDASH dumps nightly, so that the latest version is always available online • Deviations from the NVx standard are dealt with using a set of studyspecific ‘rules’, and approval by a technical committee is necessary Standard Reporting Software has finally been tested on real data, necessary fixes have been applied and new enhancements are actively being worked on as we speak Special techniques are being developed to speed up operations in SDD 6 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Two years later... (2) Phase 2 of the Legacy Data Conversion sub-project has started, and data from 191 legacy studies, dating back as far as 1992, are now available for use, of course in addition to those from CDR-native ones Utilities to create and maintain complete and ongoing dynamic ‘oceans’ have been fully validated, and jobs to maintain them are scheduled to run nightly Other utilities to create static ‘data poolings’ have been validated too, and are available for use • Several data poolings have now been created and released for analysis 7 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices CDR - The Full Picture Phasing out •TGROUPS TMS Coding Paper CRFs Scanned CRFs (ICRF) Clintrial (AE, MEDHX, CMED) •Lab norms Submission Data sets •Randomization •Exclusions •Stat. Populations Stat. Analysis Monitor listings Adapter eCRFs – EDC Periodic Updates CRO data, eDiaries Ext. Labs Serology Lab Clinical Data Repository CDISC Data sets SAS Drug Development Serology database Future wish: SAE ->EDC-> CDR-> PV DB Pooled analysis CTMS Rapid Pooling Data Mining Signal Detection. SAE coding * 8 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices SAEs PV Safety database PV Safety reports * Ideally all coding in 1 system Agenda The story so far Benefits of implementing SDTM Basic requirements Legacy data conversion: a missed opportunity? The way forward Conclusions Regulatory Requirements It was in October 2007, while attending the PhUSE annual conference in Lisbon, that I heard for the first time about JANUS from an FDA speaker The message was clear even then: the need to adopt nonproprietary data standards was not a question of ‘if’, but just of ‘when’ It took longer than expected, but on 6th February 2014 FDA officially published several guidance documents for electronic regulatory submissions, with a 90-day window for public comment • Various CDISC data standards (CDASH, SDTM, SEND, ADaM) are specifically mentioned as required 10 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Effects on Submission Reviews Even after adopting CDISC standard more pitfalls might be present in (therapeutic) areas not well characterized E.g., given the degree of latitude still left to the single companies by the gaps in the current standards as regards vaccines data, an FDA reviewer might be faced with widely differing structures (and possibly codelists) representing the same data but coming from different companies • E.g., in vaccine trials solicited AEs can be stored in AE, CE, FA, etc. This might result in longer times for review, even with the help of JANUS or similar systems 11 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Evolution of the NVx Data Standards How an unstable reference can ultimately be a liability It is now almost exactly two years since the first native CDISC study data entered the CDR Only once that milestone was reached it was finally possible to fully test the reporting software, which had been developed using mock data • As expected, many issues, small and not so small, surfaced The next slide show an overview of the changes applied to the initial production version of the NVx SDTM standard until June 2012 • A few further changes, mostly adoption of new domains, have been applied since then 12 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Evolution of the NVx Data Standards (2) Adapting to changing requirements is a must 13 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Evolution of the NVx Data Standards (3) How a pilot phase can save a lot of time later In practice, these are a few examples: • Information which in the old data structures was contained in one variable is now spread over several ones, e.g.: - EDTTEST = ‘HI_B_BRISBANE08_CC’ - LBCAT=‘IMMUNOLOGY’ - LBMETHOD=‘HEMAGGLUTINATION INHIBITION’ - LBTEST=‘INFLUENZA B BRISBANE/2008 CC AB ‘ • Re-assaying of samples, only identifiable using the date when an assay was run, was not accommodated for until a new TSTDTC variable was added in SUPPLB 14 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Usage of SDTM Data As mentioned earlier, two data ‘oceans’, inherently dynamic, are maintained within it and have been used multiple times to create static ‘data poolings’ • Complete ocean: contains all available data from studies which are not active anymore; data for newly completed studies are added to those of existing ones • Ongoing ocean: contains all available data for studies not yet completed, and is recreated from scratch every night • All AE/MH verbatims can be updated to the latest MedDRA version in one go These ‘data poolings’ are used to support submissions, but also papers, etc., i.e., whenever there is the possible need to go back to a known status to perform additional analyses 15 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Agenda The story so far Benefits of implementing SDTM Basic requirements Legacy data conversion: a missed opportunity? The way forward Conclusions Basic Requirements - General • The scope of such projects must be well defined at the beginning, so that all necessary functions are involved from early on • Proper resourcing levels are needed from the beginning, and all interested systems must be identified and properly covered • Think boldly when looking at the big picture and drawing your longterm plans, but don’t be ashamed to take only the steps you can afford at any given time • It is extremely dangerous to have all knowledge on a certain topic concentrated in only one or two persons, and it can also create unnecessary bottlenecks at the wrong time • A target reference framework must be clearly identified early on • While it is possible to implement CDISC structures from scratch just looking at the documentation, choosing the right system will make it easier to implement, e.g., the inevitable tweaks linked to the peculiarities in your legacy data 17 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Basic Requirements – Legacy Data • The criteria to include/exclude studies and the overall timelines should be well understood and shared by top management to avoid creating excessive or plainly wrong expectations • Putting study data together takes a lot of time, patience and attention to detail, even when most of the electronic data are already more or less in the right format • Finding actual evidence about old studies can be a challenge (!) • Original documentation (at least annotated CRF and protocol/CSR, including all amendments) must be available for all studies before the coding starts, to avoid misinterpreting the data • Metadata are important as much as data themselves, so, e.g., make absolutely sure your T-domains are correct • If at all possible plan for a pilot phase, it will most probably save a lot of time and grief later 18 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Agenda The story so far Benefits of implementing SDTM Basic requirements Legacy data conversion: a missed opportunity? The way forward Conclusions What About All Those Studies in the Archive? We decided that data from all new studies would be collected and stored according to CDISC standards, but this still left us with data from over 400 studies lying in our archives gathering dust In this case the skill set required for the remapping was different, needing both good knowledge of the new CDISC standards and a lot of experience with the old one(s) As described earlier, the first step was to decide which studies to convert first 20 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Interpreting Old Data A data inspection step was then undertaken, with the aim to identify all the legacy datasets and variables used In our case we could further split the studies in pre- and post-Clintrial adoption: • Pre-Clintrial ones were managed by several different CROs in the first half of the 90’s, so that today existing knowledge is extremely limited and fragmentary • Post-Clintrial ones are on the contrary based on relatively minor variations of the current internal standard, which is known and understood by most members of the programming group Study data were either re-extracted again from Clintrial or retrieved from our electronic archives 21 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Necessary Tools We correctly identified the need to retrieve all protocols, protocol amendments, (annotated) CRFs and CSRs from the archives, but underestimated the difficulty of getting them, especially the very old ones The initial programming approach, agreed with our Quality partners, was based on random sampling of results, but a radical change to double-programming was requested well into the process • This caused a sudden shortage of experienced programmers 22 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Remapping Your Existing Data Another source of possible issues for your SDTM implementation The LDC Phase 1 exercise, restricted to 153 high priority studies, took considerably longer than planned, but why? • Not all studies had available protocol and CRF at the time programming started, forcing the team to a dangerous guessing game • The NVD data standard was still evolving rapidly, so that some important assumptions proved grossly wrong when new studies were designed and run in the new system, sometimes requiring extensive rewriting of almost validated programs • One domain in particular proved a nightmare, LB/SUPPLB, as several old test codes, as well as free text descriptions, were not included in the new so-called lab metadata tables 23 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Remapping Your Existing Data (2) Still something you want to do? Even with all these issues and delays, this is still one of the core business reasons why NVx started the CDR project Safety signaling across vaccines, therapeutic areas and even single vaccine components can now be performed using data in the oceans • This is meaningful only for projects where all data are already available in SDTM 24 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Agenda The story so far Benefits of implementing SDTM Basic requirements Legacy data conversion: a missed opportunity? The way forward Conclusions Future Developments What’s next on the CDR plate The current implementation of CDR is based on SDD 3.5, which does not exactly shine for speed • A new, revised version of SDD (4.3) has been out for a while, and we have been getting a feel for it for some time The set of available CDISC standards currently implemented is restricted to CDASH 1.1 and SDTM 3.1.2 • Two main tasks have been identified here, upgrading to SDTM 3.1.3/3.2 and adopting ADaM 2.1 The Standard Reporting Software we use is still a revamped version of very old programs ported to SDD+SDTM • We are in the process of redesigning the reporting system from scratch, embedding in it the creation of ADaM datasets 26 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Future Developments (2) What’s next on the CDR plate Until now we have remapped only part of the available legacy data • Lessons learned from the Phase 1 have been put to good use to reengineer the process • Many things have changed to the better in the CDR environment: - The reference standard is much more stable - OpenCDISC consistency checks are readily available to check the progress all along - Basic documentation has been retrieved for all remaining studies (no more guessing games!) - Programmers know better both sets of data structures 27 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices What is Happening on the SDTM Front Lately CDISC has been very active too! SDTM 3.1 was released on 14 July 2004, SDTM 3.1.1 on 26 August 2005 and SDTM 3.1.2 on 12 November 2008 After a hiatus of almost four years, during which more and more companies have adopted it, SDTM 3.1.3 has been finally released on 16 July 2012, followed by an utterly unexpected version 3.2 on 26 November 2013 • According to their previous roadmap, we should have seen SDTM 3.1.4 last July, and 3.1.5 roughly one year later • An updated roadmap has been published in late February Stated long-term aim is to have two updates per year, à la MedDRA 28 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices CDISC Planned Activities and Releases What we can expect for the next year 29 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Agenda The story so far Benefits of implementing SDTM Basic requirements Legacy data conversion: a missed opportunity? The way forward Conclusions Conclusions The old SAS server is still in service, but we are reducing the number of both business processes and users using it We have plenty of exciting developments on our plate CDR is a long-term project, which is evolving and will take years to reach a stable status More and more activities are now moved completely to CDR Safety signaling using all available data for selected vaccines/products is now possible, and by the end of 2014 this should be possible for most of our major products currently on the market 31 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Conclusions (2) Using capabilities already built in SAS we should be able to create necessary documentation for the submission data (e.g., define.xml) with much less effort than in the past, and with a higher quality In the meantime SDTM is moving again Changes to existing processes have always a cost, and resources are not always a viable alternative to time It will be crucial to understand how much change is introduced by each new version, and calculate the business impact in each company scenario • It might make sense to skip some ‘minor’ versions 32 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Useful Links CDISC: • http://www.cdisc.org • Technical Plan Project Schedule SAS Clinical Standards Toolkit • http://support.sas.com/rnd/base/cdisc/cst/ FDA/PhUSE Computational Science Symposium 2014 • http://www.phuse.eu/css 33 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Useful Links (2) FDA: • FDA-2014-D-0092 - Study Data Technical Conformance Guide and Data Standards Catalog • FDA-2012-D-0097 - Guidance on Electronic Submissions: Standardized Study Data • FDA-2014-D-0085 - Guidance on Submissions in Electronic Format--Submissions under the Federal Food, Drug, and Cosmetic Act 34 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices Questions? 35 | BIAS Meeting | P. Nacci | 14 March 2014 | SDTM Implementation - Best Practices