National Cancer Institute caDSR Briefing for Small Scale Harmonication Project Denise Warzel Associate Director, Core Infrastructure caCORE Product Line Manager July 25, 2007 National Cancer Institute NCI Center for Biomedical Informatics and Information Technology (CBIIT) • The CBIIT is the NCI’s strategic and tactical arm for research information management • We collaborate with both intramural and extramural groups • Mission to integrate and harmonize disparate research data • Production, service-oriented organization. Evaluated based upon customer and partner satisfaction. National Cancer Institute caDSR and EVS Distinctions • caDSR is a metadata repository – maintains metadata to permit a user to locate the correct data element defining the characteristics of a piece of datum, an instance of a specific concept, in sufficient detail to be collected and stored on a computer • EVS is a terminology server – provides services for synonymy, mapping between vocabularies, hierarchical structures, Subconcepts, Superconcepts, Roles, Semantic type, etc. National Cancer Institute Goals of the Registry • Goals tools development: – Simplify development and creation of ISO/IEC 11179 compliant metadata by Data Element Curators and UML Modelers – Simplify consumption of Data Elements and standard vocabularies by end users and application developers through APIs and web services – Enhance reuse of Data Elements across domains – Enable semantic consistency across research domains – Support metadata life-cycle and governance processes • Created, maintained by NCI Contractors and Open Development model • Available as an open-source download * Training * National Cancer Institute * Training * National Cancer Institute Introduction to caDSR Tools – CDE Browser to Search for, View and Download – – – Form Builder to Create user specified collections of CDEs • Skip patterns, repeating groups, default values Side-by-Side Compare UML Model Browser to View and manage UML Model metadata – CDE Curation Tool to Create Data Elements – Admin Tool to Administer caDSR and curate content - “Power Users” – Sentinel Tool to Generate end user ‘Alerts’ triggered by metadata changes – Semantic Integration Workbench – Semantic Integration/Annotation Tools, annotate, transform and register metadata Batch Load to import Administered Items • Excel Loader (MS Excel) – • • • Semantic Integration Workbench UML Model Loader (XMI) Case Report Form Loader (MS Excel) Access, Develop, Manage, Consume National Cancer Institute Curation Tool • • To Create, Edit or Version: • Value Domains • Data Elements Construct ISO compliant Data Elements by building up the pieces • Builds Names and Definitions from underlying components. “Get Associated” – • Data Element Concepts ISO 11179 Wizard – • • Leverage ISO to retrieve related CDEs “Block Edit” • “shopping cart” • Assign classification schemes • Versioning National Cancer Institute Sentinel Tool • • Create “Alerts” – User defined triggers based on data element metadata attributes – “notify me of any change to the Value Domain for any CDE on the Adverse Event Form Generates and emails a report of changes matching “Alert” criteria National Cancer Institute CDE Browser “CONTEXT Browsing” • View, Search, Download – • FormBuilder to Build / Download Forms and Data Elements • “Context Browsing” Tree • Basic Search Shopping cart feature – By Classification Schemes – By Forms CDE Basic Search Criteria – Google-like search – Sortable search results by clicking on column headings National Cancer Institute CDE Browser • Advanced Search – Leverages ISO 11179 and Concept semantic attributes • Find all with “18254-3” permissible value • Find all with “Gene*” • Find all with “Released” workflow status • Find all with “Standard” Registration status • Advanced Search Etc. National Cancer Institute Form Builder • Create and Manage Forms – Organize CDEs into modules within a Form – Create skip patterns, default values, repeating groups – Attach pdf or word format – Classify Forms into groupings for specific end user communities – “Publish” “Un-Publish” for Browser Catalog visibility • “Printer Friendly” version • Download CDEs National Cancer Institute Form Builder • Create and Manage Forms – Organize CDEs into modules within a Form – Create skip patterns, default values, repeating groups – Attach pdf or word format – Classify Forms into groupings for specific end user communities – “Publish” “Un-Publish” for Browser Catalog visibility • “Printer Friendly” version • Download CDEs National Cancer Institute CDE Side-by-Side Compare • CDE Side-by-Side Compare – Build shopping cart, compare CDE metadata side by side – Download to excel spreadsheet National Cancer Institute UML Model Browser • View CDEs as part of a UML Domain Model • Classes • Attributes • Associations • View Contact information, like to UML source file, documentation, etc National Cancer Institute Administration Tool • System Administration • User Accounts and Security • Lists of Values (LOVs) used in content creation • Create “Framework”: • Conceptual Domains • Classification Schemes (basis for organizing CDEs in Browser) • Define high level “Protocol” National Cancer Institute Batch Loading • OC caDSR DEFAULT VALUES: Workflow status = "Released" Alw ays. Version = 1.0 Alw ays. Create Date =Date loaded by Loader. Created by = EVS. Long Name = EVS Preferred nam e EVS Preferred Nam e Definition Definition Source Database Context Preferred Nam E effective Begin Date Change Note Alternate Nam e Type VARCHAR2 (20) Mapped to Long Name and Preferred Name Not Null VARCHAR2 (2000) PreferredDefinition VARCHAR2 (2000) Definition Source VARCHAR2 (255) Database VARCHAR2 (20) Requestors Context VARCHAR2 (30) YY.MM.B VARCHAR2 (2000) Text VARCHAR2 (20) AlternateName.Type Not Null Celsius Scale The temperature scale defined by the values 0 degree Celsius for the freezing point of water and 100 degrees Celsius for the boiling point of water. The Celsius degree (C) is the same size as a Kelvin and equal to (F - 32)/1.8. To convert Celsius to Fah Null NCI Not Null NCI Thesaurus Not Null caBIG Null 11/18/2004 Null Requested by Dianne Reeves Not Null NCI_Concept_Code Semantic Integration Workbench and UML Loader – XMI representation of a UML Class Diagram • Class Object Class • Attribute Property HEENT NCI HEENT is the Head, Ears, Eyes, Nose and Throat, and is referred to as a body system on a physical or medical examination. The term is typically used as 'HEENT' in a physician or caregiver notes. NCI Thesaurus caBIG 11/18/2004 Requested by Dianne Reeves NCI_Concept_Code Gracely Pain Unpleasantness Scale The Gracely Pain NCI Unpleasantness Scale is a visual analog scale of 0 to 20 used by a subject to define their pain unpleasantness experience. Together with the intensity scale these tools serve to differentiate the patient's sensory perception of pain inte NCI Thesaurus caBIG 11/18/2004 Requested by Dianne Reeves NCI_Concept_Code • Enumeration Value Domain • Mapped EVS Concepts • Data Element Concept, Value Domain and Data Element harmonized with existing content and created from the above • Excel Loaders – Formatted MS Worksheet – Administered Items National Cancer Institute ISO/IEC Administered Item Administration Record and Common Attributes • • • • • • • • • • Unique Identifier • • • Registration Authority (RA) Data Identifier (within RA) Version Administrative Status Registration Status Creation Date Administrative Note(s) Effective Date Change Date(s) Change Description(s) Origin Until Date • • • • • • • • Created By Modified By Name(s) Definition(s) Stewardship Information Submitter Information Reference Document(s) Classifications Additional Value Domain Attributes • Datatype (ISO 11179) • • • • • • • • • Name, Description, Scheme Reference, Annotation (ISO 11179) Codegen compatible, Comment Format (ISO 11179) Maximum Characters (ISO 11179) Unit of Measure (ISO 11179) Minimum Characters High Value Low Value Character Set • If Enumerated: • • Permissible Values (ISO 11179) • Value, Value Meaning, Begin Date, End Date (ISO 11179) • High Value, Low Value If Non-enumerated: • Reference Document pointing to External ‘Top Node’ Concept National Cancer Institute Create For Standard Codes National Cancer Institute Create For Standard Code Sets - Use this if you want everyone store the code the same way National Cancer Institute caDSR URLs • • • caDSR Home Page http://ncicb.nci.nih.gov/core/caDSR – Browser and Form Builder: http://cdebrowser.nci.nih.gov/ – Admin Tool: http://cadsradmin.nci.nih.gov/ – Curation Tool: http://cdecurate.nci.nih.gov/ – Sentinel Tool: http://cadsrsentinel.nci.nih.gov/ – Freestyle search: http://freestyle.nci.nih.gov/ – Semantic Integration Workbech: http://cadsrsiw.nci.nih.gov/ caDSR Users, Developers ListServ – http://list.nih.gov to subscribe to caDSR_Users@list.nih.gov – http://list.nih.gov to subscribe to caDSR_Software_Developers@list.nih.gov caDSR Training Home Page – • http://ncicb.nci.nih.gov/NCICB/core/caDSR/Training caDSR Training ListServe – http://list.nih.gov to subscribe to caDSR_Training-L@list.nih.gov National Cancer Institute caCORE Reading Materials • caCORE Homepage: – • caCORE User Application Manual: – • http://ncicb.nci.nih.gov/NCICB/infrastructure/cacore_overview ftp://ftp1.nci.nih.gov/pub/cacore/NCICBapplications/NCICBAppManual.pdf NCICB GFORGE • • caGRID Browser – • • http://ncicb.nci.nih.gov/NCICB/training caDSR Business Rules – • ftp://ftp1.nci.nih.gov/pub/cacore/caCORE3.1_Tech_Guide.pdf – caCORE APIs caCORE Training – • http://cagrid-browser.nci.nih.gov/cagrid-browser/ caCORE Technical Guide: – • http://gforge.nci.nih.gov http://ncicb.nci.nih.gov/NCICB/infrastructure/cacore_overview/cadsr/business_rules caDSR_Users List serv subscribe: – http://list.nih.gov – Send Request for caDSR Account to: ncicb@pop.nci.nih.gov caBIG home page: documentation about the Grid – http://cabig.nci.nih.gov National Cancer Institute Questions?