A demo of an initial prototype of project idea Mustafa Yuksel & Gokce B. Laleci, SRDC Motivation Currently, the clinical research and the clinical care domains are quite disconnected because each use different standards and terminology systems. In contrast to CDISC standards used in the clinical research domain, in the clinical care domain, the most widely used content and messaging standards are by HL7 The terminology systems are quite different as well: While MedDRA, WHODD and CDISC Terminology are commonly used in the clinical research domain; the prominent terminology systems in clinical care domain include SNOMED CT, LOINC, ICD-9 and ICD-10 The available integration efforts are mostly proprietary, custom developed for a specific use-case and depend on hard-coded n x n mappings among standards For example, the Electronic Data Capture (EDC) systems are usually not connected to the EHR systems that are used by the health care providers. The clinicians have to manually copy the results of therapeutic procedures and examinations from an EHR system into the Case Report Form (CRF), which causes errors and work disruption as well as delays in reporting data. April 17-18, 2012 SALUS Technical Kickoff Meeting 2 Visionary Scenario A new Calcium Channel Blocker is marketed after a successful clinical trial period The regulatory body receives Adverse Event reports indicating that this new drug causes swelling of legs The regulatory body decides to conduct a more extended post market safety study (or asks the Pharmaceutical Company to do so) Prepares the Study Protocol in CDISC SDM Eligibility criteria: Patients who have recently been treated with this new Calcium Channel Blocker Collect all of the other symptoms, diagnoses, allergies, medications of this patient in the first visit This protocol definition is sent to the health care providers that are in SALUS cslinical research community Patient history documents conforming to the protocol definition, and in different schemas such as HL7 CDA and CEN EN 13606 are sent by the hospitals to the regulatory body This patient histories in CDA and 13606 are translated to BRIDG Model Form Manager processes the Study Design, identifies the items requested in CRF Forms from their annotation in CDASH The patient history in BRIDG is queried through the predefined queries defined for each CDASH variable (they can be used for semi-autmatically filling in CRF forms) April 17-18, 2012 SALUS Technical Kickoff Meeting 3 Visionary Scenario (continued) After collecting significant data from some patients, the regulatory body prepares the statistical analysis data by semantically querying the collected data represented in BRIDG Model Number of patients who have experienced edema in legs (represented through MedDRA term 10014239) have also Condition of heart failure (represented through MedDRA term 10019279) Condition of primary pulmonary hypertension (represented through MedDRA term 10036727) Has already been treated through a vasodilating agent (represented through SNOMEDCT term 58944007) Participating health care providers code observations through ICD-9, SNOMED CT terms, record adverse events through Who-Art, and record the medications provided through RxNorm After the analysis, it has been clarified that the adverse event incidents are mostly related with the underlying condition or current treatment of the patients… April 17-18, 2012 SALUS Technical Kickoff Meeting 4 Exploiting the Initial SALUS Semantic Framework We have envisioned two use cases to 1. automatically fill in eCRFs 2. facilitate safety studies on EHR systems April 17-18, 2012 SALUS Technical Kickoff Meeting 5 The Components of the initial Demo the BRIDG DAM ontology expressed in RDF as the core ontology hosted in a knowledge base i. ii. iii. iv. v. vi. We have developed the RDF representation of the BRIDG DAM v3.0.3 to be used as the core ontology to make the common shared semantics available in a formal, machine processable form. tools for semantic lifting of the content standards harmonized by the BRIDG initiative including HL7 RIM based models, CDISC ODM based models and for aligning these semantic models with the core ontology in the knowledge base tools for importing semantic representations of the terminology systems and biomedical ontologies as well as aligning these models with the core ontology tools to import clinical documents/messages to the SALUS knowledge base by automatically translating them to the instances of the core ontology a library of SPARQL queries to retrieve clinical data corresponding to the CDASH data sets from the knowledge base tools for semantically mediating the documents/messages represented in different clinical research and care standards to one another April 17-18, 2012 SALUS Technical Kickoff Meeting 6 April 17-18, 2012 SALUS Technical Kickoff Meeting 7 (A) BRIDG DAM as the common “model of meaning” The BRIDG DAM is an implementation independent UML model to represent common shared semantics of regulated clinical research studies which may have different implementations In 2003, CDISC, and HL7 signed a 2-year-old Memorandum of Understanding (MoU) to work collaboratively on the data exchange standards in domains that are of interest to both organizations and to create a Domain Analysis Model (DAM) as an implementation independent model of the shared semantics A reverse engineering effort to create the DAM is initiated Protocol Representation, Study Conduct, Adverse Event, Regulatory and Common Implementation independent UML Model From the already existing HL7 RCRIM messages From the CDISC CDASH, SDTM Data sets and ODM Models BRID DAM is composed of five sub-domains: Later NCI through their CaBIG Project, and FDA joined the group CDISC SDM and HL7 Study Design RMIM are both implementations of Protocol Representation Sub Domain Hence, it is the best alternative to be the starting point for core of our Semantic Framework April 17-18, 2012 SALUS Technical Kickoff Meeting 8 Sample UML Model from BRIDG Study Conduct Sub Domain class View CM: Comm... Legend View Description: The Common sub-domain represents the semantics that are common to all (or most) of the other sub-domains. For example, it includes semantics for such things as people, organizations, places and materials. Adverse Event Sub-Domain Common Sub-Domain Protocol Representation Sub-Domain Regulatory Sub-Domain Produc t Study Conduct Sub-Domain is a function performed by 0..* + + {has} 1 Ex perimentalUnit BiologicEntityPart is part of BiologicEntity + + + + + + + + name: DSET<EN> administrativeGenderCode: CD birthCountryCode: CD 0..1 birthOrder: INT.POS birthDate: TS.DATETIME deathDate: TS.DATETIME deathIndicator: BL actualIndicator: BL 1.. * {functions as} 0..* is a function performed by {functions as} +performing 0..* 0..* 0..1 typeCode: DSET<CD> {scopes} + + + 0..* {participates in} 0..1 0..* BiologicEntityIdentifier + + + + 0..1 is assigned by {assigns} 0..* 0..1 + name: ST 0..* is a version of {has as a version} 1.. * constraints {Is a Function Performed By Qualifier} {Study Author Performed By Qualifier} {Is a Function Performed By Exclusive Or} {functions as} 1 ResearchStaff performs identifier: II jobTitle: ST postalAddress: AD telecomAddress: BAG<TEL> effectiveDateRange: IVL<TS.DATETIME> DocumentV ersion is a function performed by {functions as} 0..1 0..* officialTitle: ST text: ED keywordCode: DSET<CD> keywordText: DSET<ST> numberText: ST.SIMPLE revisionReason: ST uniformResourceLocator: URL bibliographicDesignation: ST date: TS.DATETIME has as t arget +target {functions as} {functions as} 0..* has as source +source 1 0..1 1 {functions as} Dev ice ReportV ersion {functions as} 0..* {is staffed by} is a function is a function performed by performed by {functions as} {functions as} effectiveDateRange: IVL<TS.DATETIME> 0..* + + + + + belongs t o {contains} 1 + effectiveDateRange: IVL<TS.DATETIME> 0..1 1 ResearchOrganization + + Performe r {functions as} constraints {Is a Function Performed By Exclusive Or} st aff s 0..* {is staffed by} typeCode: CD effectiveDateRange: IVL<TS.DATETIME> is assigned by {assigns} is a function performed by 1 0..1 receives {is received by} is assigned by Administrativ eMemberCRA {assigns} 1 is a function performed by 0..* 0..* ReportReceiv er {is staffed by} is a function performed by st aff s {functions as} {is staffed by} {assigns} 1 + + + + {functions as} is a function performed by 0..* 0..* 0..* {functions as} constraints {Is a Member Of Exclusive Or} is a member of 1.. * ResourceProv ider 0..1 0..1 0..* 1.. * 1.. * {produces} 1.. * {has as an outlet } 0..* 0..* ProcessedProduc t produces Reprocessor belongs t o department at 1.. * {is the department for} OrganizationalContac t + + + + + 0..1 is a function performed by + title: ST typeCode: DSET<CD> postalAddress: BAG<AD> telecomAddress: BAG<TEL> effectiveDateRange: IVL<TS.DATETIME> primaryIndicator: BL {has as a member} 0..* 1 0..* 0..* 0..* is qualified in {is the location in which the qualification is granted for} 0..* handles communication for 0..* + + typeCode: CD effectiveDateRange: IVL<TS.DATETIME> + + 0..* + + {approves} MaterialName 0..* + + name: EN.TN typeCode: CD 0..* 0..1 1.. * Distributor 0..1 provides {has as a member} is a function performed by StudyRegistry + + is a function performed by 0..* 0..1 + + + + + 1 Cooperativ eGroup identifier: DSET<II> name: EN.TN typeCode: CD code: CD physicalAddress: AD constraints {physicalAddress Qualifier} 0..* is a function performed by Plac e {functions as} is a function performed by {is provided by} 0..* 0..1 name: ST acronym: ST is a function performed by 1 {is named by} ProcessingSite {functions as} {functions as} 0..* name s 0..* {functions as} + + + identifier: II typeCode: CD primaryIndicator: BL is located at {is location for} {contains} is contained in constraints {Is Assigned By Exclusive Or} +assigned 0..* +identifying Material + + + + + is a function performed by 0..* is managed by {functions as} {functions as} {manages} code: CD formCode: CD description: ST actualIndicator: BL effectiveDateRange: IVL<TS.DATETIME> 0..* is a function performed by Serv iceDeliv eryLocation is credentialed by {functions as} 0..* {is the target for} + + {credentials} has as source + {is the source for} code: CD postalAddress: BAG<AD> telecomAddress: BAG<TEL> 1 is assigned by identifies {is identified by} {assigns} is a function performed by identifies {functions as} 0..1 1 1 1 1 1 0..1 1 1 0..1 0..* is delivery location for 1 1.. * is a function performed by OrganizationIdentifie r is a function performed by {functions as} +containing 0..1 +contained 0..1 0..* 0..1 is a function performed by constraints {Is a Function Performed By Exclusive Or} has as t arget fabricates {is fabricated by} 0..* 1 0..* {functions as} is credentialed by {credentials} 1.. * 0..* {functions as} 0..1 identifier: II leadIndicator: BL targetAccrualNumberRange: URG<INT.NONNEG> accrualStatusCode: CD accrualStatusDate: TS.DATETIME plannedDuration: PQ.TIME dateRange: IVL<TS.DATETIME> statusCode: CD statusDate: TS.DATETIME 1 {manufactures at} is approved by jurisdictionAuthorityCode: 0..1 CD effectiveDateRange: IVL<TS.DATETIME> is a member of {has jurisdiction over} Study Conduct Sub-Domain:: StudySite + + + OrganizationRelationshi p 0..* is a member of is the location for {has as a member} 0..* 0..* {has communications handled by} + + + typeCode: CD effectiveDateRange: IVL<TS.DATETIME> {is overseen by} {is produced by} manufactures for Regulatory Sub-Domain:: RegulatoryAuthority + 0..1 0..* oversees is a function performed by {functions as} {functions as} 0..* Manufacturer constraints {Is a Function Performed By Exclusive Or} 0..* + + is a function performed by 0..* QualifiedPerson {functions as} identifier: II NotificationReceiv er 1 Cooperativ eGroupMember 0..1 Regulatory Sub-Domain:: Ov ersightAuthority Ov ersightCommittee 0..* identifier: II typeCode: CD certificateLicenseText: ST effectiveDateRange: IVL<TS.DATETIME> oversees is a function performed {is overseen by} by 0..* {functions as} + 0..* {functions as} is a function performed by 1 is produced by functions as an outlet for constraints {Is a Function Performed By Exclusive Or} TreatingSite is a function performed by identifier: II postalAddress: AD 1 telecomAddress: BAG<TEL> effectiveDateRange: IVL<TS.DATETIME> receivedIndicator: BL receivedDate: TS.DATETIME {assigns} HealthcareFacility is a function performed by + + is assigned by is assigned by 0..1 0..1 Processor 0..* st aff s 1 1 0..1 {functions as} Administrativ eMemberPI 0..* is used t o group st aff f or {groups staff int o} + reprocessedDeviceCode: CD +/ age: PQ.TIME + manufactureDate: TS.DATETIME + returnedToReprocessorDate: TS.DATETIME + availableForEvaluationIndicator: BL + overTheCounterProductIndicator: BL + singleUseDeviceIndicator: BL + riskCode: CD + handlingCode: CD ::P roduct + codeModifiedText: ST + typeCode: CD + classCode: DSET<CD> + lotNumberText: ST.SIMPLE + expirationDate: TS.DATE.FULL + pre1938Indicator: BL ::Material + code: CD + formCode: CD + description: ST + actualIndicator: BL + effectiveDateRange: IVL<TS.DATETIME> + communicationModeCode: CD + dueDate: TS.DATETIME + physicianSignOffIndicator: BL ::DocumentVersion + officialTitle: ST + text: ED + keywordCode: DSET<CD> + keywordText: DSET<ST> + numberText: ST.SIMPLE + revisionReason: ST +/ uniformResourceLocator: URL + bibliographicDesignation: ST + date: TS.DATETIME {functions as} 0..* identifier: II typeCode: CD postalAddress: AD telecomAddress: BAG<TEL> effectiveDateRange: IVL<TS.DATETIME> code: CD date: TS.DATE.FULL comment: ST 0..1 0..* HealthcareProv iderGroupMember DocumentVersionWorkflow Status + + + 0..* is a function performed by is a function performed by {functions as} + 0..* 0..* is a function performed by st aff s 0..* {is described by} 1 1 {functions as} 0..1 typeCode: CD priorityNumber: INT.NONNEG 0..* describes is a function performed by identifier: II postalAddress: AD telecomAddress: BAG<TEL> effectiveDateRange: IVL<TS.DATETIME> + 0..* + +target {is the source for} 1 constraints { Person-ResearchOrganization Pair Unique} HealthcareProv ider 0..* + + + + DocumentV ersionRelationship +source {is the target for} 1 is a function performed by {assigns} HealthcareProv iderGroup + + + + + + + 1 + + + +/ + + aut hors {is authored by} 1.. * + + + + + {is performed by} 0..* constraints {Is a Function Performed By Exclusive Or} 1 0..1 paymentMethodCode: CD statusCode: CD statusDate: TS.DATETIME confidentialityIndicator: BL is a function performed by 1 1 +source {is the source for} 0..* StudySubj ect 1 {is the target for} has as source typeCode: CD 1 0..* 1.. * {functions as} 1 + 0..* DocumentAuthor {functions as} is a function performed by + + + + 1 0..1 Document identifies {is identified by} 0..* constraints {Is Assigned By Exclusive Or} 0..* is a function performed by is a function performed by identifier: II effectiveDateRange: IVL<TS.DATETIME> identifier: II typeCode: CD primaryIndicator: BL 0..1 {assigns} 0..* 0..1 0..* 0..1 0..* 0..1 + + + + + SystemOfRecord +target 0..* {functions as} identifier: II typeCode: CD effectiveDateRange: IVL<TS.DATETIME> Person 0..1 constraints {Distributor Qualifier} {Processor Qualifier} {ProcessingSite Qualifier} has as t arget identifier: II +source typeCode: CD 0..* quantity: RTO<PQ,PQ> confidentialityCode: DSET<CD> activeIngredientIndicator: BL +target effectiveDateRange: IVL<TS.DATETIME> 0..* + + DocumentIdentifie r 0..* constraints {Is Assigned By Exclusive Or} 1.. * {is identified by} 1 is a function performed by ReportSubmitte r identifier: II typeCode: CD effectiveDateRange: IVL<TS.DATETIME> primaryIndicator: BL is assigned by + initials: ST + raceCode: DSET<CD> + ethnicGroupCode: DSET<CD> + maritalStatusCode: CD + educationLevelCode: CD + postalAddress: AD + telecomAddress: BAG<TEL> + primaryOccupationCode: CD + occupationDateRange: IVL<TS.DATE> ::BiologicEntity + name: DSET<EN> + administrativeGenderCode: CD + birthCountryCode: CD + birthOrder: INT.POS + birthDate: TS.DATETIME + deathDate: TS.DATETIME + deathIndicator: BL + actualIndicator: BL {is grouped by} + + + + is assigned by 0..* identifies Animal + + + identifier: II reasonCode: DSET<CD> comment: ST constraint s {Is Participated In By Qualifier} is participated in by Subj ect constraints {Is a Function Performed By Exclusive Or} identifies 1 gr oups identifier: II quantity: INT.NONNEG 0..* actualIndicator: BL ProductRelationshi p Subj ectIdentifier Activity 0..* {functions as} + speciesCode: CD + breedCode: CD + strain: ST + description: ED + reproductiveOrgansPresentIndicator: BL ::BiologicEntity + name: DSET<EN> + administrativeGenderCode: CD + birthCountryCode: CD + birthOrder: INT.POS + birthDate: TS.DATETIME + deathDate: TS.DATETIME + deathIndicator: BL + actualIndicator: BL + + + 0..1 +scoped is scoped by {is identified by} {functions as} {functions as} 0..* is a function performed by 0..1 ProductGroup is a function performed by 0..* is a function performed by AssociatedBiologicEntity + 1 constraints {Is a Function Performed By Exclusive Or} +performed {functions as} +scoping 0..* 0..1 is participated in by {participates in} is a function performed by 1 {functions as} 0..1 name: EN.TN typeCode: CD quantity: INT.NONNEG actualIndicator: BL + codeModifiedText: ST + typeCode: CD + classCode: DSET<CD> + lotNumberText: ST.SIMPLE + expirationDate: TS.DATE.FULL + pre1938Indicator: BL ::Material + code: CD + formCode: CD + description: ST + actualIndicator: BL 1.. * + effectiveDateRange: IVL<TS.DATETIME> 0..1 identifier: DSET<II> subgroupCode: CD statusCode: CD statusDate: TS.DATETIME 0..* is a function performed by BiologicEntityGroup + + + + gr oups {is grouped by} + + + + {functions as} 0..* is a function performed by anatomicSiteCode: CD 0..1 anatomicSiteLateralityCode: CD 0..1 0..1 1 1 1 1 {receives delivery at} 0..1 {is identified by} 0..1 +assigning 0..1 +identified 1 0..1 1 0..1 1 1 Cosmeti c FoodProduc t Pack age + stabilityDuration: IVL<TS.DATETIME> ::P roduct + codeModifiedText: ST + typeCode: CD + classCode: DSET<CD> + lotNumberText: ST.SIMPLE + expirationDate: TS.DATE.FULL + pre1938Indicator: BL ::Material + code: CD + formCode: CD + description: ST + actualIndicator: BL + effectiveDateRange: IVL<TS.DATETIME> + stabilityDuration: IVL<TS.DATETIME> ::P roduct + codeModifiedText: ST + typeCode: CD + classCode: DSET<CD> + lotNumberText: ST.SIMPLE + expirationDate: TS.DATE.FULL + pre1938Indicator: BL ::Material + code: CD + formCode: CD + description: ST + actualIndicator: BL + effectiveDateRange: IVL<TS.DATETIME> + capTypeCode: CD + capacityQuantity: PQ + handlingCode: CD ::P roduct + codeModifiedText: ST + typeCode: CD + classCode: DSET<CD> + lotNumberText: ST.SIMPLE + expirationDate: TS.DATE.FULL + pre1938Indicator: BL ::Material + code: CD + formCode: CD + description: ST + actualIndicator: BL + effectiveDateRange: IVL<TS.DATETIME> Organization + + + + + + name: DSET<EN.ON> typeCode: CD description: ST postalAddress: AD telecomAddress: BAG<TEL> actualIndicator: BL April 17-18, 2012 0..* MaterialIdentifier is assigned by 0..1 SALUS Technical Kickoff Meeting {assigns} 0..* + + identifier: II typeCode: CD Biologi c + riskCode: CD + handlingCode: CD + stabilityDuration: IVL<TS.DATETIME> ::P roduct + codeModifiedText: ST + typeCode: CD + classCode: DSET<CD> + lotNumberText: ST.SIMPLE + expirationDate: TS.DATE.FULL + pre1938Indicator: BL ::Material + code: CD + formCode: CD + description: ST + actualIndicator: BL + effectiveDateRange: IVL<TS.DATETIME> Drug + riskCode: CD + handlingCode: CD + stabilityDuration: IVL<TS.DATETIME> ::P roduct + codeModifiedText: ST + typeCode: CD + classCode: DSET<CD> + lotNumberText: ST.SIMPLE + expirationDate: TS.DATE.FULL + pre1938Indicator: BL ::Material + code: CD + formCode: CD + description: ST + actualIndicator: BL + effectiveDateRange: IVL<TS.DATETIME> 9 Sample UML Model from BRIDG Study Conduct Sub Domain April 17-18, 2012 SALUS Technical Kickoff Meeting 10 Creating BRIDG Ontology We have created a complete RDF representation of the latest BRIDG DAM (v3.0.3) UML -> XMI -> XSD -> RDF conversion Utilization of several tools (Enterprise Architect,Visual Paradigm, Topbraid Composer) Manual fine-tuning It was quite an effort… In the end, the RDF representation of the BRIDG DAM is the core of the initial SALUS Semantic Framework, which we call SALUS core ontology Note that SALUS core ontology has a living and expanding nature April 17-18, 2012 SALUS Technical Kickoff Meeting 11 BRIDG Ontology April 17-18, 2012 SALUS Technical Kickoff Meeting 12 (B) Mapping Different Content Models to BRIDG DAM Ontology (Common Ontology) Medical summaries available through XML files First we need to create semantic models of these content models Schemas provided through XSD XSD2RDF Normalization Tools can be used We created RDF model of HL7 CDA and CEN 13606 Then this semantic model of the Content Models need to be mapped to the Common Ontology So that mapping definitions can be used to translate medical summary instances as individuals f SALUS Common Ontology April 17-18, 2012 SALUS Technical Kickoff Meeting 13 (B) Mapping CCD “Past Medical History” section to “PerformedMedicalConditionResult” class in BRIDG April 17-18, 2012 SALUS Technical Kickoff Meeting 14 SPINMap Formalism SPINMap SPARQL-based language to represent mappings between RDF/OWL ontologies mappings can be used to transform instances of source classes into instances of target classes Mainly uses the SPARQL CONSTRUCT particularly useful to define rules that map from one graph pattern (in the WHERE clause) to another graph pattern Based on SPIN (SPARQL Inferencing Notation) W3C Submission makes it easy to associate mapping rules with classes, and SPIN templates and functions can be exploited to define reusable building blocks for typical modeling patterns Provides a vocabulary: collection of properties and classes that can be used to link RDFS and OWL classes with SPARQL queries SPINMap vocabulary (http://spinrdf.org/spinmap) the class ex:Rectangle can define a property spin:rule that points to a SPARQL CONSTRUCT query that computes the value of ex:area based on the values of ex:widthand ex:height. the property spin:constraint may link the class ex:Square with a SPARQL ASK query that verifies that the width and height values are equal A collection of reusable design patterns that reflects typical best practices in ontology mapping Can be executed in conjunction with other SPARQL rules with any SPIN engine April 17-18, 2012 SALUS Technical Kickoff Meeting 15 SPINMap vocabulary Context: Groups together multiple mappings so that they have a shared target resolution algorithm The source class of the mapping The target class of the mapping The expression that delivers the target of the mapping. This expression can reference the variable ?source for the source resource, and the variable ?targetClass for the type of the target TargetFunction Class of SPIN functions used to get the target resource of a mapping Usually expressed through a TargetFunction Conditional Construct Statements… SPIN Rules Bound to classes and contexts To map the datatype/object properties of the source-target classes Can make use of SPIN: Functions Can make use of the results of the mappings defined through other contexts.. April 17-18, 2012 SALUS Technical Kickoff Meeting 16 April 17-18, 2012 SALUS Technical Kickoff Meeting 17 Sample Mapping RecordTarget-StudySubject RecordTarget performingBiologicalEntity = targetRecource (RecordTarget, RecordTarget-Person) -hasPatientRole StudySubject -performingBiologcal Entity PatientRole Person RecordTarget-Person -hasPatient Patient -hasRaceCode [CE] -hasBirthTime [TS] -hasAdministrativeGenderCode[CE] CS -dtype:Value CE -hasCodeSystem [UID] -hasCode [CS] -hasCodeSystemName [string] CE -hasCodeSystem [UID] -hasCode [CS] -hasCodeSystemName [string] UID -dtype:Value CS -dtype:Value • raceCode= targetRecource (RecordTarget, RecordTarget-CD-1) •administrativeGenderCodeCode= targetRecource (RecordTarget, RecordTarget-CD-2) •birthDate=targetRecource (RecordTarget, RecordTarget-TS) CD RecordTarget-CD-1 • code= targetRecource (RecordTarget, RecordTarget-Code) • codeSystem= targetRecource (RecordTarget, RecordTarget-Uid) • codeSystemName= copy( (RecordTarget.hasPatientRole.hasPatient.hasRaceCod e.hasCodeSystemName) -dtype:Value [string] -dtype:Value -codeSystem -code -codeSystemName Code RecordTarget-Code UID TS -raceCode -birhDate -administrativeGenderCode •dtype:Value= copy( (RecordTarget.hasPatientRole.hasPatient.hasRaceCod e.hasCode.dtype:value) -dtype:Value Uid RecordTarget-Uid •dtype:Value= copy( (RecordTarget.hasPatientRole.hasPatient.hasRaceCod e.hasCodeSystem.dtype:value) -dtype:Value TS RecordTarget-TS •value=targetRecource (RecordTarget, RecordTarget-Class1) RecordTarget-Class1 April 17-18, 2012 •dtype:Value= copy( Kickoff Meeting SALUS Technical (RecordTarget.hasPatientRole.hasPatient.hasBirthTim e.dtype:value) -value Class1 -dtype:value 18 (D) Clinical data instance translation procedure April 17-18, 2012 SALUS Technical Kickoff Meeting 19 (D & F) Importing & Exporting Clinical Documents Ontology Mapping Definition HL7 Study Design RMISM as an Ontology Source Ontology BRIDG DAM Ontology Target Ontology HL7 Study Design XSD Instance HL7 Study Design Ontology Instance Ontology Mapping Engine (SPIN Engine) BRIDG Study Design DAM Ontology Instance Study Design Source Ontology Instance (Native XML conformant to (Study Design in HL7 study HL7 study Design Ontology) Design RMIM) SPIN Map (SPARQL Queries attached to Classes) Ontology Mapping Definition CDISC Study Design ODM as an Ontology Source Ontology BRIDG DAM Ontology Target Ontology 1. Defining the Mapping April 17-18, 2012 CEN 13606 XSD Instance CDISC Study Design Ontology Instance Ontology Mapping Engine (SPIN Engine) BRIDG Study Design DAM Ontology Instance Target Ontology Study Design Instance (Native XML (Study Design conformant to in the CDISC SDM CDISC SDM ODM) Ontology) 2. Instance Translation SALUS Technical Kickoff Meeting 20 (E) Aligning the standards harmonized by BRIDG (Data Sets) with the SALUS Core Ontology Clinical Data Acquisition Standards Harmonization a link between the study data collected through eCRF Forms and the study data submitted to the regulatory bodies as SDTM datasets a limited set of structured data used for any Clinical Trial, regardless of research sponsors or therapy areas 16 domains Sites have always been asked to complete non-standard CRFs while patients are performing daily assessments, and CRFs are expected to be completed on time and accurately by the site Adverse Events (problems) Medications (prior and concomitant) Demographics and subject characteristics Medical History Vitals/ Physical Exam ECG test results Lab results variety of CRF questions and layouts is almost unlimited The current 16 CDASH CRFs are associated with standard SDTM mappings and standard CDISC controlled terminology The eCRF design time is shortened as CDASH eCRF forms can be pulled out of the EDC library as and when they are needed Standard CDASH CRFs can be transformed to standard SDTM datasets using standard extract transform load (ETL) code April 17-18, 2012 SALUS Technical Kickoff Meeting 21 CDASH Data set example April 17-18, 2012 SALUS Technical Kickoff Meeting 22 How CDASH Variables can be used within ODM messages April 17-18, 2012 SALUS Technical Kickoff Meeting 23 (E) Aligning the standards harmonized by BRIDG with the SALUS Core Ontology In the first case, the mappings between vocabularies termed as “data sets” (as in the case of CDASH variables) and the BRIDG based core ontology is addressed This is quite straightforward, since it is possible to write SPARQL queries on top of BRIDG DAM to retrieve the requested CDASH variable We have developed a library of sample SPARQL queries to extract several CDASH variables April 17-18, 2012 SALUS Technical Kickoff Meeting 24 An example SPARQL to collect fields in Medical History Data set in CDASH PREFIX sp: <http://spinrdf.org/sp#> PREFIX fn: <http://www.w3.org/2005/xpath-functions#> PREFIX bridg: <http://bridgmodel.org/dam/3.0.3#> PREFIX bfn: <http://www.salus.eu/bridg-functions#> SELECT ?MHONGO ?MHSTDAT ?MHENDAT ?MHTERM ?MHTERM_CD ?MHTERM_CS ?MHTERM_CS_NAME WHERE { ?p a bridg:PerformedMedicalConditionResult . OPTIONAL { ?p bridg:medicalHistoryIndicator ?mhi . ?mhi bridg:value ?MHONGO . } OPTIONAL { ?p bridg:occurrenceDateRange ?odr . ?odr bridg:low ?odrlow . BIND (bfn:getTSValue(?odrlow) as ?MHSTDAT) . ?odr bridg:high ?odrhigh . BIND (bfn:getTSValue(?odrhigh) as ?MHENDAT) . ?odr bridg:value ?odrval . BIND (bfn:getTSValue(?odrval) as ?midval) . BIND (if( (!bound(?MHSTDAT) && !bound(?MHENDAT)), ?midval, ?MHSTDAT) as ?MHSTDAT) . } OPTIONAL { ?p bridg:value ?val . BIND (bfn:getCDCode(?val) as ?MHTERM_CD) . BIND (bfn:getCDDisplayName(?val) as ?MHTERM) . BIND (bfn:getCDCodeSystem(?val) as ?MHTERM_CS) . BIND (bfn:getCDCodeSystemName(?val) as ?MHTERM_CS_NAME) . } } April 17-18, 2012 SALUS Technical Kickoff Meeting 25 How and Where these SPARQLs can be exploited Study Design Model is represented in CDISC ODM where it is also annotated with CDASH variables to specify the data to be collected through CRFs The Medical Summaries are collected through SALUS The EDC system can automatically parse the Study Design Model annotated with CDASH variables They are mapped to SALUS Common Ontology instances query the knowledge base already containing the medical history of the patient in the common ontology This is achieved using the pre-defined SPARQL queries for CDASH variables This eliminates static XSLT based mappings between Medical Histories and CDASH annotated ODM messages representing CRFs (as proposed by IHE CRD)… April 17-18, 2012 SALUS Technical Kickoff Meeting 26 (D) Exploiting terminology systems within the SALUS Semantic Framework Imported the following terminology systems from BioPortal into the SALUS Knowledge Base ICD-9: 21,669 terms ICD-10: 12,318 terms WHO-ART: 1,724 terms MedDRA: 69,389 terms National Drug File (NDFRT): 40,104 terms SNOMEDCT Clinical findings (97,139 terms) + Pharmaceuticals / biologic products (17,100 terms) RxNorm: 194,176 terms Human Disease Ontology (DOID): 8,574 terms It has references to other Ontologies such as ICD and SNOMED CT through DbXref property to indicate equivalances Those are processed to create additional Mapping Definitions And, 133,825 unique code mappings Not very straight forward Usually it is not possible to download the full ontology through a singe Rest Service due to timeouts The class names in an ontology are collected These classes are retrieved from Bioportal seperately (100 class each time) Then these subontologies are merged Some of the Class UIDs were incorrect (for ICD), they are corrected manually April 17-18, 2012 SALUS Technical Kickoff Meeting 27 (D) Aligning the Common Ontology with Terminology Ontologies To be able to automatically map the clinical data using different terminology systems to one another, it is necessary to link the coded terms in SALUS core ontology instances representing clinical data collected from participating sites with the SALUS terminology ontology resources, and to utilize terminology reasoning while querying the collected clinical data. Two heuristics that we have adapted on top of BioPortal ontologies: We automatically create the instances of BioPortal ontology classes and copy all non-rdfs and non-owls properties from the class definitions to the instances, to prevent OWL-Full ontologies Within a term present in a terminology ontology retrieved from BioPortal, the original terminology system name is implicitly given in the full URL of the term However, we need to immediately get the encapsulating terminology system of any term Therefore, we automatically run a SPARQL rule to add a “skos:inScheme” property to each instance in the terminology ontologies that we retrieve from BioPortal. We maintain an upper ontology (SALUS Terminology Upper Ontology), in which the major terminology systems used in our system are represented as the individuals of “skos:ConceptScheme” class. This way, we are able to execute a SPARQL rule to automatically bind a “CD” instance (a coded value) in BRIDG model to the corresponding BioPortal ontology instance via “salus:terminologyRef ” property April 17-18, 2012 SALUS Technical Kickoff Meeting 28 CONSTRUCT { ?this salus:terminologyRef ?codeRef . } WHERE { ?this p3.0:code ?code . ?code dtype:value ?codeValue . ?this p3.0:codeSystem ?codeSystem . ?codeSystem dtype:value Attached to CD class ?codeSystemRef . BIND (str(?codeSystemRef) AS ?csr) . ?codeOIDRef salus:oid ?csr . ?codeRef skos:inScheme ?codeOIDRef. BIND (str(?codeValue) AS ?cv) . PerformedMedica ?codeRef skos:notation ?cv . lConditionResult } value dtype:value: 2.16.840.1.113883.6.96 CD codeSystem code skos:ConceptSche me salus:MedDR A rdf:type rdf:type SNOMEDCT salus:SNOMED CT salus:oid: 2.16.840.1.113883.6.96 <http://purl.bioontology.org/ontol ogy/SNOMEDCT/102572006 > ???? rdfs:subClassOf skos:inScheme <http://purl.bioontology.org/ontolo gy/SNOMEDCT/102574007> rdf:type dtype:value: 102574007 <http://purl.bioontology.org/ontology/ SNOMEDCT#Ins_102574007> salus:terminologyRef skos:notation: 102574007 √ Part A: A part of the SALUS core ontology based on BRIDG DAM April 17-18, 2012 salus:ICD9 salus:LOINC Uid Code rdf:type rdf:type Part B: A part of SNOMED CT ontology from Bioportal SALUS Technical Kickoff Meeting 29 Exploiting the Initial SALUS Semantic Framework We have envisioned two use cases to 1. automatically fill in eCRFs 2. facilitate safety studies on EHR systems April 17-18, 2012 SALUS Technical Kickoff Meeting 30 The Knowledge Base All the semantic artifacts are hosted in a knowledge base The main consideration for the choice of the SALUS knowledge base is its performance, which is related directly to the complexity of the reasoning process Our reasoning requirements: Subsumption reasoning: Crucial to deduce matching coded terms that are aligned with different terminology ontology class instances, which in fact have the same ancestor in the terminology ontology Reasoning on equivalence of classes: In SALUS, the mappings of the terms in different terminology ontology classes to each other are represented through “owl:equivalentClass” property. We should be able to classify individuals of a class also as the individuals of its equivalent classes. “Acute heart failure” is a child of “heart failure” in SNOMED CT Both MedDRA:10019279 and SNOMEDCT:84114007 mean “heart failure” Reasoning on transitivity of properties: “owl:equivalentClass” property is inherently a transitive property. It should be possible for us to process transitive equivalences, in order to classify individuals of a class also as the individuals of its equivalent classes that are deduced to be equivalent through transitivity. When we calculate the transitive closure of the 133,825 unique code mappings that we retrieved from the BioPortal, the number of mappings increase to 186,712 April 17-18, 2012 SALUS Technical Kickoff Meeting 31 The Knowledge Base Clearly all the RDF and OWL-DL reasoners support all our reasoning requirements and much more. However, due to the very large number of triples (around 4.7 million) to be reasoned on in the SALUS knowledge base, we have chosen Virtuoso. Virtuoso supports a limited reasoning capability when compared to other RDF and OWL-DL reasoners; however the limited set of constructs supported includes rdfs:subClassOf, rdfs:subPropertyOf, owl:sameAs, owl:transitiveProperty and owl:equivalentClass, which fully address the SALUS Framework reasoning requirements. In addition, we benefit from Protege with Fact++ reasoner support, for calculating the transitive closure only via the “owl:equivalentClass” property It was not possible to run DL reasoning with other reasoners (Jena, OWLim, Fact++, Pellet, Hermit) when we load the BioPortal ontologies April 17-18, 2012 SALUS Technical Kickoff Meeting 32 Q1: All patients with history of “Edema of Legs” define input:inference "salus5" prefix bridg: <http://bridgmodel.org/dam/3.0.3#> prefix salus: <http://www.salus.eu/ontology/clinical#> prefix rdfs: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix dtype: <http://www.linkedmodel.org/schema/dtype#> prefix skos: <http://www.w3.org/2004/02/skos/core#> SELECT ?subject ?subjectBirthDate ?ProblemCodeValue ?ProblemcodeSystemName ?ProblemDisplayName ?StartingDate ?EndDate ?ProblemDate WHERE { OPTIONAL{ ?dateRange bridg:value ?datevalue. } OPTIONAL{ ?datevalue bridg:value ?ProblemDate.} OPTIONAL{ ?dateRange bridg:high ?high. } OPTIONAL{ ?high bridg:value ?EndDate.} OPTIONAL{ ?dateRange bridg:low ?low.} Rest are for binding values to variables in the results set OPTIONAL{ ?low bridg:value ?StartingDate.} ?performedObservationResult bridg:occurrenceDateRange ?dateRange. ?CodedValue bridg:codeSystemName ?ProblemcodeSystemName. ?ProblemCode dtype:value ?ProblemCodeValue. ?CodedValue bridg:code ?ProblemCode. ?birthdatevalue dtype:value ?subjectBirthDate. ?birthdate bridg:value ?birthdatevalue. ?performingBiologicalEntity bridg:birthDate ?birthdate. ?subject bridg:performingBiologicEntity ?performingBiologicalEntity. ?performedObservation bridg:involvedSubject ?subject. ?performedObservation bridg:resulted ?performedObservationResult. ?terminologyCode <http://www.w3.org/2000/01/rdf-schema#label> ?ProblemDisplayName. ?performedObservationResult bridg:value ?CodedValue. ?CodedValue salus:terminologyRef ?terminologyCode. ?terminologyCode rdfs:type <http://purl.bioontology.org/ontology/MDR/10014239> } April 17-18, 2012 SALUS Technical Kickoff Meeting Only condition 33 Available Sample Patient Documents in the Knowledge Base Example Patient edema of Summari ankle es (snomed) edema of foot (snomed) heart edema of leg edema failure (snomed) (whoart) (ICD) Code 26237000 102576009 102574007 1 X 2 X 3 X 4 5 X 6 X 7 X 8 X 9 X 10 (13606) X 401 428 heart failure unspecifie d (ICD) 428.9 heart acute H. chronic heart primary Dipyridam failure F. H. F. failure pulmonary ol 50MG (snomed (snomed (snomed (whoart hypertensio pph (icd TAB ) ) ) ) n (snomed) 9) RxNorm 8411400 5667500 4844700 7 7 3 496 26174007 416 197622 X X X X X X X None of the medical histories are coded with MedDRA Term:10014239 April 17-18, 2012 SALUS Technical Kickoff Meeting 34 5. SELECT ?ProblemDisplayName WHERE { ?terminologyCode <http://www.w3.org/2000/01/rdf-schema#label> ?ProblemDisplayName ?performedObservationResult bridg:value ?CodedValue. ?CodedValue salus:terminologyRef ?terminologyCode. ?terminologyCode rdfs:type <http://purl.bioontology.org/ontology/MDR/10014239> 1. Through terminology system ontologies and mappings downloaded from BioPortal 2. Instances are created to avoid OWL Full reasoning type MedDRA: 10014239 Edema of legs equivalantClass equivalantClass MedDRA:10030105 Oedema legs equivalantClass type type WHOART:0401 Edema type SNOMEDCT:102574007 Edema of leg subclass SNOMEDCT:102574007 Instance salus:terminologyRef subclass SNOMEDCT:26237000 Edema of ankle SNOMEDCT: 102576009 Edema of foot type type SNOMEDCT:26237000 Instance Medical History 3 type SNOMEDCT: 102576009 Instance salus:terminologyRef salus:terminologyRef WHOART:0401 Instance salus:terminologyRef Medical History 4 April 17-18, 2012 Medical History 2 Medical History 1,5,6,7,8,9 4. Through equivalence, subsumption and transitivity reasoning supported by Virtuoso SALUS Technical Kickoff Meeting 3. After Medical Histories are uploaded in SALUS Common Ontology, through the Rule attached to CD Class, these references are added… 35 Facilitating safety studies on EHR systems Q1: All patients with history of “Edema of Legs” Q2: All patients with history of “Edema of Legs” AND “Heart Failure” Q3: All patients with history of “Edema of Legs” AND history of “primary pulmonary hypertension ” Q4: All patients with history of “Edema of Legs” AND actively using a “vasodilating agent” similar Vasodilating agent: SNOMEDCT 58944007 Instance 8: Patient is using DIPYRIDAMOLE 50MG TAB (RxNorm: 197622) SNOMEDCT:58944007 <-- subClassOf – SNOMEDCT: 66859009 <equivalentClass -> NDF: C24056--ingredientof NDF:C39726 <equivalentClass -> RxNorm: 197622 April 17-18, 2012 SALUS Technical Kickoff Meeting 36 define input:inference "salus5" prefix bridg: <http://bridgmodel.org/dam/3.0.3#> prefix salus: <http://www.salus.eu/ontology/clinical#> prefix rdfs: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix dtype: <http://www.linkedmodel.org/schema/dtype#> prefix owl: <http://www.w3.org/2002/07/owl#> SELECT ?subject ?subjectBirthDate ?MedicationCodeValue ?MedicationDisplayName WHERE { ?termCode rdfs:type <http://purl.bioontology.org/ontology/SNOMEDCT/58944007>. {?termCode salus:ingredientOf ?drugClassA. ?drugClassA owl:equivalentClass ?drugClassB. } UNION {?termCode salus:ingredientOf ?drugClassB} ?drugClassA owl:equivalentClass ?drugClassB. Not only medication’s prodcut code, but also active ingredients are checked Through domain specific rules ?medTerminologyCode rdfs:type ?drugClassB ?medTerminologyCode <http://www.w3.org/2000/01/rdf-schema#label> ?MedicationDisplayName. ?CodedValue salus:terminologyRef ?medTerminologyCode. ?classCode bridg:item ?CodedValue. ?product bridg:classCode ?classCode. ?agenta bridg:performing ?product. ?performedSubstanceAdministration bridg:usedConcomitantAgent ?agenta. ?performedSubstanceAdministration bridg:involvedSubject ?subject. ?subject bridg:performingBiologicEntity ?performingBiologicalEntity. ?performingBiologicalEntity bridg:birthDate ?birthdate. ?birthdate bridg:value ?birthdatevalue. ?birthdatevalue dtype:value ?subjectBirthDate. Query parameters are mapped to related fields, like date of birth Medication’s coded representation is retrieved as medTerminologyCode ?CodedValue bridg:code ?MedicationCode. ?MedicationCode dtype:value ?MedicationCodeValue. ?performedObservation2 bridg:involvedSubject ?subject. ?performedObservation2 bridg:resulted ?performedObservationResult2. ?performedObservationResult2 bridg:value ?CodedValue2. ?CodedValue2 salus:terminologyRef ?terminologyCode2. ?terminologyCode2 rdfs:type SALUS Technical April 17-18, 2012 <http://purl.bioontology.org/ontology/MDR/10014239> } Patients with History of “Edema of Legs” Kickoff Meeting 37 Performance Evaluation On an average desktop computer (Intel Core 2 Duo 3Ghz CPU and 4 GB RAM), the semantic mediation of a medical history in CCD format to SALUS core ontology takes approximately 110 seconds. An example SPARQL query to check the underlying conditions of patients can be executed on the knowledge base hosting more than 4.7 million triples under 7 seconds. These results are quite encouraging for a real-life deployment of the initial Semantic Framework. April 17-18, 2012 SALUS Technical Kickoff Meeting 38 Thank you...