Query Health Concept-to-Codes (C2C) SWG Summary Outputs Consensus Approved on 3/19/2012 1 Summary of Concept to Codes (C2C) Content in this presentation # 1 2 3 2 Item Final Recommendations from C2C Key Themes Across all presentations Presentation Summaries Slides 3-5 6-9 10-43 FINAL RECOMMENDATIONS 3 Summary of Decisions and Suggested Next steps 1. 2. 3. 4. 5. 4 Healthcare Industry has a significant gap in creating, managing and publishing “Value Sets” mapped to common concepts. – Community recommends to the HITSC to work with organizations such as NLM, NQF and others to set up the above capability. For Query Health, one suggestion was to start with NQF CQM based Value Sets and improve from there. – This is a very good source as evidenced by the community and Query Health will start from these value sets. In order to share value sets in a standard way, Query Health will adopt the IHE SVS Restful web service approach and will pilot that using the NQF Value Sets. The CEDD is a significant specification to promote common definitions of queries across multiple Data Sources – Enhance the CEDD to include the vocabulary mapping, value sets for elements where applicable and map to the reference implementation frameworks of i2B2, hQuery and PopMedNet to ensure completeness Multiple tools and interfaces can be used for Query Composition, but simple intuitive ways are necessary and the concept hierarchy representation of i2B2 has been proven to work in the real world. – Adopt i2B2 approach and create the hierarchy for Query Health pilots. Next Steps • Activity 1 will be performed by the Initiative Coordinator / ONC as part of their regular briefing to HITSC • Activities 2 (NQF VS) and 4 (CEDD enhancement) will be part of the CEDD work stream in the Clinical WG • Activities 3 (IHE SVS Profile) and 5 (i2B2) will be part of the Technical WG 5 RECURRING KEY THEMES 6 All Recurring Key Themes The below list includes themes across many of the different presentations to date. The next 2 slides address how these may be incorporated into the Technical Expression. 1. 2. 3. 4. 5. 6. 7. 8. 9. 7 All mappings are Purpose and Goal Specific Hierarchy within mappings is important to ensure the concepts are appropriately mapped and can be drilled down to a granular level A centralized Data Dictionary or central Terminology System is used to store mappings and is common practice in many healthcare organizations As Standards such as ICD9, SNOMED CT, LOINC etc. are modified and updated, ongoing maintenance of mappings can be challenging Identifying a Best Match OR Alternate/Default Map is necessary since concepts may not always have an exact map It is important to identify the context of the queries to understand what information is being requested (example - Ordered drug vs. Administered drug) Ongoing maintenance of mappings is very resource intensive and requires dedicated skilled resources such as Clinicians/ Informaticists Many mapping tools and resources are publicly available or accessible that can be leveraged by organizations and further developed, refined, and maintained for use Most Concept Mapping tools maintain data in its original form Recurring Key Themes – Direct impact to Technical Framework The below list includes themes across many of the different presentations to date. The key themes in blue below directly impact the Technical Framework of QH. All others are addressed on the next slide. 1. 2. 3. 4. 5. 6. 7. 8. 9. 8 All mappings are Purpose and Goal Specific Hierarchy within mappings is important to ensure the concepts are appropriately mapped and can be drilled down to a granular level A centralized Data Dictionary or central Terminology System is used to store mappings and is common practice in many healthcare organizations As Standards such as ICD9, SNOMED CT, LOINC etc. are modified and updated, ongoing maintenance of mappings can be challenging Identifying a Best Match OR Alternate/Default Map is necessary since concepts may not always have an exact map It is important to identify the context of the queries to understand what information is being requested (example - Ordered drug vs. Administered drug) Ongoing maintenance of mappings is very resource intensive and requires dedicated skilled resources such as Clinicians/ Informaticists Many mapping tools and resources are publicly available or accessible that can be leveraged by organizations and further developed, refined, and maintained for use Most Concept Mapping tools maintain data in its original form Recurring Key Themes Do not directly impact Technical Framework However, Important to consider While not all key themes on the last slide may directly impact the Technical Framework, it is important to consider some of the key themes seen in the C2C presentation series (in green below) 1. 2. 3. 4. 5. 6. 7. 8. 9. 9 All mappings are Purpose and Goal Specific Hierarchy within mappings is important to ensure the concepts are appropriately mapped and can be drilled down to a granular level A centralized Data Dictionary or central Terminology System is used to store mappings and is common practice in many healthcare organizations As Standards such as ICD9, SNOMED CT, LOINC etc. are modified and updated, ongoing maintenance of mappings can be challenging Identifying a Best Match OR Alternate/Default Map is necessary since concepts may not always have an exact map It is important to identify the context of the queries to understand what information is being requested (example - Ordered drug vs. Administered drug) Ongoing maintenance of mappings is very resource intensive and requires dedicated skilled resources such as Clinicians/ Informaticists Many mapping tools and resources are publicly available or accessible that can be leveraged by organizations and further developed, refined, and maintained for use Most Concept Mapping tools maintain data in its original form PRESENTATION SUMMARIES 10 C2CPresentation Series 11 Presentation Schedule 2011-2012 Presentation Date 12 Presenter Andy Gregorowicz Framework/Tool/Standard hQuery Category 1 12/13/11 Distributed Network 2 12/13/11 (cont. 12/20/11) Shawn Murphy i2b2 / SHRINE Distributed Network 3 12/20/11 Stan Huff Intermountain Health Standard 4 12/20/11 (cont. 1/3/12) Rick Biehl DOQS - Data Warehousing and Concept Mapping Standard 5 1/3/12 Jeff Brown PopMedNet Distributed Network 6 1/3/12 Olivier Bodenreider National Library of Medicine Tool 7 1/10/12 Victor Beraja Ibeza Standard 8 1/10/12 Rhonda Facile CDISC SHARE Standard 9 1/17/12 Shaun Shakib 3M - Healthcare Data Dictionary Tool 10 1/17/12 David Baorto NYP Terminology Services Tool 11 1/17/12 Zeshan Rajput RELMA (Regenstrief) Tool 12 1/24/12 Rita Scichilone AHIMA Concept Mapping Principles 13 - 14 1/24/12 Craig Stancl and Kevin Peterson Lex EVS – CTS 2 Tools 16 1/24/12 Jacob Reider ONC Senior Policy Advisor NA 17 1/31/12 Kevin Puscas S&I Repository Value Set Definition Mgmt 18 2/7/12 Floyd Isenberg NQF Value Sets 19 2/14/12 Peter Hendler Convergent Medical Terminology Tool Questions for Considerations Frameworks (Ex. - i2B2, PMN, hQuery) Overview and Current Status • How do you define concept mapping within your system (e.g. are you mapping in between standards, or are you mapping from standards to your local data dictionary)? • Are there any internal mechanism? • Do you use any external tools? • Are you able to maintain the integrity of the original data in its native form (i.e. data as collected and not modified)? Tools (Ex. RxNav, RELMA, LexEVS) • How does your tool function? • Are you able to maintain the integrity of the original data in its native form (i.e. data as collected and not modified)? • How can your tool be leveraged? Are there any external APIs or other interfaces? • How do you see your tool integrating with the QH Reference Implementation solution? Standards • How do your standards relate to concept mapping? • Are you able to maintain the integrity of the original data in its native form (i.e. data as collected and not modified)? Integration and Infrastructure • How can you integrate with external tools for mapping? JavaScript library? Java? Web Services API? • How do you see your framework integrating with the QH Reference Implementation solution? Alignment to Query Health • Where does the mapping occur? Is it at the Data Source level? Or at the Information Requestor level? Or Both? • Can it be easily implemented elsewhere? • Where does the mapping occur? Is it at the Data Source level? Or at the Information Requestor level? Or Both? • Can it be easily implemented elsewhere? • Where does the mapping occur? Is it at the Data Source level? Or at the Information Requestor level? Or Both? • Can it be easily implemented elsewhere? Who maintains your concept mapping tool? • Who maintains the mappings and how often are they released? • What is the associated cost with maintenance? • Who maintains your concept mapping tool? • Who maintains the mappings and how often are they released? • What is the associated cost with maintenance? • Who maintains the development of standards? • Who maintains the mappings and how often are they released? • What is the associated cost with maintenance and periodic releases? Maintenance 13 • What infrastructure is necessary to implement / utilize your standard? • How do you see your standard integrating with the QH Reference Implementation solution? AHIMA – Data Mapping Principles (Rita Scichilone) Concept mapping as defined by the ISO is “a process of defining a relationship between concepts in one coding system to concepts in another coding system in accordance with a documented rationale, for a given purposed.” Documentation about the rationale and intended purpose for map are essential before any type of mapping is attempted. Maps should be Understandable, Reproducible, and Useful. Map development between coded data systems is designed for various purposes and can be expensive and challenging to maintain and/or update. The challenge in ongoing maintenance is that it requires some element of human interaction to ensure semantic interoperability. Use of local terminology for data capture in EHR is a major challenge for interoperability and health information exchange. Rita’s presentation focused on discussion around overview of mapping principles. Multiple data sources at the various targets for distributed queries pose a challenge for universal data mapping. Critical Mapping Factors and Considerations for Best Practices 1. Mappings must be developed with this in mind - Have a (query) source and a (query) end target 2. Careful consideration is required in interpreting and using maps 3. Skilled resources and Subject Matter Experts (SMEs) are necessary to develop and test the validity and accuracy of the maps 4. Organizations must have a maintenance plans to keep current with changes in the standard codes. EHR Systems may or may not always be mapped to the most recent versions of standard codes released such as LOINC or SNOMED CT. This challenges semantic interoperability as mentioned above. 5. Confirm degree of equivalence there is between the source and target, for example, via Rating Scales and Cardinality of each maps (relationships between associated concepts) such as one to one, one to many etc. 6. Helpful to have a best match and a default map in – it specifics how “other” and “unspecified” are handled within the maps 7. Need a consensus management process in place via SMEs and quality assurance plan 8. Maintain a good data dictionary and ensuring data consistency Value proposition of maps is to ensure meaningful reuse of the maps. This leads to efficiencies if the maps are completed accurately. However, data mapping can have its shortcomings as concepts may not always properly translate from a local system 14 to a standard system. hQuery (Andy Gregorowicz) hQuery is an open-source implementation of a framework for distributed queries ofhealth data. It uses a simplified standard information model, based on nationally recognized data elements, to enable generic queries against, and summarized responses from, various clinical data sources. The query is expressed in JavaScript that interacts with a clinical EHR to extract information that is then aggregated using hQuery. hQuery includes a graphical query composer. hQuery operates on Codes/Code Sets in JavaScript libraries. A Graphical User Interface (GUI) facilitates building JavaScript libraries. These JavaScript libraries can easily be extracted and reused. Using the JavaScript libraries, hQuery runs through the clinical record and pulls any matching codes/code sets. There are no special facilities in hQuery for any type of “concept mapping” to take place. popHealth has used spreadsheets released with MU stage 1 measures to generate concepts. The codes and the spreadsheets for MU stage 1 are available; however not able to re-distribute the outputs of the concepts due to licensing restrictions. Spreadsheets with codes / concepts can be found on www.projectpophealth.org hQuery is intended for the responder level of Query Health. 15 i2b2/SHRINE (Shawn Murphy) SHRINE is a research network for distributed queries. From a user interface perspective, he/she builds a query, launches it and waits to get an answer back. On the back end, the query is composed and goes through an aggregator that distributes it to SHRINE adaptors that support a type of mapping from standard ontology to local ontology. The query is then run against the Clinical Research Charts (CRC) in the local ontology terminology and the answers come back to the composer. i2B2 allows the creation of formatted ontologies from the web services at National Center for Biomedical Ontology (NCBO). For example, the Global ontology used in a query may be SNOMED. To get SNOMED ontologies, i2B2 uses NCBO web services that bring down SNOMED terms into i2b2 ontology to a build tree. Within the SNOMED view, an extraction workflow program can be invoked that goes to NCBO, where it grabs ontology terms and places it in a database. The database processes these terms and translates into an i2b2 metadata ontology table. This allows the latest SNOMED ontology to be pulled from the NCBO website. The key to i2b2 is the hierarchical path and that the code/set of codes ultimately mapped to. i2B2 can be flexible to allow organizations to support and query multiple ontologies in their database based on the merging terminology process. Also, i2b2 can map from one ontology to another in order to support distributed queries on the fly. It is also agnostic to specific coding systems and specific ontologies because ontologies can be built to support a particular organization’s database. To get a standard ontology into an i2b2 implementation, it is necessary to build from the NCBO web service or i2b2 has to distribute the appropriate demo version. 16 i2b2/SHRINE (Cont.) Two possible Use Cases for i2b2 ontology services 1. Mapping a global to a local terminology for SHRINE queries 2. Merging one terminology into another to enable queries using two terminologies simultaneously. – For instance, i2b2 can automatically merge ICD9 terms with SNOMED ontology if the database has both terms. It can be queried selectively using i2b2. At this time all organizations have ICD9 and will soon have ICD10. Within i2b2, ICD9 can be merged into ICD10 so they can be queried together. Each site is responsible for mapping the standard ontology to their local ontology so they are in control of how their local terms are represented in relation to the standard ontology. There are often similar terms that are used differently at various hospitals. i2b2 does not map terms in a bi-directional fashion 17 PopMedNet (Jeff Brown) PopMedNet (PMN) is an open source software designed to facilitate the creation, operation and governance of networks where each network decides how data and queries are standardized. It can be thought of as a “transport mechanism” via which a query can be sent and received from data partners. PMN typically standardizes formatting but for the most part avoids concept mapping as there is no internal concept mapping mechanism. Networks powered by PMN standardize the data and decide on the querying approach where PMN facilitates. A data model plug-in is possible to translate queries between models. PMN has a Web services API/ plug-in architecture. The architecture allows for the data partners (DPs) to be in control of participating in distributed query networks and the terms on how they participate. It minimizes the need for extensive database expertise and ongoing maintenance/management of complex data structures. PMN querying tools are network specific (SAS, SQL, etc). Mappings are limited to industry standard terminologies (NDC, ICD9, HCPCS, LOINC). Currently, PMN is used in a variety of population health and public health surveillance projects such as Mini-Sentinal (FDA), MDPHnet (ONC), AHRQ etc. 18 Intermountain Health (Stan Huff) Concept mapping at Intermountain is done mainly to support data transformation/conversion from local code to standard codes. They mainly receive data from outside systems or from own system (owned by IM but purchased from different vendors). The data is pulled into a common repository and typically has hierarchical structure and associative relationship. In concept mapping it is important to know the concept and the information model because there are a lot of areas (outside of Lab) where knowing the code/set of codes does not always translate to appropriate concepts. At Intermountain, they normalize multiple streams of data that comes in to standard form by bringing the data to an interface engine and normalizing data to common structure and transform local codes to standard codes. For example local lab tests and results to LOINC and local problem lists to SNOMED-CT. Concept mapping should occur at the query responder’s level (data source) level vs. the query requestors level because the query responder would understand their own data sets and be able to respond more accurately than the query requestor. Internally, they utilize Dice Coefficient to for lexical mapping; however also utilize RELMA (from Regenstrief) as an external tool for mapping. They can utilize terminology services such as Java and JavaScript as well. Ongoing model and terminology creation and internal maintenance of mappings requires approximately 4.5 FTEs (40% of 14 FTEs dedicated to this work). Their mappings are currently not shared or used by other entities and maintained internally . Most local mappings have a 1:1 relationships; however, mappings from SNOMED to ICD-9 for example would have more general applicability on a large/national scale. 19 Data Oriented Quality Solutions Data Warehousing (Rick Biehl) Clinical research data warehousing is made up of 26 key dimensions. Each dimension has the same 6 table database design that represent the logical sub-dimensions of the data warehouse: Context, Reference, Definition, Bridge, Group, and Hierarchy.Ontologies are mapped in the Hierarchy dimension. Regardless of how many dimensions there are, the 6-table database design a standard way of looking at each dimension. The key to building healthcare dimensions is to standardize the way dimensions are structured in order for them to be federated and shared on a large scale. In terms of Concept Mapping, categorization of sub-dimensions within database into some standard form is key to making this model work. Three key elements are necessary for query creation and form the overarching data architecture: •Category, Role, and Perspective •For example – to answer “How many analgesics were administered?,” the query should be designed to pull all facts where a drug (category) was administered (role) and Analgesic (perspective) was available in any higher perspective. If all data warehouses are designed with the same 3 constructs in mind, queries can be written against various data models Biehl relies heavily on the Open Biomedical Ontology (OBO) Group for his work. His model is designed such that it makes it easier for query creators to search by concepts such as “heart disease” which on the back end are mapped to any and all relevant diagnosis codes and therefore yield accurate query responses. His model eliminates the need to perform searches by codes and allows for more terminology based searches. 20 Ibeza (Victor Beraja) Concepts and codes don’t always tell what part of history or exam the information is coming from. This is why context of the codes being returned is very important. They run medical rules to inform patients and doctors at the point of care about clinical guidelines and insurance benefits so an informed decision can be made for what is best for the patient. Clinical data concept is mapped to SNOMED and /or LOINC as available within the structure that provides context – this way the context of where this information was pulled from is also found. Query results for a particular question at hand are as good as how well the query has been formulated and whether the right concepts are being queried (in the right context). Victor Beraja’s presentation focuses on the importance of context queries and how to execute accurate searches. For example, billing data (such as diagnosis code) may pull all information relevant to the number of tests conducted but not the findings from those tests. For instance, a query written to identify all diabetic patients with macular edema will pull results (based on Procedure / Diagnosis codes) for the total number of tests conducted; however, it will not factor in the results of those tests because that information isn't grounded in the Procedure or Diagnosis codes. This is why it is important to understand the (clinical) context of the queries to obtain the most accurate information possible. 21 CDISC SHARE (Rhonda Facile) CDISC is a non profit organization that develop standards for clinical trials. The discussion will focus on metadata repositories and metadata that contains clinical data from clinical trials. Overarching purpose for CDISC SHARE project is to take the various CDISC standards developed to date and create a metadata repository that is easily accessible and appropriately linked / formatted. Some of the main goals of the CDISC Share Project • Provide a consistent approach to standard definition and Improve access to standards • Speed up new clinical research content development • Facilitate data reuse • Decrease costs - Downloadable metadata could reduce standards maintenance costs and enable process improvement • Deliver all of CDISC’s existing and all new content in both human and machine-readable forms • Facilitate alignment of Clinical Research and Healthcare Standards In order to achieve these goals, there must be semantic interoperability where all standards are using the same definitions. Also they must utlize the CDISC share model that links all CDISC Standards. Their metadata model was derived by reviewing models by Intermountain, Open EHR, SDTM, and decided to come up with the CDISC SHARE model. The benefits of this model will be appropriately layered and structured, linked together with CDISC Standards, a single definitions that is consistently used, and machine readable. 22 UMLS and RxNorm National Library of Medicine – (Olivier Bodenreider) National Library of Medicine (NLM) develops standards to support users of clinical standards and terminologies. NLM has developed normalization strategies that are specific to clinical text/terms. They go beyond string matching and do more linguistically based methods for normalizing strings into UMLS concepts. What is UMLS ? – UMLS stands for Unified Medical Language System. It integrates clinical vocabularies such as SNOMED CT (clinical terms) and MeSH (information science) to name a few which make it easier to directly translate one terminology to another. There are a total 160 source vocabularies and a total of 21 languages that are housed in UMLS. The UMLS serves as a vehicle for the regulatory standards (HIPAA, HITSP, Meaningful Use). UMLS includes all major clinical terminologies such as LOINC, SNOMED CT. RxNorm. UMLS is released twice a year and RxNorm is released on a monthly basis. UMLS Terminology Services is available as a Graphical User Interface( GUI) for UMLS What is RxNorm? RxNorm can be thought of as a mini UMLS specifically targeted towards drugs) that creates standardizes names for the drugs in addition to mappings. RxNorm integrates names and codes of a dozen vocabularies and normalizes the names. RxNav is the Graphical User Interface for RxNorm. Variances between UMLS and RxNorm – •UMLS doesn’t create terms whereas RxNorm creates standardized terms for the drugs in addition to creating the mappings •RxNorm and UMLS are both terminology integration systems. They provide some level of source transparency which means that from UMLS, the original sources can be reconstructed (this cannot be done for RxNorm). 23 RELMA – Regenstrief Institute (Zeshan Rajput) Zeshan Rajput is a member of the SDC Support team at Accenture and not representing Regenstrief for this presentation. RELMA does one off mappings – specifically to the LOINC standard. LOINC is used to code the questions a clinician may ask– for example: “what is my patient’s Translates local set of concepts into standard vocabulary. RELMA helps consume single vocabulary RELMA is a tool distributed with LOINC that can be downloaded from the Regenstrief website. It is a free resource that also has a sample data that can be used to test out RELMA. Some tools are available for mapping of local test (or set of tests) to LOINC codes Common Tests subset – Top 2000 LOINC codes can represent 95% of clinical care use Google like search function – search box functionality and manual identification of code when necessary. There are 4 basic ways to map local concepts to RELMA of which the best way is to load directly from HL7 V2.x messages which eliminates any manual loading errors. Zeshan also provided a demonstration of the various ways to map local concepts to RELMA by sharing the LMOF file (Local Master Observation File). It contains all the information that an organization would need to represent the concept within their own information system. Zeshan did a demo of both the manual as well as the automated ways RELMA can be used for concept mapping. Successes in mapping within RELMA depend on a few tried and true best practices such as Expanding typical abbreviations, Standardizing terms that are referenced, Avoid administrative terms used to distinguish tests (Stick to clinical), Standardize time references, and lastly include units of measure to increase specificity of the automated mapping tool. 24 Mayo Clinic – LexEVS (Craig Stancl ) LexEVS 6.0 has been in development at Mayo Clinic for approximately the past 10 years. It is a comprehensive set of open source software and services to accomplish various tasks such as load, publish, and access vocabulary or ontological resources. It is built on common information model called the Lex Grid model that represents multiple vocabularies and ontologies in one model. LexEVS’s primary goal is to be able to utilize and provide access of common repositories (Oracle, DB2 etc.) software components, APIs, and tools. The LexEVS model is based on standards within HL7, specifically CTS 1 specs and model/definition provided by ISO 11179 (international standard for representing metadata for an organization in a metadata registry.) The LexGrid Model represents the data and provides a mechanism for standard storage of controlled vocabularies and ontologies: • Defines HOW vocabularies can be commonly formatted and represented • Provides the core representation for all data managed and retrieved through the LexEVS system • Provides ability to build common repositories to store vocabulary content and common programming interfaces and tools to access and manipulate that content. • Terminologies from widely varying resources such as RRF, OWL, and OBO can be loaded to a single data base management system and accessed with an single API. Mapping capabilities built into LexEVS for the most part have been based off experience with SNOMED CT. Mayo provides the API to do the mappings. The mapping relates a coded concept within a specified code system (source) to a corresponding coded concept (target) within the same or another code system, including identification of a specified association type. The query can be based on the source or target text as well as the association type or qualifier. Also can specify how the results are sorted based on query type. Able to load maps from a wide range of formats – XML, OWL, RRF into LexEVS repository. 25 Mayo Clinic – Common Terminology Services 2.0 (Kevin Peterson) CTS 2 was developed by Mayo Clinic to provide a standard for a shared semantic model and API for the query, interchange and update of terminological content. (i.e. standard way to query and store vocabulary content). CTS2 is an API specification and it defines the semantics, syntax and valid interactions that can occur. It is not a software. Rather it is the blueprint for building and using software. It is not specific to healthcare and can essentially support any terminology content from any discipline. There are some semantic web components built in - OWL and RDF. Extensions are encouraged within CTS 2 as it is not meant to limit what can be done with it. The purpose is to be able to show how common things can be consistently done CTS 2 is plug in based meaning that functionality deemed necessary can be implemented - Not all components have to be implemented. The Model View Controller (MVC) architecture pattern of development provides a REST based framework for CTS 2. REST implementation was one of OMG standards platform specific models of CTS 2 (http://www.omg.org/spec/CTS2/1.0/Beta1/20120906 ). Integration points between LexEVS and CTS 2 – •CTS2 interfaces could be implemented using LexEVS 6.0. •Provide a Standards-Based ontology mapping tool. •CTS2 is an accepted OMG and HL7 Standard. •LexEVS is proven and used by NCI and NCBO. •LexEVS and CTS2 provide an open source solution. 26 3M - Healthcare Data Dictionary (Shaun Shakib) Healthcare Data Dictionary is the vocabulary server utilized at 3M. Components of the system are - Enterprise Master Person Index, Clinical Data Repository for storing pt data, HDD (responsible for all structuring and encoding of data stored in CDR). HDD is a system that organizes, defines, and encodes concepts. It can be thought of as a metadata repository that defines the logical structure of instance data to make it computable. - Medical Information Model – supplies structural definitions of what is stored in CDR, - Knowledge Base – Semantic network – relationships between concepts -Controlled Medical Vocabularies. The approach is to centralize mapping vs. a network of point to point mappings. Main reason for this is that it is easier to maintain additions of new sources and updates. They have concept based controlled medical vocabulary and 3M integrate both source standard terminology and local interface terminology through process of concept mapping. HDD has roughly 2.4 million concepts (both global, local (single enterprise) and shared concepts) In terms of the actual mapping process, 3M utilizes some tooling / matching technology. They have SMEs who are clinical experts with domain knowledge and formal informatics training to develop and maintain mappings. They use in-house tools developed for mappings. They also perform Semantic Mapping for Drugs and have an internal terminology model for drugs/lab results. This internal terminology model breaks lab results concept down into component attributes by following LOINC model . They can then look to see if these concepts are in HDD that are defined in the same way. Lastly, the creation of new concepts requires inerator (?) agreement and validation of all mappings. 27 3M - Healthcare Data Dictionary (Shaun Shakib) The combination of all these various pieces is a concept based controlled medical vocabulary that has various source systems mapped to them. All mappings in HDD are concept equivalent level. This means that if an incoming source has a different level of granularity/specificity than any concept in HDD, their approach is to create that concept at the appropriate level of granularity and then use semantic network to go from that concept to the closest concept that might have standard code associated with it (i.e. – Identify the best match and then have the next best alternative mapping). Currently, HDD utilizes CTS V 1.2 which provides functional specs to query terminology servers in a standard way. Concepts that have changed in meaning or updated codes are not deleted; however, they are inactivated. Having HDD as a tool for mapping allows 3M to code data and once a standard code becomes available, they associate that code with the concept in HDD. As far as handling any overlaps with concepts, there is always a chance that observations in LOINC and SNOMED CT could overlap. A single concept in HDD can have multiple source codes associated with it. So if there is overlap, they just associated all codes to that same concept. 28 NY Presbyterian Terminology Service (David Baorto) What started as terminology management at NY Presbyterian hospital with the development of the Medical Entities Dictionary (MED) approximately 20 years ago has now expanded into a Terminology Service at NYP (both Cornell and Columbia hospitals). Concept mapping can be defined in various ways – local to local, local to standard, and standard to standard codes. There are Hierarchical types of maps (i.e. information residing as a subclass or at a more granular level) and Relational maps which are mappings in between the various standards. As we’ve seen in other presentations, concept mapping is goal specific and the type of mapping necessary will vary based on what requirements or results an information requestor is looking for. For this reason, concept mapping also needs to be flexible and support changes to evolving standards. Integrity of the original data is retained at the granularity of the source system. Semantics about the data are maintained in the central terminology system. As data flows from source systems to various downstream systems, the central terminology can be leveraged by the system in order to infer information from that data and obtain terminological knowledge. The MED Connects in real time for production queries to certain systems and provides batch interval results for other vendor systems. The mapping process – All codes (subclasses) are assigned to a class based on the hierarchy system. When a new code needs to be added, it is created in the MED and assigned to a class in the appropriate hierarchy. The mapping does not occur at the data source or the information requestor level, but rather it occurs at the level of the terminology server. 29 NY Presbyterian Terminology Service David Baorto – (Cont.) As far as internal tools, they use lexical approaches, hierarchy and semantics. The terminology team does build some tools in Perl however most of their work is centered around “service” rather than actual tools. For external tools, there are available crosswalks between ICD 9 and CPT that helps build procedure hierarchy in MED and they also utilize RELMA for LOINC. Content provision tools - how do downstream systems speak to the central terminology? There is a Web Based MED Browser where searches can be conducted. Also, you can view concepts within the MED (one by one) and also drill down by hierarchy. Most of their tools are created internally at NYP. The MED used to be directly connected to clinical information system for real time interactions. Use the Perl scripts for regular pre-determined queries. In terms of maintenance, once the terminology has been incorporated, the maintenance process is relatively easy with regular weekly/monthly updates. For the new terminologies that are incorporated the maintenance and updating process is more challenging. Currently a lot of ongoing work is managed by 3 FTEs and they utilize 2 Unix Servers. 30 Convergent Medical Terminology (Kaiser) Peter Hendler CMT is an ongoing initiative at Kaiser Permanente that began in1996 and is used to manage the standardization of clinical information. It is an Enterprise Terminology System structured as a hub and spoke model. The three different spokes: 1. Internal clinical terms that providers see and what populates in the eMR, 2. Patient friendly names easily understandable via patient portal, and 3. Administrative terms (ICD9-10, CPT etc). The hub that these connect to is the reference terminology SNOMED, but they also use LOINC for internal mappings Kaiser donates the spreadsheets that contain mappings between core SNOMED terms and the 3 different types of terms mentioned above to the IHTSDO. CMT covers various domains such as Diagnosis, Lab Results, Immunizations, clinical Observations, Nursing Documentation, etc. No matter the changes that occur to standards or the codes (such as the change from ICD9 to ICD10) the clinicians or patients not see any changes in the EHR System that they use – the majority of changes are on the backend. CMT is predominantly mapped to Standard Terminology (SNOMED/LOINC) and can be mapped to other terminology as needed. It can also leverage SNOMED’s structure, including its hierarchy and description logic (ie- the formal definition). It supports the requirements for standard terminology for Meaningful Use and Health Information Exchange. Their current infrastructure – A SNOMED name and code is 1. mapped to the appropriate ICD codes, 2. has descriptions of what the physicians display name, what the patients see, and synonyms and 3. a identifies the relationship. These ID’s and mappings are all sent to in the Epic (EHR) load file which links to a database called prodnam. The purpose of the prodnam is to then regionally distribute CMT mappings since Kaiser is quite a large network. CMT query tool can pull information for a query in 3 different ways – Property based query (code or display name), Hierarchical based and Subsumption (role) based. This allows for the query to pull all the different possible results and increases the likelihood of accurate results. OWL (Web Ontology Language) is also another project that is currently underway at KP. Compared to EL+ used by SNOMED, the descriptive logic of OWL is more expressive and allows for negation and disjunction. 31 VALUE SETS 32 Jacob Reider Senior Policy Advisor for ONC Jacob Reider’s presentation and discussion mainly focused on ensuring that the C2C workgroup considers Value Sets as part of the series. His recommendations were to leverage currently ongoing work (NQF – Retooled Measures/Value Sets published in early 2010) as a starting point. Clinical Decision Support (CDS) and Quality Management (QM) activities will both rely heavily on the creation, maintenance and sharing of value sets. These needs seem identical to the needs of the QH project in the domain of “concepts to codes.” Query Health should consider defining a list of codes that qualify a patient for inclusion or exclusion in a given Query/Quality Measure/CDS intervention. Has this patient with DIABETES had an A1C TEST? How is “diabetes” defined? What is an A1C test? – A value set is a list of ALL possible codes that would exist in the system that would “count.” Such a list may be long and is ideally harmonized with other such lists – so that all payer entities define it consistently. If the definitions (lists) differ, then the context of the information will always vary. – The term “Value Set” is sometimes used to refer to a “Convenience Set.” Convenience sets are not comprehensive, but aim to constrain a list of codes to a given specialty for the convenience of a give type of provider. Therefore, there may be a “cardiology convenience set” which is a list of the most common terms a cardiologist may use. This would presumably make “lookup” easier for a cardiologist. – Some argue that as computers get better/stronger/faster, such convenience sets are no longer relevant. 33 S&I Repository (Kevin Puscas) The S&I Repository is a Web-based platform and consists of three components: Artifact Manager, Category Manager, and Information Browser. The architecture is an alfresco application linking external sources. It requires a link with the NML UMLS to formulate searches based on accepted standard terms. The Repository has the capability to: 1. manage artifacts (documents, models, links, etc.), 2. facilitate artifact lifecycles (publishing and versioning), 3. establish contextual relationship and taxonomy of tags between artifacts, and 4. search both content and metadata. It uses a “shopping cart” model similar to Amazon.com for reporting results and can manage links to web sources. The Browser allows user to search artifacts after Wiki collaboration. Artifact Manager allows searches. The artifacts can range from documents (PDF, Word, XML etc.), Media (audio/visual, JPEG,TIFF,MP3. MOV etc), Links (URL, value sets), and Models. The Category Manager allows category groupings and application of official meta-tags. Vocabulary Value Sets are in a computable format defined by HITSC. They use the NLM’s UMLS system, which has been defined by ONC as the source vocabulary. Value Sets are composed of the information needed and how those values are defined. The vocabulary is not stored by S&I Framework, but retrieved from UMLS as needed. Once value Sets are returned to the repository by UMLS, identifiers are attached and the results reported in XML. This version of S&I framework is a work in progress and not the finalized product. One of the major constraints is that S&I Repository handles only enumerated value set definitions and not intentional value set definitions (values that change by criteria such as age and date). Only 14 of 160 available vocabularies are included, although others can be easily added. Users must have a UMLS account to use it, but this prevents copyright violations. It is limited to current UMLS version. Future capabilities are expected to be a full SVS profile to allow access to the public, intentional value set definitions, access to non-UMLS vocabularies, and alternative authoring mechanisms (ex csv). 34 Value Sets – HIT Standards Committee Vocabulary Task Force Recommendations (Floyd Eisenberg - NQF) Floyds presentation focused on the recommendations from the HIT Standards Committee Vocabulary Task Force and Clinical Quality Workgroup. Reviewed descriptions of decisions that were made and what is meant by the term “value sets” as well as what code systems are recommended for use. When defining a quality measure or a query, the “element” within the query has to be very specific. The concepts that are to be queried such as “Type II Diabetes” or “Coronary Artery Disease” usually have corresponding codes and these groupings of codes are known as value sets. Smaller value sets can be combined and link to a parent value set. The issue is that there are many different vocabulary sets to choose from to define any particular concept within an EHR. What the Vocabulary Task Force Recommendations do is suggest which appropriate vocabularies to use for the category of information within the Quality Data Model. The primary recommended vocabulary are SNOMED CT, LOINC, RxNORM. Additionally, International Classification of Function (limited use – some Rehab hospitals use this), Universal Classification of Units of Measurement (UCUM), CVX (Vaccine Terminology) are also part of the recommendations. Transition vocabularies have also been identified for EHR systems – ICD9, ICD 10 CM, ICD10 PCS, CPT, HCPCS are allowable. The Quality Data Model (QDM) is a information model that is used as a way to define concepts and categories of information at NQF. It allows for the querying of the state or context of the concepts within the query and how it is categorized (example. Ordered vs. Administered vs. Dispensed). Depending on what category the concepts fall under, the Task Force recommended what Vocabularies should be used. The rest of the presentation focused on walking through the recommendations. 1. recommended 35 vocabulary, 2. the associated concept and 3. how it is categorized in QDM Summaries of Questions for Considerations Distributed Query Networks 36 Summary of Distributed Query Networks Maintenance Alignment to QH Integration and Infrastructure Overview and Current Status Questions 37 PopMedNet i2B2/SHRINE hQuery • How do you define concept mapping within your system (e.g. are you mapping in between standards, or are you mapping from standards to your local data dictionary)? • Are there any internal mechanism? • Do you use any external tools? • Are you able to maintain the integrity of the original data in its native form (i.e. data as collected and not modified)? •Facilitates creation, operation, and governance of networks, each network decides how to standardize data and queries •PMN networks typically standardize formatting but avoid concept mapping, with some exceptions •Maintaining integrity - Networks determine how to store data. Most data models maintain local codes even if the code is mapped •The standard SHRINE vocabulary is mapped to the local data dictionary within each site’s adaptor cell. Internal tools are used. Data is maintained in its original form and then mapped to various terminologies. •Concept mappings (if any) are created in the JavaScript passed in the query. There is no special support for this. • How can you integrate with external tools for mapping? JavaScript library? Java? Web Services API? • How do you see your framework integrating with the QH Reference Implementation solution? •Has a Web services API/ plug-in architecture •Does NOT include any mapping capability •PMN querying tools are network specific (SAS, SQL, etc) •Mappings are limited to industry standard terminologies (NDC, ICD9, HCPCS, LOINC) •External mapping tools could be used to build the mapping tables. •JavaScript library • Where does the mapping occur? Is it at the Data Source level? Or at the Information Requestor level? Or Both? • Can it be easily implemented elsewhere? •It is the transport mechanism and governance tool and is agnostic to the implementation decisions. •Networks will develop unique solutions for querying •Mapping is performed at each site at the data source level. It could also be performed by a designated intermediary. •Information Requestor Level Who maintains your concept mapping tool? • Who maintains the mappings and how often are they released? • What is the associated cost with maintenance? •All network implementations use a dedicated coordinating center to oversee the integrity of the data and use of the network •Requires substantial resources to ensure appropriate use •I2b2/SHRINE personnel maintain our mapping tools. Local teams maintain the mappings. Cost depends on number of mappings, most sites have at least ½ FTE to maintain i2b2 software, mappings, and database for SHRINE. •NA •The codes and code systems of the underlying data are preserved Summaries of Questions for Considerations Standards 38 Summary of Standards •Standardizes data warehousing into dimensions and sub dimensions that can be easily stored, retrieved, managed and queried • What infrastructure is necessary to implement / utilize your standard? • How do you see your standard integrating with the QH Reference Implementation solution? •Internal tools - Lexical matching using Dice Coefficient •External tools - RELMA •Overarching data architecture should be categorized by Category, Role, and Perspective within each dimension • Where does the mapping occur? Is it at the Data Source level? Or at the Information Requestor level? Or Both? • Can it be easily implemented elsewhere? •Query Requestors level is where mapping should occur because the query responder would understand their own data sets and be able to respond more accurately than the query requestor •The standard refers to how data should be organized for querying so can be implemented elsewhere if they follow the particular pathway • Who maintains the development of standards? • Who maintains the mappings and how often are they released? • What is the associated cost with maintenance and periodic releases? •Maintain their own mappings and are all internal to IM •Cost - 14 FTEs that create models, terminology, and mappings •1,200 HL7 interfaces •3 million patients •31,000 employees •NA Overview and Current Status •Most data can be preserved in original (log files will eventually be deleted) •Preservation of data is the responsibility of the sending system Integration and Infrastructure Data Oriented Quality Solution (DOQS) • How do your standards relate to concept mapping? • Are you able to maintain the integrity of the original data in its native form (i.e. data as collected and not modified)? Maintenance 39 Intermountain Health Alignment to QH Questions NYP Terminology Services •Defined by the tasks (local to local, local to standard etc) and types (equivalency, hierarchical, Relational) of mappings •Integrity of data is maintained by retrieving results according the class or semantics •Once a terminology has been incorporated, maintenance consists of regular updates •New terminology -> difficult. (example – Two independent laboratories merged into single laboratory information system). •3 FTEs and 2 Unix Servers Common Terminology Services 2 Ibeza • How do your standards relate to concept mapping? • Are you able to maintain the integrity of the original data in its native form (i.e. data as collected and not modified)? • CTS2 is an Official OMG Standard (Beta 1) • OMG Concept Mapping Specification Document • http://www.omg.org/spec/CTS2/1.0/Beta1/20120906 • All data must be transformed into CTS2 compliant Data Objects. For many formats (such as SNOMEDCT and HL7), the transformation itself will be standardized. • Each clinical data concept is mapped to SNOMED and/or LOINC • The integrity of the original data is preserved by creating a dictionary of clinical terms offered to the public so everyone can use the same terms in their clinical forms. New terminology submitted is then revised by a team of experts. These determine if the “new” term is added as “new” or as an “alternate wording” of an existing clinical term. Integration and Infrastructure • What infrastructure is necessary to implement / utilize your standard? • How do you see your standard integrating with the QH Reference Implementation solution? • Implementation are independent of any type of technology infrastructure, but significant tooling around facilitating implementation has been done (http://informatics.mayo.edu/cts2/framework/) • CTS2 would provide a common set of APIs to enable the creation, query and maintenance of concept mapping. • The standards allow Context Queries of specific Clinical Data. For example you will be able to query number of patients who had a dilated fundus exam with an exam of the macula for diabetic maculopathy. Alignment to QH • Where does the mapping occur? Is it at the Data Source level? Or at the Information Requestor level? Or Both? • Can it be easily implemented elsewhere? • CTS2 can consume pre-mapped data from an arbitrary data source. It also be used to create and incrementally modify new mapping data. • Yes, implementation is modular – CTS2 is a large specification but only parts of it need to be implemented. • Both. At the creation of the glossary of concepts mapped to SNOMED and LOINC. • And it can be easily implemented • Who maintains the development of standards? • Who maintains the mappings and how often are they released? • What is the associated cost with maintenance and periodic releases? • OMG, HL7, Mayo Clinic maintain development of standards • CTS2 does not specify where mappings come from, how they are released, or what release cycle they follow. • Depending on CTS2 implementation, new releases would require a new load of the source content, or an incremental load of the change set. • A dedicated group of medical experts and engineers oversees the integrity and development of the standard. • Mappings are maintained by a dedicated group of medical experts on a quarterly basis Overview and Current Status Questions Maintenance Summary of Standards 40 Summaries of Questions for Considerations Tools 41 Summary of Tools Maintenance Alignment to Query Health Integration and Infrastructure Overview and Current Status Questions 42 Resources and Tools (UMLS/UTS, RxNorm/RxNav) RELMA • How does your tool function? • Are you able to maintain the integrity of the original data in its native form (i.e. data as collected and not modified)? • Terminology integration system • Source transparency (most original terminologies can be recreated from the UMLS; generally not the case for RxNorm) • Take any of four sources of local observations and map them to LOINC using automated or manual (facilitated) approaches • Since you have to get your information into LMOF format (in one way or another), the mapping would not affect the original data as long as you don’t delete your original data after conversion to LMOF • How can your tool be leveraged? Are there any external APIs or other interfaces? • How do you see your tool integrating with the QH Reference Implementation solution? • UMLS: - GUI: UTS - API: SOAP-based • RxNorm - GUI: RxNav - API: SOAP-based + RESTful • No API or interface • Representative of tools (another is RxNav) that facilitate adoption of a single standard • LOINC also distributed with .mdb and could be directly imported into another interface and used to facilitate query composition, interpretation, or response • Where does the mapping occur? Is it at the Data Source level? Or at the Information Requestor level? Or Both? • Can it be easily implemented elsewhere? • Includes all major clinical terminologies • Bridges between query (text, code) and data source (standard code) • Most likely at both the query composer and data source • Query composer – makes sure the right question is being asked • Data source – makes sure the question is translated into local terms • RELMA is specific to LOINC • Who maintains your concept mapping tool? • Who maintains the mappings and how often are they released? • What is the associated cost with maintenance? • NLM develops the UMLS and RxNorm (data + tooling) • Release schedule - UMLS: twice yearly - RxNorm: monthly • No fee to the end user (but license agreement required*) • Regenstrief Institute and multiple partners around the world maintain tool • The mappings are created by each user of RELMA • The tool is updated with LOINC, twice yearly Maintenance Alignment to Query Health Integration and Infrastructure Overview and Current Status Summary of Tools 43 Questions LexEVS 6.0 • How does your tool function? • Are you able to maintain the integrity of the original data in its native form (i.e. data as collected and not modified)? • LexEVS 6.0 provides a set of APIs that can be used by an application to create, query and maintain concept mapping. • All data must be transformed into LexEVS compliant Data Objects. • Mapping to local data dictionary = centralized concept mapping; not point-topoint • Can maintain integrity original data • External APIs are provided. • LexEVS would provide a set of APIs to enable the creation, query and maintenance of concept mapping. • Internal Tools - Quality Control? – Inter-rator agreement; internal terminology models; db triggers and constraints; business logic in domain specific tools • External technologies, but all our tooling is in-house • Web services API = CTS v1.2 • Alignment with QH RI - Native integration or through the API • Where does the mapping occur? Is it at the Data Source level? Or at the Information Requestor level? Or Both? • Can it be easily implemented elsewhere? • LexEVS can consume pre-mapped data from arbitrary data sources (OWL, RRF, LexGrid XML). It also be used to create and incrementally modify new mapping data. • LexEVS is an implementation and it can be easily deployed elsewhere. • Both; Local terminologies (terms and codes) are integrated with standards through concept mapping • Can be easily implemented • Who maintains your concept mapping tool? • Who maintains the mappings and how often are they released? • What is the associated cost with maintenance? • Mayo Clinic maintains the LexEVS code, but it is open source and freely available to the community. • LexEVS does not specify where mappings come from, how they are released, or what release cycle they follow. • New releases would require a new load of the source content, or an incremental load of the change set. • Domain specific mapping tools maintained by a Development team • A team of nearly 30 SMEs with domain specific clinical expertise and informatics training • How can your tool be leveraged? Are there any external APIs or other interfaces? • How do you see your tool integrating with the QH Reference Implementation solution? 3M – Healthcare Data Dictionary