Clinical-Genomics HL7 SIG The Tissue Typing Use Case Amnon Shabo1, Shosh Israel2, Guy Karlebach1 1IBM Research Lab in Haifa, 2Hadassah University Hospital Presented by Amnon Shabo SHAMAN = IMR = Secured Health and Medical Access Network Integrated Medical Records Middleware Haifa Labs Integration of multiple sources of data; transformation to standards; full-text indexation Watson/Yorktown Labs Processing of personal genomic and proteomic data In collaboration with the Hadassah University Hospital in Jerusalem Clinical-Genomics HL7 SIG 1 Types of Genomic Data • DNA Sequences • Personal SNPs (Single Nucleotide Polymorphism) • Programmatic / manual annotation (e.g., SNPs combination x could possibly lead to mutation y) • Gene expression levels • Proteomic (proteins translated w/SNPs) Clinical-Genomics HL7 SIG 2 The Case for Clinical-Genomics • Clinical-Genomics: the use of information obtained from DNA sequencing, patterns of gene expression & resulted proteins for healthcare purposes • Personalized Medicine – Detect sensitivities/allergies beforehand – Drug Selection by clinicians • Pharmacogenomics – Improve drug development based on clinical-genomics correlations – Personal customization of drugs • Preventive Care Clinical-Genomics HL7 SIG 3 Gene Expression in Cancer • Differences between normal tissue vs. premalignant lesion vs. neoplastic tissue – markers of diagnostic value – targets for drug research – evolution of cancer • Differences between responders vs. nonresponders for a standard therapy • Development of drug-resistance • Correlation of gene expression patterns with presentation or evolution: – long vs. short survivors – metastatic vs. non-metastatic – clinical or pathological grades Clinical-Genomics HL7 SIG 4 Differential Display • Difference between banding patterns of cDNA from tumor tissue and normal tissue on polyacrylamide gel can point to a protein that could potentially be the target of a therapeutic antibody. • DNA microarrays are also employed to examine the genetic expression of thousands of potential antigens and determine which are present in abnormal (tumor) tissue but not in normal tissue. Clinical-Genomics HL7 SIG 5 Using Databases • Vast databases of genetic information contribute to genomic research • Search for potential antigens can be as easy as an online search • HLA Database example: (part of the IMGT - international immunogentics project) http://www.ebi.ac.uk/imgt/hla/ Clinical-Genomics HL7 SIG 6 Clinical-Genomics Interrelations Bi-directional relationships: • Genomics Clinical – Personal SNPs could be interpreted as mutations and thus indicate possible diseases/sensitivities • Clinical Genomics – Patient & family history leads to genetic testing order – Crosschecking of genomics results Clinical-Genomics HL7 SIG 7 SNPs Interpretation • SNPs as known mutations (might imply the develop. of diseases) • Unknown SNPs: – in significant segments of the gene (possibly imply individual differences) – in gene segments that translate to inactive parts of the proteins (thought to be insignificant) • SNPs as normal polymorphisms Clinical-Genomics HL7 SIG 8 CG Uses: From Clinical to Forensic These pictures describes paternity casework autoRADS - the left picture shows a case of paternity exclusion and the right one a case of paternity inclusion. Taken from the site of Genelex, a company which offers, among other genomic services, paternity testing (see http://www.genelex.com/). Clinical-Genomics HL7 SIG 9 Variety of Methods STR (short tandem repeats ) STR’s are short sequences that are easy to detect and its specific pattern of repetitions could identify a gene without needing to sequence the entire gene. Clinical-Genomics HL7 SIG 10 HL7 Specs for Clinical-Genomics • Create a DIM for Clinical-Genomics • Derive R-MIMs and message types • Clinical-Genomic Documents (CDA L3!) • Review / Utilize the following emerging bio-informatics standards – BSML (Bioinformatic Sequence Markup Language) – MAGE-ML (Microarray and GeneExpression Markup Language) Problem: These standards are not necessarily patient-based. Clinical-Genomics HL7 SIG 11 BSML: Sequencing Markup <Sequence id="_2" db-source="GMS" length="51" representation="raw" molecule="dna" topology="linear" alignment-sequence="_"> <Feature-tables> <Feature-table><Feature title="gms:sequence"> <Interval-loc startpos="1" endpos="51" /> </Feature> <Feature title="gms:new_fragment"> <Interval-loc startpos="1" endpos="51" /> </Feature> <Feature title="gms:annotation" value="possible somatic mutation cell line #4 end-11thxml" /> <Feature title="/gms:new_fragment" /> <Feature title="/gms:sequence"/> </Feature-table> </Feature-tables> <Seqdata> AGGAATCAGAAAGGACACTCTGGACTTCAGCCAACAGGATACCTGAGCTGA </Seq-data> </Sequence> Clinical-Genomics HL7 SIG 12 MAGE-ML: Gene Expression • Gene Description: <reporter id="1051_g_at"> <rep_des V="Source: Human melanoma antigen recognized by T-cells (MART-1) mRNA." /> </reporter> • Gene Expression Levels: <reporter id="32847_at" accession="U48959"> <NormalizedIntensity value="0.235" /> <Control value="230.972" /> <Raw value="54.3" /> <T-testPValue value="no replicates" /> <PresentAbsentCall value="A" /> </reporter> Clinical-Genomics HL7 SIG 13 Analogy to Imaging Integration HL7DICOM relationship: existing standards IMAGING DICOM Mass data Pixels Summarized data Clinical-Genomics HL7 SIG GENOMICS BSML; MAGE; I3C Efforts Sequences; GeneExpression; Proteins Radiologist- GenomicistReport Report 14 Current Experimentations at IBM Research • A clinical point of view – Bone-marrow transplantation center in Israel • Donor-recipient matching: tissue typing • Reporting to international BMT registry • A research point of view – Research center in Canada • Focusing on heart&lung diseases • Trying to find clinical-genomic interrelations • Using clinical data from patient records compared with healthy people • Using genomic data, mainly gene expression levels and proteins Clinical-Genomics HL7 SIG 15 Collaboration with Hadassah • Information exchange – Report to international registries (IBMTR) • Standardization – Transform to HL7-CDA documents (L.13) • Indexing – Index all data including semi-structured data • Annotation …agctgaa… SNPs – Integrating the personal genomic data • Visualization – Visualizing the integrated BMT documents Clinical-Genomics HL7 SIG 16 The BMT Procedure –Matching a donor or autologous transplant –Conditioning Pre-BMT •Irradiation •Chemotherapy •GVHD (Graft vs. Host Disease) Prophylaxis –Substance donated BMT -Transplant •Bone-marrow •Peripheral blood stem cells •Cord blood stem cells •Donor lymphocytes –Control of GVHD and other complications Post-BMT Clinical-Genomics HL7 SIG –Hematopoietic Reconstitution –Engraftment and Chimerism 17 New Trends in BMT Mini-allografts (mini-transplantations) • Immunosuppression instead of total conditioning (destroying the entire immune system) • Infusing donor lymphocytes to attack tumors, cancerous cells, autoimmune artifacts and infectious pathogens • Stopping the donor lymphocytes once they’re done with the patient disease source, so that they won’t attack the patient normal cells using ‘suicide genes’ • Striking a balance between to 2 immune systems Clinical-Genomics HL7 SIG 18 The HLA-Typing Use Case • HLA = Human Leucocytes Antigens; determine the personal fingerprint distinguishing between self and nonself • HLA-Typing methods move from serology (antibodies) to molecular (DNA) and recently to DNA sequencing yielding higher levels of typing resolution • Common Triggers: donor-recipient matching, familial relationships, disease association Clinical-Genomics HL7 SIG 19 Donor Matching • HLA (Human Leukocytes Antigens) – HLA Typing – DNA typing – About 6 important loci, each can have dozens of different antigens (alleles) – Haplotype – common set of antigens • Relatives versus unrelated donation • Donor banks • Search engines • Lack of donors to minorities Clinical-Genomics HL7 SIG 20 HLA Alleles in the Family Clinical-Genomics HL7 SIG 21 Differences in Antigens Allelic polymorphism is concentrated in the peptide (antigen) binding site: Class I: Class II Variables exons: 2,3,4 Variables exons: 2 Clinical-Genomics HL7 SIG 22 The HLA-Typing Triggers • Donor-Recipient Matching – Bone-Marrow transplant • Full match (identical twin) • Avoid GVHD and Promote GVM • Precise and personal match rather than full match – Organ transplant (cross-match: antibodies) • Living donor: also HLA typing before transplant • Select the best treatment for the individual patient-donor matching • HLA-typing is done for post-transplant Info. • Forensic Scenarios – Paternity disputes – Crime suspects (HLA is one component of known genetic markers) Clinical-Genomics HL7 SIG 23 Personal Rather than Full Match Personal match could be beneficial to to new trends in BMT: • HLA - A & B versus C: – When there is a match in HLA A & B: – Mismatch in HLA-C might promote GVL (Graft vs. Leukemia) • Mini-transplants: – Avoid full-match (even when identical twin is available) Clinical-Genomics HL7 SIG 24 Data of Interest • Class I allele sequences (all cells): – HLA-A – HLA-B – HLA-C • Class II allele sequences (certain cells from the immune system): – HLA-DR (most important) – HLA-DQ (the contribution is not proven but can verify the DR match since there there is strong linkage) – HLA-DP (usually is not being typed) • might sequence only the polymorphic segments (e.g., exon 2 in class II and exon 2-4 in class I), each exon is about a 300 nucleotides, because SNPs in other segments are not important to the matching Clinical-Genomics HL7 SIG 25 New Naming Convention • Letter designates the membrane locus • Full allele name: eight digits – First 2 digits defining the allele family and where possible corresponding to the serological family – Third and fourth digits describing coding variation – Fifth and sixth digits describing synonymous variation – Seventh and eighth digits describing variation in introns Clinical-Genomics HL7 SIG 26 Sequencing Data Example: Generic Meta Data: – Local Names: – – – – – DRB1*110101 IMGT/HLA No: HLA00756 Class: II Assigned: 01-AUG-1989 Last Aligned: 17-OCT-2002 Component Entries: AF029281 AJ297587 – Cell Sequence Derived From: – Known Ethnic Origin of Cells: – Length: Clinical-Genomics HL7 SIG 34A2, FPAF Caucasoid 801 bps 27 Sequencing Data Example: DRB1*110101 IMGT-HLA SEQUENCE DATABASE.htm Clinical-Genomics HL7 SIG SNPs 28 Sequencing Data Example: SNP-Resulted Protein Sequence IMGT-HLA SEQUENCE DATABASE.htm Clinical-Genomics HL7 SIG 29 Sequencing Data Example: DRB1*110401 IMGT-HLA SEQUENCE DATABASE2.htm Clinical-Genomics HL7 SIG SNP 30 Sequencing Data Example: SNP-Resulted Protein Sequence IMGT-HLA SEQUENCE DATABASE2.htm Clinical-Genomics HL7 SIG 31 Testing Kit Output Example - Sample ID - Name - Ethnic Group - Donor/Patient - Purpose of Test - Test Date - Test By - Comments Serology Results: HLA A: B: C: Kit-specific data Clinical-Genomics HL7 SIG Kit Name Kit Lot Number Kit Expires DNA Extraction DNA Quality DNA Concentration Review Date Reviewed By DR: DQ: Positive Lanes: 32 Tissue Typing Report - Recipient - Subject - Specific Alleles - Record Number - Molecular Sample - Date - Disease Clinical-Genomics HL7 SIG - Patient Result - Specific Alleles - Possible combinations - Siblings - Unrelated Donors 33 Search for Unrelated Donor • Banks of potential donors (volunteers) • Each donor was tested only for HLA Class I • When a patient needs a donor: – The transplant facility searches the donor banks to find a donor (direct access to the donor banks databases) – The search is based on Class I matching – If appropriate donors are found – then the searching transplant facility initiates a request to the respective donor banks, asking for Class II typing – Each approached donor bank is moving the request to the tissue typing lab where the DNA samples reside – Class II matching results are returned to the searching facility and if the donor with the best match in both class I & II is approached Clinical-Genomics HL7 SIG 34 Search for Unrelated Donor Transplant Center (TC) searches for donors TC chooses potential donors TC chooses best donor Clinical-Genomics HL7 SIG Patien t Class I HLA Donor Donor Banks Banks Class I Matchin g donors Request for HLA class II typing Donor Bank Class II Matchin g donors Tissue Typing Lab Class II Typing 35 Genomic Data in a Clinical Docs • A DNA Testing Device – raw DNA sequences • Reports from service units, e.g., tissue typing, should answer questions such as patient-donor matching, fatherhood, etc. • Embedding annotated results received from a DNA lab in a CDA document • Linking genomic annotations and clinical data (external links?) Clinical-Genomics HL7 SIG 36 Matching Option Notations • Different notations for coarsegrain results: – possibilities from the A24 antigen family could be represented differently by different kits on the same patient DNA tested: • A*2402101-06/08-11N/13-15/17/18/20-23/25-36N • A*2402101-06/08-11N/13-15/17/18/20-23/25-31 – Pair combinations (inherited alleles): • DRB1*0402 AND DRB1*0408 Kit B: or two DRB1*0404/44 possible combinations or Clinical-Genomics HL7 SIG Kit A: Exact combination AND DRB1*0414 37 Report Example – Unrelated Donors The Patient Unrelated Donor 2 Unrelated Donor 1 Unrelated Donor 3 Clinical-Genomics HL7 SIG 38 Class I vs. Class II Antigens • A 4-digit resolution level is common in class II antigens as they have been discovered more lately • It’s desired that class I antigens will report in 4 –digits as well as they are more crucial to BMT success • 4-digits reporting requires molecular and sequencing procedures • 4-digits reporting still not common in class I Clinical-Genomics HL7 SIG 39 Clinical-Genomic Data in CDA? • What should go into a clinical document (extent of detail)? • Programmatic and manual annotation at different levels? • The users of such integrated documents: clinicians? genomicists? patients? Medico-ethical issues! • HL7-Association semantics that represents the interrelations of clinical-genomics Clinical-Genomics HL7 SIG 40 First Attempts using CDA… • GMS – Genetic Messaging System – From the computational biology center in IBM Watson – Example: integrating the genomic annotation and analysis of the personal DNA sequences, into the clinical document (CDA format) <levelone> <clinical_document_header> <!--header structures per CDA--> </clinical_document_header> <body> <!--clinical content per CDA--> <!--GMS merges genomic data here--> <gms:dna sequence="2" base="802" locus="1"> <gms:annotation> possible somatic mutation cell line #4 end-11th </gms:annotation> AGGAATCAGAAAGGACACTCTGGACTTCAGCCAACAGGATACCTGAGCTGA... <gms:automated_annotation> </body> </levelone> Clinical-Genomics HL7 SIG 41 And the Work Just Begins… • Use Cases in Detail & Taxonomy • High-Level CG Model and HL7-DIM • Messages • Documents • Prototyping info. Exchange using specs Clinical-Genomics HL7 SIG 42