Advances in genetic technologies in the identification of genetic disease in children Dr Katie Snape Specialist Registrar in Genetics St Georges Hospital DNA and the genetic code • Made up of 4 nucleotides or “bases” – – – – A = Adenine T = Thymine C = Cytosine G = Guanine 5’-ATGTGCATGCTAGCT-3’ 3’-TACACGTACGATCGA-5’ Genetic variation • Makes us unique – “polymorphisms” • Is the basis for evolution • Is the basis for disease http://dee-annarogers.com Genetic variation Large scale Genetic variation Large scale Aneuploidy Genetic variation Large scale Aneuploidy Structural rearrangements Genetic variation Large scale Aneuploidy Structural rearrangements Smaller scale Base substitutions Small insertions and deletions Genetic variation Large scale Aneuploidy Structural rearrangements Smaller scale Base substitutions Small insertions and deletions Single Nucleotide Polymorphism (SNP) Genetic variation Large scale Aneuploidy Structural rearrangements Smaller scale Base substitutions Small insertions and deletions Genetic variation CYTOGENETIC ANALYSIS Large scale Aneuploidy Structural rearrangements DNA SEQUENCING Smaller scale Base substitutions Small insertions and deletions Genetic variation CYTOGENETIC ANALYSIS Large scale Aneuploidy Structural rearrangements DNA SEQUENCING Smaller scale Base substitutions Small insertions and deletions Cytogenetic analysis • What used to happen….. Fluorescent In-Situ Hybridisation Developmental delay Congenital heart disease Hypocalcaemia AND NOW…. Array CGH • An array is a glass slide onto which thousands of short sequences of DNA (probes) are spotted. Array CGH Array CGH Submicroscopic chromosomal abnormalities • Contiguous gene syndromes – Phenotype conferred by haploinsufficiency or gain of multiple different genes • Common clinical features – Developmental delay – Facial dysmorphism – Congenital abnormalities Interpretation • Copy number variant vs pathogenic mutation • Parental studies – is variant de novo? – Caution! • Is parent also affected? • Is the phenotype variable? • Genetic material in region – Does gain or loss of genes match phenotype? • Comparison with other children – Decipher database Array CGH • Making more diagnoses than ever before but… – Can lead to clinical uncertainty – Do not over interpret array findings – Remember WE ARE ALL INDIVIDUALS Genetic variation CYTOGENETIC ANALYSIS Large scale Aneuploidy Structural rearrangements DNA SEQUENCING Smaller scale Base substitutions Small insertions and deletions DNA sequencing Genomic DNA Primer amplification of region of interest Cycle sequencing with fluorescently labelled chain terminator ddNTPs Capillary Electrophoresis (1 read/capillary) Sanger sequencing • 500-600bp per reaction • Takes > 1 year to sequence 1 gigabase (1/3 of human genome) • Costs $0.10 per 1000 bases • The Human Genome Project took >10 years • And now….. Next Generation Sequencing (NGS) • Multiple methodological approaches • In practice…. – Single molecule sequencing – Massively parallel sequencing • Whole genome sequencing – in a week • Targeted resequencing – “exome” Single-molecule sequencing Massively parallel sequencing Fragment DNA Fragment DNA Amplify DNA fragments of interest Fragment DNA Amplify DNA fragments of interest Sequence DNA fragments in parallel Fragment DNA Amplify DNA fragments of interest Sequence DNA fragments in parallel Generate data containing 100 bp DNA reads Fragment DNA Amplify DNA fragments of interest Sequence DNA fragments in parallel Generate data containing 100 bp DNA reads Align DNA reads to reference genome Fragment DNA Amplify DNA fragments of interest Sequence DNA fragments in parallel Generate data containing 100 bp DNA reads Align DNA reads to reference genome Identify differences between sample and reference “Variant calling” The “Exome” 1 Gene • The coding part of ~ 20000 genes • Most likely to harbour disease causing mutations Data Analysis • • • • 15-20 Gb of data per exome stored Files contain sequence reads of ~100 bases Need to align reads to reference genome Need to call variants seen in an individual sample Alignment Variant calling • Reads = the strands of DNA which are aligned with the reference sequence • Depth of coverage = number of reads covering a particular region of the exome – The deeper the coverage, the more accurate the results – Alterations within the middle of a read are more likely real than those at the end of a read Variant calling Clinical Applications • Identification of novel disease genes in Mendelian disorders • Identification of genetic susceptibility to common and complex disorders • Rapid sequencing of multiple known genes – Diagnostic gene panels • Guide therapeutics – Sequencing of cancer genomes – Pharmacogenetics Identifying Mendelian disease genes • Per genome ~ 3 million variants per sample • Per exome ~ 20, 000 variants per sample – How can we go from 20, 000 to 1? • • • • Genes shared in multiple affected individuals Inheritance patterns in a family Look for RARE genetic variants De novo variants Diagnostic gene panels • Genetically heterogenous disorders – Previously, sequential sequencing of genes – Time consuming and expensive • NGS allows all known genes to be sequenced in parallel e.g For Noonan syndrome • PTPN11, SOS1, RAF1, KRAS, NRAS, BRAF, MEK1, MEK2, HRAS, SHOC2, CBL, SPRED1 Pitfalls • Variants of uncertain clinical significance • Incidental findings e.g mutations in genes for adult onset conditions Conclusions • Unprecedented opportunities to identify genetic factors influencing disease • Genetic technologies will become commonplace in diagnostics and therapeutics • Array CGH and NGS likely to become first line diagnostic testing techniques in clinical paediatrics • We should be cautious of over interpretation of genetic data