Teaching Biology Through Bioinformatics Real world genomics research in your classroom Kim B. Foglia Stuart M. Brown, PhD Division Ave. High School AP Biology Levittown Director of Research Computing 2004-2005 NYU School of Medicine Bioinformatics Use of computers to analyze genomic data for molecular biology research basic biological research medical research clinical medicine application taxonomy Bioinformatics 2005-2006 Bioinformatics is a great way to learn biology Using computers to study primary biological data (genomes, proteins, other databases) students learn biology as a dynamic process of interpreting complex data students can reproduce current scientific work & ask new questions science is no longer just a collection of facts in a textbook, it’s a process of inquiry Bioinformatics 2005-2006 Teaching Biology with Bioinformatics Bioinformatics tools lead directly to insights about genes, proteins, evolution “Hey! Most human genes have more bases of introns than they do exons.” “Hemoglobin sequences show that seals are closer to weasels than they are to whales.” “Protein shapes determine their function, so small changes can make a big difference.” Bioinformatics 2005-2006 All you need… Students can work on bioinformatics questions at low cost only need Internet connected computers most database tools are free on Internet unlimited data GenBank, protein structures, mutations, microarrays, etc. teacher knowledge & “comfort” questions to answer Bioinformatics 2005-2006 What we offer… Teaching modules inquiry labs student & teacher lab handouts supporting teaching resources PowerPoint presentations Supplemental skills modules Download from Web site http://bio.kimunity.com Bioinformatics 2005-2006 Modules in development Testing for Sickle Cell Anemia develop a genetic diagnostic test to screen for disease Endosymbiosis evolutionary history mitochondria & chloroplasts Hemophilia Gene Therapy build a vector and insert correct gene Are Seals and Whales Related? studying evolutionary relationships Bioinformatics 2005-2006 Using Bioinformatics in Medicine Sickle Cell Anemia & the Hemoglobin Gene AP Biology 2004-2005 Sickle Cell Anemia Most common genetic disease in US high incidence in African-Americans affects red blood cell structure & function single base mutation causing single amino acid change SNP = single nucleotide polymorphism Bioinformatics 2005-2006 Symptoms Anemia jaundice, fatigue, paleness, shortness of breath Hypoxia (low oxygen) & capillary damage severe pain in organs & joints retinal damage (blindness) Delayed growth delayed puberty, stunted growth Infections more susceptible depressed immune death from bacterial infections Stroke blocked small blood vessels in brain primarily in children Bioinformatics 2005-2006 Sickle cell hemoglobin Bioinformatics mutant hemoglobin (Hb S) 2005-2006 Bioinformatics 2005-2006 Cell biology Hb S molecules stick together form fibers under low blood oxygen levels deoxyhemoglobin sickles distortion of cells from normal round to sickle shape Bioinformatics 2005-2006 Genetics Sickle cell mutation Hb S changes 6th amino acid of hemoglobin chain normal glutamic acid valine Recessive allele heterozygote Hb AS, normal, but carrier Hb A Hb S 2 sickle cell carriers mate… Hb A HbAA HbAS homozygote recessive Hb SS, sickle cell disease each child has 1/4 chance of having the disease Hb S HbAS Bioinformatics HbSS 2005-2006 Prevalence in U.S. Carriers ~2 million Americans carry sickle cell trait 1 in 14 African-Americans Disease ~72,000 Americans have disease ~1 in every 700 African-American babies born in U.S. has sickle cell disease Bioinformatics 2005-2006 The Malaria Connection Sickle cell disease is surprisingly common for a potentially lethal genetic disease Heterozygote advantage heterozygotes are tolerant of malaria infection & do not suffer symptoms of sickle cell disease Bioinformatics 2005-2006 Malaria Bioinformatics 2005-2006 Prevalence of Malaria Prevalence of Sickle Cell Anemia ~sickle cell movie~ Bioinformatics 2005-2006 Public health Many carriers of this mutant allele are not aware that they have it at risk of having children with the disease DNA test for sickle cell allele would benefit public health genetic counseling pre-natal testing Bioinformatics 2005-2006 Your Assignment Develop a simple inexpensive DNA test for sickle cell allele develop DNA probe test for presence of sickle cell mutation use bioinformatics tools online databases of DNA sequences UCSC Genome Browser probe design tool Primer3 Bioinformatics 2005-2006 DNA review DNA double helix A–T, C–G base pair bonds can be broken by heating to 100°C separate strands denature, or melt Bioinformatics 2005-2006 DNA probes Probe short, single stranded DNA molecule mix with denatured DNA DNA Hybridization probe bonds to complementary DNA sequence Label probe is labeled for easy detection labeled probe genomic DNA Bioinformatics 3’ G A T C A G T A G C T A G T C A T C 2005-2006 5’ Designing Probes Allele specific probes probes require matched sequences can detect single base differences in alleles single mis-matched base near middle of probe greatly reduces hybridization efficiency labeled probe genomic DNA X C T A G T C A T C 3’ Bioinformatics 5’ 2005-2006 Dot blot Genomic DNA denature DNA bind DNA from cells on filter paper DNA hybridization wash probe over filter paper if complementary sequence present, probe binds to genomic DNA expose on X-ray film dark spots show bound probe Bioinformatics 2005-2006 Get hemoglobin sequence UCSC Genome Browser human genome database http://genome.ucsc.edu/ Bioinformatics UCSC Genome Browser home page click on link to Genome Browser in genome pulldown menu, choose “Human” for position text box, type “HBB” (hemoglobin ) hit “submit” 2005-2006 Genome Browser Results Listing of genes & sequences in database Click on “RefSeq” gene for HBB (NM_000518) Bioinformatics 2005-2006 Chromosome view Position of HBB in genome at base 5.2 million on chromosome 11 Bioinformatics 2005-2006 Change view of chromosome Move & zoom tools zoom out ~30x to see more of chromosome 11 Bioinformatics 2005-2006 More Hb genes Cluster of hemoglobin genes on chromosome 11 HBD, HBG1, HBG2 & HBE1 what are these genes? Bioinformatics 2005-2006 Get the DNA sequence Click on the HBB RefSeq gene HBB RefSeq summary page Bioinformatics 2005-2006 HBB RefSeq gene summary page Click on “Genomic Sequence from assembly” Bioinformatics 2005-2006 Formatting the sequence Sequence Formatting Options “exons in upper case, everything else in lower case” hit “submit” Genomic DNA lower case = introns spliced out of mRNA before translation upper case = exons translated into polypeptide chain Bioinformatics 2005-2006 HBB DNA sequence >hg16_refGene_NM_000518 range=chr11:5211005-5212610 5'pad=0 3'pad=0 revComp=TRUE ACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACC ATGGTGCATCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGG CAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGgttggtat caaggttacaagacaggtttaaggagaccaatagaaactgggcatgtgga gacagagaagactcttgggtttctgataggcactgactctctctgcctat tggtctattttcccacccttagGCTGCTGGTGGTCTACCCTTGGACCCAG AGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGG CAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTG ATGGCCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGT GAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGGgtgag tctatgggacgcttgatgttttctttccccttcttttctatggttaagtt catgtcataggaaggggataagtaacagggtacagtttagaatgggaaac agacgaatgattgcatcagtgtggaagtctcaggatcgttttagtttctt ttatttgctgttcataacaattgttttcttttgtttaattcttgctttct ttttttttcttctccgcaatttttactattatacttaatgccttaacatt gtgtataacaaaaggaaatatctctgagatacattaagtaacttaaaaaa aaactttacacagtctgcctagtacattactatttggaatatatgtgtgc ttatttgcatattcataatctccctactttattttcttttatttttaatt gatacataatcattatacatatttatgggttaaagtgtaatgttttaata tgtgtacacatattgaccaaatcagggtaattttgcatttgtaattttaa aaaatgctttcttcttttaatatacttttttgtttatcttatttctaata ctttccctaatctctttctttcagggcaataatgatacaatgtatcatgc ctctttgcaccattctaaagaataacagtgataatttctgggttaaggca Bioinformatics 2005-2006 atagcaatatctctgcatataaatatttctgcatataaattgtaactgat first 50 bases are untranslated “leader” sequence actual protein coding sequence starts at base 51 starting with letters ATG Get the mutant sequence Sickle cell mutation single base mutation 6th amino acid: glutamic acid valine need DNA sequence to design probe SNPs single nucleotide polymorphisms “variations and repeats” section: pack Bioinformatics 2005-2006 SNPs of HBB gene several SNPs of HBB gene need mutation in exon near beginning of HBB protein rs334 = Hb S mutation Bioinformatics 2005-2006 rs334 Hb S sickle cell mutation “Sequence in Assembly” = normal sequence “Alternate Sequence” = sickle cell sequence Bioinformatics 2005-2006 Align Hb A & Hb S sequences Line up sequences Normal: HBB: Mutant: catggtgcacctgactcctgAggagaagtctgccgttactg ATGGTGCATCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGG catggtgcacctgactcctgTggagaagtctgccgttactg sequence fragment is enough to design DNA probes for normal & mutant sequences Bioinformatics 2005-2006 Designing the probe Primer3 free on Web from MIT http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi powerful tool for primer design paste in sequence fragment Bioinformatics 2005-2006 Allele specific probes Need 2 probes normal allele probe sickle cell allele probe choose hybridization probes Customize probes 12-16 bases 40°-60°C longer probes are stable at higher temperatures Bioinformatics 2005-2006 Your probes… Ready to order! Place an order at your local DNA lab! Bioinformatics 2005-2006 Extra credit Advanced Assignments AP Biology 2004-2005 Advanced Assignment #1 Use the Web to research other “allele specific” genotyping methods ligase chain reaction primer extension TaqMan Design probes for one of these alternate technologies Bioinformatics 2005-2006 Advanced Assignment #2 PCR & Restriction Digest pre-natal testing for small samples it is necessary to use PCR to amplify the amount of genomic DNA before testing once you have a PCR-amplified DNA fragment of a gene, a restriction enzyme may be able to distinguish between alleles design PCR primers & find restriction enzyme that will locate sickle cell allele design with Primer3 Bioinformatics 2005-2006 Restriction enzymes NEBcutter http://tools.neb.com/NEBcutter2 New England BioLabs screens DNA sequence against all restriction enzymes Webcutter similar program http://www.firstmarket.com/cutter/cut2.html Bioinformatics 2005-2006 NEBcutter Bioinformatics 2005-2006 Advanced Assignment #3 Population genetics determine if sickle cell allele is in HardyWeinberg equilibrium in the U.S. African-American population ~2 million Americans carry sickle cell trait 1 in 14 African-Americans is a carrier ~1 in every 700 African-American babies born in U.S. has sickle cell disease Bioinformatics 2005-2006