Bio/CS 251 Bioinformatics Due date: KEY Homework 3 Monday, 2/5/07 20 points (Part 1) 1. A single strand of bacterial DNA contains the base sequence -35 5’ -10 +1 CGTGTATTGACACTGGTGAGCCACTATCGTATATTCCCTAAGTGAGTATTGG 3’ 3’..GCACATAACTGTGACCACTCGGTGATAGCATATAAGGGATTCACTCATAACC 5’ mRNA: 5’ AGUGAGUAUUGG….3’ a. What is the complementary sequence? Draw or type this sequence just below and indicate its polarity. See above b. Could this stretch of DNA, or part of this stretch, be transcribed in bacteria? If so, find and label the promoter elements and the startsite for transcription. See above c. Under the double stranded DNA sequence, draw or type the RNA sequence that could be transcribed, and indicate its polarity. Note: show only the sequence of RNA that would be transcribed. See above d. Which strand of the DNA serves as the coding strand (sense strand), and which serves as the template strand (antisense strand), for the synthesis of the RNA transcript for this hypothetical gene fragment. The coding strand is the “mRNA-like” strand, i.e., the top strand The template strand is the “mRNA-unlike” strand, i.e., the bottom strand 2. Examine the -globin gene sequence in Slide #24 of today’s Powerpoint lecture (Bio 251 Central Dogma 06.PPT) a. Is this a DNA or RNA sequence? Why? RNA sequence, because uracil (U) is present instead of thymine (T) b. Does this sequence show a primary transcript, or a mature transcript? Explain. It shows a primary transcript, or heterogeneous nuclear RNA (hnRNA). This immature transcript contains introns that have not yet been spliced out. c. What three steps must occur to form a mature mRNA? - 5’ cap must be added - 3’ poly(A) tail must be added - introns must be spliced out. d. How many exons and introns are found in the human -globin gene? Three exons and two introns e. Which end of the gene would contain the promoter sequence? f. 5’ end Does the sequence shown in the slide include a promoter sequence? Why or why not? This sequence cannot contain a promoter, because promoters are not present in mRNA. The promoter lies upstream of the first base of the mRNA. g. What is the sequence of the stop codon for translation of the -globin mRNA? UAA h. Does the sequence shown in the slide include a 5’ UTR? A 3’ UTR? If so, how long are each of these sequences? 5’ UTR: 50 bp long, preceding the start codon 3’ UTR: 133 bp long, after the stop codon i. What 2-base sequence serves as the boundary between an intron and an exon at the beginning of an intron? At the end of an intron? The 5’ splice junction of an intron always begins with GT The 3’ splice junction of an intron always begins with AG j. What is unusual about the Arginine codon that specifies amino acid #30 in the -globin gene sequence? Explain how this codon differs from the Arginine codon that specifies amino acid #104. Both Arg codons are specified by the codon AGG. Arg #30, however, is encoded by the last two bases in Exon1 + the first base of exon 2, whereas Arg #104 is ecoded by the last three bases of exon 2. In other words, the reading frame of a gene may be split in the middle of a codon by an Introns….pretty cool, huh?! 3. A gene in the chicken that codes for a specific egg protein was isolated. In one set of experiments, the duplex DNA was denatured and allowed to hybridize with mature processed mRNA, which was isolated from cells that express the protein. The electron microscope revealed heteroduplex molecules formed between the mRNA and single DNA strands. Such a heteroduplex is shown here. Answer the following questions regarding this heteroduplex: a. Is the DNA represented by the dark or light strand? Explain. DNA is represented by the dark strand. The mRNA represented by the light strand has undergone processing, and introns have been excised from it. These sequences are present in the DNA and form loops in the hybrid molecule because no complementary sequences are present in the mRNA to permit pairing. b. How many introns appear to be present? 7 introns and 8 exons c. The longer strand shows an unpaired segment at each end. What do these represent, in other words, what elements of the gene would be found in each of these regions? At one end is the promoter region that is not transcribed. At the other end is a segment containing sequences in the terminator that are not found in the mRNA. d. The shorter strand shows an unpaired segment at one end. Is this unpaired strand at the 5’ or 3’ end of the gene? What does this unpaired strand represent? This represents the poly(A) tail at the 3’ end of the mRNA strand. It is added after transcription is completed, and there is no complement to it in the template strand. 4. Into how many different mRNAs can an hnRNA (a primary RNA transcript) be processed if it has three exons and if all of its 5’ splice junctions can be used with any downstream 3’ splice junction? Diagram the information content of each mRNA. It is possible to combine any 5’ splice junction (GT) with any 3’ splice junction (AG) to perform a splicing event. Since there are only two 5’ and two 3’ splice junctions, the number of alternatively spliced transcripts is limited to only two, as shown above and below the primary transcript in the following diagram: 1 3 1 2 1 2 3 3 5. How frequently would you expect to find the sequence of nucleotides provided below in a DNA molecule simply as the result of random chance? Assume that each of the four nucleotides occurs with the same frequency. 1/413 = once in every 6.71 x 107 nucleotides, i.e., once in every 67 million base pairs. 5’ – G G A T C G T A G C C T A – 3’ 6. How many nucleotides long would a DNA sequence need to be in order for it not to be found by chance more than once in a genome whose size is 3 x 109 base pairs long? 16 nt long 1/416 = one occurrence in every 4.3 x 109 nt 7. a. What sequence of amino acids would the following RNA sequence code for if it were to be translated by a ribosome? 5’ AUG GGA UGU CGC CGA AUA 3’ MET GLY CYS ARG ARG ILE b. What sequence of amino acids would it code for if the first nucleotide were deleted and another ‘A’ was added to the 3’ end? 5’ UGG GAU GUC GCC GAA UAA 3’ TRP ASP VAL ALA GLU Stop This is Part 2 and is worth 10 points. The following exercise will be due on Tuesday, Feb. 6 at 5 pm 8. Using bioinformatic tools to locate information about gene structure: The gene/protein mutS/MSH2 is universally conserved across all species, because it serves an essential function in DNA repair for all organisms. Using Entrez on the NCBI homepage (http://www.ncbi.nlm.nih.gov/entrez/) find the entries for the mutS/MSH2 gene from the following three species: (To find these gene easily, use the “Search” box, and choose a “Gene” search) Prokaryotic (bacteria): Staphylococcus aureus [Hint: search “mutS Staphylococcus aureus”) Eukaryotic: Homo sapiens (human) [Hint: search “hMSH2 Homo sapiens”] Aspergillus nidulans (bread mold fungus) [Hint: search “mutS Aspergillus nidulans”, and study the first gene entry at the top of the list] Do the following: 1. Print out the annotation page (Entrez Gene page) for each of the three gene versions. Here are the first several lines of each Entrez Gene page: 1: mutS DNA mismatch repair protein mutS [ Staphylococcus aureus subsp. aureus USA300 ] GeneID: 3913979 updated 27-Jan-2007 1: MSH5 mutS homolog 5 (E. coli) [ Homo sapiens ] GeneID: 4439 updated 20-Jan-2007 1: AN5006.2 hypothetical protein [ Aspergillus nidulans FGSC A4 ] GeneID: 2872806 updated 01-Feb-2007 2. Compare the lengths of the mutS/MSH2 genomic and mRNA sequences for each species. (why does the S. aureus annotation not include an mRNA sequence, and what will you use instead?) Construct a table to record this information in a clear and systematic manner. Gene mRNA S. aureus 2523 nt 2523 nt A.nidulans 6215 nt 4935 nt (XM_657518 4935 bp mRNA) H. sapiens MSH5 28811 nt a. 2903 nt (NM_002441 2903 bp mRNA) b. 2796 nt (NM_025259 2796 bp mRNA) c. 2748 nt (NM_172165 2748 bp mRNA) d. 2745 nt (NM_172166 2745 bp mRNA) (If instead you used H. sapiens hMSH2, then you received full credit as well) The human gene can be transcribed to produce 4 similar mRNA variants, as shown above 3. Record the numbers and lengths of the exons and introns for each gene. The easiest way to get this information is as follows: a. From the Entrez Gene page, go to the “Display” pull-down box, and in place of “Full Report”, select “Gene Table”. Presto! You have all of the information you need to answer this question. 4. Evolution: given that the mutS/MSH2 gene codes for the same protein in each species, how can you account for the dramatic differences in length of these genes? For example, what is revealed by comparing the gene lengths in prokaryotes versus simple eukaryote to complex eukaryote? Do you think that introns are ancestral and have been lost over the course of evolution in some groups, or do you think that introns are recent and were gained during the evolution of more complex eukaryotes? Answer: Some of both, probably. Organisms ancestral to modern prokaryotes likely had some introns, and modern prokaryotes lost introns due to selective pressures that favored streamlining during the course of their evolution. As organisms evolved greater and greater complexity, it seems they gained more and longer introns. In other words, greater complexity (multicellularity, cell and tissue specialization) was accompanied by a corresponding proliferation and expansion of introns along with the invention of more and different genes, and more ways of splicing a single gene to create variant transcripts and proteins. Here are the GenBank Full entries for each gene: S. aureus 2523 nt LOCUS DEFINITION ACCESSION VERSION PROJECT KEYWORDS SOURCE ORGANISM NC_007793 2523 bp DNA linear BCT 26-JAN-2007 Staphylococcus aureus subsp. aureus USA300, complete genome. NC_007793 REGION: 1306747..1309269 NC_007793.1 GI:87159884 GenomeProject:16313 . Staphylococcus aureus subsp. aureus USA300 Staphylococcus aureus subsp. aureus USA300 Bacteria; Firmicutes; Bacillales; Staphylococcus. REFERENCE 1 (bases 1 to 2523) AUTHORS Diep,B.A., Gill,S.R., Chang,R.F., Phan,T.H., Chen,J.H., Davidson,M.G., Lin,F., Lin,J., Carleton,H.A., Mongodin,E.F., Sensabaugh,G.F. and Perdreau-Remington,F. TITLE Complete genome sequence of USA300, an epidemic clone of community-acquired meticillin-resistant Staphylococcus aureus JOURNAL Lancet 367 (9512), 731-739 (2006) PUBMED 16517273 REFERENCE 2 (bases 1 to 2523) CONSRTM NCBI Genome Project TITLE Direct Submission JOURNAL Submitted (11-FEB-2006) National Center for Biotechnology Information, NIH, Bethesda, MD 20894, USA REFERENCE 3 (bases 1 to 2523) AUTHORS Diep,B.A., Gill,S.R., Chang,R.F., Phan,T.H., Chen,J.H., Davidson,M.G., Lin,F., Lin,J., Carleton,H.A., Mongodin,E.F., Sensabaugh,G.F. and Perdreau-Remington,F. TITLE Direct Submission JOURNAL Submitted (30-JAN-2006) Department of Medicine, University of California, San Francisco, San Francisco, CA 94143, USA COMMENT PROVISIONAL REFSEQ: This record has not yet been subject to final NCBI review. The reference sequence was derived from CP000255. COMPLETENESS: full length. FEATSTATS placeholder FEATURES Location/Qualifiers source 1..2523 /organism="Staphylococcus aureus subsp. aureus USA300" /mol_type="genomic DNA" /strain="USA300; FPR3757" /sub_species="aureus" /db_xref="taxon:367830" gene 1..2523 /gene="mutS" /locus_tag="SAUSA300_1188" /db_xref="GeneID:3913979" CDS 1..2523 /gene="mutS" /locus_tag="SAUSA300_1188" /note="identified by match to protein family HMM PF00488; match to protein family HMM PF01624; match to protein family HMM PF05188; match to protein family HMM PF05190; match to protein family HMM PF05192; match to protein family HMM TIGR01070" /codon_start=1 /transl_table=11 /product="DNA mismatch repair protein mutS" /protein_id="YP_493885.1" /db_xref="GI:87160655" /db_xref="GeneID:3913979" >gi|87159884:1306747-1309269 Staphylococcus aureus subsp. aureus USA300, complete genome ATGTTTTATGAAGATGCCAAGGAGGCATCACGTGTACTTGAAATTACTTTAACTAAAAGAGATGCTAAAA AAGAAAATCCAATTCCGATGTGTGGTGTTCCGTATCATTCTGCAGATAGTTATATAGATACACTTGTTAA TAATGGATATAAAGTAGCTATTTGTGAACAGATGGAAGATCCGAAACAAACGAAAGGTATGGTTAGACGT GAGGTAGTAAGAATTGTGACTCCAGGAACTGTGATGGAGCAAGGTGGTGTAGATGATAAACAAAATAACT ATATTTTAAGTTTTGTTATGAATCAACCTGAAATTGCGCTTAGTTACTGTGATGTTTCTACTGGCGAATT AAAGGTTACACATTTTAATGATGAAGCGACTTTATTAAATGAAATTACGACGATAAACCCTAACGAAGTT GTTATCAATGACAATATTTCCGATAATTTAAAAAGACAAATTAATATGGTGACAGAAACAATAACAGTCA GGGAAACGTTATCATCAGAAATCTATAGTGTGAATCAAACTGAACATAAATTAATGTATCAAGCGACACA ATTATTGCTAGATTATATTCATCATACACAAAAACGTGATTTATCGCATATCGAGGATGTTGTTCAATAT GCAGCTATAGATTATATGAAAATGGATTTTTATGCTAAGAGAAACCTTGAGTTAACGGAAAGCATTCGAT TAAAATCAAAAAAAGGAACGCTACTTTGGCTAATGGACGAAACGAAAACACCAATGGGAGCACGCCGCTT AAAACAATGGATAGATAGACCACTAATAAGTAAAGAACAAATTGAAGCACGATTAGATATCGTTGATGAA TTTAGTGCTCATTTCATAGAAAGAGACACCTTAAGAACATATCTTAATCAAGTGTATGATATTGAACGTC TTGTTGGGCGTGTTAGTTACGGAAATGTTAATGCGAGAGATTTAATTCAACTTAAACATTCCATTTCTGA AATACCGAATATTAAAGCATTACTAAATTCTATGAATCAGAATACTCTTGTACAAGTTAATCAACTAGAA CCCCTTGATGATTTACTTGATATATTAGAACAGAGTTTAGTAGAAGAACCACCAATTTCAGTTAAAGATG GCGGACTATTCAAAGTTGGTTTTAATACGCAATTAGATGAATATCTTGAAGCTTCAAAAAACGGAAAAAC ATGGTTAGCAGAATTACAAGCCAAAGAAAGACAACGTACAGGAATAAAATCATTGAAAATAAGCTTTAAT AAAGTGTTTGGTTATTTTATAGAAATAACACGTGCCAACTTGCAAAATTTTGAACCAAGTGAATTTGGTT ATATGAGGAAGCAAACGTTATCGAATGCTGAACGTTTTATAACTGATGAACTTAAAGAAAAAGAAGATAT CATTTTAGGTGCGGAAGACAAAGCCATCGAATTAGAATATCAATTATTTGTTCAGCTACGTGAAGAAGTT AAAAAATATACTGAACGTTTACAACAACAAGCTAAAATTATTTCAGAGCTAGATTGTTTACAGAGCTTTG CAGAAATTGCTCAAAAATATAATTACACTAGGCCTTCATTTAGTGAAAATAAAACATTAGAATTAGTGGA ATCTAGGCACCCAGTAGTGGAAAGAGTAATGGATTATAATGACTATGTGCCTAATAATTGTCGATTAGAT AATGAAACATTTATATATTTAATTACAGGTCCGAATATGTCTGGTAAATCGACATATATGAGACAAGTTG CCATAATTAGTATAATGGCCCAAATGGGAGCTTATGTCCCTTGTAAAGAGGCAGTGTTACCTATATTTGA TCAAATATTCACTAGAATAGGTGCGGCAGATGATTTGGTTTCAGGTAAGAGTACGTTTATGGTAGAAATG CTAGAAGCACAAAAGGCATTAACTTATGCAACAGAGGATAGTTTGATTATTTTCGATGAAATTGGACGTG GTACTTCAACGTATGACGGTTTAGCTTTAGCGCAGGCAATGATAGAGTATGTAGCTGAAACATCACATGC TAAAACGTTATTTTCAACACATTATCATGAATTGACAACATTAGATCAAGCATTACCAAGTCTAAAAAAT GTTCACGTCGCTGCTAATGAATATAAAGGTGAACTTATATTCTTGCATAAAGTCAAAGATGGTGCAGTTG ACGATAGTTATGGTATTCAAGTTGCGAAATTAGCTGATTTACCTGAAAAAGTTATTAGCAGAGCACAAGT GATTCTAAGCGAGTTTGAAGCGTCTGCTGGTAAAAAATCATCGATATCAAATTTAAAAATGGTCGAAAAT GAACCTGAAATTAATCAAGAAAATTTAAACTTAAGTGTTGAAGAAACAACTGATACTTTATCTCAAAAAG ACTTTGAACAAGCATCATTTGATTTGTTTGAAAATGATCAAGAAAGCGAGATTGAACTACAAATTAAAAA TTTGAATTTATCTAATATGACACCAATTGAGGCATTGGTGAAGTTAAGTGAATTACAAAATCAATTAAAA TAG A.nidulans6215 nt LOCUS DEFINITION NT_107011 6215 bp DNA linear CON 31-JAN-2007 Aspergillus nidulans FGSC A4 chromosome III scaffold_5, whole ACCESSION VERSION KEYWORDS SOURCE ORGANISM REFERENCE AUTHORS TITLE JOURNAL PUBMED REFERENCE CONSRTM TITLE JOURNAL REFERENCE AUTHORS genome shotgun sequence. NT_107011 REGION: 2170653..2176867 NT_107011.1 GI:50058543 WGS. Aspergillus nidulans FGSC A4 Aspergillus nidulans FGSC A4 Eukaryota; Fungi; Ascomycota; Pezizomycotina; Eurotiomycetes; Eurotiales; Trichocomaceae; Emericella. 1 (bases 1 to 6215) Galagan,J.E., Calvo,S.E., Cuomo,C., Ma,L.J., Wortman,J.R., Batzoglou,S., Lee,S.I., Basturkmen,M., Spevak,C.C., Clutterbuck,J., Kapitonov,V., Jurka,J., Scazzocchio,C., Farman,M., Butler,J., Purcell,S., Harris,S., Braus,G.H., Draht,O., Busch,S., D'Enfert,C., Bouchier,C., Goldman,G.H., Bell-Pedersen,D., Griffiths-Jones,S., Doonan,J.H., Yu,J., Vienken,K., Pain,A., Freitag,M., Selker,E.U., Archer,D.B., Penalva,M.A., Oakley,B.R., Momany,M., Tanaka,T., Kumagai,T., Asai,K., Machida,M., Nierman,W.C., Denning,D.W., Caddick,M., Hynes,M., Paoletti,M., Fischer,R., Miller,B., Dyer,P., Sachs,M.S., Osmani,S.A. and Birren,B.W. Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus and A. oryzae Nature 438 (7071), 1105-1115 (2005) 16372000 2 (bases 1 to 6215) NCBI Genome Project Direct Submission Submitted (07-JUL-2004) National Center for Biotechnology Information, NIH, Bethesda, MD 20894, USA 3 (bases 1 to 6215) Birren,B., Nusbaum,C., Abebe,A., Abouelleil,A., Adekoya,E., Ait-zahra,M., Allen,N., Allen,T., An,P., Anderson,M., Anderson,S., Arachchi,H., Armbruster,J., Bachantsang,P., Baldwin,J., Barry,A., Bayul,T., Blitshsteyn,B., Bloom,T., Blye,J., Boguslavskiy,L., Borowsky,M., Boukhgalter,B., Brunache,A., Butler,J., Calixte,N., Calvo,S., Camarata,J., Campo,K., Chang,J., Cheshatsang,Y., Citroen,M., Collymore,A., Considine,T., Cook,A., Cooke,P., Corum,B., Cuomo,C., David,R., Dawoe,T., Degray,S., Dodge,S., Dooley,K., Dorje,P., Dorjee,K., Dorris,L., Duffey,N., Dupes,A., Elkins,T., Engels,R., Erickson,J., Farina,A., Faro,S., Ferreira,P., Fischer,H., Fitzgerald,M., Foley,K., Gage,D., Galagan,J., Gearin,G., Gnerre,S., Gnirke,A., Goyette,A., Graham,J., Grandbois,E., Gyaltsen,K., Hafez,N., Hagopian,D., Hagos,B., Hall,J., Hatcher,B., Heller,A., Higgins,H., Honan,T., Horn,A., Houde,N., Hughes,L., Hulme,W., Husby,E., Iliev,I., Jaffe,D., Jones,C., Kamal,M., Kamat,A., Kamvysselis,M., Karlsson,E., Kells,C., Kieu,A., Kisner,P., Kodira,C., Kulbokas,E., Labutti,K., Lama,D., Landers,T., Leger,J., Levine,S., Lewis,D., Lewis,T., Lindblad-toh,K., Liu,X., Lokyitsang,T., Lokyitsang,Y., Lucien,O., Lui,A., Ma,L.J., Mabbitt,R., Macdonald,J., Maclean,C., Major,J., Manning,J., Marabella,R., Maru,K., Matthews,C., Mauceli,E., Mccarthy,M., Mcdonough,S., Mcghee,T., Meldrim,J., Meneus,L., Mesirov,J., Mihalev,A., Mihova,T., Mikkelsen,T., Mlenga,V., Moru,K., Mozes,J., Mulrain,L., Munson,G., Naylor,J., Newes,C., Nguyen,C., Nguyen,N., Nguyen,T., Nicol,R., Nielsen,C., Nizzari,M., Norbu,C., Norbu,N., O'donnell,P., Okoawo,O., O'leary,S., Omotosho,B., O'neill,K., Osman,S., Parker,S., Perrin,D., Phunkhang,P., Piqani,B., Purcell,S., Rachupka,T., Ramasamy,U., Rameau,R., Ray,V., Raymond,C., Retta,R., Richardson,S., Rise,C., Rodriguez,J., Rogers,J., Rogov,P., Rutman,M., Schupbach,R., Seaman,C., Settipalli,S., Sharpe,T., Sheridan,J., Sherpa,N., Shi,J., Smirnov,S., Smith,C., Sougnez,C., Spencer,B., Stalker,J., Stange-thomann,N., Stavropoulos,S., Stetson,K., Stone,C., Stone,S., Stubbs,M., Talamas,J., Tchuinga,P., Tenzing,P., Tesfaye,S., Theodore,J., Thoulutsang,Y., Topham,K., Towey,S., Tsamla,T., Tsomo,N., Vallee,D., Vassiliev,H., Venkataraman,V., Vinson,J., Vo,A., Wade,C., Wang,S., Wangchuk,T., Wangdi,T., Whittaker,C., Wilkinson,J., Wu,Y., Wyman,D., Yadav,S., Yang,S., Yang,X., Yeager,S., Yee,E., Young,G., Zainoun,J., Zembeck,L., Zimmer,A., Zody,M. and Lander,E. TITLE Direct Submission JOURNAL Submitted (26-APR-2004) Broad Institute, 320 Charles Street, Cambridge, MA 02142, USA COMMENT PROVISIONAL REFSEQ: This record has not yet been subject to final NCBI review. The reference sequence was derived from CH236924. FEATURES Location/Qualifiers source 1..6215 /organism="Aspergillus nidulans FGSC A4" /mol_type="genomic DNA" /strain="FGSC A4" /db_xref="taxon:227321" /chromosome="III scaffold_5" ORIGIN 1 atgtcgtctc ggccagatct cagggtaagt tctttttagc ctagaaaccg aacagcttct 61 tactaacttt ctttctaaag gttgacgacg aagtcggctt tatccgcttc taccgctccc 121 tcgcctccga tgactcccat aataatgaaa caattcgcat cttcgaccgt ggggattggt 181 actcagccca cggcaaagaa gccgaattca tcgctcgcac agtctacaaa acaacctctg 241 tccttcgtaa tctcggccgc agcgaaacgg gcggcttgcc gtccgtcaca atgagcatta 301 ctgtcttccg taatttttta cgtgaggctc tattccggct aaataagagg attgagatct 361 ggggctccgc cggcacgggc aaagggcact ggaagaaggt taagcaggcg agtcccggaa 421 atctgcagga tgtggaggag gaattagggg caatgggtat ggagggaagt aacggagcgc 481 ccattatcat ggcagtgaag cttagtgcaa aggccgggga ggcgcgaaat gtaggtgttt 541 gttttgcaga tgcaagtgtg cgcgagcttg gtgtgagtga gttcctggac aatgatgttt 601 actcaaactt tgaggcgctt gttatccagc tcggtgtgaa agagtgtctc gttgtgcagg 661 atgtcaatcg gaaggatgtg gaggtggcca agatccgagc aatatgtgat aactgcggga 721 tagcgatatc ggagcgcccg gcatctgatt ttggggttaa ggatattgaa caggacctta 781 caaggttgct gagggatgag cggtcggctg ggacactgcc ggagacggag ctgaagcttg 841 cgatgggcgg tgcggcggcg ctaattcggt atttgggcgt gatgtcggat gcgacaaatt 901 tcgggcagta tcaactctac cagcatgatt tggcgcagta catgaagctc gatgcggcgg 961 cattgagagc tttgaatctt atgcctgggc cgagggatgg atcaaaatcg atgagtttat 1021 ttgggctgtt gaatcattgt aaaacgcctg ttgggagccg gttgctggca cagtggctga 1081 aacagccgtt aatggatctg gcggagattg aaaagcggca aaggcttgtt gaggcgtttg 1141 tcgtgagcac ggagcttcgg cagatgatgc aggaggagca tctacgatct attccggatc 1201 tgtatcggct tgcgaaacga ttccagcgaa aacaggcgaa tctggaagat gtagtgcgtg 1261 tgtatcaggt tgctattcgg ctgcctgggt ttgtgaactc tctggagaat gttatggatg 1321 aggagtacca gacgccgctt gagacagagt acacggccaa gctacgcaac cattcggcga 1381 gcctggcgaa actggaggag atggtcgaga cgacggttga tctggatgcc ctcgagaatc 1441 acgagttcat catcaagccc gaattcgatg atagtctgcg catcattcgc aaaaagctgg 1501 atcagttgcg ccatgatatg taccttgagc ataaggctgt cgcgagagac ctagatcagg 1561 aaatggacaa gaagctgttc ctggagaacc accgcgtgta cggatggtgt ttccgtctga 1621 cgcggaatga ggcgggttgc attcgcaaca agaaggccta ccaggagtgc tcaacgcaga 1681 agaacggtgt gtactttacc acatcgacga tgcaatctct ccgccgggaa catgatcagc 1741 tctcctccaa ttacaaccgc acccagacgg gacttgtctc ggaggttgtc aacgttgcag 1801 catcgtactg tccggtcctg gaacaactag ccggcgtcct ggctcacctc gatgtcattg 1861 tgagctttgc gcacgcctct gtacacgcgc caacagccta tacgaaaccc aagatccacc 1921 cgcgcggcac gggcaataca gtccttaaag aagcacgcca cccctgcatg gagatgcagg 1981 acgacatctc cttcataact aatgatgtct cccttatccg cgacgagtcc tcattcctta 2041 tcatcactgg ccccaatatg ggcggtaaat cgacctacat ccgcatgatt ggcgttatag 2101 cgctcatggc gcagataggc tgcttcgtgc cctgcaccga agcagagttg acgatctttg 2161 2221 2281 2341 2401 2461 2521 2581 2641 2701 2761 2821 2881 2941 3001 3061 3121 3181 3241 3301 3361 3421 3481 3541 3601 3661 3721 3781 3841 3901 3961 4021 4081 4141 4201 4261 4321 4381 4441 4501 4561 4621 4681 4741 4801 4861 4921 4981 5041 5101 5161 5221 5281 5341 5401 5461 5521 5581 5641 actgcatcct tggcggagat tcattgatga tttcagagca aactcacgac tcatcggcga agaaggtaac acgtcgctga aggagctgga ataagtactc agtggaagag ttatgaggga ttcaggcgct gcttctgtct tataggtcat tttttcgaaa gaaatcacgt gcggagacct ggcctgtgca tacagtaatc tgtgcgtaca attcatgcac gaggtgggat ggtggtgcgt cgctaacggc ttcttgcttg caatctgata tgtccatctc gacattttga tcccccggtc ttgtgcttgc gttgctaact tatttcgcaa tgcatgtatt tgattgttac gcctctcgat cgctgaaccc gggtattgga tggataagct ttccattcat acagcagcct agtcggctac ggatatggcc cagactttgt atatctctgc cgcagtatgt cccactatat tcgctctagg tcctcagtgg cagctgggct ttatcgagaa cagcgtgcat caagcttcgc cagtgcacat tcatcatttc tagattcttg tctttttcct taataatctc tctttgcgct tgcccgtgtt gctcgaaacg attgggccgc cattgtgacc acttgcagat cggaacaaca attgctctac gctcgtccgc ggattttacg gcaggaggaa tgcaattgag tttggtaaaa gtgagccatc gcaatggtgg tcaactgcca tcaacgagtc gcccatggtt ccacaaacac gcttagtttt gaacccgact gtctaattga cccttttcca ctcgtcacac tcctttaagc gggctcagac tcgtaccgat gttaaagagt ggtgtttccg caccatgcag gtcggcgcca cgtgttagag ctctcgcctg cagtcctcac gggatcgacg tactcgtccc tcaggttatt catttcgctt tacgatagaa caggctccat catcaaccag tattgggatc gacggtgtgg ggtcattacc tctgcaggtg gacggcacag caagttgcgg ccggttctac gtcgtacatt ttggactgtt gaagctgaat ctgggcaggt cgcgaatctc aggtgccagc atactctttc gctcttccac tgactatgac cctccctaca cttgaaagca gatgttgcgc ggtgcgagcg tcaaacatcc ggcactagca gaaatccgct cgatacccca gcgaacgaag cgcgttgaac ttcccagaga tccgctgatt gtcgaagaag gagccgggga ggggacgaga atccagagtt taggtagtta ggtaatggat atatattgcc tagagtcccg tcaacgctca aagctcaaag tgtccctcgt ctcgaagctg tcctcactct ggctatacga tatctttcaa taggacagca aaaccatctg ttgggcgata ctcgcgaaca atgagcgatg ggtgtaattg gatgttgagg cagccgcgga ccagttccgc aatatgcgag tcgtgcaagg atgtttcacc gctttggagg accgaggagg acggttgtca atcaattgcg cgcgtgaaga gacttgttcg cggatattcg ttggagtggc caggtggaca cagagaaagg aacagtctgt attgataatg gagggtctcc aatgagcttg aagattgatt caaccctatc atgcccatcg tacattgcat ctcttccgcg cttgaccagc gtcattgtat gcacttgaca gtcaaggatt attcgcagct tcaagtcagc cttacgacgg gctttggact agtctgtcaa aagacgaaaa caggcatctg aggtagtcaa ccgctggaaa ggagtgcgct gagagctgac agttgcaggc gtttctggtc tggcgttcgg ctgatataca atactacttt cccgcgcagt ctcttgaggt cgccaattac tacgtccctc gaccgtccac cggcatcttc attggccata gcgtcgcgct aagggatttg gcgcaaaagg ctaatgaccg actacgccgt gattgctacg ttggatggag tacgcgcggt atgtcgacaa gcatttttag atccgccgac tcccgaagat gaccggatcc acaagatgct gagcgcagaa agcatcttcc catacgaaat ggagcatgag ttctcggagt ttatagggct gagctcggcc ttcgccttca atgactggga ggcttgtggc ccaactgggt agcggacgat cggcatttct ctctctcaag ttcctcatat cccttttctc cagcgcggat ggaagaagcg tcctcctcgg tctatctcgc cttggctggc cgaggagact taaaggcgtt gacctctgag attcggtctc attcgcgaca gaacctgcac agagaagaga cgaccagtct catggcgcgg tgctgcgtca gttgaaagcg gcttgaggag gaatagagtc ttcggtttcg gtagattcta tctatgatgt ctgcgctacc cggagtaagt cgtctattgc tctgcctttt tttacagtgc atcgcatttg gtagcgggct tttgcttggg actcttgaaa tcttcctctc agagtattca ttgctccagc cgcggctcgt ggtattttgg gaattcggag tggcgagtgt tgctctccgc tctgtgcggt ggcgttcaat atactgtccg aaccaggatg tgctgcggat ggctcgcgcg atcacacaaa ggggaagctg tgtttcggag gtcctatgtc ggtcttccac agacgccgcg gcagttctgc aagtgtgaca aaacgatgtt ggcttaccaa ttcctggctg gggtgatctt tgtgtctggt catatacgtt ggatctcatt cttcaattgg gaacgtcctc gaccattctt gttcacgtcc tttcttgaat accaggttag tctacgttca tccttgatca gcctgggcga cacttccacg gtcgttgcat aagacccggc ttcggcatcc cagaaggccg gcgacgattg ctgctggtga aagaggcagg ttccagggga ggttatctat cgcaaggttg atttaatatg tgtatagtat gcgcaaatct tctttgttgt tactgctgga ccgtccgtga caatagtcta acatcaccgc atcacggcgt gggctccgtt ctttttcttt ggtaggctgg accctcgacc tttaaaggat ccatacgacc ctcgatttat ttgagaaaag gcgggtatct cggtcgtcaa gcctcgcatt cccgagacag gagtatatgt aagtcggacg caacgattag gagttggcgc atagaaaaga cgggtggttg ttttggcagt cgcgttatag gcgttgaagg tactggccta acctatcatg attatcggga atcaattata atggactggc ttcttgtggg gctaactggt atcggctgcg tccatcctga caacttacga cgcaaccgta tttactgttc gcgcggatgt cattttcctc tacactttat 5701 5761 5821 5881 5941 6001 6061 6121 6181 ggggggtggt gaagagcacg gtaagcccgg tccccctaac aacactacct tccatcgtcg tcgcgcaagt gcggtgtgtc gagggcttgc atcttcatat atactctacc tctctctctt tctccgtgcc ctctccgcaa cagtctctat atggagcgaa tggacagggt ggagagaagg taactgatca taattcagta cttcaaaatc atgttcgacc gtcatcttct agtatgcaat ttgactcagc ggcaatagca agagggtatc tccgcaggcg gactcaacga tggactaagt aatacttcca gcctctttac acagtatgct cggcgcgaaa ttgggatcgg gttga gcatccgctt tacattacat taattgtata acttgcccac cggccgcttc tcctgatgtc gtcaagtggt gattatgggg tgaactccgc ccaccttgag aacaagtcca cgcctccgga gtgcccccta cggccctcgc gggggtagtg atgaagatgg H. sapiens 28811 nt LOCUS DEFINITION ACCESSION VERSION KEYWORDS SOURCE ORGANISM AY943816 28811 bp DNA linear PRI 08-MAR-2005 Homo sapiens mutS homolog 5 (E. coli) (MSH5) gene, complete cds. AY943816 AY943816.1 GI:60459549 . Homo sapiens (human) Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo. REFERENCE 1 (bases 1 to 28811) AUTHORS Livingston,R.J., Rieder,M.J., Rajkumar,N., Downing,T.K., Olson,A.N., Nguyen,C.P., Gildersleeve,H., Cassidy,C.M., Johnson,E.J., Swanson,J.E., McFarland,I., Yool,B., Park,C. and Nickerson,D.A. TITLE Direct Submission JOURNAL Submitted (23-FEB-2005) Genome Sciences, University of Washington, 1705 NE Pacific, Seattle, WA 98195, USA COMMENT To cite this work please use: NIEHS-SNPs, Environmental Genome Project, NIEHS ES15478, Department of Genome Sciences, Seattle, WA (URL: http://egp.gs.washington.edu). FEATURES Location/Qualifiers source 1..28811 /organism="Homo sapiens" /mol_type="genomic DNA" /db_xref="taxon:9606" gene mRNA CDS 1996..26853 /gene="MSH5" join(1996..2058,2450..2609,3160..3283,4874..4954, 5105..5167,5901..6022,6136..6296,6543..6578,7234..7316, 9384..9429,15301..15439,15567..15629,20167..20295, 20550..20622,20768..20877,21099..21179,21427..21514, 21788..21977,22092..22218,22692..22841,23190..23264, 23474..23617,23820..23957,24114..24187,24422..25170, 25417..25557,26066..26161,26268..26357,26438..26853) /gene="MSH5" /product="mutS homolog 5 (E. coli)" join(2463..2609,3160..3283,4874..4954,5105..5167, 5901..6022,6136..6296,6543..6578,7234..7316,9384..9429, 15301..15439,15567..15629,20167..20295,20550..20622, 20768..20877,21099..21179,21427..21514,21788..21977, 22092..22218,22692..22841,23190..23264,23474..23617, 23820..23957,24114..24187,24422..24533) /gene="MSH5" /codon_start=1 /product="mutS homolog 5 (E. coli)" /protein_id="AAX20111.1" /db_xref="GI:60459550" /translation="MASLGANPRRTPQGPRPGAASSGFPSPAPVPGPREAEEEEVEEE EELAEIHLCVLWNSGYLGIAYYDTSDSTIHFMPDAPDHESLKLLQRVLDEINPQSVVT SAKQDENMTRFLGKLASQEHREPKRPEIIFLPSVDFGLEISKQRLLSGNYSFIPDAMT ATEKILFLSSIIPFDCLLTPPGDLRFTPIPLLIPSQVRALGGLLKFLGRRRIGVELED YNVSVPILGFKKFMLTHLVNIDQDTYSVLQIFKSESHPSVYKVASGLKEGLSLFGILN RCHCKWGEKLLRLWFTRPTHDLGELSSRLDVIQFFLLPQNLDMAQMLHRLLGHIKNVP LILKRMKLSHTKVSDWQVLYKTVYSALGLRDACRSLPQSIQLFRDIAQEFSDDLHHIA SLIGKVVDFEGSLAENRFTVLPNIDPEIDEKKRRLMGLPSFLTEVARKELENLDSRIP SCSVIYIPLIGFLLSIPRLPSMVEASDFEINGLDFMFLSEEKLHYRSARTKELDALLG DLHCEIRDQETLLMYQLQCQVLARAAVLTRVLDLASRLDVLLALASAARDYGYSRPRY SPQVLGVRIQNGRHPLMELCARTFVPNSTECGGDKGRVKVITGPNSSGKSIYLKQVGL ITFMALVGSFVPAEEAEIGAVDAIFTRIHSCESISLGLSTFMIDLNQVAKAVNNATAQ SLVLIDEFGKGTNTVDGLALLAAVLRHWLARGPTCPHIFVATNFLSLVQLQLLPQGPL VQYLTMETCEDGNDLVFFYQVCEGVAKASHASHTAAQAGLPDKLVARGKEVSDLIRSG KPIKPVKDLLKKNQMENCQTLVDKFMKLDLEDPNLDLNVFMSQEVLPAATSIL" ORIGIN 1 61 121 181 241 301 361 421 481 541 601 661 721 781 841 901 961 1021 1081 1141 1201 1261 1321 1381 1441 1501 1561 1621 1681 1741 1801 1861 1921 1981 2041 2101 2161 2221 2281 tggagtgcaa ggggaggagt tcttgtgctt ccgacagaaa gattccatgc ggaccccttc gttccaactg acaggtttgg gccttgatat taatttacca tgtaccccca ctcctatctc ctgggcttgg cctgacactc cagatttgcc tggaggggat cctaacttct aagctgggga tttctccctc cacacagacg tccaacccct tctctgtctt tgagtgttct acctctgctt cataccctgt gcttatttat agctctgtat aaaaaataag cccaaacccc cactgcgcca cagagggcgg tccagactct ggatccgggc gctcacgcgc ccacctgtag gggaggagga acataagacg cctccgcgtg taaatacccg gaacacagaa ggggtggagt cctcttctgt ggctatcttt ctctgtgtgg ccattgaggg gtctgtgtcc cagcctctcc caatattgac ggtgacaggc aatcctgcaa agtagctcct taaactagtc cccagagctg aggggtaacc tttccaatta tgggttgggg ggatggtgca ataaccccac cacaaataag ggcccaaaat caggggtgtg gatcagaact gaagggttta acccaggaaa ttgaaacacc aaatattaac cgcctctggc ttccagggac atcaaatccc gaagtttgaa tcagctacgg ctctccaatc cggcgcgctc cgactcaggt agataagcgc tgcacgactc acaggaatga agaattcacc ctaaaacaga gacgtgatgt gggttttctc ggtgttcgtt gtgtgaattc gaggggatca aagagaagcc cggcctctct tttggctggc aacgctgccc cagaggtttc ctttccctct cctaatctct tcaccaacct attaacccag ctggttagca ggctggggag tttaaaggga ccccgcaaca ctttatggag aggcactaaa tagaagtgcg acttctctct aataataaat caggcacggt gctcatttag tcagagggta tctttaagtt tgggactacg agaaacagtg cgttctggac cagtccgctt aacagcggct cttttgcagg tactgaaaag gtgaggctgg gccccacagg gggtggggcg tcctgtgtcc gcttgaaact gctgctggaa ctgcttgtgg cccttgaact cctcatggtg aaacaactct ttaggtaaat tttctctcct tggcatgact tctcctggaa cctccatctc ctgggcttct tcacacccca cctccaagtt cccttaactc cagctaggtc gaatctcccc ttatatatat cacacacaca cagtgacttc tagttgccga aaggggtatg gccagaattt taggccctgt agggctgaga gtcttacttt caaatattaa tggcctcccc gaccctggtc agtgctagag ccgccccgaa ctcctcctcg aggagggcgg ctcgtggcgg gcgggaaaac ggtcctggcg gccctcagac cgtggagttt acagctctcc taaagaaagg accagcagtt gagggccttt gtaacatcct accctcaaaa acttctcagg ggggccagct acagctttat acccacaggg ccatccagca acctccctgt ctttccactc gattggaagg tctatagctc tgttccccca atctcacccc atctcagggt atatatatat cacacacaca attatgttca atgcatgaat ggcatgtccc gatgtaattc cgtgccatta cagaagtcct gtttgccagg cttaagagtt ctcaaaaccc cgaccttctc gcccggctgc ggcaaatagg ccctgtcgga ggcgcgtgcg tcggtcagcg gctgcgatgg cgtggttggc cccttccttc cccacaatct acgcccctca gagagacttg ggtggtttcc ttctctcctc gtaagggtat tctgcacaca gtcctctcct tgaagatcaa agctacagct tatcgtgcct gagccagggc ccctgcattt cctccccttc tgggtccctc cattgctcaa cctttcttgc caccatcttt actaggaaca attttttttc cacacacaca ccgctttgag gatagatacc agtaggggtg gaatgcttcc tgggggtggt gcttgtttcc cactgttcta gttgcaggaa ccgcaacggt gcgggcttcc taagcaacgg ccaatcagcg tctctaggct cgcgcacctc gggcgttctc cggcagctgg agaggcagag caaagggtaa gtactttagt gccctgcccc 2341 2401 2461 2521 2581 2641 2701 2761 2821 2881 2941 3001 3061 3121 3181 3241 3301 3361 3421 3481 3541 3601 3661 3721 3781 3841 3901 3961 4021 4081 4141 4201 4261 4321 4381 4441 4501 4561 4621 4681 4741 4801 4861 4921 4981 5041 5101 5161 5221 5281 5341 5401 5461 5521 5581 5641 5701 5761 5821 gcagccctgt cctctgtgaa tcatggcctc cctcctccgg aagtcgagga agttgatggg aatcttagca agcctgggca gtggcacgtg cgggaggcgg agcaagactc aagtctctta ctcagagatt tgattctaat aattcaggat ccagatgccc aattcctctg cacacacaca gctcaagtgc ttctcctgac agttttttgt aaactcctga gagtcaccac tgtgttgctg tatttggctt agcctcagac cagagaagaa aactaataga tctgttcatg acactatcac aactatagca ttatgaattc tgttcttttt cttggctcac agtagctggg ggcagggttt tgccttggcc atatatgacg tgtctgttct caccatccca tgcagagcac ggagggctat ttctccttcc gatgagaata aaaatgggga tgtcagacag acagcctccc gattttggta cccccaactc tgtgctcctc taacctgcag tgaggagtgg aagcctttta gcacttctat attaaggcat ccagacagat tatctttata cttgctctag gcttctattg agcagaagta tcgttgcttc cttaggagcg cttccccagc ggaggaggag aagttagaat ctttgggaga acatggcgaa cctgttatcc aggttgcaat cgtctcaaag cctcctgagt tgagaaaatg agtgatttcc acttgggcat cagaccacga ctctctggga cacacacaca agtggcgcaa tcaacctccc gtgtgttttt ccttgtgatc gcccagccat taaggaaata atggttctgg tgcttcaact gcaaggggga gtgagaacct agcgatccac actggggatt ctgactcaat atttcattat tgagacagtg tgcaacctcc attacaagaa caccacgttg tcccaaaggg tatttacaat ccaggggttt ccactcttaa tccctactcc gggttttctc aagttctgga tgactcgatt aggactaata aggtagaagg aggagcacag tctccttcct taccttcatc atgccctatg cttattggga tgacactttt tcaccaaacc aactttccta tagtgttact gtctactttc ttatatgtag ctttccatta tctcattctt cttagtgctt cgaaccgccc aacccaagga ccggccccag ctggccgagg aaaagagggt ctgaggcggg accccgtctc cagctactgg gagctgagat aaagaaagaa ggctgtttca attaaattat tttcttcctt tgcctactat gagcctcaag ttgcagatgt tatttttttt tcttggctca gagtagctgg agcacagacg cgcccacctt gttttactta cctgactctg atggttggaa catggaagaa gggagatgcc ctcactcata tcccatcacc aaatttcaac atattttaca ttaacaaata tcttgctccg gtctcccgga tctgccatca gctaggcttg cagggattat gtttcaggtg acagcctagt ctacttttct tagggaggaa ttagtcaaag tgagatcaat tctgggaaag tatggaatat actgagatgt agagcctaaa tttgctttgc atcacagatc acctgtcccc agcctctgct tggacagggt aaaaggcact ggtttacaat agttctatta ctcagccatt aataaaaaga actgcctgtg aaaatggggt tgcattctgc tcactttttg ggacaccgca tgccgggccc tctctgaggg tgggagccgg cggatcacct tactaaaaat gaaggctgag tgctccactg aaaaaaaaac cattcactaa ataagacatg gctggacaga gatactagtg cttctccaga gttacacaca ttctagacag ctgcagcctc gactacaggc gtgtttcacc ggcctcctaa cattaactca agtaatttgt agctcaaaat gggaaggcag aggctctttt cccaccaaca cacacacctc acgatatttg gttgcttcac tttgtgaggt tcactcaggc ttcaagcaat cgcctggcta tcttgagctc aggcatgagc cttcagattc gacaacatcc aaatctcaac atgtttttga acaaagatcc ccccagtctg cttggtaagg tccagggggc aaagaatgat agacctgaaa ctaactccct tcccctctgc ccaagatctc taagtcatgt tttattgttg gcctcagtga aagaacagga ataccattat tatctttctc gaattagact tgagcttggg gaaaaaattg gcgccaccct catccgcaga gggaccgaga cagggaggcc gagtagaaac gcgcggtggc gagctcagga ataaaaatta gcaggagaat cacttcagcc agggttggga atgggggtga gtaaacccta tccatctgtg actccactat gaggtgggga cacacacaca agtcttgctc cacctcctgg gtgtgccacc atgttggcca agtgctggga cctcactgtc taaaaaaaaa tgggcatctt ggtgtgtaga tgacaaccag cactccagga ctgctaggcc gcagggacaa agaggctccc tgttttttgg tggaagtgta tctcctgcct atttttatat ctggcctcca cactgtgcct agccctgggc agaacatccc ttctacctgt gaaggagagg tttaactcat ttgttacgag acttggtaaa tagaattggg agccttttct tcatattttt gttccggtgt cttatgtcat tcctgctccc ctagggatga gaattctccc cccttattat gtgtactatc tttgaccaaa aggctgtgct aagagtctga caagtcaaat agctacaaga accccggcct gcctccaagc cctggggcgg gaggaggagg ttgaatggag tcacacctgt gttggagacc gccgagcgtg cactgtaact tgggcgacag agagctgggc tgatgcctat cacttatgag tgtgctgtgg ccacttcatg tggaaccatg cacacacaca tgttacccag gttcaagcaa acacccagct gggtggtctc ctacaggtgt tagcatattt aaaaaagttt cactggtgag ggtcacatgg ctctctcagg agggcattaa ctacctcaca atcacatcca tcttttgttt tttgtttggt gtggtgccat cagtctcccg ttttagtaga gtgatctgcc ggccacaaat aaatcagtca acttccctct gttcccactg ggtaggaaga ttgatctctg tgccaaacag ggatagaggg tgagagggag ttcctccccc gccaagtgtg cccattcttt cctaaacctt taccctttaa gggcctcccc cattaagtta gatccataag ctaattagat atcctcaatt ttcagacaag aaatttggtt aatctctctt ccgttccctt 5881 5941 6001 6061 6121 6181 6241 6301 6361 6421 6481 6541 6601 6661 6721 6781 6841 6901 6961 7021 7081 7141 7201 7261 7321 7381 7441 7501 7561 7621 7681 7741 7801 7861 7921 7981 8041 8101 8161 8221 8281 8341 8401 8461 8521 8581 8641 8701 8761 8821 8881 8941 9001 9061 9121 9181 9241 9301 9361 tgcttgcctc ctccttcatc tccctttgac acaagtgcta cctggtgctc tcccaggttc gaactggaag gtgattcacc ggggtggggt aatctaaggg aaggaagtgg aggactcatc ctgtctctgg gtttccctaa tgagggaaag tgaggcaaag tggctcatgc aggagtttga gctgggtgtg aatccaggag acagagcgag atatcccctg gttaccccaa tcacccctca gtgtgcccca gatgtgtccc tccaaatttc tgtggctgta tgggtttgta gtatagagaa agttgtctct atctcagtaa gtcgcccagg ttcacaccat tgcctggcta tctcgatctc gcgtgagcca gtgtgtgtgt tcggctcaat gtaactggga acggggtttc gcctcagcct tttttacata aatcttgctg cgcctcttgg aactacaggc tttcgccatg gcctcccaaa ttgtattttt ttgtttcgtt gcgatctccg cgagtagctg agagactgcg ctgcctcggc ttgagtttta atcccttccc tttggctgcc atgcctcagc actccctgcc cctcaaatag ccagacgcca tgcctcctca gggctgaatt tgcagccccc gagcacttgg actataatgt ccaaccccaa gtggatgtgg ctaatgagac agttgtggaa tggtgaacat ggagggagaa tcctggggct agaggaagaa aaaagtaagg ctgtaatccc gacctgacca ttgtgcgcct gcagaggttg actccatctc tccccattcc actcctccat gtgtacaaag tccctcatct agcctccctt ttacctattt cagtgttgtt tgaatgtttc gagttttgta ggctttaggg gttgtgtcca ctggagtgca tctgctgcct atttttgtat ctgacctcgt ccgcgcctgg gagacgaagt gcaacctctg ttacaggcac accatgttgc cccaaggtgc tgtgtgttgg tgtcacccag gttcaagtga ctgcaccacc ttgcccaggc gtgctgggat ggaggggcgg ttgttttgag ctcactgcaa ggactacagg tttcaccatg ctcccagagt actaggtctg ccatctccat tttgtgccct ttagacacat ttatccctca gtctggagat tgactgccac cagtgagatt ctgggaggta aggagattta agggctgctg cagcgtcccc ccaaagtaat ctgtgaccca tttgggaaga cgaactcaga agatcaagac ggattaagtt attaagatct aagcataaag gacaaacctt cgcgctttgg atatggtaaa gtaatcccag cagtgagctg aaaaaaaaaa atttatcagt ttctcctcga tggccagtgg cacattacaa cccacttcac gtaccccccg gcatatcagc tactagttgg actgcataac tatattaggg ggtttttttt gtggtgcaat cagcctcctg ttttagtaga gatccgtcca ccagttgtgt ctcgctcttg cctcctgggt ctgccaccac ccaggctggt tgggattaca gctcttgagt gctggagtgc ttcttctgcc atgcccagct tggtctcaaa tataggcatg gagggctctt acggagtctt gctctgcctc tgcctgccac ttagccagga gctgggatta ctttgtgtat gtacggtaat cccaggccag aaacacattc caggaatcct aagcaaacaa tgagaaaatc ggtcctgggg ctggcctagc agatttaccc aagttcctgg atcctgggct gtgggattgg gtgggtcaag agactgggac ctgcttcctg acttacaggt taatgcccca ctctccttga atactagctt acatcaagat gaggccaagg accccgtctc ccactcagga aaattgcacc aaaaaaaaaa cctcaattct cagtgttcta actgaaggag agacctacca tcccattgtc ccccccaagc cattacttta gtacctgtta tgcctatttg acatctccgc tttttttttt atcagctcac agtagctagg gaacgggttt cctcggcctc ccagttttgt tcccccaggc tcaagcgatt gcccagctaa cttgaactcc ggcatgagcg tttttgtttg agtggtacaa tcatcctacc aatttttttg ctcctgagct agccaccctg gagttttttg gctctgtcac ccgggttcat catgcccggc tggtctcgat caggcgtgag ttttctggct cccagctcat cttcctcaac cattccctgt caacagatgc cgcctccttt ctcttcctct gataagggct cctggaaaat cgattccact gtcgaagaag ttaagaaatt gaggcctgaa tgctctagga aatattcaga cttttttgtt aaagaggtgg ataatcctaa aggaaaggga tcttttctat atgatctcgg cgggtggatc tactaaaaat ggctgaggca actgcactcc aaagacgtga tattcccttc cagattttta gggctcagcc gaaaagcaat agatatctct ttgagcatct ccaattctgt gggactttgg atttgtatag aaatatccat ttgatacaga tgcaagctct actacaggtg cactgtgtta ccaaagtgct gtgtgtgtgt tggagtgcaa ctcctgcctc tttttgtatt tgacctcagg accgcgcccg tttgtttgtt tttaggctca tcagcctcct tatttttagt caagtgatcc cccggccagc tgttttgttt ccaggctgga gccattctgc taattttttg ctcctgacca ccaccgtgcc aagtgtccct atttgtggcc aaccagcacc ccctgccttg cactgtaagt ctggaaacta cttccattat gggaggcggc agtaactttc gctgatcccc aatcggggtt tatgttgtag aagtaaagtg cacccgggag gagggggaca ttctgtcctc aggcatgctg tgaggctcta aggggggttt agggagaaac ctgggcgcgg gcctgaggtc ataaaaatta ggattgcttg agcctgggcg tctcaggagg aaaagtccaa agagtgagtc tctttggtag tggctccaaa ttcatgccaa tcccatactt gttccttccc gagaccttgt agtctttatc atagtttcat gtctcgctct gcctcctggg cccaccacga gccaggatga gggattacag gtgtgtgtgt tggtgcgatc agcctcctga tttagtagag ggatccactc gccgtccagt ttttagatgg ctgcaacctc gaatagctgg agagatggtg tcctgccttg tattgagttt gtttgtttat gtgcagtggc ttcagcctcc tatttttagt cgtgatccgt tggccagttc gtgagtgtcc aggcaccagc tctgacctgg taacaagttc ggggagagaa 9421 9481 9541 9601 9661 9721 9781 9841 9901 9961 10021 10081 10141 10201 10261 10321 10381 10441 10501 10561 10621 10681 10741 10801 10861 10921 10981 11041 11101 11161 11221 11281 11341 11401 11461 11521 11581 11641 11701 11761 11821 11881 11941 12001 12061 12121 12181 12241 12301 12361 12421 12481 12541 12601 12661 12721 12781 12841 12901 gctgctcagg actacatact gtagtagtaa agctttgtgt cccattgatg ggccgggcac tcacttgagg aaatgcaaaa taaggcagga ctgcacttca tgtaacacat ctgtgatgac atatcaccac ccagatctgt cttctcaagc gagatctaag tgttgcccag gttcaagcaa acacctggct gtctcgaact gagtcactgc ataaacctta aatgatgttc gccccttcta gccattatgg tttttttttt ttgctcactg tagctgggat tgggatttca tcggcctccc tttccttgtt gtagtctcag ccagcctggc tcatggcgca ctcgggaggt agagtgagat tgttaccctt acatagtgct agtgaatctg gtgtctgaca aacgcactgg atccttcccc atctctgagg cccacacatt gagaattctc aatggaatca actaaggaca ggaaatacct cacgcctata tcaagaacta agggcactgt tatgactttg cagtttggtt gtgtttagtg actgttctgg agccatctgg acagactgca tataagatgg cctacagagt tgagtgggtc aaagcctact agcaactgcc agtagctgat atacacagac agtggctcac tcaggagttt attagccaag gaatcgcttg gcctgggcaa ggctatgtta tctcaaattt ctgcagtttt tcctcttcct catgaacctt catgtcactc gctggagtac ttcttctgcc aatttttgta cctggtgatc gcctggccgt ggaggttttc agggtccatc agtcctttcc aaacatgtca gagacggagt caacctccac tacagggacc ccatgttggc aaagtgctgg ctgtacctat cattttggga caacttggtg tgcctctagt agaggttgaa tctgtcttaa tggcttttac tgtttatgtt cttgcctcca catggtcttg tgacatcatc tacctcacct tcatctccag cttcagtgct agtggagaag gacagggcat ggtgacatat ctaacgtaca atcccagcat aaagctgtac cctaggagtc ggcaagttgc taactaaagt cagcgcatgt tgggagccct atgctccgct atgtaagata catccttagt ctcacactct ccacacatac aatggcagta ctttactgag atgcatctca ttgcacacat acctgtaatc gagaccagct cgtgttggta aacccaggag cagagtgaga gcatggttat gtgtctctag cacaggcagc gcattccctt ggattgatgc cttttttttt aatggcacga tcagcctccc ttttcagtag cgcccacctc cactccactt ccattacctt acgtttaccc tactggatct tgcctgtgtt ctccctctgt ctcccaggtt caccaccatg cacgctggtc gattacaggc caaaatctta ggctgaggtg aaacttcacg accagctact gtgagcccag aaaaaaaaaa aggtacctgt tttctagttg tagctatgaa tacatagtag tctaaacaga cccgctgcaa gtaatcagca tcaggtgcag agagctggat gagatcatat ttattaatat aaaagatgta ttgggaggct aggcaccaag tggagacatg ttatctttta ctctcccagt gctttctaaa cagtgaccaa tcattcaaat caaatcttca acttgttcta catggccaat tacacactaa tacagattct cactggctaa ttttttgttg acaagcagca ccagcacttt ggccaacatg ggtgcctgta gtggaggttg ctccgtctca ctttagttat ttttgaactc tcaaactcag tgctggttaa tagaaacaaa tttttttttt tctctgctca aagtagctgg aggcggggtt ggcctcccaa tttaaatagc caggattaag cagttttaat accttatatt tatgctgctc cacccaggct caagcaattc cctggctaat ttgaactgct ataagccact catccaggtc ggtggatgat tctaccgaaa caggaggctg attgccccac aagtgcatcc aaggagttga cactgtcaca ctctatctag cactcaatgt gtggaaacct tgtgtatctt tctccaggaa atctttaatc ttaaggtcgg aacttgaaga ttctgtatac aataataata gaggcaggag aatatgactg attttgagac acggtttcat acgagcaggg gtggggatgg ggagcagagg ctggggtgtt ggacctagtg tgtagaaaag aagtatacag tgcatgaatt cacatacacc ctgcatttca tcagcgcagg ggaaaaaaca ggggggccaa gtaaagcccc attccagcta cagtgagcca aaaaaaaaaa agaaaacaca cgtatgtgaa agcatccaaa tggcattgct aaacctgtca ttgtgactga ctgcaacctc gattacaggc tcaccatgtt agtgctggga ctaagtagaa attagcatct ttccagactc ctagtcattt cttctggaaa ggagtgcagt tcctgcctca ttttgtattt gacctcgtga gtgcccggcc aggcgcggtg ttgaggtcag atacaaaaat aggcaggaga tgcactccag tcttcaaggt tgtgcacctt tcatgttaga tagctatacc gtgaacacaa ttgctagcct gtaggagtta tcggcagggt tcagccacag ggaggaatgc atcatcatat gaatttattt tagggccaga gattgcttga aatgtcacag ttagctgttc tttctcagtt cgtgagtcag ctatttacag tacctgaaac ctaacccaaa tgtgcacatg aatttgtggg ggatacccgg ccatatgcac accccaccta tccttataac tacacatata caaaatgtaa ggtgggtgaa atctctacta ctcaggagac agattgtgca aaaatgctaa cttcacattt tgttaattgc ctgatgccca ggcagtacac ttccaaaaca gtttcgctct cacttcccgg gcccaccacc ggtcaggctg ttacaggcat agaaaataac taagcagtat accttccaaa agggccactt gtcttttttt ggcgcggtct gcctccggag ttagtagaga tctgcccacc tggaaagtct gctcacgcct gagttcaaga tagcccagca attgcttgaa cctgggcaac gcaatccaac ctttgtgctc attagcagtc tgttacctca cgcaaatgta caggtgcaca atttaggata aatttaatta atgggaggga gtattcccca aatctaatga taatttatta tgtggtggct ggccaggagt tatgctctaa tcattggcgg gttaaattta agatacctaa actggcctac ccacccttga gtaactggcc ttggctctta ctcacaagtc aattagacaa 12961 13021 13081 13141 13201 13261 13321 13381 13441 13501 13561 13621 13681 13741 13801 13861 13921 13981 14041 14101 14161 14221 14281 14341 14401 14461 14521 14581 14641 14701 14761 14821 14881 14941 15001 15061 15121 15181 15241 15301 15361 15421 15481 15541 15601 15661 15721 15781 15841 15901 15961 16021 16081 16141 16201 16261 16321 16381 16441 acacagatga agacagagtc cacctcccgg cgccaacacg gccaggatgg gggattacag cctccaacac ccagcacttt tggccaacat tgcacccctg aggcggaggt agactctgtc gtggcttatg caggagcttg tagtaataca tgaggcaggc ccctatctct agctgctggg acccgagatc taataataat ggcgcggtgg aggtcaggag caaaaattag tgcagtaagc taaaaaataa tgtaatccca attctggcca gtggcgcata cgggaggtgg gcgagactct ccttaaatgt agcttatcat tctgttgctg tcaccgtctc cactgctctc ggtcttatcc agtagataca tagcacatcg aactgagatc gctatggttc tcagtttttt tcacatcaag gagggctgat gcggctgtaa aggtcagcga caggtaaagg ttattctacc tgccatgggg tcagcctccc tttttagtag ggtgatccac cggcctccct aatataatta ttttttattt ttgcagtggt tacctcagcc tgtattttta ctcaagtgat tgcccagcct gacatttatt tcactctgtc gtttacacca cccggctaat tcttgatctc gcgtgagcca acaaaaagat gggaggccga ggtgaaacct taatcccagc tgcagtgagc tcagaaaaaa cttgtaatcc agaacagcct aatggccagg agatcacctg actaaaaata gaggctgagg ccgccactgt aatacaaata ctcatgcctg ttcaagacca ctgagcacag caagattgtg aataaaaaaa gcactttggg acatagtgaa cctgcagtcc aggttgcagt gtctcaaaaa ccctccccaa ccctaatggg ttacccagag ttactcagcc cttctaaaat gttgaaactg tgtaaacaca tggatacata aagatgatag acacgtccga ctgctgcccc aacgtgcctg actgggcagt ccttgtctga ctggcaggtt ccctcagcct ctcttttttt cgatcttggc gagaagttgg agacaggatt ctgcctcagc cttttttttt acccccatgt ctttttcttt gcgatctcgg tcccaaatag gtagagatgg ctgcccatct agtatgtgtc tctgtatatg acccatcctg ttctgcctca tttgttgtgt ctgacctcgt ccgcacctgg gtaaataatt ggcaggtgga catctctact tactccagag tgagatcgca aaaaaaagat cagcactttg gggcaacata cgcagtagct aggtcacgat caaaatttag caggagaatc accctagcct tctatatatc taatcccagc gcctggccaa tggcgggtgc ctactgcact aagaataaga agccgaggcg accctgactc cagctactcg gagccgagat aaaaaaaaaa tccttttttc acagttatgt acactttcac tcagagtgag attgacaagc tgatatgtag cctgaatggg ttgccacaat atgtaacttg ctcatgacct agaatctgga tgagcccagg gggcttcttg ctgtagctga ctctacaagg gtattccaga tttttttgga tcactgcaac gattacaagc tcaccatgtt ctcccaaggt ttaactttgt acccaccatg ttttttttag ctcactgcaa ctgggattac gatttcacca cggcctccca atctatatct aatttatttt gagtgcagtg ccctcccgag ttttagtaga gacccgccct cctgaattta agccgggcgt acaccagagg aaaaatacaa gctgagacac ccactgcact gtaaataata ggaggccatg gcaagacctc catgcctgta tttgagacca ctgggcatgg acttaagcct gggcgacaga catcagccaa actttgggag gatggtgaaa ctgtaattcc ctagcctggg cactattggc ggcagatcac tactaaaaat ggaggctgag cgcatcactg acgaaagaat tctgcagtgt tttcacagga agctaaaaag ctgcagtgtt tccgttactt acacaattat taggacactg ccccagggac tagtaccccc gggggagctc catggctcag gtggagggca aggggcatta ttctgaaacg taaggccttc ctgtctgtac gacagcctcg ctccgcctcc gcccgccacc ggccaggctg gctgggatta attcaggaaa cagtttcaac acagagtctt cctctgcctc aggtgcccac tgttggccag aagtgctagg ttttctactt atttatttat gcctggctca tagctgggac gacggggttt ccttggcctg ttttcattta ggtggctcat tccggagttt aaattagccg gagaatcgct ccagcctgga atactatcgg gcaggaggac gtctctataa attccagcac gcctggccaa tggcacacac gtgaggcaga gcaagactcc tttaagaata gccgaggcgg ccccgtctct agaacctggg cgacagagca cgggtatggt gaggtcaaga acaacaatta gcacgaaaat cactccagcc aagacattgt tgaccattat agaatatgaa acatacaaac ggcacacaaa atatacatgg gctcacatct cacttgccac tgcaagcaca acccaaaccc agttctcgtc atgctgcatc gggaggtggg gagtgaggga catgaagttg cttcttgaat cctagacatg ctctgtcgcc tgggttcaag atgcctggca gtcttgaact caggtgtgaa atgtaaaaaa actttaactt gctctgtcgc ccaagttcaa caccacactg gctggtctca attataggtg tcccctcttg ttattttttg ttgcaagctc tataggtgcc caccgcgtta ccaaagtgct ttaggaaata gactgtaatc gagaccaggc ggagtggtgg tgaacctggg caacagagca gccaggtgca tgcttgaggc aaactattaa tttaggaggc cacagcgaaa ctgtagttcc ggttgcagtg atctcaaaaa agacatgccg gtggatcaca actaaaaata aggtggaggt agactctata gactcacgcc gatcgagatc gctgggcatg cacttgaacc tggcgacaaa tgttgaagcc tatgaattaa aagatgaatg tcatactgac tacctcaaca aatgacacac agcaattttc tacattccca ctttttggca tcacttccag tggacgtcat ggctcctggg gaaggaggtt agagaaaaca tcccacacca cccaaaagtc ctgtccaatt caggctgaag caattctgcc aatttttgta cctgacttca ccaccaggcc tatttagaat acgccaattt ccaggctgga gtgattctcc gagtgatttt aattcctggc ggagccaccg gattattttg 16501 16561 16621 16681 16741 16801 16861 16921 16981 17041 17101 17161 17221 17281 17341 17401 17461 17521 17581 17641 17701 17761 17821 17881 17941 18001 18061 18121 18181 18241 18301 18361 18421 18481 18541 18601 18661 18721 18781 18841 18901 18961 19021 19081 19141 19201 19261 19321 19381 19441 19501 19561 19621 19681 19741 19801 19861 19921 19981 tgggttttgt gagcagagtg cccaccttag tggattattt aatttctttt gagtgcagtg cctgcctcag tttttgtatt tgacctcgtg ctcacccggc cagtattgtt ttcttttttg agctcactgc agcagggacc tatttattta agcccaggct caagtgattc cccagctaat cgaactcccg cgtgagccat gctcaaggtg gctgagacta tctaccggtt ctcctggcaa ctggattttg ttctgtaaat atactgtttt gatgttggca tatttagttt gctcactgca gctgggactt gggatttcac ctcagcctcc ttctttagca cagaaatttc atcatataga agtcttgctc acctcccggg taggcgtgtg atgatggcca agtgctggga aacgcttctt aaatttcagt tatggtaaat tttgtttgtt agtggcgcga tcagcctcct acttttagta gtgatccacc ggccctgata ggccttttat tttgcacaac ctgtcttcct tttgttttgt tctggctata gctggaatta agggtttcac ctcagcctcc tcttaattgt tgtcgtttgt gtgcagtcat cctccagagt tgaagcaagt ttcttttttt gcatgatctc cctcccatgt tttagtagag atccacccgc caatatttca atcttacttt agacagggtc agcctcaaac acaggtgatg tttatttatt ggagtgcaat tcttgcctca tttgtatttt acctcagatg tgcacctgac gtctcaaact ttggtgtgag ccttctctat tttgttgaag ctgtttgcat taatatttag agaagtagag aatgttctga gaggtgagtc acctccgcct acaggcgcac catgttggcc caaagtgctg attacaaagt tgtaaagaga aatacagaaa tgtcaccagg ttgggtttca ccaccatgcc ggatggtctt ttacaggtgt gaatttgtgt atatgtgctg tagttcccta tgtttgtttg tctcggctca gagtgcctgg gagacagggt cgcctcagcc gccgatgagg agctatattt cctctatcaa tcctagactg tttgagacag acctccacct catgtgcatg catgttggtg caaagtgctg ctttgtagtt ttgttttttt atttcattgc aactgggact cccagccatt tttttttttt ggctcactgc agctgggata acagggtttc ctcggcctcc gggtaatttc aaaaattgta tcactctgtc tgttgggctc gccatcacac tatttattta ggtgtgatct gcctcccaag tagtagagac atctgcctgc caatttttta cctgagctca ccacgatgcc cttttctctc aaactagatt ttcctagatg ttctagaagc gaacataatg tgtttaatac tccctctgtc cctgagttca accaccacgc aggatggtct ggattacagg agtagcattt aacttccttt attgcttgat ctggagtgca agtgattctc tggctaattt gatctcttga gagccaccat catccgtgcc cagaaatgag acattctcca tttgagatgg ctgcaagctc gactacaggc ttcaccctct tcccaaagtg tttttttgtc ctctttctcc agcacctacc tgagcacatc agtctcgctc cccgggttca ccaccaagcc aggctgatct ggattacagg tcagtgtttg aaataaggtc agcctctaac ataagcctga atatcatttc gagatggagt aacctccgcc acaggcacat accatgttgg cagagtgctg taaaagaaaa ttatttggta acccaggctt aagcgatcct cgaactaagt tttatttttg cggctcaccg tagctgggat agggtttctc cccggcctcc ttttttgtag agtgatcctc cagtcagatg ttgttctttc attatttgtc tttttggcac atgattaggt tctgatatgt ctaaatctat gccaggctga agtgattctc ccagctaatt tgatctcttg cgtgaaccac aaatctctga tatgtactat atttccccca gtggcacaat ctgcctcagc ttgtattttt cctcgtgatc gcccagccct cagtggccgt caccccccac agatagccat agtcttgcac tgcctcccgg gcccgccacc taaccaggat ctgggattac attgttcttc cgactctgta acctcacttt tgggacaggg tgtcggccag agagattctc cagttaattt cgaactcctg catgagccac gtacagtgcc ctgctttgtc ttctgggctc gatgctgcac atccataaat ctcactctgt tcccaggttc gccaccatgc ccaggatggt ggattacagg ttatttttta tcatcaaata gagtgcaatg cccccctcag ttttattttt agacgggatt caacctccac tacaggtgcg cctgttggtc caaagtgctg agacaggatc ctgcttgggc atggccctag tttctttttc ttatagtgtt atttctctct tcagagtttt ccgattgtct tattcattta agtgcagtgg ctgcctcagc tttgtatttt acctcaggtg tacacccagc ttctttcttc ttggttgcca ctcttttttt cttggctcac ctcccgagta agtagagaca cacccgcctc tttttttttc gctaatccct ctttatttac gagatttttt tgttgcccag gttcgcgcca acgcctggct ggtctcaatc aggtgtgagc ttgtatcatt caaactcctt taaatcttct accatatctt gctggagtgc ctgcctcagc tttgtatttt acctcaggtg tgtgcctggc tctcactgtt acccatacta aagcaatcct ctggctttct atttcagtgt caccaggctg aagcgattct ccaagttttt cttgatctcc cgtgagccac aaaagaataa tctgaaattt gcacaattgt cctcctgagt tgcttgcatt ttgctcttgt ctcctgggtt tgccaccacg gggctggtct ggattacagg tcactatgtt ctcccaaagt tcctttttaa tctttttctt ttccattagc cttctatagt ttttttttca ctctttttct tttatttatt cacgatcttg ctcctgagta tagtagagac atccacccgc catctattaa atttattagc agtgatagaa tttgagacag tgcaacctcc gctgggacta gggtttcacc ggcctcccaa cccaatatgg gtaaccttcg tagctatcaa ttgttttttg gcaggagtgc ttctcctacc aattttttgt tcctgacctc caccgcgccc acagactcat tgttttagag gcatgtattt tttttgttta aatggcgtga ctcccaagta gagtagagac atctacccac caggaccata tctttttgcc 20041 20101 20161 20221 20281 20341 20401 20461 20521 20581 20641 20701 20761 20821 20881 20941 21001 21061 21121 21181 21241 21301 21361 21421 21481 21541 21601 21661 21721 21781 21841 21901 21961 22021 22081 22141 22201 22261 22321 22381 22441 22501 22561 22621 22681 22741 22801 22861 22921 22981 23041 23101 23161 23221 23281 23341 23401 23461 23521 tttgagatct tgtccctatc tcacagactg atccagctct ctcattggga gagaatgcag tgggatggtg gaggactgca gtcctaagaa gcttcacagt tgggcctgtg ggggatcttc ctcgtagaaa gagctggaga agggcaggag agacatactg tggggataca agcaggagag cccgcctgcc taagaccctc gcgttactct ctttaccagc gaaagatggg acccagtttc gcattgctgg attgtaagat atgcttccaa ttcctctgaa gggcctctgc ctcccagacc gctgtcttaa agtgctgccc cgaatccaga aggctacatc gcccactgca gaatgtggtg agcatatacc atctactcca gaccccgctc ttattctcca tgcctctgag cactgattct agtgtgtctg catgccttga tccctattca aggaggccga tctcccttgg ggtgggattg gcatgccctc gaagaatcaa agaggatgaa tgtatacagc ctttcccttg agtcgctggt atgaggggag tgggcctctg aaaggcagtg gccctctttg acgtggaccc tccctctttg accacctcaa tgtacagtgc ttcgggacat aagtagtgag tgtgcaagat ggaggaggca gggagaattg acagtactta cctccccaac agccctgcgc atagcaacca agcgaagact atctggactc agtgggtgta gtagataaga gggatctttt tagagtatct ttccatggta aacctctgta acagcagcac actgagacaa gaaggagagg tctcagagga gggacctgca gtttaaaaaa caggcccctc tgtcttgagg ctctccttca aggagacgct cccgagtatt gggactatgg atggcaggta ttctgggggt gacatcctct gggacaaagg tcaaacaggt cccctacttg ttcatgaaag ctggagagcc taggtacaca aaattagccc ttacctattt gcctcacttt ggtaggcttg aattggggca cctctccacc aggaagggga tgctgcatgc taccataaac caaagcgtgc tttattcaca tccaccttat ccttattgat aaactaagga gaatggaata caagtgcaga caggtggatg acatgccccc ttactgtgat cccgagctgg cctgggcctg tgcccaagag tagaaggaaa ggggaaacat gcagaacttc gggcccaagg tctcctcagg atagatcctg agtgatggag gggcaggaga gatgggactt ccgtattcct gccttcagat aaacttgtgg aagctccctc cctctttact gaggccagtg aggtgagtga tgcccaatat aggaaagaga aggaccaaga gaagctgcac ctgcgagatc agcagcagcc ctcttcctgc tctcagattg ctttcccctg gctgatgtac ggaccttgcc ctactcaagg agaatagagg tcatctatct gatggaactc gagggtcaaa gaggagaagc ccagccaact gaccatcacc atgctctaat ccactcccaa acagggctat ctcctgtttc cacctcagcc atcacattca gtagacgcca ttcatgatcg taatgggaaa cctttatact atttcttgaa acctcaccat ggccaactgt acccagcagg gaatttggaa ggggaaaatg gggctgtgtg ggggcatatg ggctcgcgct acatctttgt cttccctact atgtggcctg agggatgcct ttctctgatg aagggagtgc ggaagatatt agggaagtat agagctgagg tggactttga aaattgatga taccatcctt ctcacttttg cccagtttcc tcatgcagtg gtcttttggg ggcagcctga tagggtgggg ctccccagat actttgagat tgaggaaaat gggatctctc agtcagagtt gatgcaaagt tatcgtagtg cggggtgagg aggggaagga tctctgtctc tatctgcaac gaactgacct cagctacagt tcccgcctgg ccgcgttact cgggtggagg tgatccacaa tgtgcccgaa gtcatcactg cctgcagcct caggctcctg cacatccctg ggaactttcc gtatgtctct ggtcaggatt accctgtcca cacggcacca tggccctggt tcttcacacg acctcaacca ggaacccctg aaaagtgggg cccttgtttc tcaagaactt ggtcagtgcg tggcgaaagc agggaaccaa gaggaggatg ggcagaaaag gggtccccat tctggccgct ggccaccaac ggtctttgtt tcctcctttt gccgctccct acctgcacca acccagggag gaggtcaatt ctggagggtg aacaggacag gggcagcctt gagtgagtgt ggcaggtggt ataaccacgt ttactgaggt tcatctacat ggagatatta agaacatgaa aggtgtccag tggcttcctt taatggactg gagtcagcag ctctgtagtt aggggctgga ccacagcttt cccgaaccaa aaaagccaga ggggagtggg gctcactctg ctgtttccag ccagctccct gccaggtgct acgtcctgct ccccacaagt aatagacatg gccatgcgag cctttgtgcc gacccaactc gggcctctgg cagctcttct tgcttccacc gtggcccaaa gcccacgtcc cggggaggag tttctctttg ggccccaggc aggcagcttt aattcatagc ggtcaaaggg aaaatgctca agcactaagg atgtgagtca gcagtgcagt ttacgggctt agtgaacaat cacggtgagg aaggagcatg aaatagaaca ggctccgaat gtgctccgac tttctgagcc cttctgagtc tgtgtttctc gccgcagtcc tatcgccagc gtcagggaga ggataaagaa agagttaaag agggtgccag gctgaaaatc tgggtgtgga caccacagct gtcttccacc tgcccgcaag ccctctggtg ggcttatgaa cacttttttg taagtctcca ctttctattc gacttcatgg ctgaggaaga ttactctgag ggtggggtta gaacccctgt ggagctggat ggttatatgc caacttgggg actctatctt atccccctag tcctcaccca ggcacgagca ggctcttgcc ccttggggta aggggcccaa gtgcctctcc caactccaca atcagggaag cgtctcctgc cccattttct tcacatgttc ttccttcacc cgtgcctctt agacagagtc atgtgccatt cctgtctcct gtgccagcag tgcgaatcca aacaaaggga taacaggaaa tcagagataa ctgttggcaa agggagggca ccaatactaa gccactgcac ggagaaactg acagtgaggc cgagacaggg gctaacctct actggctggc ttgttcagct 23581 23641 23701 23761 23821 23881 23941 24001 24061 24121 24181 24241 24301 24361 24421 24481 24541 24601 24661 24721 24781 24841 24901 24961 25021 25081 25141 25201 25261 25321 25381 25441 25501 25561 25621 25681 25741 25801 25861 25921 25981 26041 26101 26161 26221 26281 26341 26401 26461 26521 26581 26641 26701 26761 26821 26881 26941 27001 27061 acaactgctg cggggacccc ggctccctca agggtttcag ccatggagac ttgcgaaggc tggctcgtgg ctttcattcc ccccttctct acttgatccg tggaaaagtg ttctcccctt ttctttctga tctgcccagg gttgccagac tgaacgtttt tccagtgtcc tatctccctc aagtttattt ccaaaacaca tgttgctgta cagagaggcc tcttctcaag tgacagaggc cagtgggagt gtgcccttgg gacccagggg ggaactctct cccgcctgcc accacacagt ggcgtgagag agagaacagg gaagaccatc tgggggctgc tagacacagc gaggcagcat tggaggaagc tgtggcctgg aaagggggta tggagagtac agagggatgg tgtcagcctc ccccccgtta ggtgagttta ggggtcagtg cactgaacaa acttgtggca gggaggagcc agtcacccag tacccgtgtc ggctcctttt gaagttcaga tgtctcatgt tctgggaaga gaaactgctc aggaaaagtg ttttgttttg ccacaactca agagtgctgg ccacaagggc caggctgggc gcacagagac gacaggaagg ctgtgaggat cagccatgcc caaggaggtg tagtcctact tccccactcc cagtggaaaa cgtatatggc ttcagggact aatccctaaa gatttggttg attagtggat catgagccag tccccagcct agacgcagag tgaagaaaga gagcaggagt ggaagactcc ctacaaaaca taggaacacc aggatgggtg ggctccaccc tgcaggctcc cccagagctt caatagatct tccccttgcc gaacccaact tcagcctctg atgccctgtg tgagggaggc ccctctgtgt ctgagaccac cgtaatgaca tggtctgcat attattccaa ttgtgctatt tggattttcc cctgtggtga caactcctcc gcccagattc ttgttttcag ctggcttaac ggctagttcc gcaacaggta cagaggcccc ccaaaggagg taaaccctcc tctgtttctt catgttggtg cctcattctc agacatgttt cagaaagcag cattgtggga tttctgagat ctgcagcctc gattacaggc ccctggtgca atttcccaga cacatccctt aggtgattga ggcaacgatc tcccacacag atgagatcca gggcctgggt ccttactcct cccatcaagc cccagtgtct cagccttcct tcttcaagat tccattctgc aagtttatga gaagtgctgc cctgagactc tttttagttt tattgtttct ccacagggga cggattctac gattagaggg ccagcctcag caaggcaggg ttcccacctc ctacacagtc cttcctttgg gccctttata ctccaggtcc aggggtggag tgattgcctt gcagggtctg acagcgacag gaatgggggg tctggagagg ggatgccacc tccattcaga tggggcaggg tgagggagat cacctgcctg caggaataga tctagaattt aaaaggtgaa tttagacttt agaaaacaca tgcaccaccc aacttcaaga atctgtttct agatggctca tctcactccc aactcaacag taaagtttct ccctatatga attgaaactg ctttctccaa atatccattg agggtctcac aacctcctgg gtgagccact gtatttggtg ggtggggatt cccttttctc tgatacactg ttgtcttctt ctgcccaggc aatgtgcaac ctaggtccac cccaccttct ctgtcaagga ttaccctctc ccagcacttt cccaggtttt cataaatctt aactggattt ctgctgccac cggtgggctg ctctagaaat ttagtctcaa acctgccctg cccaggatac aagacagagg acagacacag gtggagggga agagccatgg ctgctgctgc gtgagtatca tttccattca tcagtggcca agaaagggcc tcccagctac gagctgctac cagctgcatc aggaccaggg ggagagttaa aagtgttaag gggatttgga atggacaggg ggaggaactg ggagggtact gtggcagacg tctaacagat catctgtttg tgggaagttg gcgaatttcc aggattcaaa aggagggcag cctccaggag gcggggctgc ctaagcctgg ctaaaaaatg cctctgctcc catgcaaaaa tccttcagcc aaatgtcttt ccctctatcc tgtcgctcag gctcaagtga gcacccagcc aggagaccaa ggctcctcta cctccccaca tcttttattc ctatcaggtt tgggcttcct cacctccaca aggatttctg tgcttgttcc tttgctaaag tgcatcttct gcccttcaga ctgtgccaca gcgattttct ggaagatcct cagcatcctc ccatgccctc tttgtttcat aacaagagac cctcagtaaa ttcatgagaa ggtccaaggg caggaagggg gggaccagcc ggagccaggg cgctggggac gcccaacaag acttgagggc gtctgggttc ataacccaga gcaggatgca agcatggcca taggggccct agggaggaac tggtcaggga ttggtgttca tcactccatg aggctccatg atgtgctaaa gggacgaggg acctcaggtt ttacactcag cagaatctga gactagagag cctccagttc ggaaagacga gagccccacc ttgtcaaggc accaaggggc tgaaagagtc gctccaggta tgaaaacttc cgatctttct ttaaatacaa ggtttgtttc cagtcttgca gctgtagtgc tcctcctgcc ccagtctcga tctagctcct tcagaacaag ggattggcca tcttttaaga tgcgaaggtg gacaagcttg tcagagctcc acccttattt taggtctcag aagaaccaaa cctgcaactc aacccaccat gcctctcccc ctcttcttca aacctggact tgagagtcct tttgtttcct attaggaata taggaaagat aatacagtgt cgaacccctt agatggtctc cctgagaggc cgggctgcac ctctggcggg aagccgccaa aggtcccagg ccacagtgtt acactcagtg gccctactgt ggctctggag ggcctggttt tggtgaggta agggaatgtg tcatgagttg ttgttggggc gagatgaggg aagagtagtg gagatggagg gatccagatg ttcaccatgt agcctggtcg ttcatgagaa gggagttgtt tccccaagtc agggagcaga ctacagggct agcagaaagg caagaggccc agaagcccca gtgagtcaat atcttcttgg ttgaaatccc aaataaaact tcatagggtt gggtgttttg agtggttcga tcagcctccc agtttctaag 27121 27181 27241 27301 27361 27421 27481 27541 27601 27661 27721 27781 27841 27901 27961 28021 28081 28141 28201 28261 28321 28381 28441 28501 28561 28621 28681 28741 28801 aaaggaaagg ctgtaatccc gaccagcctg gcgtggtggc gaacctagga aatgagcaaa cccagctagg ccaatggggc caggaggcca gcagccagcc gtggccgtgg gaaaaggaag tccccggcct cgataccagg tgcagtcaca caggcggccc tcgctaaagc ggagaggagg tggccgggca aggtgaggga agctggcgat cactaagctg gccgcacttt gggcagccct gtctaggggt gtcccacggg tgacgtggga cttctaccag agattcggag gatgtgatgg agcactttgg gccaacatag gggcacctgt ggcagaggtt actacatctc agaggtaagg aggttgaaaa ggcctagcag ctgccccagc tgaggatcgg agaatgacag tgcccaccct agccggagga gtcaccatca caggccgact tccaggctgc attctgaggg agatgccaga gaagctgggg gcggagactg cagagaaggg cccctggcgt gtgcaggcgc ggacagcagc ctccaagggc gaaatgcggc cagctgggtc a agaaagaaaa gaggccgagg tgaaaccctg gatcccagct gcagcgagcc aaaaaaaaaa tcctaagacc tagtgctgga aagcagcacc cctgccttga gtcagatgag ggtgtgctag gtgatggaaa aagcatgagt ccacggaatc cattcagttc ccagagccta ccagtcggag gggagctgga ttgacgaaag agcgggactt ttcttcaggg ctcacctcca cgccccgctg gtgggcgaca acctggccta tgaggatccc tgaagacctg ccttcattgg caggcagatc tctctactaa acttgggagg gagattgcgc caacaacaac tatgtgacaa gacccatccc cctccaactg gtcaccaatg ccggtagggg agctgtactc gtagtggttc cgggggtact cggggccgct caggtgagcc gagtcgggac ggggacaggg gggttcatga tcctaaggtc tgctgcccgg aaggggccgc gaaggacagg cgtcctggcc gcgaggctgc gttcggcacc caggattggc ggacaggggc ctgggcacgg acctgaggtc aaatacaaaa ctgaggcagg cgctgcactc aaaaaagaga atgtgtccca tttagagccc tgccccacca tgaaggggga tggtgtgccg aaattaaacc ctcacctgcg gggttggctt gaatctggga ctggagagag gcctgcaggg gcagggcttg gcctcacctg aagatcctga ggccaagaaa tctaactctc gactacagtg aatcagctcc gaggagacct ctctgggacc tctggaaccc gaggagggga tggctcacgc aggagtttga aattagccgg agaatcgctt cagcctgggc aaaccttcat ggtcttctta gttgtgtcac ggggctgccc aaaggcaggg gtcctgtggg tacaccaccc gggctggggc ctcgtccccc cctccagcca gatataggcg gcacgggagc ggataagcat gagaggttgg gggcccgaga cccgaggggc tccagcccca ctaggctgag agggagaagg cgctccggag cctcgaagga aaccctgtca gaacattgtg