Bio 251 HW 3 KEY 2-8..

advertisement
Bio/CS 251 Bioinformatics
Due date:
KEY
Homework 3
Monday, 2/5/07
20 points (Part 1)
1. A single strand of bacterial DNA contains the base sequence
-35
5’
-10
+1
CGTGTATTGACACTGGTGAGCCACTATCGTATATTCCCTAAGTGAGTATTGG
3’
3’..GCACATAACTGTGACCACTCGGTGATAGCATATAAGGGATTCACTCATAACC
5’
mRNA:
5’ AGUGAGUAUUGG….3’
a. What is the complementary sequence? Draw or type this sequence just below and indicate its
polarity. See above
b. Could this stretch of DNA, or part of this stretch, be transcribed in bacteria? If so, find and label
the promoter elements and the startsite for transcription. See above
c. Under the double stranded DNA sequence, draw or type the RNA sequence that could be
transcribed, and indicate its polarity. Note: show only the sequence of RNA that would be
transcribed. See above
d. Which strand of the DNA serves as the coding strand (sense strand), and which serves as the
template strand (antisense strand), for the synthesis of the RNA transcript for this hypothetical
gene fragment.
The coding strand is the “mRNA-like” strand, i.e., the top strand
The template strand is the “mRNA-unlike” strand, i.e., the bottom strand
2. Examine the -globin gene sequence in Slide #24 of today’s Powerpoint lecture (Bio 251
Central Dogma 06.PPT)
a. Is this a DNA or RNA sequence? Why?
RNA sequence, because uracil (U) is present instead of thymine (T)
b. Does this sequence show a primary transcript, or a mature transcript? Explain.
It shows a primary transcript, or heterogeneous nuclear RNA (hnRNA). This immature
transcript contains introns that have not yet been spliced out.
c. What three steps must occur to form a mature mRNA?
- 5’ cap must be added
- 3’ poly(A) tail must be added
- introns must be spliced out.
d. How many exons and introns are found in the human -globin gene?
Three exons and two introns
e. Which end of the gene would contain the promoter sequence?
f.
5’ end
Does the sequence shown in the slide include a promoter sequence? Why or why not?
This sequence cannot contain a promoter, because promoters are not present in mRNA.
The promoter lies upstream of the first base of the mRNA.
g. What is the sequence of the stop codon for translation of the -globin mRNA?
UAA
h. Does the sequence shown in the slide include a 5’ UTR? A 3’ UTR? If so, how long are
each of these sequences?
5’ UTR: 50 bp long, preceding the start codon
3’ UTR: 133 bp long, after the stop codon
i.
What 2-base sequence serves as the boundary between an intron and an exon at the
beginning of an intron? At the end of an intron?
The 5’ splice junction of an intron always begins with GT
The 3’ splice junction of an intron always begins with AG
j.
What is unusual about the Arginine codon that specifies amino acid #30 in the -globin gene
sequence? Explain how this codon differs from the Arginine codon that specifies amino acid
#104.
Both Arg codons are specified by the codon AGG.
Arg #30, however, is encoded by the last two bases in Exon1 + the first base of exon 2,
whereas Arg #104 is ecoded by the last three bases of exon 2.
In other words, the reading frame of a gene may be split in the middle of a codon by an
Introns….pretty cool, huh?!
3. A gene in the chicken that codes for a specific egg protein was isolated. In one set of
experiments, the duplex DNA was denatured and allowed to hybridize with mature processed mRNA,
which was isolated from cells that express the protein. The electron microscope revealed heteroduplex
molecules formed between the mRNA and single DNA strands. Such a heteroduplex is shown here.
Answer the following questions regarding this heteroduplex:
a. Is the DNA represented by the dark or light strand? Explain.
DNA is represented by the dark strand. The mRNA represented by the light strand has
undergone processing, and introns have been excised from it. These sequences are
present in the DNA and form loops in the hybrid molecule because no complementary
sequences are present in the mRNA to permit pairing.
b. How many introns appear to be present?
7 introns and 8 exons
c. The longer strand shows an unpaired segment at each end. What do these represent, in other
words, what elements of the gene would be found in each of these regions?
At one end is the promoter region that is not transcribed. At the other end is a segment
containing sequences in the terminator that are not found in the mRNA.
d. The shorter strand shows an unpaired segment at one end.
Is this unpaired strand at the 5’ or 3’ end of the gene? What does this unpaired strand represent?
This represents the poly(A) tail at the 3’ end of the mRNA strand. It is added after
transcription is completed, and there is no complement to it in the template strand.
4. Into how many different mRNAs can an hnRNA (a primary RNA transcript) be processed if it
has three exons and if all of its 5’ splice junctions can be used with any downstream 3’ splice
junction? Diagram the information content of each mRNA.
It is possible to combine any 5’ splice junction (GT) with any 3’ splice junction (AG) to
perform a splicing event. Since there are only two 5’ and two 3’ splice junctions, the
number of alternatively spliced transcripts is limited to only two, as shown above and
below the primary transcript in the following diagram:
1
3
1
2
1
2
3
3
5. How frequently would you expect to find the sequence of nucleotides provided below in a DNA
molecule simply as the result of random chance? Assume that each of the four nucleotides
occurs with the same frequency.
1/413 = once in every 6.71 x 107
nucleotides, i.e., once in every 67 million base pairs.
5’ – G G A T C G T A G C C T A – 3’
6. How many nucleotides long would a DNA sequence need to be in order for it not to be
found by chance more than once in a genome whose size is 3 x 109 base pairs long?
16 nt long  1/416 = one occurrence in every 4.3 x 109 nt
7. a. What sequence of amino acids would the following RNA sequence code for if it were to be
translated by a ribosome?
5’ AUG GGA UGU CGC CGA AUA 3’
MET GLY CYS ARG ARG ILE
b. What sequence of amino acids would it code for if the first nucleotide were deleted and another
‘A’ was added to the 3’ end?
5’ UGG GAU GUC GCC GAA UAA 3’
TRP ASP VAL ALA GLU Stop
This is Part 2 and is worth 10 points.
The following exercise will be due on
Tuesday, Feb. 6 at 5 pm
8. Using bioinformatic tools to locate information about gene structure:
The gene/protein mutS/MSH2 is universally conserved across all species, because it serves an
essential function in DNA repair for all organisms.
Using Entrez on the NCBI homepage (http://www.ncbi.nlm.nih.gov/entrez/) find the entries for the
mutS/MSH2 gene from the following three species:
(To find these gene easily, use the “Search” box, and choose a “Gene” search)
Prokaryotic (bacteria):
Staphylococcus aureus [Hint: search “mutS Staphylococcus aureus”)
Eukaryotic:
Homo sapiens (human) [Hint: search “hMSH2 Homo sapiens”]
Aspergillus nidulans (bread mold fungus) [Hint: search “mutS Aspergillus
nidulans”, and study the first gene
entry at the top of the list]
Do the following:
1. Print out the annotation page (Entrez Gene page) for each of the three gene versions.
Here are the first several lines of each Entrez Gene page:
1: mutS DNA mismatch repair protein mutS [ Staphylococcus
aureus subsp. aureus USA300 ]
GeneID: 3913979 updated 27-Jan-2007
1: MSH5 mutS homolog 5 (E. coli) [ Homo sapiens ]
GeneID: 4439 updated 20-Jan-2007
1: AN5006.2 hypothetical protein [ Aspergillus nidulans FGSC A4 ]
GeneID: 2872806 updated 01-Feb-2007
2. Compare the lengths of the mutS/MSH2 genomic and mRNA sequences for each species.
(why does the S. aureus annotation not include an mRNA sequence, and what will you use
instead?)
Construct a table to record this information in a clear and systematic manner.
Gene
mRNA
S. aureus
2523 nt
2523 nt
A.nidulans
6215 nt
4935 nt (XM_657518 4935 bp mRNA)
H. sapiens
MSH5
28811 nt
a. 2903 nt (NM_002441 2903 bp mRNA)
b. 2796 nt (NM_025259 2796 bp mRNA)
c. 2748 nt (NM_172165 2748 bp mRNA)
d. 2745 nt (NM_172166 2745 bp mRNA)
(If instead you used H. sapiens hMSH2, then you received full
credit as well)
The human gene can be transcribed to produce 4 similar mRNA
variants, as shown above
3. Record the numbers and lengths of the exons and introns for each gene.
The easiest way to get this information is as follows:
a. From the Entrez Gene page, go to the “Display” pull-down box, and
in place of “Full Report”, select “Gene Table”. Presto! You have all of the
information you need to answer this question.
4. Evolution: given that the mutS/MSH2 gene codes for the same protein in each species, how
can you account for the dramatic differences in length of these genes?
For example, what is revealed by comparing the gene lengths in prokaryotes versus simple
eukaryote to complex eukaryote? Do you think that introns are ancestral and have been lost
over the course of evolution in some groups, or do you think that introns are recent and were
gained during the evolution of more complex eukaryotes?
Answer: Some of both, probably. Organisms ancestral to modern prokaryotes likely had some
introns, and modern prokaryotes lost introns due to selective pressures that favored streamlining
during the course of their evolution. As organisms evolved greater and greater complexity, it
seems they gained more and longer introns. In other words, greater complexity (multicellularity,
cell and tissue specialization) was accompanied by a corresponding proliferation and expansion of
introns along with the invention of more and different genes, and more ways of splicing a single
gene to create variant transcripts and proteins.
Here are the GenBank Full entries for each gene:
S. aureus 2523 nt
LOCUS
DEFINITION
ACCESSION
VERSION
PROJECT
KEYWORDS
SOURCE
ORGANISM
NC_007793
2523 bp
DNA
linear
BCT 26-JAN-2007
Staphylococcus aureus subsp. aureus USA300, complete genome.
NC_007793 REGION: 1306747..1309269
NC_007793.1 GI:87159884
GenomeProject:16313
.
Staphylococcus aureus subsp. aureus USA300
Staphylococcus aureus subsp. aureus USA300
Bacteria; Firmicutes; Bacillales; Staphylococcus.
REFERENCE
1 (bases 1 to 2523)
AUTHORS
Diep,B.A., Gill,S.R., Chang,R.F., Phan,T.H., Chen,J.H.,
Davidson,M.G., Lin,F., Lin,J., Carleton,H.A., Mongodin,E.F.,
Sensabaugh,G.F. and Perdreau-Remington,F.
TITLE
Complete genome sequence of USA300, an epidemic clone of
community-acquired meticillin-resistant Staphylococcus aureus
JOURNAL
Lancet 367 (9512), 731-739 (2006)
PUBMED
16517273
REFERENCE
2 (bases 1 to 2523)
CONSRTM
NCBI Genome Project
TITLE
Direct Submission
JOURNAL
Submitted (11-FEB-2006) National Center for Biotechnology
Information, NIH, Bethesda, MD 20894, USA
REFERENCE
3 (bases 1 to 2523)
AUTHORS
Diep,B.A., Gill,S.R., Chang,R.F., Phan,T.H., Chen,J.H.,
Davidson,M.G., Lin,F., Lin,J., Carleton,H.A., Mongodin,E.F.,
Sensabaugh,G.F. and Perdreau-Remington,F.
TITLE
Direct Submission
JOURNAL
Submitted (30-JAN-2006) Department of Medicine, University of
California, San Francisco, San Francisco, CA 94143, USA
COMMENT
PROVISIONAL REFSEQ: This record has not yet been subject to final
NCBI review. The reference sequence was derived from CP000255.
COMPLETENESS: full length.
FEATSTATS
placeholder
FEATURES
Location/Qualifiers
source
1..2523
/organism="Staphylococcus aureus subsp. aureus USA300"
/mol_type="genomic DNA"
/strain="USA300; FPR3757"
/sub_species="aureus"
/db_xref="taxon:367830"
gene
1..2523
/gene="mutS"
/locus_tag="SAUSA300_1188"
/db_xref="GeneID:3913979"
CDS
1..2523
/gene="mutS"
/locus_tag="SAUSA300_1188"
/note="identified by match to protein family HMM PF00488;
match to protein family HMM PF01624; match to protein
family HMM PF05188; match to protein family HMM PF05190;
match to protein family HMM PF05192; match to protein
family HMM TIGR01070"
/codon_start=1
/transl_table=11
/product="DNA mismatch repair protein mutS"
/protein_id="YP_493885.1"
/db_xref="GI:87160655"
/db_xref="GeneID:3913979"
>gi|87159884:1306747-1309269 Staphylococcus aureus subsp. aureus USA300,
complete genome
ATGTTTTATGAAGATGCCAAGGAGGCATCACGTGTACTTGAAATTACTTTAACTAAAAGAGATGCTAAAA
AAGAAAATCCAATTCCGATGTGTGGTGTTCCGTATCATTCTGCAGATAGTTATATAGATACACTTGTTAA
TAATGGATATAAAGTAGCTATTTGTGAACAGATGGAAGATCCGAAACAAACGAAAGGTATGGTTAGACGT
GAGGTAGTAAGAATTGTGACTCCAGGAACTGTGATGGAGCAAGGTGGTGTAGATGATAAACAAAATAACT
ATATTTTAAGTTTTGTTATGAATCAACCTGAAATTGCGCTTAGTTACTGTGATGTTTCTACTGGCGAATT
AAAGGTTACACATTTTAATGATGAAGCGACTTTATTAAATGAAATTACGACGATAAACCCTAACGAAGTT
GTTATCAATGACAATATTTCCGATAATTTAAAAAGACAAATTAATATGGTGACAGAAACAATAACAGTCA
GGGAAACGTTATCATCAGAAATCTATAGTGTGAATCAAACTGAACATAAATTAATGTATCAAGCGACACA
ATTATTGCTAGATTATATTCATCATACACAAAAACGTGATTTATCGCATATCGAGGATGTTGTTCAATAT
GCAGCTATAGATTATATGAAAATGGATTTTTATGCTAAGAGAAACCTTGAGTTAACGGAAAGCATTCGAT
TAAAATCAAAAAAAGGAACGCTACTTTGGCTAATGGACGAAACGAAAACACCAATGGGAGCACGCCGCTT
AAAACAATGGATAGATAGACCACTAATAAGTAAAGAACAAATTGAAGCACGATTAGATATCGTTGATGAA
TTTAGTGCTCATTTCATAGAAAGAGACACCTTAAGAACATATCTTAATCAAGTGTATGATATTGAACGTC
TTGTTGGGCGTGTTAGTTACGGAAATGTTAATGCGAGAGATTTAATTCAACTTAAACATTCCATTTCTGA
AATACCGAATATTAAAGCATTACTAAATTCTATGAATCAGAATACTCTTGTACAAGTTAATCAACTAGAA
CCCCTTGATGATTTACTTGATATATTAGAACAGAGTTTAGTAGAAGAACCACCAATTTCAGTTAAAGATG
GCGGACTATTCAAAGTTGGTTTTAATACGCAATTAGATGAATATCTTGAAGCTTCAAAAAACGGAAAAAC
ATGGTTAGCAGAATTACAAGCCAAAGAAAGACAACGTACAGGAATAAAATCATTGAAAATAAGCTTTAAT
AAAGTGTTTGGTTATTTTATAGAAATAACACGTGCCAACTTGCAAAATTTTGAACCAAGTGAATTTGGTT
ATATGAGGAAGCAAACGTTATCGAATGCTGAACGTTTTATAACTGATGAACTTAAAGAAAAAGAAGATAT
CATTTTAGGTGCGGAAGACAAAGCCATCGAATTAGAATATCAATTATTTGTTCAGCTACGTGAAGAAGTT
AAAAAATATACTGAACGTTTACAACAACAAGCTAAAATTATTTCAGAGCTAGATTGTTTACAGAGCTTTG
CAGAAATTGCTCAAAAATATAATTACACTAGGCCTTCATTTAGTGAAAATAAAACATTAGAATTAGTGGA
ATCTAGGCACCCAGTAGTGGAAAGAGTAATGGATTATAATGACTATGTGCCTAATAATTGTCGATTAGAT
AATGAAACATTTATATATTTAATTACAGGTCCGAATATGTCTGGTAAATCGACATATATGAGACAAGTTG
CCATAATTAGTATAATGGCCCAAATGGGAGCTTATGTCCCTTGTAAAGAGGCAGTGTTACCTATATTTGA
TCAAATATTCACTAGAATAGGTGCGGCAGATGATTTGGTTTCAGGTAAGAGTACGTTTATGGTAGAAATG
CTAGAAGCACAAAAGGCATTAACTTATGCAACAGAGGATAGTTTGATTATTTTCGATGAAATTGGACGTG
GTACTTCAACGTATGACGGTTTAGCTTTAGCGCAGGCAATGATAGAGTATGTAGCTGAAACATCACATGC
TAAAACGTTATTTTCAACACATTATCATGAATTGACAACATTAGATCAAGCATTACCAAGTCTAAAAAAT
GTTCACGTCGCTGCTAATGAATATAAAGGTGAACTTATATTCTTGCATAAAGTCAAAGATGGTGCAGTTG
ACGATAGTTATGGTATTCAAGTTGCGAAATTAGCTGATTTACCTGAAAAAGTTATTAGCAGAGCACAAGT
GATTCTAAGCGAGTTTGAAGCGTCTGCTGGTAAAAAATCATCGATATCAAATTTAAAAATGGTCGAAAAT
GAACCTGAAATTAATCAAGAAAATTTAAACTTAAGTGTTGAAGAAACAACTGATACTTTATCTCAAAAAG
ACTTTGAACAAGCATCATTTGATTTGTTTGAAAATGATCAAGAAAGCGAGATTGAACTACAAATTAAAAA
TTTGAATTTATCTAATATGACACCAATTGAGGCATTGGTGAAGTTAAGTGAATTACAAAATCAATTAAAA
TAG
A.nidulans6215 nt
LOCUS
DEFINITION
NT_107011
6215 bp
DNA
linear
CON 31-JAN-2007
Aspergillus nidulans FGSC A4 chromosome III scaffold_5, whole
ACCESSION
VERSION
KEYWORDS
SOURCE
ORGANISM
REFERENCE
AUTHORS
TITLE
JOURNAL
PUBMED
REFERENCE
CONSRTM
TITLE
JOURNAL
REFERENCE
AUTHORS
genome shotgun sequence.
NT_107011 REGION: 2170653..2176867
NT_107011.1 GI:50058543
WGS.
Aspergillus nidulans FGSC A4
Aspergillus nidulans FGSC A4
Eukaryota; Fungi; Ascomycota; Pezizomycotina; Eurotiomycetes;
Eurotiales; Trichocomaceae; Emericella.
1 (bases 1 to 6215)
Galagan,J.E., Calvo,S.E., Cuomo,C., Ma,L.J., Wortman,J.R.,
Batzoglou,S., Lee,S.I., Basturkmen,M., Spevak,C.C., Clutterbuck,J.,
Kapitonov,V., Jurka,J., Scazzocchio,C., Farman,M., Butler,J.,
Purcell,S., Harris,S., Braus,G.H., Draht,O., Busch,S., D'Enfert,C.,
Bouchier,C., Goldman,G.H., Bell-Pedersen,D., Griffiths-Jones,S.,
Doonan,J.H., Yu,J., Vienken,K., Pain,A., Freitag,M., Selker,E.U.,
Archer,D.B., Penalva,M.A., Oakley,B.R., Momany,M., Tanaka,T.,
Kumagai,T., Asai,K., Machida,M., Nierman,W.C., Denning,D.W.,
Caddick,M., Hynes,M., Paoletti,M., Fischer,R., Miller,B., Dyer,P.,
Sachs,M.S., Osmani,S.A. and Birren,B.W.
Sequencing of Aspergillus nidulans and comparative analysis with A.
fumigatus and A. oryzae
Nature 438 (7071), 1105-1115 (2005)
16372000
2 (bases 1 to 6215)
NCBI Genome Project
Direct Submission
Submitted (07-JUL-2004) National Center for Biotechnology
Information, NIH, Bethesda, MD 20894, USA
3 (bases 1 to 6215)
Birren,B., Nusbaum,C., Abebe,A., Abouelleil,A., Adekoya,E.,
Ait-zahra,M., Allen,N., Allen,T., An,P., Anderson,M., Anderson,S.,
Arachchi,H., Armbruster,J., Bachantsang,P., Baldwin,J., Barry,A.,
Bayul,T., Blitshsteyn,B., Bloom,T., Blye,J., Boguslavskiy,L.,
Borowsky,M., Boukhgalter,B., Brunache,A., Butler,J., Calixte,N.,
Calvo,S., Camarata,J., Campo,K., Chang,J., Cheshatsang,Y.,
Citroen,M., Collymore,A., Considine,T., Cook,A., Cooke,P.,
Corum,B., Cuomo,C., David,R., Dawoe,T., Degray,S., Dodge,S.,
Dooley,K., Dorje,P., Dorjee,K., Dorris,L., Duffey,N., Dupes,A.,
Elkins,T., Engels,R., Erickson,J., Farina,A., Faro,S., Ferreira,P.,
Fischer,H., Fitzgerald,M., Foley,K., Gage,D., Galagan,J.,
Gearin,G., Gnerre,S., Gnirke,A., Goyette,A., Graham,J.,
Grandbois,E., Gyaltsen,K., Hafez,N., Hagopian,D., Hagos,B.,
Hall,J., Hatcher,B., Heller,A., Higgins,H., Honan,T., Horn,A.,
Houde,N., Hughes,L., Hulme,W., Husby,E., Iliev,I., Jaffe,D.,
Jones,C., Kamal,M., Kamat,A., Kamvysselis,M., Karlsson,E.,
Kells,C., Kieu,A., Kisner,P., Kodira,C., Kulbokas,E., Labutti,K.,
Lama,D., Landers,T., Leger,J., Levine,S., Lewis,D., Lewis,T.,
Lindblad-toh,K., Liu,X., Lokyitsang,T., Lokyitsang,Y., Lucien,O.,
Lui,A., Ma,L.J., Mabbitt,R., Macdonald,J., Maclean,C., Major,J.,
Manning,J., Marabella,R., Maru,K., Matthews,C., Mauceli,E.,
Mccarthy,M., Mcdonough,S., Mcghee,T., Meldrim,J., Meneus,L.,
Mesirov,J., Mihalev,A., Mihova,T., Mikkelsen,T., Mlenga,V.,
Moru,K., Mozes,J., Mulrain,L., Munson,G., Naylor,J., Newes,C.,
Nguyen,C., Nguyen,N., Nguyen,T., Nicol,R., Nielsen,C., Nizzari,M.,
Norbu,C., Norbu,N., O'donnell,P., Okoawo,O., O'leary,S.,
Omotosho,B., O'neill,K., Osman,S., Parker,S., Perrin,D.,
Phunkhang,P., Piqani,B., Purcell,S., Rachupka,T., Ramasamy,U.,
Rameau,R., Ray,V., Raymond,C., Retta,R., Richardson,S., Rise,C.,
Rodriguez,J., Rogers,J., Rogov,P., Rutman,M., Schupbach,R.,
Seaman,C., Settipalli,S., Sharpe,T., Sheridan,J., Sherpa,N.,
Shi,J., Smirnov,S., Smith,C., Sougnez,C., Spencer,B., Stalker,J.,
Stange-thomann,N., Stavropoulos,S., Stetson,K., Stone,C., Stone,S.,
Stubbs,M., Talamas,J., Tchuinga,P., Tenzing,P., Tesfaye,S.,
Theodore,J., Thoulutsang,Y., Topham,K., Towey,S., Tsamla,T.,
Tsomo,N., Vallee,D., Vassiliev,H., Venkataraman,V., Vinson,J.,
Vo,A., Wade,C., Wang,S., Wangchuk,T., Wangdi,T., Whittaker,C.,
Wilkinson,J., Wu,Y., Wyman,D., Yadav,S., Yang,S., Yang,X.,
Yeager,S., Yee,E., Young,G., Zainoun,J., Zembeck,L., Zimmer,A.,
Zody,M. and Lander,E.
TITLE
Direct Submission
JOURNAL
Submitted (26-APR-2004) Broad Institute, 320 Charles Street,
Cambridge, MA 02142, USA
COMMENT
PROVISIONAL REFSEQ: This record has not yet been subject to final
NCBI review. The reference sequence was derived from CH236924.
FEATURES
Location/Qualifiers
source
1..6215
/organism="Aspergillus nidulans FGSC A4"
/mol_type="genomic DNA"
/strain="FGSC A4"
/db_xref="taxon:227321"
/chromosome="III scaffold_5"
ORIGIN
1 atgtcgtctc ggccagatct cagggtaagt tctttttagc ctagaaaccg aacagcttct
61 tactaacttt ctttctaaag gttgacgacg aagtcggctt tatccgcttc taccgctccc
121 tcgcctccga tgactcccat aataatgaaa caattcgcat cttcgaccgt ggggattggt
181 actcagccca cggcaaagaa gccgaattca tcgctcgcac agtctacaaa acaacctctg
241 tccttcgtaa tctcggccgc agcgaaacgg gcggcttgcc gtccgtcaca atgagcatta
301 ctgtcttccg taatttttta cgtgaggctc tattccggct aaataagagg attgagatct
361 ggggctccgc cggcacgggc aaagggcact ggaagaaggt taagcaggcg agtcccggaa
421 atctgcagga tgtggaggag gaattagggg caatgggtat ggagggaagt aacggagcgc
481 ccattatcat ggcagtgaag cttagtgcaa aggccgggga ggcgcgaaat gtaggtgttt
541 gttttgcaga tgcaagtgtg cgcgagcttg gtgtgagtga gttcctggac aatgatgttt
601 actcaaactt tgaggcgctt gttatccagc tcggtgtgaa agagtgtctc gttgtgcagg
661 atgtcaatcg gaaggatgtg gaggtggcca agatccgagc aatatgtgat aactgcggga
721 tagcgatatc ggagcgcccg gcatctgatt ttggggttaa ggatattgaa caggacctta
781 caaggttgct gagggatgag cggtcggctg ggacactgcc ggagacggag ctgaagcttg
841 cgatgggcgg tgcggcggcg ctaattcggt atttgggcgt gatgtcggat gcgacaaatt
901 tcgggcagta tcaactctac cagcatgatt tggcgcagta catgaagctc gatgcggcgg
961 cattgagagc tttgaatctt atgcctgggc cgagggatgg atcaaaatcg atgagtttat
1021 ttgggctgtt gaatcattgt aaaacgcctg ttgggagccg gttgctggca cagtggctga
1081 aacagccgtt aatggatctg gcggagattg aaaagcggca aaggcttgtt gaggcgtttg
1141 tcgtgagcac ggagcttcgg cagatgatgc aggaggagca tctacgatct attccggatc
1201 tgtatcggct tgcgaaacga ttccagcgaa aacaggcgaa tctggaagat gtagtgcgtg
1261 tgtatcaggt tgctattcgg ctgcctgggt ttgtgaactc tctggagaat gttatggatg
1321 aggagtacca gacgccgctt gagacagagt acacggccaa gctacgcaac cattcggcga
1381 gcctggcgaa actggaggag atggtcgaga cgacggttga tctggatgcc ctcgagaatc
1441 acgagttcat catcaagccc gaattcgatg atagtctgcg catcattcgc aaaaagctgg
1501 atcagttgcg ccatgatatg taccttgagc ataaggctgt cgcgagagac ctagatcagg
1561 aaatggacaa gaagctgttc ctggagaacc accgcgtgta cggatggtgt ttccgtctga
1621 cgcggaatga ggcgggttgc attcgcaaca agaaggccta ccaggagtgc tcaacgcaga
1681 agaacggtgt gtactttacc acatcgacga tgcaatctct ccgccgggaa catgatcagc
1741 tctcctccaa ttacaaccgc acccagacgg gacttgtctc ggaggttgtc aacgttgcag
1801 catcgtactg tccggtcctg gaacaactag ccggcgtcct ggctcacctc gatgtcattg
1861 tgagctttgc gcacgcctct gtacacgcgc caacagccta tacgaaaccc aagatccacc
1921 cgcgcggcac gggcaataca gtccttaaag aagcacgcca cccctgcatg gagatgcagg
1981 acgacatctc cttcataact aatgatgtct cccttatccg cgacgagtcc tcattcctta
2041 tcatcactgg ccccaatatg ggcggtaaat cgacctacat ccgcatgatt ggcgttatag
2101 cgctcatggc gcagataggc tgcttcgtgc cctgcaccga agcagagttg acgatctttg
2161
2221
2281
2341
2401
2461
2521
2581
2641
2701
2761
2821
2881
2941
3001
3061
3121
3181
3241
3301
3361
3421
3481
3541
3601
3661
3721
3781
3841
3901
3961
4021
4081
4141
4201
4261
4321
4381
4441
4501
4561
4621
4681
4741
4801
4861
4921
4981
5041
5101
5161
5221
5281
5341
5401
5461
5521
5581
5641
actgcatcct
tggcggagat
tcattgatga
tttcagagca
aactcacgac
tcatcggcga
agaaggtaac
acgtcgctga
aggagctgga
ataagtactc
agtggaagag
ttatgaggga
ttcaggcgct
gcttctgtct
tataggtcat
tttttcgaaa
gaaatcacgt
gcggagacct
ggcctgtgca
tacagtaatc
tgtgcgtaca
attcatgcac
gaggtgggat
ggtggtgcgt
cgctaacggc
ttcttgcttg
caatctgata
tgtccatctc
gacattttga
tcccccggtc
ttgtgcttgc
gttgctaact
tatttcgcaa
tgcatgtatt
tgattgttac
gcctctcgat
cgctgaaccc
gggtattgga
tggataagct
ttccattcat
acagcagcct
agtcggctac
ggatatggcc
cagactttgt
atatctctgc
cgcagtatgt
cccactatat
tcgctctagg
tcctcagtgg
cagctgggct
ttatcgagaa
cagcgtgcat
caagcttcgc
cagtgcacat
tcatcatttc
tagattcttg
tctttttcct
taataatctc
tctttgcgct
tgcccgtgtt
gctcgaaacg
attgggccgc
cattgtgacc
acttgcagat
cggaacaaca
attgctctac
gctcgtccgc
ggattttacg
gcaggaggaa
tgcaattgag
tttggtaaaa
gtgagccatc
gcaatggtgg
tcaactgcca
tcaacgagtc
gcccatggtt
ccacaaacac
gcttagtttt
gaacccgact
gtctaattga
cccttttcca
ctcgtcacac
tcctttaagc
gggctcagac
tcgtaccgat
gttaaagagt
ggtgtttccg
caccatgcag
gtcggcgcca
cgtgttagag
ctctcgcctg
cagtcctcac
gggatcgacg
tactcgtccc
tcaggttatt
catttcgctt
tacgatagaa
caggctccat
catcaaccag
tattgggatc
gacggtgtgg
ggtcattacc
tctgcaggtg
gacggcacag
caagttgcgg
ccggttctac
gtcgtacatt
ttggactgtt
gaagctgaat
ctgggcaggt
cgcgaatctc
aggtgccagc
atactctttc
gctcttccac
tgactatgac
cctccctaca
cttgaaagca
gatgttgcgc
ggtgcgagcg
tcaaacatcc
ggcactagca
gaaatccgct
cgatacccca
gcgaacgaag
cgcgttgaac
ttcccagaga
tccgctgatt
gtcgaagaag
gagccgggga
ggggacgaga
atccagagtt
taggtagtta
ggtaatggat
atatattgcc
tagagtcccg
tcaacgctca
aagctcaaag
tgtccctcgt
ctcgaagctg
tcctcactct
ggctatacga
tatctttcaa
taggacagca
aaaccatctg
ttgggcgata
ctcgcgaaca
atgagcgatg
ggtgtaattg
gatgttgagg
cagccgcgga
ccagttccgc
aatatgcgag
tcgtgcaagg
atgtttcacc
gctttggagg
accgaggagg
acggttgtca
atcaattgcg
cgcgtgaaga
gacttgttcg
cggatattcg
ttggagtggc
caggtggaca
cagagaaagg
aacagtctgt
attgataatg
gagggtctcc
aatgagcttg
aagattgatt
caaccctatc
atgcccatcg
tacattgcat
ctcttccgcg
cttgaccagc
gtcattgtat
gcacttgaca
gtcaaggatt
attcgcagct
tcaagtcagc
cttacgacgg
gctttggact
agtctgtcaa
aagacgaaaa
caggcatctg
aggtagtcaa
ccgctggaaa
ggagtgcgct
gagagctgac
agttgcaggc
gtttctggtc
tggcgttcgg
ctgatataca
atactacttt
cccgcgcagt
ctcttgaggt
cgccaattac
tacgtccctc
gaccgtccac
cggcatcttc
attggccata
gcgtcgcgct
aagggatttg
gcgcaaaagg
ctaatgaccg
actacgccgt
gattgctacg
ttggatggag
tacgcgcggt
atgtcgacaa
gcatttttag
atccgccgac
tcccgaagat
gaccggatcc
acaagatgct
gagcgcagaa
agcatcttcc
catacgaaat
ggagcatgag
ttctcggagt
ttatagggct
gagctcggcc
ttcgccttca
atgactggga
ggcttgtggc
ccaactgggt
agcggacgat
cggcatttct
ctctctcaag
ttcctcatat
cccttttctc
cagcgcggat
ggaagaagcg
tcctcctcgg
tctatctcgc
cttggctggc
cgaggagact
taaaggcgtt
gacctctgag
attcggtctc
attcgcgaca
gaacctgcac
agagaagaga
cgaccagtct
catggcgcgg
tgctgcgtca
gttgaaagcg
gcttgaggag
gaatagagtc
ttcggtttcg
gtagattcta
tctatgatgt
ctgcgctacc
cggagtaagt
cgtctattgc
tctgcctttt
tttacagtgc
atcgcatttg
gtagcgggct
tttgcttggg
actcttgaaa
tcttcctctc
agagtattca
ttgctccagc
cgcggctcgt
ggtattttgg
gaattcggag
tggcgagtgt
tgctctccgc
tctgtgcggt
ggcgttcaat
atactgtccg
aaccaggatg
tgctgcggat
ggctcgcgcg
atcacacaaa
ggggaagctg
tgtttcggag
gtcctatgtc
ggtcttccac
agacgccgcg
gcagttctgc
aagtgtgaca
aaacgatgtt
ggcttaccaa
ttcctggctg
gggtgatctt
tgtgtctggt
catatacgtt
ggatctcatt
cttcaattgg
gaacgtcctc
gaccattctt
gttcacgtcc
tttcttgaat
accaggttag
tctacgttca
tccttgatca
gcctgggcga
cacttccacg
gtcgttgcat
aagacccggc
ttcggcatcc
cagaaggccg
gcgacgattg
ctgctggtga
aagaggcagg
ttccagggga
ggttatctat
cgcaaggttg
atttaatatg
tgtatagtat
gcgcaaatct
tctttgttgt
tactgctgga
ccgtccgtga
caatagtcta
acatcaccgc
atcacggcgt
gggctccgtt
ctttttcttt
ggtaggctgg
accctcgacc
tttaaaggat
ccatacgacc
ctcgatttat
ttgagaaaag
gcgggtatct
cggtcgtcaa
gcctcgcatt
cccgagacag
gagtatatgt
aagtcggacg
caacgattag
gagttggcgc
atagaaaaga
cgggtggttg
ttttggcagt
cgcgttatag
gcgttgaagg
tactggccta
acctatcatg
attatcggga
atcaattata
atggactggc
ttcttgtggg
gctaactggt
atcggctgcg
tccatcctga
caacttacga
cgcaaccgta
tttactgttc
gcgcggatgt
cattttcctc
tacactttat
5701
5761
5821
5881
5941
6001
6061
6121
6181
ggggggtggt
gaagagcacg
gtaagcccgg
tccccctaac
aacactacct
tccatcgtcg
tcgcgcaagt
gcggtgtgtc
gagggcttgc
atcttcatat
atactctacc
tctctctctt
tctccgtgcc
ctctccgcaa
cagtctctat
atggagcgaa
tggacagggt
ggagagaagg
taactgatca
taattcagta
cttcaaaatc
atgttcgacc
gtcatcttct
agtatgcaat
ttgactcagc
ggcaatagca
agagggtatc
tccgcaggcg
gactcaacga
tggactaagt
aatacttcca
gcctctttac
acagtatgct
cggcgcgaaa
ttgggatcgg
gttga
gcatccgctt
tacattacat
taattgtata
acttgcccac
cggccgcttc
tcctgatgtc
gtcaagtggt
gattatgggg
tgaactccgc
ccaccttgag
aacaagtcca
cgcctccgga
gtgcccccta
cggccctcgc
gggggtagtg
atgaagatgg
H. sapiens 28811 nt
LOCUS
DEFINITION
ACCESSION
VERSION
KEYWORDS
SOURCE
ORGANISM
AY943816
28811 bp
DNA
linear
PRI 08-MAR-2005
Homo sapiens mutS homolog 5 (E. coli) (MSH5) gene, complete cds.
AY943816
AY943816.1 GI:60459549
.
Homo sapiens (human)
Homo sapiens
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
Catarrhini; Hominidae; Homo.
REFERENCE
1 (bases 1 to 28811)
AUTHORS
Livingston,R.J., Rieder,M.J., Rajkumar,N., Downing,T.K.,
Olson,A.N., Nguyen,C.P., Gildersleeve,H., Cassidy,C.M.,
Johnson,E.J., Swanson,J.E., McFarland,I., Yool,B., Park,C. and
Nickerson,D.A.
TITLE
Direct Submission
JOURNAL
Submitted (23-FEB-2005) Genome Sciences, University of Washington,
1705 NE Pacific, Seattle, WA 98195, USA
COMMENT
To cite this work please use: NIEHS-SNPs, Environmental Genome
Project, NIEHS ES15478, Department of Genome Sciences, Seattle, WA
(URL: http://egp.gs.washington.edu).
FEATURES
Location/Qualifiers
source
1..28811
/organism="Homo sapiens"
/mol_type="genomic DNA"
/db_xref="taxon:9606"
gene
mRNA
CDS
1996..26853
/gene="MSH5"
join(1996..2058,2450..2609,3160..3283,4874..4954,
5105..5167,5901..6022,6136..6296,6543..6578,7234..7316,
9384..9429,15301..15439,15567..15629,20167..20295,
20550..20622,20768..20877,21099..21179,21427..21514,
21788..21977,22092..22218,22692..22841,23190..23264,
23474..23617,23820..23957,24114..24187,24422..25170,
25417..25557,26066..26161,26268..26357,26438..26853)
/gene="MSH5"
/product="mutS homolog 5 (E. coli)"
join(2463..2609,3160..3283,4874..4954,5105..5167,
5901..6022,6136..6296,6543..6578,7234..7316,9384..9429,
15301..15439,15567..15629,20167..20295,20550..20622,
20768..20877,21099..21179,21427..21514,21788..21977,
22092..22218,22692..22841,23190..23264,23474..23617,
23820..23957,24114..24187,24422..24533)
/gene="MSH5"
/codon_start=1
/product="mutS homolog 5 (E. coli)"
/protein_id="AAX20111.1"
/db_xref="GI:60459550"
/translation="MASLGANPRRTPQGPRPGAASSGFPSPAPVPGPREAEEEEVEEE
EELAEIHLCVLWNSGYLGIAYYDTSDSTIHFMPDAPDHESLKLLQRVLDEINPQSVVT
SAKQDENMTRFLGKLASQEHREPKRPEIIFLPSVDFGLEISKQRLLSGNYSFIPDAMT
ATEKILFLSSIIPFDCLLTPPGDLRFTPIPLLIPSQVRALGGLLKFLGRRRIGVELED
YNVSVPILGFKKFMLTHLVNIDQDTYSVLQIFKSESHPSVYKVASGLKEGLSLFGILN
RCHCKWGEKLLRLWFTRPTHDLGELSSRLDVIQFFLLPQNLDMAQMLHRLLGHIKNVP
LILKRMKLSHTKVSDWQVLYKTVYSALGLRDACRSLPQSIQLFRDIAQEFSDDLHHIA
SLIGKVVDFEGSLAENRFTVLPNIDPEIDEKKRRLMGLPSFLTEVARKELENLDSRIP
SCSVIYIPLIGFLLSIPRLPSMVEASDFEINGLDFMFLSEEKLHYRSARTKELDALLG
DLHCEIRDQETLLMYQLQCQVLARAAVLTRVLDLASRLDVLLALASAARDYGYSRPRY
SPQVLGVRIQNGRHPLMELCARTFVPNSTECGGDKGRVKVITGPNSSGKSIYLKQVGL
ITFMALVGSFVPAEEAEIGAVDAIFTRIHSCESISLGLSTFMIDLNQVAKAVNNATAQ
SLVLIDEFGKGTNTVDGLALLAAVLRHWLARGPTCPHIFVATNFLSLVQLQLLPQGPL
VQYLTMETCEDGNDLVFFYQVCEGVAKASHASHTAAQAGLPDKLVARGKEVSDLIRSG
KPIKPVKDLLKKNQMENCQTLVDKFMKLDLEDPNLDLNVFMSQEVLPAATSIL"
ORIGIN
1
61
121
181
241
301
361
421
481
541
601
661
721
781
841
901
961
1021
1081
1141
1201
1261
1321
1381
1441
1501
1561
1621
1681
1741
1801
1861
1921
1981
2041
2101
2161
2221
2281
tggagtgcaa
ggggaggagt
tcttgtgctt
ccgacagaaa
gattccatgc
ggaccccttc
gttccaactg
acaggtttgg
gccttgatat
taatttacca
tgtaccccca
ctcctatctc
ctgggcttgg
cctgacactc
cagatttgcc
tggaggggat
cctaacttct
aagctgggga
tttctccctc
cacacagacg
tccaacccct
tctctgtctt
tgagtgttct
acctctgctt
cataccctgt
gcttatttat
agctctgtat
aaaaaataag
cccaaacccc
cactgcgcca
cagagggcgg
tccagactct
ggatccgggc
gctcacgcgc
ccacctgtag
gggaggagga
acataagacg
cctccgcgtg
taaatacccg
gaacacagaa
ggggtggagt
cctcttctgt
ggctatcttt
ctctgtgtgg
ccattgaggg
gtctgtgtcc
cagcctctcc
caatattgac
ggtgacaggc
aatcctgcaa
agtagctcct
taaactagtc
cccagagctg
aggggtaacc
tttccaatta
tgggttgggg
ggatggtgca
ataaccccac
cacaaataag
ggcccaaaat
caggggtgtg
gatcagaact
gaagggttta
acccaggaaa
ttgaaacacc
aaatattaac
cgcctctggc
ttccagggac
atcaaatccc
gaagtttgaa
tcagctacgg
ctctccaatc
cggcgcgctc
cgactcaggt
agataagcgc
tgcacgactc
acaggaatga
agaattcacc
ctaaaacaga
gacgtgatgt
gggttttctc
ggtgttcgtt
gtgtgaattc
gaggggatca
aagagaagcc
cggcctctct
tttggctggc
aacgctgccc
cagaggtttc
ctttccctct
cctaatctct
tcaccaacct
attaacccag
ctggttagca
ggctggggag
tttaaaggga
ccccgcaaca
ctttatggag
aggcactaaa
tagaagtgcg
acttctctct
aataataaat
caggcacggt
gctcatttag
tcagagggta
tctttaagtt
tgggactacg
agaaacagtg
cgttctggac
cagtccgctt
aacagcggct
cttttgcagg
tactgaaaag
gtgaggctgg
gccccacagg
gggtggggcg
tcctgtgtcc
gcttgaaact
gctgctggaa
ctgcttgtgg
cccttgaact
cctcatggtg
aaacaactct
ttaggtaaat
tttctctcct
tggcatgact
tctcctggaa
cctccatctc
ctgggcttct
tcacacccca
cctccaagtt
cccttaactc
cagctaggtc
gaatctcccc
ttatatatat
cacacacaca
cagtgacttc
tagttgccga
aaggggtatg
gccagaattt
taggccctgt
agggctgaga
gtcttacttt
caaatattaa
tggcctcccc
gaccctggtc
agtgctagag
ccgccccgaa
ctcctcctcg
aggagggcgg
ctcgtggcgg
gcgggaaaac
ggtcctggcg
gccctcagac
cgtggagttt
acagctctcc
taaagaaagg
accagcagtt
gagggccttt
gtaacatcct
accctcaaaa
acttctcagg
ggggccagct
acagctttat
acccacaggg
ccatccagca
acctccctgt
ctttccactc
gattggaagg
tctatagctc
tgttccccca
atctcacccc
atctcagggt
atatatatat
cacacacaca
attatgttca
atgcatgaat
ggcatgtccc
gatgtaattc
cgtgccatta
cagaagtcct
gtttgccagg
cttaagagtt
ctcaaaaccc
cgaccttctc
gcccggctgc
ggcaaatagg
ccctgtcgga
ggcgcgtgcg
tcggtcagcg
gctgcgatgg
cgtggttggc
cccttccttc
cccacaatct
acgcccctca
gagagacttg
ggtggtttcc
ttctctcctc
gtaagggtat
tctgcacaca
gtcctctcct
tgaagatcaa
agctacagct
tatcgtgcct
gagccagggc
ccctgcattt
cctccccttc
tgggtccctc
cattgctcaa
cctttcttgc
caccatcttt
actaggaaca
attttttttc
cacacacaca
ccgctttgag
gatagatacc
agtaggggtg
gaatgcttcc
tgggggtggt
gcttgtttcc
cactgttcta
gttgcaggaa
ccgcaacggt
gcgggcttcc
taagcaacgg
ccaatcagcg
tctctaggct
cgcgcacctc
gggcgttctc
cggcagctgg
agaggcagag
caaagggtaa
gtactttagt
gccctgcccc
2341
2401
2461
2521
2581
2641
2701
2761
2821
2881
2941
3001
3061
3121
3181
3241
3301
3361
3421
3481
3541
3601
3661
3721
3781
3841
3901
3961
4021
4081
4141
4201
4261
4321
4381
4441
4501
4561
4621
4681
4741
4801
4861
4921
4981
5041
5101
5161
5221
5281
5341
5401
5461
5521
5581
5641
5701
5761
5821
gcagccctgt
cctctgtgaa
tcatggcctc
cctcctccgg
aagtcgagga
agttgatggg
aatcttagca
agcctgggca
gtggcacgtg
cgggaggcgg
agcaagactc
aagtctctta
ctcagagatt
tgattctaat
aattcaggat
ccagatgccc
aattcctctg
cacacacaca
gctcaagtgc
ttctcctgac
agttttttgt
aaactcctga
gagtcaccac
tgtgttgctg
tatttggctt
agcctcagac
cagagaagaa
aactaataga
tctgttcatg
acactatcac
aactatagca
ttatgaattc
tgttcttttt
cttggctcac
agtagctggg
ggcagggttt
tgccttggcc
atatatgacg
tgtctgttct
caccatccca
tgcagagcac
ggagggctat
ttctccttcc
gatgagaata
aaaatgggga
tgtcagacag
acagcctccc
gattttggta
cccccaactc
tgtgctcctc
taacctgcag
tgaggagtgg
aagcctttta
gcacttctat
attaaggcat
ccagacagat
tatctttata
cttgctctag
gcttctattg
agcagaagta
tcgttgcttc
cttaggagcg
cttccccagc
ggaggaggag
aagttagaat
ctttgggaga
acatggcgaa
cctgttatcc
aggttgcaat
cgtctcaaag
cctcctgagt
tgagaaaatg
agtgatttcc
acttgggcat
cagaccacga
ctctctggga
cacacacaca
agtggcgcaa
tcaacctccc
gtgtgttttt
ccttgtgatc
gcccagccat
taaggaaata
atggttctgg
tgcttcaact
gcaaggggga
gtgagaacct
agcgatccac
actggggatt
ctgactcaat
atttcattat
tgagacagtg
tgcaacctcc
attacaagaa
caccacgttg
tcccaaaggg
tatttacaat
ccaggggttt
ccactcttaa
tccctactcc
gggttttctc
aagttctgga
tgactcgatt
aggactaata
aggtagaagg
aggagcacag
tctccttcct
taccttcatc
atgccctatg
cttattggga
tgacactttt
tcaccaaacc
aactttccta
tagtgttact
gtctactttc
ttatatgtag
ctttccatta
tctcattctt
cttagtgctt
cgaaccgccc
aacccaagga
ccggccccag
ctggccgagg
aaaagagggt
ctgaggcggg
accccgtctc
cagctactgg
gagctgagat
aaagaaagaa
ggctgtttca
attaaattat
tttcttcctt
tgcctactat
gagcctcaag
ttgcagatgt
tatttttttt
tcttggctca
gagtagctgg
agcacagacg
cgcccacctt
gttttactta
cctgactctg
atggttggaa
catggaagaa
gggagatgcc
ctcactcata
tcccatcacc
aaatttcaac
atattttaca
ttaacaaata
tcttgctccg
gtctcccgga
tctgccatca
gctaggcttg
cagggattat
gtttcaggtg
acagcctagt
ctacttttct
tagggaggaa
ttagtcaaag
tgagatcaat
tctgggaaag
tatggaatat
actgagatgt
agagcctaaa
tttgctttgc
atcacagatc
acctgtcccc
agcctctgct
tggacagggt
aaaaggcact
ggtttacaat
agttctatta
ctcagccatt
aataaaaaga
actgcctgtg
aaaatggggt
tgcattctgc
tcactttttg
ggacaccgca
tgccgggccc
tctctgaggg
tgggagccgg
cggatcacct
tactaaaaat
gaaggctgag
tgctccactg
aaaaaaaaac
cattcactaa
ataagacatg
gctggacaga
gatactagtg
cttctccaga
gttacacaca
ttctagacag
ctgcagcctc
gactacaggc
gtgtttcacc
ggcctcctaa
cattaactca
agtaatttgt
agctcaaaat
gggaaggcag
aggctctttt
cccaccaaca
cacacacctc
acgatatttg
gttgcttcac
tttgtgaggt
tcactcaggc
ttcaagcaat
cgcctggcta
tcttgagctc
aggcatgagc
cttcagattc
gacaacatcc
aaatctcaac
atgtttttga
acaaagatcc
ccccagtctg
cttggtaagg
tccagggggc
aaagaatgat
agacctgaaa
ctaactccct
tcccctctgc
ccaagatctc
taagtcatgt
tttattgttg
gcctcagtga
aagaacagga
ataccattat
tatctttctc
gaattagact
tgagcttggg
gaaaaaattg
gcgccaccct
catccgcaga
gggaccgaga
cagggaggcc
gagtagaaac
gcgcggtggc
gagctcagga
ataaaaatta
gcaggagaat
cacttcagcc
agggttggga
atgggggtga
gtaaacccta
tccatctgtg
actccactat
gaggtgggga
cacacacaca
agtcttgctc
cacctcctgg
gtgtgccacc
atgttggcca
agtgctggga
cctcactgtc
taaaaaaaaa
tgggcatctt
ggtgtgtaga
tgacaaccag
cactccagga
ctgctaggcc
gcagggacaa
agaggctccc
tgttttttgg
tggaagtgta
tctcctgcct
atttttatat
ctggcctcca
cactgtgcct
agccctgggc
agaacatccc
ttctacctgt
gaaggagagg
tttaactcat
ttgttacgag
acttggtaaa
tagaattggg
agccttttct
tcatattttt
gttccggtgt
cttatgtcat
tcctgctccc
ctagggatga
gaattctccc
cccttattat
gtgtactatc
tttgaccaaa
aggctgtgct
aagagtctga
caagtcaaat
agctacaaga
accccggcct
gcctccaagc
cctggggcgg
gaggaggagg
ttgaatggag
tcacacctgt
gttggagacc
gccgagcgtg
cactgtaact
tgggcgacag
agagctgggc
tgatgcctat
cacttatgag
tgtgctgtgg
ccacttcatg
tggaaccatg
cacacacaca
tgttacccag
gttcaagcaa
acacccagct
gggtggtctc
ctacaggtgt
tagcatattt
aaaaaagttt
cactggtgag
ggtcacatgg
ctctctcagg
agggcattaa
ctacctcaca
atcacatcca
tcttttgttt
tttgtttggt
gtggtgccat
cagtctcccg
ttttagtaga
gtgatctgcc
ggccacaaat
aaatcagtca
acttccctct
gttcccactg
ggtaggaaga
ttgatctctg
tgccaaacag
ggatagaggg
tgagagggag
ttcctccccc
gccaagtgtg
cccattcttt
cctaaacctt
taccctttaa
gggcctcccc
cattaagtta
gatccataag
ctaattagat
atcctcaatt
ttcagacaag
aaatttggtt
aatctctctt
ccgttccctt
5881
5941
6001
6061
6121
6181
6241
6301
6361
6421
6481
6541
6601
6661
6721
6781
6841
6901
6961
7021
7081
7141
7201
7261
7321
7381
7441
7501
7561
7621
7681
7741
7801
7861
7921
7981
8041
8101
8161
8221
8281
8341
8401
8461
8521
8581
8641
8701
8761
8821
8881
8941
9001
9061
9121
9181
9241
9301
9361
tgcttgcctc
ctccttcatc
tccctttgac
acaagtgcta
cctggtgctc
tcccaggttc
gaactggaag
gtgattcacc
ggggtggggt
aatctaaggg
aaggaagtgg
aggactcatc
ctgtctctgg
gtttccctaa
tgagggaaag
tgaggcaaag
tggctcatgc
aggagtttga
gctgggtgtg
aatccaggag
acagagcgag
atatcccctg
gttaccccaa
tcacccctca
gtgtgcccca
gatgtgtccc
tccaaatttc
tgtggctgta
tgggtttgta
gtatagagaa
agttgtctct
atctcagtaa
gtcgcccagg
ttcacaccat
tgcctggcta
tctcgatctc
gcgtgagcca
gtgtgtgtgt
tcggctcaat
gtaactggga
acggggtttc
gcctcagcct
tttttacata
aatcttgctg
cgcctcttgg
aactacaggc
tttcgccatg
gcctcccaaa
ttgtattttt
ttgtttcgtt
gcgatctccg
cgagtagctg
agagactgcg
ctgcctcggc
ttgagtttta
atcccttccc
tttggctgcc
atgcctcagc
actccctgcc
cctcaaatag
ccagacgcca
tgcctcctca
gggctgaatt
tgcagccccc
gagcacttgg
actataatgt
ccaaccccaa
gtggatgtgg
ctaatgagac
agttgtggaa
tggtgaacat
ggagggagaa
tcctggggct
agaggaagaa
aaaagtaagg
ctgtaatccc
gacctgacca
ttgtgcgcct
gcagaggttg
actccatctc
tccccattcc
actcctccat
gtgtacaaag
tccctcatct
agcctccctt
ttacctattt
cagtgttgtt
tgaatgtttc
gagttttgta
ggctttaggg
gttgtgtcca
ctggagtgca
tctgctgcct
atttttgtat
ctgacctcgt
ccgcgcctgg
gagacgaagt
gcaacctctg
ttacaggcac
accatgttgc
cccaaggtgc
tgtgtgttgg
tgtcacccag
gttcaagtga
ctgcaccacc
ttgcccaggc
gtgctgggat
ggaggggcgg
ttgttttgag
ctcactgcaa
ggactacagg
tttcaccatg
ctcccagagt
actaggtctg
ccatctccat
tttgtgccct
ttagacacat
ttatccctca
gtctggagat
tgactgccac
cagtgagatt
ctgggaggta
aggagattta
agggctgctg
cagcgtcccc
ccaaagtaat
ctgtgaccca
tttgggaaga
cgaactcaga
agatcaagac
ggattaagtt
attaagatct
aagcataaag
gacaaacctt
cgcgctttgg
atatggtaaa
gtaatcccag
cagtgagctg
aaaaaaaaaa
atttatcagt
ttctcctcga
tggccagtgg
cacattacaa
cccacttcac
gtaccccccg
gcatatcagc
tactagttgg
actgcataac
tatattaggg
ggtttttttt
gtggtgcaat
cagcctcctg
ttttagtaga
gatccgtcca
ccagttgtgt
ctcgctcttg
cctcctgggt
ctgccaccac
ccaggctggt
tgggattaca
gctcttgagt
gctggagtgc
ttcttctgcc
atgcccagct
tggtctcaaa
tataggcatg
gagggctctt
acggagtctt
gctctgcctc
tgcctgccac
ttagccagga
gctgggatta
ctttgtgtat
gtacggtaat
cccaggccag
aaacacattc
caggaatcct
aagcaaacaa
tgagaaaatc
ggtcctgggg
ctggcctagc
agatttaccc
aagttcctgg
atcctgggct
gtgggattgg
gtgggtcaag
agactgggac
ctgcttcctg
acttacaggt
taatgcccca
ctctccttga
atactagctt
acatcaagat
gaggccaagg
accccgtctc
ccactcagga
aaattgcacc
aaaaaaaaaa
cctcaattct
cagtgttcta
actgaaggag
agacctacca
tcccattgtc
ccccccaagc
cattacttta
gtacctgtta
tgcctatttg
acatctccgc
tttttttttt
atcagctcac
agtagctagg
gaacgggttt
cctcggcctc
ccagttttgt
tcccccaggc
tcaagcgatt
gcccagctaa
cttgaactcc
ggcatgagcg
tttttgtttg
agtggtacaa
tcatcctacc
aatttttttg
ctcctgagct
agccaccctg
gagttttttg
gctctgtcac
ccgggttcat
catgcccggc
tggtctcgat
caggcgtgag
ttttctggct
cccagctcat
cttcctcaac
cattccctgt
caacagatgc
cgcctccttt
ctcttcctct
gataagggct
cctggaaaat
cgattccact
gtcgaagaag
ttaagaaatt
gaggcctgaa
tgctctagga
aatattcaga
cttttttgtt
aaagaggtgg
ataatcctaa
aggaaaggga
tcttttctat
atgatctcgg
cgggtggatc
tactaaaaat
ggctgaggca
actgcactcc
aaagacgtga
tattcccttc
cagattttta
gggctcagcc
gaaaagcaat
agatatctct
ttgagcatct
ccaattctgt
gggactttgg
atttgtatag
aaatatccat
ttgatacaga
tgcaagctct
actacaggtg
cactgtgtta
ccaaagtgct
gtgtgtgtgt
tggagtgcaa
ctcctgcctc
tttttgtatt
tgacctcagg
accgcgcccg
tttgtttgtt
tttaggctca
tcagcctcct
tatttttagt
caagtgatcc
cccggccagc
tgttttgttt
ccaggctgga
gccattctgc
taattttttg
ctcctgacca
ccaccgtgcc
aagtgtccct
atttgtggcc
aaccagcacc
ccctgccttg
cactgtaagt
ctggaaacta
cttccattat
gggaggcggc
agtaactttc
gctgatcccc
aatcggggtt
tatgttgtag
aagtaaagtg
cacccgggag
gagggggaca
ttctgtcctc
aggcatgctg
tgaggctcta
aggggggttt
agggagaaac
ctgggcgcgg
gcctgaggtc
ataaaaatta
ggattgcttg
agcctgggcg
tctcaggagg
aaaagtccaa
agagtgagtc
tctttggtag
tggctccaaa
ttcatgccaa
tcccatactt
gttccttccc
gagaccttgt
agtctttatc
atagtttcat
gtctcgctct
gcctcctggg
cccaccacga
gccaggatga
gggattacag
gtgtgtgtgt
tggtgcgatc
agcctcctga
tttagtagag
ggatccactc
gccgtccagt
ttttagatgg
ctgcaacctc
gaatagctgg
agagatggtg
tcctgccttg
tattgagttt
gtttgtttat
gtgcagtggc
ttcagcctcc
tatttttagt
cgtgatccgt
tggccagttc
gtgagtgtcc
aggcaccagc
tctgacctgg
taacaagttc
ggggagagaa
9421
9481
9541
9601
9661
9721
9781
9841
9901
9961
10021
10081
10141
10201
10261
10321
10381
10441
10501
10561
10621
10681
10741
10801
10861
10921
10981
11041
11101
11161
11221
11281
11341
11401
11461
11521
11581
11641
11701
11761
11821
11881
11941
12001
12061
12121
12181
12241
12301
12361
12421
12481
12541
12601
12661
12721
12781
12841
12901
gctgctcagg
actacatact
gtagtagtaa
agctttgtgt
cccattgatg
ggccgggcac
tcacttgagg
aaatgcaaaa
taaggcagga
ctgcacttca
tgtaacacat
ctgtgatgac
atatcaccac
ccagatctgt
cttctcaagc
gagatctaag
tgttgcccag
gttcaagcaa
acacctggct
gtctcgaact
gagtcactgc
ataaacctta
aatgatgttc
gccccttcta
gccattatgg
tttttttttt
ttgctcactg
tagctgggat
tgggatttca
tcggcctccc
tttccttgtt
gtagtctcag
ccagcctggc
tcatggcgca
ctcgggaggt
agagtgagat
tgttaccctt
acatagtgct
agtgaatctg
gtgtctgaca
aacgcactgg
atccttcccc
atctctgagg
cccacacatt
gagaattctc
aatggaatca
actaaggaca
ggaaatacct
cacgcctata
tcaagaacta
agggcactgt
tatgactttg
cagtttggtt
gtgtttagtg
actgttctgg
agccatctgg
acagactgca
tataagatgg
cctacagagt
tgagtgggtc
aaagcctact
agcaactgcc
agtagctgat
atacacagac
agtggctcac
tcaggagttt
attagccaag
gaatcgcttg
gcctgggcaa
ggctatgtta
tctcaaattt
ctgcagtttt
tcctcttcct
catgaacctt
catgtcactc
gctggagtac
ttcttctgcc
aatttttgta
cctggtgatc
gcctggccgt
ggaggttttc
agggtccatc
agtcctttcc
aaacatgtca
gagacggagt
caacctccac
tacagggacc
ccatgttggc
aaagtgctgg
ctgtacctat
cattttggga
caacttggtg
tgcctctagt
agaggttgaa
tctgtcttaa
tggcttttac
tgtttatgtt
cttgcctcca
catggtcttg
tgacatcatc
tacctcacct
tcatctccag
cttcagtgct
agtggagaag
gacagggcat
ggtgacatat
ctaacgtaca
atcccagcat
aaagctgtac
cctaggagtc
ggcaagttgc
taactaaagt
cagcgcatgt
tgggagccct
atgctccgct
atgtaagata
catccttagt
ctcacactct
ccacacatac
aatggcagta
ctttactgag
atgcatctca
ttgcacacat
acctgtaatc
gagaccagct
cgtgttggta
aacccaggag
cagagtgaga
gcatggttat
gtgtctctag
cacaggcagc
gcattccctt
ggattgatgc
cttttttttt
aatggcacga
tcagcctccc
ttttcagtag
cgcccacctc
cactccactt
ccattacctt
acgtttaccc
tactggatct
tgcctgtgtt
ctccctctgt
ctcccaggtt
caccaccatg
cacgctggtc
gattacaggc
caaaatctta
ggctgaggtg
aaacttcacg
accagctact
gtgagcccag
aaaaaaaaaa
aggtacctgt
tttctagttg
tagctatgaa
tacatagtag
tctaaacaga
cccgctgcaa
gtaatcagca
tcaggtgcag
agagctggat
gagatcatat
ttattaatat
aaaagatgta
ttgggaggct
aggcaccaag
tggagacatg
ttatctttta
ctctcccagt
gctttctaaa
cagtgaccaa
tcattcaaat
caaatcttca
acttgttcta
catggccaat
tacacactaa
tacagattct
cactggctaa
ttttttgttg
acaagcagca
ccagcacttt
ggccaacatg
ggtgcctgta
gtggaggttg
ctccgtctca
ctttagttat
ttttgaactc
tcaaactcag
tgctggttaa
tagaaacaaa
tttttttttt
tctctgctca
aagtagctgg
aggcggggtt
ggcctcccaa
tttaaatagc
caggattaag
cagttttaat
accttatatt
tatgctgctc
cacccaggct
caagcaattc
cctggctaat
ttgaactgct
ataagccact
catccaggtc
ggtggatgat
tctaccgaaa
caggaggctg
attgccccac
aagtgcatcc
aaggagttga
cactgtcaca
ctctatctag
cactcaatgt
gtggaaacct
tgtgtatctt
tctccaggaa
atctttaatc
ttaaggtcgg
aacttgaaga
ttctgtatac
aataataata
gaggcaggag
aatatgactg
attttgagac
acggtttcat
acgagcaggg
gtggggatgg
ggagcagagg
ctggggtgtt
ggacctagtg
tgtagaaaag
aagtatacag
tgcatgaatt
cacatacacc
ctgcatttca
tcagcgcagg
ggaaaaaaca
ggggggccaa
gtaaagcccc
attccagcta
cagtgagcca
aaaaaaaaaa
agaaaacaca
cgtatgtgaa
agcatccaaa
tggcattgct
aaacctgtca
ttgtgactga
ctgcaacctc
gattacaggc
tcaccatgtt
agtgctggga
ctaagtagaa
attagcatct
ttccagactc
ctagtcattt
cttctggaaa
ggagtgcagt
tcctgcctca
ttttgtattt
gacctcgtga
gtgcccggcc
aggcgcggtg
ttgaggtcag
atacaaaaat
aggcaggaga
tgcactccag
tcttcaaggt
tgtgcacctt
tcatgttaga
tagctatacc
gtgaacacaa
ttgctagcct
gtaggagtta
tcggcagggt
tcagccacag
ggaggaatgc
atcatcatat
gaatttattt
tagggccaga
gattgcttga
aatgtcacag
ttagctgttc
tttctcagtt
cgtgagtcag
ctatttacag
tacctgaaac
ctaacccaaa
tgtgcacatg
aatttgtggg
ggatacccgg
ccatatgcac
accccaccta
tccttataac
tacacatata
caaaatgtaa
ggtgggtgaa
atctctacta
ctcaggagac
agattgtgca
aaaatgctaa
cttcacattt
tgttaattgc
ctgatgccca
ggcagtacac
ttccaaaaca
gtttcgctct
cacttcccgg
gcccaccacc
ggtcaggctg
ttacaggcat
agaaaataac
taagcagtat
accttccaaa
agggccactt
gtcttttttt
ggcgcggtct
gcctccggag
ttagtagaga
tctgcccacc
tggaaagtct
gctcacgcct
gagttcaaga
tagcccagca
attgcttgaa
cctgggcaac
gcaatccaac
ctttgtgctc
attagcagtc
tgttacctca
cgcaaatgta
caggtgcaca
atttaggata
aatttaatta
atgggaggga
gtattcccca
aatctaatga
taatttatta
tgtggtggct
ggccaggagt
tatgctctaa
tcattggcgg
gttaaattta
agatacctaa
actggcctac
ccacccttga
gtaactggcc
ttggctctta
ctcacaagtc
aattagacaa
12961
13021
13081
13141
13201
13261
13321
13381
13441
13501
13561
13621
13681
13741
13801
13861
13921
13981
14041
14101
14161
14221
14281
14341
14401
14461
14521
14581
14641
14701
14761
14821
14881
14941
15001
15061
15121
15181
15241
15301
15361
15421
15481
15541
15601
15661
15721
15781
15841
15901
15961
16021
16081
16141
16201
16261
16321
16381
16441
acacagatga
agacagagtc
cacctcccgg
cgccaacacg
gccaggatgg
gggattacag
cctccaacac
ccagcacttt
tggccaacat
tgcacccctg
aggcggaggt
agactctgtc
gtggcttatg
caggagcttg
tagtaataca
tgaggcaggc
ccctatctct
agctgctggg
acccgagatc
taataataat
ggcgcggtgg
aggtcaggag
caaaaattag
tgcagtaagc
taaaaaataa
tgtaatccca
attctggcca
gtggcgcata
cgggaggtgg
gcgagactct
ccttaaatgt
agcttatcat
tctgttgctg
tcaccgtctc
cactgctctc
ggtcttatcc
agtagataca
tagcacatcg
aactgagatc
gctatggttc
tcagtttttt
tcacatcaag
gagggctgat
gcggctgtaa
aggtcagcga
caggtaaagg
ttattctacc
tgccatgggg
tcagcctccc
tttttagtag
ggtgatccac
cggcctccct
aatataatta
ttttttattt
ttgcagtggt
tacctcagcc
tgtattttta
ctcaagtgat
tgcccagcct
gacatttatt
tcactctgtc
gtttacacca
cccggctaat
tcttgatctc
gcgtgagcca
acaaaaagat
gggaggccga
ggtgaaacct
taatcccagc
tgcagtgagc
tcagaaaaaa
cttgtaatcc
agaacagcct
aatggccagg
agatcacctg
actaaaaata
gaggctgagg
ccgccactgt
aatacaaata
ctcatgcctg
ttcaagacca
ctgagcacag
caagattgtg
aataaaaaaa
gcactttggg
acatagtgaa
cctgcagtcc
aggttgcagt
gtctcaaaaa
ccctccccaa
ccctaatggg
ttacccagag
ttactcagcc
cttctaaaat
gttgaaactg
tgtaaacaca
tggatacata
aagatgatag
acacgtccga
ctgctgcccc
aacgtgcctg
actgggcagt
ccttgtctga
ctggcaggtt
ccctcagcct
ctcttttttt
cgatcttggc
gagaagttgg
agacaggatt
ctgcctcagc
cttttttttt
acccccatgt
ctttttcttt
gcgatctcgg
tcccaaatag
gtagagatgg
ctgcccatct
agtatgtgtc
tctgtatatg
acccatcctg
ttctgcctca
tttgttgtgt
ctgacctcgt
ccgcacctgg
gtaaataatt
ggcaggtgga
catctctact
tactccagag
tgagatcgca
aaaaaaagat
cagcactttg
gggcaacata
cgcagtagct
aggtcacgat
caaaatttag
caggagaatc
accctagcct
tctatatatc
taatcccagc
gcctggccaa
tggcgggtgc
ctactgcact
aagaataaga
agccgaggcg
accctgactc
cagctactcg
gagccgagat
aaaaaaaaaa
tccttttttc
acagttatgt
acactttcac
tcagagtgag
attgacaagc
tgatatgtag
cctgaatggg
ttgccacaat
atgtaacttg
ctcatgacct
agaatctgga
tgagcccagg
gggcttcttg
ctgtagctga
ctctacaagg
gtattccaga
tttttttgga
tcactgcaac
gattacaagc
tcaccatgtt
ctcccaaggt
ttaactttgt
acccaccatg
ttttttttag
ctcactgcaa
ctgggattac
gatttcacca
cggcctccca
atctatatct
aatttatttt
gagtgcagtg
ccctcccgag
ttttagtaga
gacccgccct
cctgaattta
agccgggcgt
acaccagagg
aaaaatacaa
gctgagacac
ccactgcact
gtaaataata
ggaggccatg
gcaagacctc
catgcctgta
tttgagacca
ctgggcatgg
acttaagcct
gggcgacaga
catcagccaa
actttgggag
gatggtgaaa
ctgtaattcc
ctagcctggg
cactattggc
ggcagatcac
tactaaaaat
ggaggctgag
cgcatcactg
acgaaagaat
tctgcagtgt
tttcacagga
agctaaaaag
ctgcagtgtt
tccgttactt
acacaattat
taggacactg
ccccagggac
tagtaccccc
gggggagctc
catggctcag
gtggagggca
aggggcatta
ttctgaaacg
taaggccttc
ctgtctgtac
gacagcctcg
ctccgcctcc
gcccgccacc
ggccaggctg
gctgggatta
attcaggaaa
cagtttcaac
acagagtctt
cctctgcctc
aggtgcccac
tgttggccag
aagtgctagg
ttttctactt
atttatttat
gcctggctca
tagctgggac
gacggggttt
ccttggcctg
ttttcattta
ggtggctcat
tccggagttt
aaattagccg
gagaatcgct
ccagcctgga
atactatcgg
gcaggaggac
gtctctataa
attccagcac
gcctggccaa
tggcacacac
gtgaggcaga
gcaagactcc
tttaagaata
gccgaggcgg
ccccgtctct
agaacctggg
cgacagagca
cgggtatggt
gaggtcaaga
acaacaatta
gcacgaaaat
cactccagcc
aagacattgt
tgaccattat
agaatatgaa
acatacaaac
ggcacacaaa
atatacatgg
gctcacatct
cacttgccac
tgcaagcaca
acccaaaccc
agttctcgtc
atgctgcatc
gggaggtggg
gagtgaggga
catgaagttg
cttcttgaat
cctagacatg
ctctgtcgcc
tgggttcaag
atgcctggca
gtcttgaact
caggtgtgaa
atgtaaaaaa
actttaactt
gctctgtcgc
ccaagttcaa
caccacactg
gctggtctca
attataggtg
tcccctcttg
ttattttttg
ttgcaagctc
tataggtgcc
caccgcgtta
ccaaagtgct
ttaggaaata
gactgtaatc
gagaccaggc
ggagtggtgg
tgaacctggg
caacagagca
gccaggtgca
tgcttgaggc
aaactattaa
tttaggaggc
cacagcgaaa
ctgtagttcc
ggttgcagtg
atctcaaaaa
agacatgccg
gtggatcaca
actaaaaata
aggtggaggt
agactctata
gactcacgcc
gatcgagatc
gctgggcatg
cacttgaacc
tggcgacaaa
tgttgaagcc
tatgaattaa
aagatgaatg
tcatactgac
tacctcaaca
aatgacacac
agcaattttc
tacattccca
ctttttggca
tcacttccag
tggacgtcat
ggctcctggg
gaaggaggtt
agagaaaaca
tcccacacca
cccaaaagtc
ctgtccaatt
caggctgaag
caattctgcc
aatttttgta
cctgacttca
ccaccaggcc
tatttagaat
acgccaattt
ccaggctgga
gtgattctcc
gagtgatttt
aattcctggc
ggagccaccg
gattattttg
16501
16561
16621
16681
16741
16801
16861
16921
16981
17041
17101
17161
17221
17281
17341
17401
17461
17521
17581
17641
17701
17761
17821
17881
17941
18001
18061
18121
18181
18241
18301
18361
18421
18481
18541
18601
18661
18721
18781
18841
18901
18961
19021
19081
19141
19201
19261
19321
19381
19441
19501
19561
19621
19681
19741
19801
19861
19921
19981
tgggttttgt
gagcagagtg
cccaccttag
tggattattt
aatttctttt
gagtgcagtg
cctgcctcag
tttttgtatt
tgacctcgtg
ctcacccggc
cagtattgtt
ttcttttttg
agctcactgc
agcagggacc
tatttattta
agcccaggct
caagtgattc
cccagctaat
cgaactcccg
cgtgagccat
gctcaaggtg
gctgagacta
tctaccggtt
ctcctggcaa
ctggattttg
ttctgtaaat
atactgtttt
gatgttggca
tatttagttt
gctcactgca
gctgggactt
gggatttcac
ctcagcctcc
ttctttagca
cagaaatttc
atcatataga
agtcttgctc
acctcccggg
taggcgtgtg
atgatggcca
agtgctggga
aacgcttctt
aaatttcagt
tatggtaaat
tttgtttgtt
agtggcgcga
tcagcctcct
acttttagta
gtgatccacc
ggccctgata
ggccttttat
tttgcacaac
ctgtcttcct
tttgttttgt
tctggctata
gctggaatta
agggtttcac
ctcagcctcc
tcttaattgt
tgtcgtttgt
gtgcagtcat
cctccagagt
tgaagcaagt
ttcttttttt
gcatgatctc
cctcccatgt
tttagtagag
atccacccgc
caatatttca
atcttacttt
agacagggtc
agcctcaaac
acaggtgatg
tttatttatt
ggagtgcaat
tcttgcctca
tttgtatttt
acctcagatg
tgcacctgac
gtctcaaact
ttggtgtgag
ccttctctat
tttgttgaag
ctgtttgcat
taatatttag
agaagtagag
aatgttctga
gaggtgagtc
acctccgcct
acaggcgcac
catgttggcc
caaagtgctg
attacaaagt
tgtaaagaga
aatacagaaa
tgtcaccagg
ttgggtttca
ccaccatgcc
ggatggtctt
ttacaggtgt
gaatttgtgt
atatgtgctg
tagttcccta
tgtttgtttg
tctcggctca
gagtgcctgg
gagacagggt
cgcctcagcc
gccgatgagg
agctatattt
cctctatcaa
tcctagactg
tttgagacag
acctccacct
catgtgcatg
catgttggtg
caaagtgctg
ctttgtagtt
ttgttttttt
atttcattgc
aactgggact
cccagccatt
tttttttttt
ggctcactgc
agctgggata
acagggtttc
ctcggcctcc
gggtaatttc
aaaaattgta
tcactctgtc
tgttgggctc
gccatcacac
tatttattta
ggtgtgatct
gcctcccaag
tagtagagac
atctgcctgc
caatttttta
cctgagctca
ccacgatgcc
cttttctctc
aaactagatt
ttcctagatg
ttctagaagc
gaacataatg
tgtttaatac
tccctctgtc
cctgagttca
accaccacgc
aggatggtct
ggattacagg
agtagcattt
aacttccttt
attgcttgat
ctggagtgca
agtgattctc
tggctaattt
gatctcttga
gagccaccat
catccgtgcc
cagaaatgag
acattctcca
tttgagatgg
ctgcaagctc
gactacaggc
ttcaccctct
tcccaaagtg
tttttttgtc
ctctttctcc
agcacctacc
tgagcacatc
agtctcgctc
cccgggttca
ccaccaagcc
aggctgatct
ggattacagg
tcagtgtttg
aaataaggtc
agcctctaac
ataagcctga
atatcatttc
gagatggagt
aacctccgcc
acaggcacat
accatgttgg
cagagtgctg
taaaagaaaa
ttatttggta
acccaggctt
aagcgatcct
cgaactaagt
tttatttttg
cggctcaccg
tagctgggat
agggtttctc
cccggcctcc
ttttttgtag
agtgatcctc
cagtcagatg
ttgttctttc
attatttgtc
tttttggcac
atgattaggt
tctgatatgt
ctaaatctat
gccaggctga
agtgattctc
ccagctaatt
tgatctcttg
cgtgaaccac
aaatctctga
tatgtactat
atttccccca
gtggcacaat
ctgcctcagc
ttgtattttt
cctcgtgatc
gcccagccct
cagtggccgt
caccccccac
agatagccat
agtcttgcac
tgcctcccgg
gcccgccacc
taaccaggat
ctgggattac
attgttcttc
cgactctgta
acctcacttt
tgggacaggg
tgtcggccag
agagattctc
cagttaattt
cgaactcctg
catgagccac
gtacagtgcc
ctgctttgtc
ttctgggctc
gatgctgcac
atccataaat
ctcactctgt
tcccaggttc
gccaccatgc
ccaggatggt
ggattacagg
ttatttttta
tcatcaaata
gagtgcaatg
cccccctcag
ttttattttt
agacgggatt
caacctccac
tacaggtgcg
cctgttggtc
caaagtgctg
agacaggatc
ctgcttgggc
atggccctag
tttctttttc
ttatagtgtt
atttctctct
tcagagtttt
ccgattgtct
tattcattta
agtgcagtgg
ctgcctcagc
tttgtatttt
acctcaggtg
tacacccagc
ttctttcttc
ttggttgcca
ctcttttttt
cttggctcac
ctcccgagta
agtagagaca
cacccgcctc
tttttttttc
gctaatccct
ctttatttac
gagatttttt
tgttgcccag
gttcgcgcca
acgcctggct
ggtctcaatc
aggtgtgagc
ttgtatcatt
caaactcctt
taaatcttct
accatatctt
gctggagtgc
ctgcctcagc
tttgtatttt
acctcaggtg
tgtgcctggc
tctcactgtt
acccatacta
aagcaatcct
ctggctttct
atttcagtgt
caccaggctg
aagcgattct
ccaagttttt
cttgatctcc
cgtgagccac
aaaagaataa
tctgaaattt
gcacaattgt
cctcctgagt
tgcttgcatt
ttgctcttgt
ctcctgggtt
tgccaccacg
gggctggtct
ggattacagg
tcactatgtt
ctcccaaagt
tcctttttaa
tctttttctt
ttccattagc
cttctatagt
ttttttttca
ctctttttct
tttatttatt
cacgatcttg
ctcctgagta
tagtagagac
atccacccgc
catctattaa
atttattagc
agtgatagaa
tttgagacag
tgcaacctcc
gctgggacta
gggtttcacc
ggcctcccaa
cccaatatgg
gtaaccttcg
tagctatcaa
ttgttttttg
gcaggagtgc
ttctcctacc
aattttttgt
tcctgacctc
caccgcgccc
acagactcat
tgttttagag
gcatgtattt
tttttgttta
aatggcgtga
ctcccaagta
gagtagagac
atctacccac
caggaccata
tctttttgcc
20041
20101
20161
20221
20281
20341
20401
20461
20521
20581
20641
20701
20761
20821
20881
20941
21001
21061
21121
21181
21241
21301
21361
21421
21481
21541
21601
21661
21721
21781
21841
21901
21961
22021
22081
22141
22201
22261
22321
22381
22441
22501
22561
22621
22681
22741
22801
22861
22921
22981
23041
23101
23161
23221
23281
23341
23401
23461
23521
tttgagatct
tgtccctatc
tcacagactg
atccagctct
ctcattggga
gagaatgcag
tgggatggtg
gaggactgca
gtcctaagaa
gcttcacagt
tgggcctgtg
ggggatcttc
ctcgtagaaa
gagctggaga
agggcaggag
agacatactg
tggggataca
agcaggagag
cccgcctgcc
taagaccctc
gcgttactct
ctttaccagc
gaaagatggg
acccagtttc
gcattgctgg
attgtaagat
atgcttccaa
ttcctctgaa
gggcctctgc
ctcccagacc
gctgtcttaa
agtgctgccc
cgaatccaga
aggctacatc
gcccactgca
gaatgtggtg
agcatatacc
atctactcca
gaccccgctc
ttattctcca
tgcctctgag
cactgattct
agtgtgtctg
catgccttga
tccctattca
aggaggccga
tctcccttgg
ggtgggattg
gcatgccctc
gaagaatcaa
agaggatgaa
tgtatacagc
ctttcccttg
agtcgctggt
atgaggggag
tgggcctctg
aaaggcagtg
gccctctttg
acgtggaccc
tccctctttg
accacctcaa
tgtacagtgc
ttcgggacat
aagtagtgag
tgtgcaagat
ggaggaggca
gggagaattg
acagtactta
cctccccaac
agccctgcgc
atagcaacca
agcgaagact
atctggactc
agtgggtgta
gtagataaga
gggatctttt
tagagtatct
ttccatggta
aacctctgta
acagcagcac
actgagacaa
gaaggagagg
tctcagagga
gggacctgca
gtttaaaaaa
caggcccctc
tgtcttgagg
ctctccttca
aggagacgct
cccgagtatt
gggactatgg
atggcaggta
ttctgggggt
gacatcctct
gggacaaagg
tcaaacaggt
cccctacttg
ttcatgaaag
ctggagagcc
taggtacaca
aaattagccc
ttacctattt
gcctcacttt
ggtaggcttg
aattggggca
cctctccacc
aggaagggga
tgctgcatgc
taccataaac
caaagcgtgc
tttattcaca
tccaccttat
ccttattgat
aaactaagga
gaatggaata
caagtgcaga
caggtggatg
acatgccccc
ttactgtgat
cccgagctgg
cctgggcctg
tgcccaagag
tagaaggaaa
ggggaaacat
gcagaacttc
gggcccaagg
tctcctcagg
atagatcctg
agtgatggag
gggcaggaga
gatgggactt
ccgtattcct
gccttcagat
aaacttgtgg
aagctccctc
cctctttact
gaggccagtg
aggtgagtga
tgcccaatat
aggaaagaga
aggaccaaga
gaagctgcac
ctgcgagatc
agcagcagcc
ctcttcctgc
tctcagattg
ctttcccctg
gctgatgtac
ggaccttgcc
ctactcaagg
agaatagagg
tcatctatct
gatggaactc
gagggtcaaa
gaggagaagc
ccagccaact
gaccatcacc
atgctctaat
ccactcccaa
acagggctat
ctcctgtttc
cacctcagcc
atcacattca
gtagacgcca
ttcatgatcg
taatgggaaa
cctttatact
atttcttgaa
acctcaccat
ggccaactgt
acccagcagg
gaatttggaa
ggggaaaatg
gggctgtgtg
ggggcatatg
ggctcgcgct
acatctttgt
cttccctact
atgtggcctg
agggatgcct
ttctctgatg
aagggagtgc
ggaagatatt
agggaagtat
agagctgagg
tggactttga
aaattgatga
taccatcctt
ctcacttttg
cccagtttcc
tcatgcagtg
gtcttttggg
ggcagcctga
tagggtgggg
ctccccagat
actttgagat
tgaggaaaat
gggatctctc
agtcagagtt
gatgcaaagt
tatcgtagtg
cggggtgagg
aggggaagga
tctctgtctc
tatctgcaac
gaactgacct
cagctacagt
tcccgcctgg
ccgcgttact
cgggtggagg
tgatccacaa
tgtgcccgaa
gtcatcactg
cctgcagcct
caggctcctg
cacatccctg
ggaactttcc
gtatgtctct
ggtcaggatt
accctgtcca
cacggcacca
tggccctggt
tcttcacacg
acctcaacca
ggaacccctg
aaaagtgggg
cccttgtttc
tcaagaactt
ggtcagtgcg
tggcgaaagc
agggaaccaa
gaggaggatg
ggcagaaaag
gggtccccat
tctggccgct
ggccaccaac
ggtctttgtt
tcctcctttt
gccgctccct
acctgcacca
acccagggag
gaggtcaatt
ctggagggtg
aacaggacag
gggcagcctt
gagtgagtgt
ggcaggtggt
ataaccacgt
ttactgaggt
tcatctacat
ggagatatta
agaacatgaa
aggtgtccag
tggcttcctt
taatggactg
gagtcagcag
ctctgtagtt
aggggctgga
ccacagcttt
cccgaaccaa
aaaagccaga
ggggagtggg
gctcactctg
ctgtttccag
ccagctccct
gccaggtgct
acgtcctgct
ccccacaagt
aatagacatg
gccatgcgag
cctttgtgcc
gacccaactc
gggcctctgg
cagctcttct
tgcttccacc
gtggcccaaa
gcccacgtcc
cggggaggag
tttctctttg
ggccccaggc
aggcagcttt
aattcatagc
ggtcaaaggg
aaaatgctca
agcactaagg
atgtgagtca
gcagtgcagt
ttacgggctt
agtgaacaat
cacggtgagg
aaggagcatg
aaatagaaca
ggctccgaat
gtgctccgac
tttctgagcc
cttctgagtc
tgtgtttctc
gccgcagtcc
tatcgccagc
gtcagggaga
ggataaagaa
agagttaaag
agggtgccag
gctgaaaatc
tgggtgtgga
caccacagct
gtcttccacc
tgcccgcaag
ccctctggtg
ggcttatgaa
cacttttttg
taagtctcca
ctttctattc
gacttcatgg
ctgaggaaga
ttactctgag
ggtggggtta
gaacccctgt
ggagctggat
ggttatatgc
caacttgggg
actctatctt
atccccctag
tcctcaccca
ggcacgagca
ggctcttgcc
ccttggggta
aggggcccaa
gtgcctctcc
caactccaca
atcagggaag
cgtctcctgc
cccattttct
tcacatgttc
ttccttcacc
cgtgcctctt
agacagagtc
atgtgccatt
cctgtctcct
gtgccagcag
tgcgaatcca
aacaaaggga
taacaggaaa
tcagagataa
ctgttggcaa
agggagggca
ccaatactaa
gccactgcac
ggagaaactg
acagtgaggc
cgagacaggg
gctaacctct
actggctggc
ttgttcagct
23581
23641
23701
23761
23821
23881
23941
24001
24061
24121
24181
24241
24301
24361
24421
24481
24541
24601
24661
24721
24781
24841
24901
24961
25021
25081
25141
25201
25261
25321
25381
25441
25501
25561
25621
25681
25741
25801
25861
25921
25981
26041
26101
26161
26221
26281
26341
26401
26461
26521
26581
26641
26701
26761
26821
26881
26941
27001
27061
acaactgctg
cggggacccc
ggctccctca
agggtttcag
ccatggagac
ttgcgaaggc
tggctcgtgg
ctttcattcc
ccccttctct
acttgatccg
tggaaaagtg
ttctcccctt
ttctttctga
tctgcccagg
gttgccagac
tgaacgtttt
tccagtgtcc
tatctccctc
aagtttattt
ccaaaacaca
tgttgctgta
cagagaggcc
tcttctcaag
tgacagaggc
cagtgggagt
gtgcccttgg
gacccagggg
ggaactctct
cccgcctgcc
accacacagt
ggcgtgagag
agagaacagg
gaagaccatc
tgggggctgc
tagacacagc
gaggcagcat
tggaggaagc
tgtggcctgg
aaagggggta
tggagagtac
agagggatgg
tgtcagcctc
ccccccgtta
ggtgagttta
ggggtcagtg
cactgaacaa
acttgtggca
gggaggagcc
agtcacccag
tacccgtgtc
ggctcctttt
gaagttcaga
tgtctcatgt
tctgggaaga
gaaactgctc
aggaaaagtg
ttttgttttg
ccacaactca
agagtgctgg
ccacaagggc
caggctgggc
gcacagagac
gacaggaagg
ctgtgaggat
cagccatgcc
caaggaggtg
tagtcctact
tccccactcc
cagtggaaaa
cgtatatggc
ttcagggact
aatccctaaa
gatttggttg
attagtggat
catgagccag
tccccagcct
agacgcagag
tgaagaaaga
gagcaggagt
ggaagactcc
ctacaaaaca
taggaacacc
aggatgggtg
ggctccaccc
tgcaggctcc
cccagagctt
caatagatct
tccccttgcc
gaacccaact
tcagcctctg
atgccctgtg
tgagggaggc
ccctctgtgt
ctgagaccac
cgtaatgaca
tggtctgcat
attattccaa
ttgtgctatt
tggattttcc
cctgtggtga
caactcctcc
gcccagattc
ttgttttcag
ctggcttaac
ggctagttcc
gcaacaggta
cagaggcccc
ccaaaggagg
taaaccctcc
tctgtttctt
catgttggtg
cctcattctc
agacatgttt
cagaaagcag
cattgtggga
tttctgagat
ctgcagcctc
gattacaggc
ccctggtgca
atttcccaga
cacatccctt
aggtgattga
ggcaacgatc
tcccacacag
atgagatcca
gggcctgggt
ccttactcct
cccatcaagc
cccagtgtct
cagccttcct
tcttcaagat
tccattctgc
aagtttatga
gaagtgctgc
cctgagactc
tttttagttt
tattgtttct
ccacagggga
cggattctac
gattagaggg
ccagcctcag
caaggcaggg
ttcccacctc
ctacacagtc
cttcctttgg
gccctttata
ctccaggtcc
aggggtggag
tgattgcctt
gcagggtctg
acagcgacag
gaatgggggg
tctggagagg
ggatgccacc
tccattcaga
tggggcaggg
tgagggagat
cacctgcctg
caggaataga
tctagaattt
aaaaggtgaa
tttagacttt
agaaaacaca
tgcaccaccc
aacttcaaga
atctgtttct
agatggctca
tctcactccc
aactcaacag
taaagtttct
ccctatatga
attgaaactg
ctttctccaa
atatccattg
agggtctcac
aacctcctgg
gtgagccact
gtatttggtg
ggtggggatt
cccttttctc
tgatacactg
ttgtcttctt
ctgcccaggc
aatgtgcaac
ctaggtccac
cccaccttct
ctgtcaagga
ttaccctctc
ccagcacttt
cccaggtttt
cataaatctt
aactggattt
ctgctgccac
cggtgggctg
ctctagaaat
ttagtctcaa
acctgccctg
cccaggatac
aagacagagg
acagacacag
gtggagggga
agagccatgg
ctgctgctgc
gtgagtatca
tttccattca
tcagtggcca
agaaagggcc
tcccagctac
gagctgctac
cagctgcatc
aggaccaggg
ggagagttaa
aagtgttaag
gggatttgga
atggacaggg
ggaggaactg
ggagggtact
gtggcagacg
tctaacagat
catctgtttg
tgggaagttg
gcgaatttcc
aggattcaaa
aggagggcag
cctccaggag
gcggggctgc
ctaagcctgg
ctaaaaaatg
cctctgctcc
catgcaaaaa
tccttcagcc
aaatgtcttt
ccctctatcc
tgtcgctcag
gctcaagtga
gcacccagcc
aggagaccaa
ggctcctcta
cctccccaca
tcttttattc
ctatcaggtt
tgggcttcct
cacctccaca
aggatttctg
tgcttgttcc
tttgctaaag
tgcatcttct
gcccttcaga
ctgtgccaca
gcgattttct
ggaagatcct
cagcatcctc
ccatgccctc
tttgtttcat
aacaagagac
cctcagtaaa
ttcatgagaa
ggtccaaggg
caggaagggg
gggaccagcc
ggagccaggg
cgctggggac
gcccaacaag
acttgagggc
gtctgggttc
ataacccaga
gcaggatgca
agcatggcca
taggggccct
agggaggaac
tggtcaggga
ttggtgttca
tcactccatg
aggctccatg
atgtgctaaa
gggacgaggg
acctcaggtt
ttacactcag
cagaatctga
gactagagag
cctccagttc
ggaaagacga
gagccccacc
ttgtcaaggc
accaaggggc
tgaaagagtc
gctccaggta
tgaaaacttc
cgatctttct
ttaaatacaa
ggtttgtttc
cagtcttgca
gctgtagtgc
tcctcctgcc
ccagtctcga
tctagctcct
tcagaacaag
ggattggcca
tcttttaaga
tgcgaaggtg
gacaagcttg
tcagagctcc
acccttattt
taggtctcag
aagaaccaaa
cctgcaactc
aacccaccat
gcctctcccc
ctcttcttca
aacctggact
tgagagtcct
tttgtttcct
attaggaata
taggaaagat
aatacagtgt
cgaacccctt
agatggtctc
cctgagaggc
cgggctgcac
ctctggcggg
aagccgccaa
aggtcccagg
ccacagtgtt
acactcagtg
gccctactgt
ggctctggag
ggcctggttt
tggtgaggta
agggaatgtg
tcatgagttg
ttgttggggc
gagatgaggg
aagagtagtg
gagatggagg
gatccagatg
ttcaccatgt
agcctggtcg
ttcatgagaa
gggagttgtt
tccccaagtc
agggagcaga
ctacagggct
agcagaaagg
caagaggccc
agaagcccca
gtgagtcaat
atcttcttgg
ttgaaatccc
aaataaaact
tcatagggtt
gggtgttttg
agtggttcga
tcagcctccc
agtttctaag
27121
27181
27241
27301
27361
27421
27481
27541
27601
27661
27721
27781
27841
27901
27961
28021
28081
28141
28201
28261
28321
28381
28441
28501
28561
28621
28681
28741
28801
aaaggaaagg
ctgtaatccc
gaccagcctg
gcgtggtggc
gaacctagga
aatgagcaaa
cccagctagg
ccaatggggc
caggaggcca
gcagccagcc
gtggccgtgg
gaaaaggaag
tccccggcct
cgataccagg
tgcagtcaca
caggcggccc
tcgctaaagc
ggagaggagg
tggccgggca
aggtgaggga
agctggcgat
cactaagctg
gccgcacttt
gggcagccct
gtctaggggt
gtcccacggg
tgacgtggga
cttctaccag
agattcggag
gatgtgatgg
agcactttgg
gccaacatag
gggcacctgt
ggcagaggtt
actacatctc
agaggtaagg
aggttgaaaa
ggcctagcag
ctgccccagc
tgaggatcgg
agaatgacag
tgcccaccct
agccggagga
gtcaccatca
caggccgact
tccaggctgc
attctgaggg
agatgccaga
gaagctgggg
gcggagactg
cagagaaggg
cccctggcgt
gtgcaggcgc
ggacagcagc
ctccaagggc
gaaatgcggc
cagctgggtc
a
agaaagaaaa
gaggccgagg
tgaaaccctg
gatcccagct
gcagcgagcc
aaaaaaaaaa
tcctaagacc
tagtgctgga
aagcagcacc
cctgccttga
gtcagatgag
ggtgtgctag
gtgatggaaa
aagcatgagt
ccacggaatc
cattcagttc
ccagagccta
ccagtcggag
gggagctgga
ttgacgaaag
agcgggactt
ttcttcaggg
ctcacctcca
cgccccgctg
gtgggcgaca
acctggccta
tgaggatccc
tgaagacctg
ccttcattgg
caggcagatc
tctctactaa
acttgggagg
gagattgcgc
caacaacaac
tatgtgacaa
gacccatccc
cctccaactg
gtcaccaatg
ccggtagggg
agctgtactc
gtagtggttc
cgggggtact
cggggccgct
caggtgagcc
gagtcgggac
ggggacaggg
gggttcatga
tcctaaggtc
tgctgcccgg
aaggggccgc
gaaggacagg
cgtcctggcc
gcgaggctgc
gttcggcacc
caggattggc
ggacaggggc
ctgggcacgg
acctgaggtc
aaatacaaaa
ctgaggcagg
cgctgcactc
aaaaaagaga
atgtgtccca
tttagagccc
tgccccacca
tgaaggggga
tggtgtgccg
aaattaaacc
ctcacctgcg
gggttggctt
gaatctggga
ctggagagag
gcctgcaggg
gcagggcttg
gcctcacctg
aagatcctga
ggccaagaaa
tctaactctc
gactacagtg
aatcagctcc
gaggagacct
ctctgggacc
tctggaaccc
gaggagggga
tggctcacgc
aggagtttga
aattagccgg
agaatcgctt
cagcctgggc
aaaccttcat
ggtcttctta
gttgtgtcac
ggggctgccc
aaaggcaggg
gtcctgtggg
tacaccaccc
gggctggggc
ctcgtccccc
cctccagcca
gatataggcg
gcacgggagc
ggataagcat
gagaggttgg
gggcccgaga
cccgaggggc
tccagcccca
ctaggctgag
agggagaagg
cgctccggag
cctcgaagga
aaccctgtca
gaacattgtg
Download