Best non-maize hits - Purdue Genomics Wiki

advertisement
(1-50343) Christopher Dugard
acttcatcaacatcgacctccgactccgatgcgccgtcagtaacttctaagcgccaagagcgcaagaagtttagtaagatccccttacactatcctcat
acatctagacatactccattactttccgttccattaggc
>gene1
Best maize hit:
AY664414.1 Zea mays cultivar B73 locus 9008
Best non-maize hits:
HM596859.1 Zea diploperennis bio-material USDA:GRIN:PI 462368 bz locus, partial sequence
HM596856.1 Zea luxurians bio-material USDA:GRIN:PI 1441933 bz locus, partial sequence
AC243120.1 Gossypium raimondii clone GR__Ba0131I15-hog, complete sequence
AAACCACCAACTTTTGATGGCGAAGATTATGCTAGGTGGAGTGATTTAATGAAATTTCATCTAACCTCACTCCAC
AAAAGTATATGGAATGTTGTTGAGTTTGGAGCACAGGTACCATCTATAGGGGATGAAGACTATGATACGGACG
AAGTGGCCCAAATCGAGCACTTCAACTCCCAAGCTACAACCATACTCCTCGCCTCTCTAAGCAAGGAGGAATACA
ACAAGGTGCAAGGGTTGAAGAATGCAAAGGAAATTTGGGACCTTCTCAAAACCGCGCACGAGGGTGATGAACT
CACCAAGATCACCATGCGGGAAACGATCGAGGGGGAGCTCGGTCGCTTTCGTCTTCGTCAAGGAGAGGAGCCA
CAAGATATGTACAACCGGCTCAAAACCTTGGTGAACCAAGTGCGCAACATCGGGAGTAAGAAGTGGGACGACC
ACGAGGTGGTTAAGGTTATTCTAAGATCTCTTATTTTCCTTAACCCCACTCAAGTTCAATTAATTCGTGGTAATCC
TAGATACCCACTAATGACTCCCGAGGAAGTAATCGGGAATTTTGTGAGCTTTGAATGTATGATTAAAGGATCAA
AGAAGATCAACGAGCTTGACGAACCCTCCACGTCCGAAGCACAACCGGTGGCATTTAAGGCGACGGAGGAGAA
GAAGGAGGAGTCTACACCAAGTAGACAACCAATCGACGCCTCCAAGCTCGACAATGAGGAAATGTCGCTCGTC
ATCAAGAGCTTCCGCCAAATCCTCAAACAAAGGAGAGGGAAAGACTACAAGTCCCACTCCAAGAAGGTTTGCTA
CAAATGTGGTAAGCCCGGTCATTTTATTGCAAAATGTCCTATATCTAGTGATAGTGACCGGGGTGACGACAAGA
AGGGTAGAAGGAAGGAGAAGAAGAAATACTACAAGAAGAGGGGCGGCGATGCCCATGTTTGTCGGGAGTGG
GACTCCGACGAGAGCTCCACCAACTCCTCCGACAACGAGGACGCCGCCAACATCGCCGTCACCAAGGGACTCCT
CTTCCCCAACGTCGGCCACAAGTGCCTCATGGCAAAGGACGGCAAAAAGAAGAAGGTTAAATCTAAATCCTCCA
CTAGATATGAGTCCTCTAGTGATGATAATGCTAGTGATGAGGAAGATAATTTGCGTACCCTTTGTGCCAACCTTA
ACATGGAACAAAAGGAAAAATTAAATGAATTAATTAGTGCTATTCATGAAAAGGATGACCTTTTGGATTCTCAA
GAGGACTTCCTAATTAAGGAGAACAAGAAACATGCTAAGGTTAAAAATGCTTATGCTCTAGAAATTGAGAAAT
GTGAAAAATTATCTAGTGAGCTAAGCACTTGCCATGAGACAATAGACAACCTTAGAAATGAAAATGCTAATTTG
TTAGCTAAGGTTGATTCTCATGTTTGTAATGTTTCAATTACCAATTCTAGAAATAATGATGATGATTTACTTGCTA
GAATTGAAGAATTGAACATTTCTCTTGCTAGCCTTAGGATTGAAAATGAAAAATTGCTTGCTAAGGCTAAAGAT
TTTGATGTTTGCAATGCTACTATTTCCGACCTTAGAACTAAGAATGACATGTTACAAGCTAAGGTTGTAGAATTA
AAATCTTGCAAACTCTCTACATCTATTGTTGAGCATGTATCTATTTGTACTAGATGTAGAGATGTTGATATTAATG
TTATTCATGATCACATATCTTTAATTAAACAACAAAATGATCATATAGCAAAATTAGATGCTAAAATTGCCGAGC
ATAACTTAGAAAATGAAAAATTTAAATTTGCTAGAAGTATGCTCTATAGTGGGAGACGCCCTGGCATCAAGGAT
GGCATTGGCTTCCAAAGGGGAGACGATGTCAAACTTAATGCCCCTCCTAAAAGATTGTCCAACTTTGTTAAGGG
CAAAGCTCCCATGCCTCAGGATAATGAGGGTTACATTTTATACCCTGCCGACTATCCCGAGGACAAAATTAGGA
GAATTCATTCTAGGAAGTCTCACTCTGGCCCTAATCATGCTTTTATGTATAAGGGTGAGACATCTAGCTCTAGGC
AACCAACTCGTGCTAAGTTGCCTAAGAAGAAAACTCCTAGTGCATCAAATGAACATAGCATTTCATTTAAAACTT
TTGATGCATCTTATGTGTTGACTAACAAATCCGGCAAAgtagttgccaaatatgttgggggcaaacacaaggggtcaaagacttgt
gtttgggtacccaaagttcttgtatctaatgccaaaggacccaaaaccatttgggtacctaaagtcaagaactaaacttgttttgtagGTTTATGCA
TCCGGGGGCTCAAGTTGGATACTCGACAGCGGATGCACAAACCATATGACAGGGGAGAAAAGGATGTTCTCCT
CCTACGAGAAAAATCAAGATCCCCAACGAGCGATCACATTCGGGGATGGAAACCAAggtttggtcaaaggattgggtaa
aattgctatatcccctgaccattccatttccaatgtttttcttgtagattcattagattacaatttgctttctgtatctcaattatgcaaaatgggctacaact
gtctctttactgatgtaGGTGTCACTGTCTTTAGAAGAAGTGATGATTCAATATCATTTAAGGGTGTGTTAGAGGGTCA
GCTATACTTAGTAGATTTTGATAGAGCTGAACTCGACACATGCTTAATTGCTAAGACTAACATGGGTTGGCTCTG
GCACCGCCGACTAGCCCATGTTGGGATGAAGAATCTTCATAAGCTTCTAAAGGGAGAACACATTTTAGGACTAA
CAAATGTTCATTTTGAGAAAGACAGGATTTGTAGCGCATGCCAAGCAGGAAAGCAAGTTGGTGCCCATCATCCA
CATAAGAAAATCATGACGACCGACAGGCCGCTTGAGCTACTCCACATGGATCTATTCGGCCCGATTGCTTACATA
AGCATCGGCGGGAGtaagtattgtcttgtaatagtggatgattattctcgcttcacttgggtgttctttttgcaggAAAAATCTCAAACCCA
AGAGACCTTAAAGGGATTCTTGAGACGGGCTCAAAATGA
gttcggcttaaggatcaagaaaattagaagcgacaacgggacggagttcaagaactctcaaatcgaaggcttccttaaggaggagggcatcaagca
tgagttctcttctgtaatactcaaaattgtgtacaaggaatatatagtgttttcctcatttgatgtgtctcatttgcatcataaaaagagcatacctgaaat
ttagagtttaattcaaacaaatgacaaaaaatgcatcatgttggagtttatatgtttgtgcattaaataaaataataatgttaatggtaataagataatg
attgctagaagttgaattaaaccctaaattaaaactagggttttcaaaaataagagaagaaaaaagggtataaaaatataaataatatactattatct
ccaagatttatattctcatttcacaatatgattggaggataaaaatctatacaatgttcgaatttaaaattcaaattcaaaccaaattttgaaatgaaga
aagaaaatagaaaaaaataaaaaggaaaaagaataaaagcctcatgggccgccaaaccatcatttcggcccgctccagcttcccacccgcgcagcc
cacacctgaagctcgcgccgacgtgtgggtcccgcttgtcagccgcttttacttcgcgcgcgcgctgggtgactctctgtctggtgggcccggggcgcca
gactcttcgtccacctccgtaacaaacgcgtgagaatcgaccgcgcctgagcaccgtaatctccggaattcggttgcgattggacctagtctcgcgggg
aaagtgggggcataagaagacccggcgtcgcccatcctctgccctcgctaatcttctcaaaatcgcacgcgcacagaaaacccagagcctcgcccat
gctcgagtccgccgtcgccggcttgtgcttgcaacgccacttgatgtccgggaggtagttagggagctgcgcgagcgcactaggaaggcatagcaatc
ctcaatttggtggtcgggacctcggagggtggttgatttcgcgtcgtcgctggaacaccgccgcggcgccgcatcgtgccgcggacccagctctgcaca
gctcaaacataggtaagaaaaccctaggccatttcgctttgatctcagtaccgtgtagcgcgtttcaatttgcgagttggggcacgggttcgccggattg
gatgaccacggcggagcgccgccgtggggatcgcggcgctgcgggggtcctaggtcccgaggtgggggaagagcgtcgggaccgtccgatctgggc
gaacggctgcgattagaagttagcgtaccccttcgccactttggtcgggtaccgttgatctcgaatctaacggtggtggttagatctggtttcataaactc
tgggccgtcggatcccgatccggcggttggtaacgcataccggttcggggttgagtgatctaatctgagccctcagattctaatcgtgcggcccacatct
acggataccccttcgcagggactttttgcttaagaaaccctggatttattagaaaccaacccgcggtccacaggaacagtccacagagtcttggaaac
acttgcaccgaaacccctgagttttctgggattcgaggcctagttcaggggatattaaaaaccgagaaaatgattatagaatttagtttttaatacaaa
aacaattctagaaacttgtaaaattcatagaaattccatttttgacccaaattgagccattccaatttctaaaattttgtaataatattctctaccatttag
agtcactgttttgacatgaactctgatagaacattaatttaacatttaatcctaatttaatcacattaaaccttaggaaattcataacttgaaacctataa
ctccaaatttagtgattccagttcctatgatctcattttaatgtatagatttttactgtatattttattcatctgtttggtgtgatgttgattttggctatactat
gtatgtattgtgttgatgcgaatagacgagcaagccactgtggaaactgaggttcaacaagtagaagtagctgagcaggagctcattgaaggcaagt
tgtgcccttgatcacttacttttcccaaccatgttcttattaattttaatgatctgcataggtaaattttgatgggagactttatgttaccccagttttgatta
cctctataccttgttcacccctgaaataattttggggagggcgatcagagtgcttttgtggatgggtatggagttacactacacatgattacaatgatatt
atcttaatattacactggtcatgttaagatcattaaattaatgggaacatggagcgacaaccgggtaaaacagtggtacctcaagggtataatgggac
ggccttggctgggtaattaggaaagctagtggaagactaccttacccgaaaggggcaagggcagtaggggagaggtcagtgcggggaggtccctgg
ttgattttgctgggatggctgtcagccaggaaccctgaactggatcttcctataaactgtagcgggttttcggaagctagtggaactttgtaaaggcctc
gtagtggatccctagccattcacctcggtagtgtctaaggtccttgcaaacccaggcgacatgggatacacgacttgtgggtaaagatgcgcaacctct
gcagagtgtaaaactagtatactagccgtgctcacggtcaagagcggctcggaccctcacatgattaattatggaacttaaatttaatttgacattgca
tcgcatttgggattattttactattactgttctttattattattaaggtttggtatttacttacacttagtaattgctaataaaattttgaccaacttataaaa
gcaatgctcagcctcaacctctatttcattgatcagccttacactacatgaactcccacctttggtgagttcatgccacattattccccacgacttgttgag
ctatgaacgtatgtgagctcactcttgctgtctcacacccccccacaggagaagatcaggtggtcgaagaggagctgcctaacactgaggagttcgat
ctgatctaggtggcgtgtctcggtcgacattggcgccgacgatccttagttcattttatacttattgttttcttttgtaataagacttccgctatgtaataaa
tactctgatgtattatgacatttatctctatacactctgttattatatatgttgtcttcttggcgcatgtatgagatgcacccggctttgtcctttaaaaccgg
gtgttacatcttctccctacacccctcaacaaaatggtgtagtggagaggaagaatagaactcttttggacatggcgagaaccatgcttgatgatacag
acttcggatcggtttgggcgagcggtcacacgctgctacgcatcacgctatatctacatcgatctcagaagacatctatgactctaccggtaaaagcca
tattcataatttagaagtctgnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnngatgaaattcaaagctttgaattcgtataattaggtatatgtggagtacttcagtg
aaccaatggactacaaaactgagccgagctaagaattgtggacattattatcacatgaaatactccttggtttactaattactaagagaatgttgtcaa
ggaataacaacaaataccacaattagtgctattgggtataatgatatatttgctcaaataaatgattcatgccccgtatcctttccagttcaaggaaag
aaggaaactcaaatgcaatttagaccactgtaagatgaaccaataaagttattctgctacctagcaccatggtgtatctaatcaataaaaaacaagga
tgatggttttatttataaagtggtcaactatattattttaaacaaattcagatcaggcaaggacttgactttgacacacaaaatgagtatttaccttacta
catcatctgaggagcttaaattttttaccacgttagccaataggctttggcagaagttaaaatgtggggcagtatctaataaagcatccatgcaacgga
caaccagagtatgatagtgggg
>gene2 (reverse strand)
Best maize hit:
NM_001158676.1Zea mays LOC100285786 (si618065b11), mRNA >gb|EU973949.1| Zea mays clone 424851
nuclear ribonuclease Z mRNA, complete cds
Best non-maize hits:
XM_002453594.1Sorghum bicolor hypothetical protein, mRNA
XM_002436496.1Sorghum bicolor hypothetical protein, mRNA
NM_001052815.1Oryza sativa Japonica Group Os02g0214300 (Os02g0214300) mRNA, complete cds
XM_003571604.1PREDICTED: Brachypodium distachyon nuclear ribonuclease Z-like (LOC100834210),
mRNA
TTACTTCTCCAACGATATCAATTTCTGTAGGTAAGCctaaaaattcacaaaagaaatccagctccaaccattgtttgattttctttataa
gcatgtatgtcagcagggagaaaaccaattgaaaagtctacagtggtgcagagttgggacagagcgaggttggtttacccacCTTAAGCAGTG
GGCGCAGCCACTCTTTGCACTTGTAAGAGCTGTGCAAGAACGCGATAGCTGGTTTTCTTCTAGGGACGGGGTCG
TCAAGCTTCAGCAGAAGTACACGTACTCGTCCTTGTTCACCTAGctgctttgaaaggaaaaaagtagtgatagtttctggttaga
aaaatagaggggaaaataaaaaccagcagtaaggcgttgactcggtagtttagtgtggggaatacatgtgagtctggatctgagaacctagaaaaa
aggatgtcgactatcaagggtgcacccaagtattcagcagagggatgcaatggagtctgcatgtatgtggagtaagagtccactattatattcttgctc
tagatctcgtgaggctgcttagatgcgattggctcctgtaggtcgtaggtgacacgatctggctgccggttacCTCAGTGATTAGGTTAAAAT
TTTCTTCAACCAGCTTCTCCACCCAGTTTTCCACCTCTTTACTGGGGCACGACTCCATTGGATTTGCctgagattaccac
agacaatcaaccctgtaggttcaaccattttagcagcacattcatttaccaagagtccaagatactgaatgtgaaccaaagccctttcagtttacacga
taatttgaagaacatagaaagagatgcagacctcgacatttaggatagatgacaccaaaaagaaagggtagcaatatccaacgttaacatcaacaa
ggtacaaaaagttttggcatacaattctacttaatttgtctaacaagtcattcaaattgcataacaatatcacatgaaaatgtgacgagacagaacaag
aaaaaaacctggaagaaaacagtctctgcatgcccatagagattgcggcggacagtcttggttctaggaagatttgggtcagaagctacagacCTG
TGTTCGCTCCCTAGCAGTGTGGTGCACCTCGTTTGGATAGACACAATAATTGTCAGAGACCTCctgaatcaaaccagg
aatagtattgttaacttgtaagcaaaataaaggatgtttgagggatggcaaaataaatgatgtttgagacatgctcttgtgctttgcaactttgggttgg
aagacatctgatccatcatggaaaatcttatggtggtgaagttgtggtagctttattctctgtaattctgagcggcctgtaagcgactttattacacagat
gttttgctctatataaattgttctgtcatgtgtatccttatctatgttttttttcaaaatttgtagtggtttgaatcaagcagctaccaacttctattcattcga
ccaagggcgcattgctgcttatatagactatatgagatgattagtcgttcaacttccagcaccaatcaagaagggacaaccttaccccaggtgcaagg
gaatattgaatttcgaaatgtgtactttagttacctgtctcgtcctgaaatcccaatattaagtggtttcttcctcactgtaccgacgagaaaaactgtgg
cacttgttggtagaaatggctcaggaaaagcagtgtaattcctctgatggagagattctatgacccaacattaggtaagagcgttgagaagcatctttt
gttttccacttctatttccactcaaaagactactattatttttcaggtgaagttcttttggatggggaaaacataaaaaaattgaaagtagaatggttaag
aagccaaattcgtctggtgacacaagaaccagccttattgagtttgagcatttaggaaaatattgcttatggaagatctgctacctttgatcagataga
agaggcagcaaaaacagcccatgcccatgggttcatcagttcacttgaaaagggggacgaaacccaggtaaaggaggttaccatatatcttattgttt
atcaaggctagactgcttaggcattgaatctctggtactcaccttttgaaattctgtaggttggtcgagctggtctaacactcactgatgaacataaaat
caaaatttctattgctcgtgctgtgctttcgaatccatccattattttgcttgatgaggtcaccggaggacttgattttgaagctgagaaggctgtacaag
aagcattagatgttctgatgttgggaaggtcaaccattataattgctagacgtctttgcctcattaaaaatgctgattatatagctgtaatggaggaggg
ccatcttgttgaaatggggacacatgatgaacttctaaacttggatggccatactcccttcaacaagttactaaaccatgcttcaatttttgaactccaa
agccaccaagccagacagacCTCCACACCTGATAGCTTGAGCCGCTTGATCTCGCTCCCTGGTAGGCCAATGAACTCCT
GCTTAAGCTTCTGCTTCACCGAGTATATCACATACCCctgcatacacacaatttaattgagactccatccaaccattgcaggattata
acagatggattcacaacatacacagttccaagaagtgaagtgaaccaattcatctcacCTGGCTGGGTATGGCGTGATAGGTCCTGAA
AGCCCTGACCTTGAGGTCCCTCCTGAGCTCGTACTCCTCCCTGACCTCCAGGGGGACCAGGTTGTGGTTGAGCTC
AGACTGGTCCAT
ggcgcggcgatgcacctcaaatagccgctccacgaggtcccagaggcacgctgggacgaagatggtgggcggacgcagcctgaagagcccccgcta
gccacatacatgggaaggcccccgatgtggtcgaggtgcccgtgagagacgaagaggaactcctgcgagacggcgcactgcgggcatcggccgatg
tcgaaggcgaggctcagcgtcgggaagatgacgcaggtttcttgaccgccgatggagacgaggaagcgggaataggagaggagctgcaggacgca
acctccatagtgtactcgtcgcagcttttcgccattccggcgccgaggagagaggaggggggcttaagggtcgcggtgggctgtgcgggcgattcccg
cgcgcgtagggagggaatcgcggcggcgggggtcgtgcacgtgatccgcgggggagggaatcgcggcggcggggggtcgtgcgcgtgatccgcgg
ggggagggcaatcgcggggggcgtgatccgcggcagggagggggagcgggggcagtctacagtgcacccttaatagttagtagagataacacaca
tatgtacataagttattgtgttattatacggttccgttgcaacgcatgggcactcacctagttatatatatgaggtcagtatattgggagcctaaagcttc
caatagttacaggaagcccagccaccccctggtccacacccgacaaccaggcacatgcggtcgggtgcgtcccccgccatgcaccaggtcaaacctc
cccacgcgtgtaagtcaaaccccggtccaacacccaacctccgcgtgggacgcacaacataaattgtagttctgttttttgcgcaaatagagtaaatag
ttcacagaaataacgtaaatattttgtagagatagcataaaaactccgcccaactactggaccaacaccacatgtgcagacacatgcatgcagatcct
atttttatgatagatccaataattatggcagatcatggaagtttctattgcatgcgtgtaaattttggatgacattttcgtttccactgcatacgcgcaaaa
attggattgatccataaatataggatcgctccaacagccatacagctaggcatgttagaacccaatttatgatatatactaatattaattattctacaca
gtgtagatattttatgtattcaatatactacgtaaaaaattgttttctgttgcatgcgcacaaaatttaggatgacaggaatttgcgatgaaagaaaatg
gattgacatgcatgggaaaacaaaaaaatcttattcaaatgttttatttcctccataaatatatgatcgctccaacacccatgcaggtagactcgttaga
accagtttatgatatatattgatactaattatgttacacggtgtagatacttatgcaatacgtacactacataaatatatggcatgtggtttactggtgga
cacatcttcatattcgagcgtaaaaaccagcagttatgagcgtaaatactgttccaacatgaaagctattgaaaacctcttgatgttggcatagatatgt
atacaaagtgacataaaaatatgtacgcgattgtatgtcatcactccaccgcgaacatgcattcaggcgcgtcccccaacagacgcatgcatgcaatc
ctatttttatggtagatctggaaaatatggcaaccactaggcacatgcatgaggttcctaatttcatgcagatccaaaaattatgacaggtcgtgagag
tatccattgcatgcgcacaattttgaacgacattttcgttcctacagcatacacacaaaaactggatggctggtatcacacaaagcataaatcattata
gaaaattgcataaatatatacaacgtacatcataatcataattaataaagaaggctcaaccaacagccgcatgcataaacaaaggattgcaggtcat
caccggaggacgtcaccgaggccaccagatcggaagagttcgtaaacgaaggatgcccaaaccgtcgtgtagaccgccaaccctcgcagctcgcgt
ctcgcgtcaccgatgtcgccgctggagccacccgtgcctagcgtggcaaccatcgctagagacggacgcgcctcatggggtgccaccgttgctcggtc
gtcagggtcaccaccgctcgcccgcctcgggccccatcatcgattgccctacgtattagggagctgccggcaccacgcctccgatttgggaaccaccgc
cgccgcacctccaaatcgggtgtaacaccccaaaattttattttgggcacttttcaagataatacaaacatcttgaaacaaaatatctttaataggattt
tcctcactttgtaagccacgtctcctgaaataaatataactcatagtaaaggatcaaatgaacagatcttatataagttaaattattcttttaaataatat
atttagagatattctccaaataggttaagttaaaacatctcttactgaaaattttacactatgaatctttattcataaaataattatttccttgtgaggagc
tacatatctaaaagccctttcttagataggataataaattaattaattaatcaaatgatatatgcatctcatgctaggtcttattttgtggtgcatctcgaa
tttgaactctgaatccaaattaaaattttgaatttgagttagtacataaaatagaaaatgaaagaaaggaaaatagaaaatgaaattagaaaagagg
aaacccacatatgggccggttttctcacttctcggcccacctctcccgcacggcccgcttcccttgtccacgctgcgatgaaagaagtgggaggagaag
tatcagacttgccctaggaaaaaagttgggggcactcagacaaagacaccgaccttgcttcatgacccacacaacgtgacattccaacaaccagcga
tttgtcgaccaacccgtctgcgaggtctattcaaaaaggagaccgaagagaaagggggctaagagtgatcgacaagtgaacgaccatccggtgcttc
taaaggagggcgaaggtgctcgactcttccccaagaagctacatgttgcatcaaaggtagccttcatcgccttatgtgaagggtatctaggcgctaag
cccaacctcactttatggcaatacttcttttgtgtcgagctgttgaggaagaagaaggagcgaggaataatggaatcatggtcgatcaggtgcgccagt
atccgccttcacagcggaaggtcacgtgagtatatcccaacccccttttccttatataacaagggttggcacatgtgatggttctacctgcgagacatcg
atcgagagccctcgttggggttaccgcttcttagcatcgatcggaatgtcccgttcgaggctccctaaactagagccacgatgtcacgcctgaagagaa
agttcgaatccagggacatctagcgtgggtgcatatcctctaagagaaaggcgtcacaagagcggacattatctataaaacctctggtgttacgagaa
ctaaaactttagcatgacatcatatgcactgcattatattatgaatgacttacccaaaatgcattcactaggctaaaatttcaaaacaaaaacatgtga
tgtgatgcttggttgagtatacggtctagcaaggggattcttaaccctatgtaggaatgaatctcttcaagggctagaagcatgtgaccttatgatttga
gcacaaaggagtgatcttaccaagcttggtagccatgtttaagaaataatgaattggagaggttaaacacttaatgtagcttaggatactcaaaactct
aatttggagccccaaaaccctaattagtgccctatagggtgatttcaaattctatgcactttttgccaaaagtgtatatatcaaagtggtagagtaacaa
aaaccaagaaacttttatatttggaggtttccaagttttgtggtgaaacttggagtaatttgtaaaagttcataagcacctctagtcttgaactctgaaaa
tagtgttgatgagccaactttgggctcttgtatcacttaatctataagggattttttggaagtcattatggcgaagttgtagagctactatagtagtccaa
attttattaagtgacttagcccaaaacctatatgaaatttggagaaaaatgcccttgaagtcgggctgtcagtcgaagggaagtctgaacctagactga
cagtggcatgagcctaactttggagccattttatgcttgatccattgagcaaatgaccaagatccttatgacaaagttgaagctgatatataggagaac
aactttgatgcagtggggttggtagttcactgcataaaaatcgaagaaaaaggtccctaaagtcgctttgtcaggcgccctgaattgggtgtggcatgg
aacaataattgaatcatatcccaggttcactgaacccgtcgtcggcgtcctcacgtcgtgatccgagcttggcttgaagattcgtgtgaaggaacaaat
atccgtggtgaaaagatcatcgggcagggtttggattggctcacgggcggaatcattcgtcgtacccacttgcacggcgacgtggccgagcagccatc
gtcgtggtggttgcagtctcccgctgttacgccatgccgaggtccattttaccacgtcatcgtggtcgtggcgagtagggaattattgccctgaaccgtta
caccgtcttgagccaccatgaccaagttccatgaaagctcttgatgccgtcgctggcatactcattcaaaattcacgacatgtgatgctggttcctttaa
attgaatgcttccctcgctccaccctcttgttacgaacccagcgcaccacccctcactcccgatcatcaaccatagcttgagaatttcgtatttccctgga
atcggttggaacaccatcgtgaaacacctcgccgtggccagcccttttcaggttagttcatgtccctctagtccccattttagtacctctctgacctctaga
tgattacggtttcgaccaattgaactctaccgcgttatagccctgggaacgtcgtgttgtcgctcgggaatgctgtcgtggacacccctcacgtcgatcg
gctcttccgggtcttctccgtccaaattcacatgatcaccatagttctagtgagtcactgatcccttccgggtcgtttgattgaactctaccggcttgtgta
gtccaaattcgccggagctcggattctgctattgtcatgggcacacatgaattagggttggattattcaaattaacttaagagaagtgctcaggtagcct
ctgggacg
>gene3
Best maize hits:
NM_001148386.1
Zea mays uncharacterized LOC100274000 (LOC100274000), mRNA >gb|BT042177.1| Zea
mays full-length cDNA clone ZM_BFb0153F14 mRNA, complete cds
EU966203.1
Zea mays clone 292556 single myb histone 1 mRNA, complete cds
Zea mays single myb histone 1 (Smh1) mRNA, complete cds
AY271659.1
Best non-maize hits:
XM_002455819.1 Sorghum bicolor hypothetical protein, mRNA
XM_003569238.1
PREDICTED: Brachypodium distachyon uncharacterized LOC100830626
(LOC100830626), mRNA
AK365152.1
Hordeum vulgare subsp. vulgare mRNA for predicted protein, complete cds, clone:
NIASHv2031H21
CT830863.1
Oryza sativa (indica cultivar-group) cDNA clone:OSIGCRA110E03, full insert sequence
JF951953.1
Aegilops tauschii clone TaMYB70 MYB-related protein mRNA, complete cds
AK334835.1
Triticum aestivum cDNA, clone: WT011_D14, cultivar: Chinese Spring
NM_001049977.1 Oryza sativa Japonica Group Os01g0589300 (Os01g0589300) mRNA, complete cds
AJ495788.1
Oryza sativa Japonica Group partial mRNA for MYB19 protein
ATGTCCGTGTACTTTCTCCTGCTAGGGAGGGTCACCGGCGACGAGAATGCACGGCTCTGGCCAggtgcttgaccggca
catggtcggggacccggagttgtaaagaccagagaaggggaaggcttatgtgaagttgtcagtgacacattggaacagtgccacagactctttattg
gttgtttaaaacatctaggatctttgcgcaaaatcgtcagcgctggcgcaggcacaagcgtgtttccctgttcgtggaccggtttgggctagattcagcc
caatcctgttcattattttctcttttccttttcctggtaacttgtgaaatacataaaaaatagtagaaaaatgattaaaacatggaccaattttactagact
ccaagaaatatgtagtatttaataaaaatacttctaagatttttaattcaaattataaaatgtatagcgtaaggtacttgagcatagcttcttagaatttt
agaaatattttaataatcccaaaaatcctaaaactttttgggtagacttaaaatgtttgttttgaaccttgagtaaagtttgaacttatttgaacactgttt
gactaggaaaccaataataagcccaaagaattaaaccctttaatacctagggagtctaggagtagttttgtaagtagaacatgaaaatcatcactttg
ctataatttggtagcccaagaagattagttgaactaatcctaaaatattatctagctaaaagtcttggtagttaataggcttcatgaaggaaaattgtat
tcaagacacaacttactatcttcttgttataaatttgcatatccatgacacaacatatcgtaggatgtgaataaggccaacatatcatcatatgaggggt
ttggtcccacatggatcacgatctatcatattattttgtcAGCCTCGGGAGGCAGGGCCGACCTGTGCCTCTCGCGCATGGAGAG
GCTGAGTCCACGCAGAGCCTCCGAGACCGCGTCGGGGCTCTCGAGCGGAGGAGGccaggtaggcccaccctgaccgcat
cgccctcccggttgatgatgaaatcgagcttcccgaagcggatgccactctagggtgtagtcaacgcggggaccgccactggccctcatggtcgcgcc
aaggttcctcggcggtggaggctctcgaaatgaatcacggcttcggggatgcgattggtgtcacgcctggccatccagagcttgaagagctttggcac
ggaaagcgcaagtcccctacctggcgtgccaactgtcggtctttcaaacctcgctccgaaaagtaaatttattgtgcgcattccatgctctggacggttt
gcgaggtgtgcacaagatttatactagtttgggcagcacctccctacatatagtcatcggcggcttgcgctatcgacaccattgatgatcaaagctcgt
agtatgggttacaagcaaggcgagagagggaggagaggctcccaagtctcttgttcggggtggaggtggttacaaggtggagtcctagctaggtctt
ggcttggcagcagggcttcgatctccaggtcctcctctaattcgtcggtcttctgggcgtcgtcttctctagggtccgtccttgcatcgtccacccgagag
aaagaaccctcaagtgctctctctagaagggtgtcctccatgcgctcctctagaacagggtccccctttgcgcttgggggccgacctcctccctttataa
gctaaggagcggggggtcggcccgtggtggattcctttggaaggatccaccgggtgatggtaaaaccagggttcttaccatggggtaaagccacaca
tgcttgcaatcgcttggctatcctgtattctttatacgaacgtcgctggtggcatgggcctgagcgctatcatttgggctatgccgaccctcgaccttcgta
gcctggggctcggcgcggctcatcgtgttgcgccctgccagcgggcctgcgcacttaggcctagtcccgataggtcattagtgcaccgtcgtccaggga
cgtgcgggccataaatgcaccacagtccgagggcttgtagatcatgagtgctctgcaccccgagggtttgtaggccatgagtggtaggtagggtccga
ggggcttgtagttaatgtgaccgatcatttatgatgggttggtctaGGGCCAGGCGCAGGATCCACTCGAGTGGTTTCCCTTCGAC
GCGACCGCCCCCCTGTCCGGTTCCGCCAAACCCGACCTCGGGCTCGATGCCACaaggttcgtttgggggcggattcttccaga
gtagacgttatttgtctctttcgactgaccagtgggcccctgttgccagaggccctttcccagtgtgctcgctggccgatgggccccaggatctggggat
catatccccgacagagcgctgcgatgtttaaggtgtacagttgcggctatcgtaggggaggggtcacccgtggatagtctatggcggcacaaccacac
tcagcggccaacctagggatggcaacatggaattcctcgttggggaataacttccatacctgtcctggtgttgaatttttttccacggggatccccacga
atgcttgcggggtacatttcttcccctgcagggataaatcgccgacgaggatctcgtccctacttaaactacaattaggacgtacatcctttattattaat
gcaaaacattgtcatttatgcgcattgttatatgtacatcaatatgaacacatttatacaataagtaatctaacaaaataattatttatttttattattata
cggaatatcgtcatgtgcaaatttaacaaattcccgtttggaagaagatagacatcgcttcaccatttcccgtcccatttatgtgctatgtgagaaaata
ttttttcctgtaaaatggagaatcgggatagaatcccattgcctggttgccatctctagggcaaccctctttgattttatttatttttatgttgattttatttat
ttttatgtttgatgacgacgacgaagaatcaaagagggcttccttcgaacgccacccccgggcctgtggcaattaatcccctcgccttcgtaaccgcca
gtcctccgccacctctccacccaacaacaaaacaagcctaatgggtaatggcttcctcagttcctctccatctcacgccactgcgctgccctagaacctt
cgccgccgcccctctttAGGATTACCGTGTCCTCGATCGCCCCCTGGCACGCCCGCCCCGATCGCCCCCTGGCACGCCCG
CCCCGGATCCTGCACCGGAATTCCGTGTGGAATTCCGGACCCGATCGAAATCGAACCGCCTCCCTGCGTGCTCCC
CTGATCAAGAAATTCTCACGCCGCGCGAACGCACGCCTGAGGCGGTTGGATCCTCTGGTGGGGTTCCTTGGTCTC
GATTTGCTGTGGATGGGGGCGCCGAAGCAGCGCTGGACGCCGGAGGAAGAGGCCGCTCTCAAGGCCGGCGTC
GCCAAGCACGGGCCCGGCAAGTGGCGCACCATCCTCCGGGACTCGGACTTCAGCGCGCTCCTGCGCCTCCGCTCC
AATGTTGACCTCaaggtgacgctcggatcgccagggagggtgggcggtgggtttttggcgatcagctgaggggagtatggaggaaatctccagt
ccttttagtcggttagactgaatgaaagaaaatcgcggttttgcacgctccgtacatcctgacgtgttccttatttggttctagatcggtattttaggtggt
cggtactcggtagtaggtattgcatagagttttgaggtggctcgagagtaatctggtatatcagaaacttctagccgtagcatcatgatgcccaaaaag
cttccacacccctttgttttattccatgtagattttttcacctgctaggtattgatggttccttgagagagaggtcacttccatgtgttgatcttgttcaccat
aactatgatttgtttagttgggcataatatgtttttagtttccaagttcattttgaatgcctgtatcgatgagtacaaatgaatcttcggtatggtctatttgt
gcAGGACAAGTGGCGCAACTTGAGTGTCACCGCAGGAGGTTATGGGTCTAGAGAGAAGGCAAGGATGGCGTTG
AAGAAAGGTAGACGTGTGGTGCCTAAGCTTACTGCTGAGCCAATGGATGTAGATGTGAAGGATATGGACGATG
CTCATGACACAGCCATTGATGTGGAACCGTTAGCAATGGCTTTCGAGTCCTTGCCAACTGAGGAAAGTCCAGAT
AAGTCAGTTGctaggtcagtagactcggtgcaaatattttatcaaaactacttattgtctttatgagtaacctaccttttcagttcgtcatcttcctttt
tggtgaaatgtgttttatgtgggactagatagaaatcatcgatataatttgatttgacattctgattctagtcctgccttgctagacattaaagttcagaa
gttgaccttaaggaaattccttttaaaccatatctgaagttctgaactgtctcggcaacaactagtgcaggaatttgtttcgacttattgtctttatgagca
acctaccttttcagttcgccatcttcccttttggtgaatgtggtttatgtgggactagaatgaaatcatcaatatagatcgatttggcattctagtcctgcct
ggctagacattaataagttgtttggtttgaggaatgagttagttcatcatcttctcactcctcacttttttgtttgttttgtggaatggattgagttgatccat
catcacctcattccttatagttatttagttagtactaatatgaggaatgaggtcatcccaccaaattttaggaatagacccataatgcaccaccatatttt
ggatggagtgattcctcaaaccaaacaccccctaaagttcagaagttgaccttaacagaaattcctttgaaaacatattaaggtatgtgaactgtctcg
acaactagtgtaggaatttgtttacagttgtgttttgagttggggttcgtgatttttgtttacactaacagttgcagcttacaagcacattaaaggagcat
ataaacagtgatcttgtagaaaatcaaatgcttgcctattctctgccatattataggcattcatacatttcctttggtgctcagtttgacctctcttatcacg
tctttagtcattaaagtttatgtcatttttaggaaatgaaatataaattattaccccctttcttttacgttagaaggcttgttttattcaatgatggttaaata
taaaaggcacacatagttcccaatgctttcctatatatctaaaggctagttaacttcatgcacaaagcgttgaattgtttttttctcccaactcaatttgttt
gtaccaaagcaacagagtagtaatagctacggtgagcatttatgcatgcaaccagtgtgaatgactctttgttgtgttcatgtatgtacacagtagtaat
gtgtagttacttgggtcactttagtcagattagcaagtttttttttgtctttgttcccaaataaacgtgctcgctaaaactgaatggagggaccttgtctctg
agttgatggtaactgtgcatcggaaatgaagttatatggtcatcgtaattgaaacaggaggctatgcacatgttaaactcttatggctagtgcgacaat
ggtgagctatgtgcagagtcattaaaagagaacatattaataaatgttgacctgactgatactcttactaTAGAAGCTGGACAAGCTCTATG
TTTAAGCCAGTCACAAAAAATCTTATTGGAGCAGGgagggtcatcagtgtctaaaaacatgagttgaaggtggttgtatggaacgat
ggaacccaatcaatctgggttgataatttgatttggggtcgatgacatgcacaaatagttacaaaaggtcaatacaaataatacaaatccaacacgct
aaccacctaataattgtgtggtgtgatagtcatgctcgatcatgcagccttcgtcctttcttgttcctcgctttctctctactcaggcaacctgatgccgcta
gtcacgaatcaaccactgctagccaacatctcacactaggcagtagtgaaattgattgcctgcaatgttctcttctcttttgtgctcctagtttcattggcg
aagtcatgtccacaattgttggtagccatgatcaactctagttgtctctcatgtcacaaattcacaataacataaactcacatcatcttcgtgtcctctcct
ctccctgtacctcattatagccctttggatctgtggtgtactgtcgcaGTGGAGTGGTGATCCAGACAGAATCGAGTTGTTGACCTTT
GATAGGTTTTGCCGTTTTGGAATCCTACCGCGACTatgtaagttttatgtatggaagatcgatggctagagcgtgtaccccgatgctgt
ggtcgcgcataaggagtaggcgagcgagaaagagacgtggtagatggcccttgaggccaagcaagataattgtcttgcttacttagattgatacatct
ctcattatatagagatgcttactttagccctaagtaagctattcttatttttatggaaacctcctagtcaagaatcaatcatgaccctattcaagaatcaat
catgatcctaatcaagaatcaattatgatcttatttatggaaacaaatgtgaaatatctaaaggaaattattttgtattttgtttctaattgtccttatggtg
accccaacatctataggtgacaccagttataagatatctcctcgcctctctgaacaactcgctcgagctagatatcggggtagaggtccttgaattgctg
aagcagctcctaagtggtgtcagtcgccgccacaccattgaatgacatgccaagtgccctggcgtagttgggcgcaacacatggtctgctttaggcagt
agtcgaccatcctgtatgggtggcacagGCGAAGGAGTCACCTGGGGTCCCACAGAACAACTTGAGCTAAGAAGCCATGT
TCTTGGAATGGGAACACAGAATACAAAACATTATTTTTTAGAACcatggtatgaagggggtacaccttgttgtattgttttttcga
caagtgaaacgagaaccatgacagtctggaacaatatactagaacgaggaacatgttcccgagaattttgggttgacgcaactggcaacgaaacta
aaataacaagttcaaatcttacaaataatgccatgccatatacatgacctttagatttcagtctaccaaatctcgaccatccatccaaaatttggaaccc
tagtcgcgggccatcttttatttgcacaacaacacaagctgccagcgtctgtctacatcactccattggtctgtgtccatacgcctttgtactctagcagc
cgccgagcgcccgagttgtggcagctgcaagctgccattcgcaccaagccaccaactaaccagctcccgagtcaccagtccccacctgcggcagctat
tgcccattgatacgaggtggtgacaccgacgtccggttagggatacacgacgatgtccaatgccggtggctttgcgtccattgttcgctctgtcactgac
taccaaccgtgctactatggacacaagcatgctagtcggagcgtggagcgtgtgcccagcatgggatggcgtcccacgtcctaagagtcataggtgtg
atgccctaatctaatcccaagtaactaactcaatcttctaatataatatttatccactaatcctaaccctgagcctatctctatgggcctccatgccctag
gttgccgccccatgacatttagagcctacaaaaatgttttaagtgcctcaaataaatagttttttatttttggaaaaaagaaaaatactcggaaaaata
cgaaaaattatcggagacttctacggggtatgttaacatgtttcaaacaccttttgtacttaaaaacattatacagcggtatgaatgcaacacactacc
acaccaagttcacacttatgatctgttttagttgataaaaaatataaaaccataatgcgtgattctagaccactcaaatcttatcaagttaaaccaaatt
ttcaaaaaaaaatgttgattaaaaaattagggtgttacaacagatctggttaagtacatggaacgagggcccaaagagggtgtgcagctctgtgaag
tcctcggtggtgtcggtgttgcagcatgagggctttcatcaccgagttgaggcgaccggggcggtactctactgcgaaatcgaagccagacaacttgct
gatcaactagtgttggggaatagttgagtcgctggtccaagaggaacttgagggcttgaggctgtagtggtcggtgcaggcgaggaagcagcgactc
tagaggtatggccgccaatgctgcacagcctgaaccagaccaatgagttcacactcatatgcagctagcttgtgatgaagagcggtgaagggtcagct
gaagatggctaggggaccagcccttggtggagaacggtgccgaaacttgcttgcgaagcgtcatagtctacgacaaattgtcagtcaaagtttgacgt
ttcattcctaattgtctgcatggtgaccctaatgtcttagatgacaccagtcataacaatcttgcttgacccaacccttgatttgggatgatgtcctgggg
atgcagaacgtgcccgaccctaacccctgtgattatgtgtaacaaggattaagcaacaaacatatatatcttgctatttgtagtatgcaacacatgtca
gttgattacgttcagtaaagccaacttcgggacattattggatgtatccacatgtattattcatcttctgatactgcataacagttttaacatttttgcctcc
tttatggtctgtgctctattgagatgttcgggtagtgtctttatttgttcctatggtgtctgcagtacatctcgactctgtaccacagttttcaagtaaacaa
aaaaggaaactgtttgttgtgaattggaatataatgttgacacatacatcctctgagttccaatgtgtgccctttctaatgttcaatttcttacagaggttg
ccttattatttagtttatttaactatgttagcatatctcttcagcggttgccttattatttagtttatttaactatgttagcatatcctcttgatgcttggtcatc
catgtgactttatgtatgtctgatttttccaGGCTCGATGATCTCATATTAGAAGCTATAAGGAAGCTTAAGGAGCCTTCTGG
GCCCAGTAAGGCAGCCATTGCTGCATACATTGaggtttgtggatatcaatttatggtctgctcacttgttgtattttgtttaacttaagtca
gctgccctgctatttcttttctggtcactgctgcctgtaatgggaatatttgtctcatccttttgtatgaagaatctctgctgattttttgctaacttgagaca
atgctatagtgatttttcttttgccgatggatttaacttaaattgtctatatatttAGGACCAATACTGGCCACCTGCTGATTTTCAGCGC
CTGCTATCTACAAAACTGAAGGCATTGGTTAATTCCGGAAAATTAATCAaggtctctccctccctccttccctgggttgtcttttt
aggaataatagtggggacagacttcaagctgcatactatttgaatcatgattgtAGGTAAACCAAAAGTACAGAATTGCACCAAGTC
CACCCCCCTCGGGCAGAATAGGCACTAAGGTATCCTCTGCTGAAGGGATGAAGGCAGAAAATAATAATGCTAA
ACGTCTAACTAAGCATCAAGTGATTGCTGAGCTGGAGAAGATGAAAGGCATGACCAAGGAAGAGGCAGCTGCC
TTTGCTGCCAAGGCAGTTGCTGAGGCAGAAGTAGCAATAGCAGAAGCTGAGGAAGCAGCAAGAGTTGCAGAG
GCTGCAGAAAATGATGCTGAAGCTGCAAAGGCTTTTCTTGATGCTGTCACTTTATCAATGAGAAACAGGAATGC
TGCTTCCAtggtatagtgaacccagcctttgttagtatataactggtacataaactctttgactgataacaccttttgatgaatgcagATGCTTCG
AGCTTGCTGA
tcctttacccatgcggtcatatggttttggccagcagcacaccggtgtttctgctgcaaagattttctggatatatgagtttcttgtatagccttatttatag
atcacgacgcatcaactggaggggttttcaacatcttgctgttggtcctttgtcgttgcagcctttagacaatcgacaatatgataacaagaattatcgt
gaaagtagtcttcatttatttttgttttcttcccattgtgttgctttgtaattaaactctagccttttcgtttagttagttgaaatggtgtcgtctagtagaact
gtagtgaaaagaaaatgaagaaaatgttagatagccaccttagttattcaatttgctgtgatagtatgtttctagatatgattagctaatgacagcgttt
aagattacattcctgctgtttcgtttttttctatagagattacctcttttaagaaaagcatgtaagatttacaaattcggctcgggacaccaacaaaagta
caaaaaagttcagcacaaatgatgtctcgtttctcatagctatggacatgtttgggatagcttttttctaaaggggttctatgaaaatttaaatttctaaa
taatctagctgtcacaagaatatgagaaaggggcttacatgtggaggaagataaacatagcgatagtttttctcacatagtgacaacttggagtttgc
attcaaattgaattgtcaattttcatagaaaagtgctaagaagtgaagtgtttggttgtctgaaaccgatttcgatggccagaattcatccagaagtcga
aacaaaggggcttgtgtattcttatgtaacttttgtttggtttttggctaattttgctataacttctgtgaagttaatcgatcaaattgaacaactaatctta
gacagaaaaagttaaataaattatgaactaaataggtctttaggctgactccaacagaaggcgtcatatcctatgcaataatagatttagcaaaacta
aaatatgtataaatatttagaagtatagtaaatatgtatagttttatcatgcacatgtttagttctctgccattgtgaaggtgaggctgccaggcctactg
ccatatatgttttgattgtaaataaagttaggcacaatatatcttttttagagcagcacaacttgccacatgtgtggtggcctattttttttggcatcacta
aggctatccctagcagtactcggatgcgaatgtgcaaaatatgcgtgcaatggtgaatatgagcgtgaaaaagtcgcttctagtagagcgcgtaggg
gatacaatccgacaagagagatgcaatatatgccctcctcaaggggtcaagcactcgatcagaactaaaccatcccgcgacactcacaacatgagca
cacagggcaagataaagatgaggataaaccctaaagagatcgtcaacttcaccttggcatgcttccattagtgcctcaaaacatttgctattcttatctc
ttgaagggtcctattgttgtagtgtttctagtgaaagaagtcccacgctagcagcctgccagaaccattaatacatggtttgatccactgaacacttgttt
tatctatgaagagctgagttggcaccacgaggtgtcgactcagctatttgccgagaaataagcagatcctgccttctacccattatgtcgatgattaggt
agagctcttacgtgggctaggttactaggatacgaccgttccgcaatagaacatggcactaatgtatcccctaaggtcccaacacctttttgatcattac
cgaaatcataacatttctatatgataatttgagattgctataccagacctgactaccaagattgaagtctccatgctcgtcatttctagtgcttgcacaac
ctcctggtgccttatttactgacaattcccgggtgtaagccaacacttgcaaatgtcttttatttcattaaaaactacgctttgtcgacatcgtcttttccaa
acattcgggaccttagttgcaagtatgagaaatgttggttttggttaataaaaatgaaacattcatgaacaaataagccattacttaattcaagctccac
cctgaaagtgtcaatataaaagcgaacaactagaagctcccacatcagaagctagaattgacgccacagcataaggcatcgcctccatgctatctgc
acatggccccgaagtggaatggctaatttgcacaatctaggaagctcagccttctccctctcagttcctccgcaatcgacttaccagaatcgttgcggca
actccaattcaacccccactctttcccggtgtctatacttagtgaaagtcgtctagaggggggtggataggcggaaactgaaatttataaacttaaagc
acaactacaagccgttgttagcgttagaaataaaaccgagtccgaaagagagggcaaaaacaaatcaactaagaaataaagcgagtgacacggtg
ttttatcgaggtttcgggttcttgcaaacctagtccccgttgaggtggtcacaaagaccgggcctctttcaaccctttccctctctcaaacggtccgtcgg
accgagtgagctttcacttctcaaaacaaccgggagcaaaacctccccgcaaggaccaccacataattggtgtatcttgccttggttacaagtgagtat
tgatcacaagaaagaatgacatagataaagccatccgagcccaagagctcaaatgaactcgagtatcactctcactctcactagggttatgtgagga
aatggagaggatatgatatcttaggttgtcaaaaattggatgttatagttcttgtagtaattgggaatggatctatttgaatgctatgactggagggatg
gttggggtatttttatccccaaccaccaaatgcgtcgttggcagctattgttcgatggacaccgggtgtcagccgacaaccggcgcccccagcgcgaca
ctannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnnnnnnnnnnnnnnnactaaaaataataaaaaaatcagaaaaataaaaaatttataaaaaaaaaaaaataagaaaataggaag
ttaaagaagttaatggatacctatggaatacaaaccctacctttggtttggtttaagtagtggatgtggggcacttccttccgccgcgccatcgcttgaa
caacagttcaattcggcgatgacgcggatctcgctcttcagcacgataaagacacggaaaagatttttgaaatgtcctccctccaaaccgcgcgcaac
ccgtgccgccgaatgcagtgtgttcatgtagaatgcaattcttccnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnaaccaccaattcaaccgttggggaggatgt
atgtcgacgagcgcaccagacagtccggtgtgccagccacgtcacccaaccg
>gene4 (reverse)
Best maize hits:
AY664415.1 Zea mays cultivar B73 locus 9009, complete sequence
AY664419.1 Zea mays cultivar Mo17 locus 9009, complete sequence
Best non maize hits:
AC243120.1 Gossypium raimondii clone GR__Ba0131I15-hog, complete sequence
DQ645537.1 Zea luxurians mitochondrion, complete genome
DQ645538.1 Zea perennis mitochondrion, complete genome
TTAGGGTTCTGACGGTTTTGACCGTTGGAGCTCTGACATCTTGGGGCACCGGACAGTTCGGTGCCGCACCGGAC
ATGCACTGTTCACTGTCCAGTGCGCCTTCTGGCGCTGCTCTGATTCTGCGCGAACTGTCCGCGCACTGTAGCGCTT
TTGCAGGTGTCCGTTGGAGTGGACCATTGCGCTGGAGCCGTTGCTCCACTGGCACACCGGACAGTCCGGTGAAT
TATAGCGGAGCGGCTCTCTAGAAACCAGAAGGTGGCAAGTTTGAAGGAGTGCGGCCTTGGGCACCGGACACTG
TCCGGTGGTGCACCGGACAGCCCGGTGCGCCAGACCAGGGTTCTCTTCGGTTTCTTTTGCTCCTTTCTTTTGAACC
CTAACTTGGATCTTTTTATTGGTTTGTGTTGAACCTTTGGCacctgagaattcatagtctagagcaaactagtcaatccaattatttg
tgttgggcatttcaaccaccaaaatcatttaggaaaaggtttgaccctatttccctttcacttagctctttaaaaagtcttccagtacctccctcgtggcag
cccgttagaagcatgcctcataagaatctttgtggcgacaatcgaagtatcctcactcaattttgaaacaaaccaccttccagtttcgtactgctactac
ctcctaggtcgtcgttctttccattgcttcgggttctttcctatgcttcggagctaccaaactcatagatttgcgatacagggacagttgaccgcacgctgc
acaagaaagcctaaagtgcggcgatgtcataatgtgctcacatcagaagctagaattgatgccacgacataaggcatcgcctccatgctatctgcaca
tggccccgaagtggaatggctaatttgcacaatctgggaagctcggccttctccctctcagttcctccgtaatcgacttaccaaaaatcgtcgtggcaac
tccaattcaacccccactctttcccggtgtccatacttagctctttgaaaagtcttccagtacctcccgcgtggcagcccgtcagaagcatgcctcataag
aatctgtgtggtgacgatcgaaggctcctcactcaactttgaaacaaaccaccttctggtttcgtactgctactaccTCCTAGGTCGTCGTTCTT
TCCATTGTTTCGGGTTCCTTCCTCTGCTTCGGAGCTGCCAAAGTCCTTCGTATTTACATACACATATTTGCTCTCTG
GACCCATCTCCTGGTTCATATTTTCGATTATCCTCTTCAATTTCTTGTTGAAAGCTTCAGTGACGCCATTTTTATAT
GAAAGACCTTTTTTTATTTGTCTATTCCACTCCAT
ttaaatctaataaaggagccaatacaatgaaattccactagtatagaaaaagaggaatagatgcggatgcacatatacatcaatgggagagagaaa
aacacatagatttgcgatatagggacagtcgaccgcacactgcccaagaaagcctaaagtgtggcgacgtcataatgcgcttggtaagtttcggccg
ggacaaaatatatcctaaatttgattcgtttaagacgtgtatggctaaacttttataagtttgatcgttagtatattaataactaagataaggttatatta
aaaatcatgtttatatttgtttcaaatgaaatttacaataacacaaccttattgggttctatggcgtgcatatttttacggtgatataaatatttaaaaagt
attactcgtttgaagtgtcttttctataatagtattaataggttagacttctctataatagtacttagaagtgagtgacttctctaaaaccgtatttcactct
tatttttgcctatttctattttcttttcataatttctatcgcttaaaaagatattttacctagatttcaactccttttaaccgaaccctttcaaccccatcacta
cattcagcttaggttaggatggacattaaaagtgtttatattaaatttgttttacagattccaattaaaaacacaaaaacatttttgttaaaattttaaaa
ttgaaatagctaaaatttgtaaaaaaaaaacaggacaactattagagacctgtacacaaaaatatgggtttctctacgagcaatagaatccatatccg
atagaatataaaagttacgggttgaggaactgcaaccataactcattgagctgttgcaggggtaaccatcctccgtttatatgtataggacctgttgtgc
aatttttatttttatttttaaatttttaacaaaatgttacatgttttcaattagaatttgtaaaataaacataacataaacacttttaacatccatcctaactt
gagctgaatgaagagttggaatctaatcactacaggaaaatggttaaaaaacgtgggtgaaaaatcgtcgaacttaatgtaataggccgtcgaactt
accataagaacgtgggtttaccgacgaatatagccgacggtgatttttggatgaacttagagaagtaagttcgacagccccgacgaacttaacattaa
gaacgtgggtaccgtcgaacttaatattaagaacgtggataccgtcgaactagtatgagcaataaaggtaaaaagatatccttttaaacgatagaaa
ataagaaaagaagatagaaatagttgaaataaggatgaaattctgttttagagaagctcacttgtaagtattattttagataaacctaatctattaata
ctattatagagaagccaatccaaactagaactataatagcaaataactctataaaaacacacacatataaaataattgtattattattattatttgccc
atcttcttctaactaagtctataaatacgaaaactgtctctgttgaactttctatacaaccatcgtattattttatatcaggggcatactaggcccgtacag
attctacgactgcataggccccccaaaatcataaggccccaaaatctacagcaatatatatgcaacaaataaaaaatgttttgcataaaatattggtc
aacgtgactaagaatcacaagaacgttgaaatcaatctattcctctcatttttgatcaagttatttcttcccttataagaagatttaatttaaacagtatc
acgattataaaaaatatttccttatttatctctaatgcattgcaatcattagataaaaagatcttaaaatattgttgtcacaatcttaaggttgcacgtaa
aagagacgaacaatctgatttgatgcaagtgacatatttgatgagttgtttcctctaagattttattccacaagaaagtatgtgccctattgacattctaa
atttttttgaagcaacataattatttttctaatgcaactgatgcaccgaacaaagtttttctaaactaaagttattattgaagtcccatttttattctataac
taggtgagtgcccgtgcgttgcaacgaaaacatataataacacaaaaattatgtataaatgtgtgttatattgatacgagaaaagatagtgaagataa
gctagcagaaatttgcatccaatgttattgagaagtgtcttacttttggttctcatgagcaacgataaattctgatcaatgaaatgcttggtagttggcac
aactgatgagaacgaacaattacatgtttgttttcaacaatgattctcagctcacctttttcttttaaaatcagtagttggcacaaccactctatagcaca
actcaaagcacagaagcagacgcctaagaaacacaacacccgccggtaccagcgcagtgtgttgccgaccggtgacacaatgacacaagacattat
ggtgtcggctaccacctccttggacatcttcaggaacgggtccatgttgtggtggtaccagccattttgcctcactcctatggatgttactctccagggca
ggtgagttcatctttaac
>gene5 (reverse)
Best maize hits:
AY103582.1 Zea mays PCO147975 mRNA sequence
BT053879.1 Zea mays full-length cDNA clone ZM_BFc0001N20 mRNA, complete cds
BT043117.1 Zea mays full-length cDNA clone ZM_BFc0100C04 mRNA, complete cds
EU960258.1 Zea mays clone 222946 SNF4 mRNA, complete cds
Best non-maize hits:
XM_003579593.1
PREDICTED: Brachypodium distachyon sucrose nonfermenting 4-like protein-like
(LOC100844830), mRNA
AK367711.1
Hordeum vulgare subsp. vulgare mRNA for predicted protein, complete cds, clone:
NIASHv2060P11
NM_001059222.2 Oryza sativa Japonica Group Os04g0401300 (Os04g0401300) mRNA, partial cds
AJ575236.1
Oryza sativa Japonica Group partial mRNA for NF protein
TCACAGCTTGCTGCTCATCGTCGGACCAGTAGCCATCACGTCGCGCATGGCTAGGCCCTCATACTCCTCATCTGAC
TTGGTCGAGTGTTCGTCCATAGAGCCGACATCGATCTGCAAGTACCCTGTGACGCGGAAGAACATTAGCTTTGC
AGTGTTAAGAGGtgctgagaagataaggaaaacaaaaaaccaaattgttgaacactgacataaatcgcccatacctcgtgcatgtgcagcatg
tctcttttttcatcggaggtcttcatcagtttcagcatcacgaagatcgtcaacttaaaatatatatatagatatatgaaggcaccagagtccacctgcca
tgagaaccagacttgaagaggctaaatttaaagatgcacgagttaaaggcaagaagagtactgatatcattggtgaagtggttacactacctatgctt
gccaaccaactttcctatcaagctagtgtgttgccttctgcagtatcacatacagttttcagcataagagttggttcaaatgacttgtttaaaacgtgcac
tgcaggccttttttgggcaaaatgtttccctcaaaataaatgcacaaatataaaagcagtgacaatccgggaggaaacacagcttattcactagccct
gctttggcgacagagatctggtggggtaagtttgttgcccttgcttcaaacaggaagcatgccgagttgagcttctttcttggccctgagatcaatgttgc
ctgtctgtacaaactacagagtggttaaaaacatagttcctcactgaataaaccaaaattgttgatggaatggaagtccaaatgatcaaacatacaat
agaaataggtaattctgaaagttggggaggagaaacccaaacaatttctccttatttcaagacataagatgatttgtaattttttaacacaaacgacat
taagcttaatgttgtattggagaccatatctagaatttcatacaagtaatatatttatattatgaaagtgctgatcttgcaactgcttacacattgaatca
aatgaaagaatcagcaaatatatggagaagtacatggattaatggcgttctctgcatgacagcacCATCAGGCCCACCATAGAACTGTAG
CTTTGCTTCTTTCCAAGCAGAAATGGGCTCTTCGTTGCCAATAACTTGAATGTTTCTCTGCAactgcatagcaaatgataa
ttagtaagaacataaaatgcttccaagaagaaacaaagtaatttattccacccacCTTTCTCAAGATTAAAACAAAATCTGATGCAGT
GAGCATGCCTGTTATGGTTCCCTGACGGTCATCCCAAAAAGGAACCAGAGCAAGACCctatgagacacaaaaggttcga
agctattcttaaattattaaactgatgccacgtatgaagtgcagatgaaagaagtgcctgcttctaggagcgattttgttgtgtttttaaaccgcaaaac
ataataggaaaacaaattaaactgctacttaaaaaggcaaacaattgctagcacaactttaaaaattcttcaaaatgaaagtaaaaacagagcaaa
gttgaaatgagagcaaacaagcaccatagagggaaagtgaactatttgaagcaactctagcaagcacagtaacatacataccACATCATGCAT
TATTTTAAATGCTTGTTTAACAGGAAGCTGAGTGTCCAAAACTGTTAACtgaaatgttcatatgaaatccgattatacacatag
tctaaagagaagtgctggcttgcaagaaataaacctaacCTTGCTAGAAAGGGGAACAACGTCATATATGGTATTGTGTAATA
ATATTCCAGAGACCACATGGCGGATAACTGCTATTTGCATGCTTGGGTTCTGAGATGATGGCTCCAGGGGCAT
ctttgtttatttaaggccacagtaaataacagaccattagttatcaggaatatggagaacaagaaaatataatgttaaacttctaagacatactgttttc
aaaatggtacccttatccatattagttcctctgatagaaggctctggctgcacaacgggtagtacattgttttccacaagcacttcattgctgatcagtcc
atattcatcacgtacaaagggttttgtctcatcacacctccagacaccatcaaccaaaaaccggtactaaattgcacaaaaatgccatataaatgtatg
tggcatagaaaatgtcacaggagaaacataatatcttatgcaaatagctgggcagtggccatttgtacagcaaatatgcactactatagtatcatgttc
aaatagcttgctaaccattgaactgaatcaaaccatcacatgtgaacatcatgatttacagaactcgcaataacaagggtgtgaatttggtacctgtgc
gcaatgttatgtccaagaattacattgatagatgaaaaagcgacacatcaaatacaaaaatccatcatgaatatgtcaaatatatcatgtttcgatcaa
ttgctggaattgaaaaacatattgaagttcaaaatagttgtctggtatatgcaaaagtagttccatgtaaagccaacagttaaggggcattcaagattc
atttagtctaactgcaagtactaattgctggggaagaagaggtgcaataaaaactccatgtacagaaacctgagagacagaagtgtggttgtatttaa
tcatcactcactttacaatggcgttaaatggctgctttctgatgcaacaactgcaatggcattctaccttacaggcattatgtaacggccaaaacatctta
tcgaaaacgaaagagacagtgggatgtctttacaggcataaagattcatgatgcctttatgggggtgatctcgcaaatcaaattgcgaagaatcccga
gctcggttacttattttaggaataaaagcattaacaaacacaattaatttaggaataaaaggattaactaaaaatgcagtcatttgctcctggtattttc
tctgttcagtagatggacaggggctcaccattccactccttgaggcacttgcgcagcgatgtcttcctaaaatgagcgcttgaagtactcgtgataatca
tcctacataccagctccacatgaagtagcatctccagtacgatgaacaagatatagaaattttaggcaataactacatcaagaaactgaaggcaaaa
aacacaggactatgaccgcgaactgaaagcgagcagctcaaaaaacacggtattccaccagtgacccagtcatcacacgctgaaagttccacaggt
gtggtttcacatgtctacatcccagctattgacttcaatgagtggtatcaaatttggtttaaaggagagactagaccagtggaaagccagactggagga
gttctcttctcaatgaaacatctggcgctctggtaggttgtcgagttccacaaacgtgcgggaatcactacaatgtcctagtgccatatcagggagaagt
tgactgcttcttgcggaaattacagaagcacatgcactgcagaaccattccatcgaactgacactatgacacccgctgactaccaatccaatttgatgt
cacttgcaggcttgcagcgcaccgcgtaccatgataacaatccagataaggggctcaccgtgctgatggggtgacgagttgaggcgctgcctgcgttc
tggcccttccaggaggttgatgttgacaacaacattttcttaaacaccttcggcccgcccgcccttccgcaggaacaacacaacaaacaggccacata
ctgtaaataacgcccccacagagaagcagtagataatgcagagcaaaataaaaaacatgagctctgtttgtgttcctcttcaagcgagaggccctaa
cgtcaagcatcggaggccctaacgtcaagcgtgacagcgacgacgtcctccagcatcctcgcgatcctcgtcgagtgggaggaggagaatggatttg
ttgtgggagagggagcgaggcggctgcggataggggatgactacaccgcgcgacgtggatgggcgggtgggtgacgctgccaggcagggcaggcg
taaaggtttagggagaacacaaaagcgaatgaatcaagaacgcgggagagcacaaagcgaatgaagtaacctcaggaatccatatcctggattgtc
gcgccttcgcccactgccgaccaccagcagttccctcccctcctctacctccctccccctgctggatctagccttgggctccacccacgaatccaaaccct
agccccgcagttgtctcggtggaaccaggccatcgcgggggagggagcggtggatccggtgtgcgggcatgtgggagcgatggatccggtgtgcga
gcggtggattcggtggaaccagaactcgatggtgctcgttccacctccactgtgcccgttgtgcccgcggtggagcacctatttcacgtcgccctcgacg
cgcatgcgtcgctggttccaccactcggactcgacgacc
>gene6
Best maize hits:
NM_001174809.1
Zea mays uncharacterized LOC100382044 (LOC100382044), mRNA >gb|BT062978.1| Zea
mays full-length cDNA clone ZM_BFc0013I21 mRNA, complete cds
NM_001196777.1
Zea mays uncharacterized LOC100502299 (LOC100502299), mRNA >gb|BT087700.2| Zea
mays full-length cDNA clone ZM_BFb0162H19 mRNA, complete cds
EU975938.1
Zea mays clone 506018 hypothetical protein mRNA, complete cds
EU964704.1
Zea mays clone 280953 40S ribosomal protein S17-4 mRNA, complet
Best non-maize hits:
NM_001055203.2 Oryza sativa Japonica Group Os03g0103300 (Os03g0103300) mRNA, complete cds
AB433794.1
Oryza sativa Japonica Group qLTG3-1 gene, complete cds, haplotype: 5
AB433793.1
Oryza sativa Indica Group qLTG3-1 gene, complete cds, haplotype: 4
AB433792.1
Oryza sativa Japonica Group qLTG3-1 gene, complete cds, haplotype:
ATGGGGGAGGGGCGTGGCGGTGGCGGTGGCCGCGGAGTTGGTGAGGCGGGCGTGGAGGTAGACAGTCTTGT
CCCTGGCGTGGGTGAGGAGCGCGGTGGTGGAGGAGCGGATTGCGGATCTGCCGCAGTGGGGAGGCGATTGAC
AGCGGCTCTGGTGTGGTGGGAGCGCGGTGGGGGAGGAGCGGATTGCGGATCTGCCGCGGTGGGGAGGCGATT
GAGAGCGGCTGTGGCGTGGTGGGAGCGCATCCGTGCGATGAGGAGCGGAGGGACAGGATTGTGCGGCAGAA
CATGGTGCACTATTGTGAACGTCTACGTTGAAGGTTTATAG
agtagtagagatgacaaaaaacttatgcattgcgtgtaataccaactatatatcggtaaaattactggagaagttcgaatgtgaaaatttcattgaag
attttattttaaagaacactaaaagaataatgctttttaaaatgaagattatataatatttagtcatcgttattttatgtgcacatatatacatagcattac
ttaatatattttttgctatattttgtacggagcctcccaaaatgcgaggacctgttatatatatatatctctactactccttaagagggtaaggagggcgtc
cacaccctcaccgcggctccgccctccccaccccgttgatctcgctgccaccgcgcacgcccccgtcttcgtctcccccgcgctaatctcgccgcggcag
tgcgtcctccgccccaccttctcgaaaatgtcgtcgcccccataaccgcgcgggccggcaccgcggcgcgcagatctcgctccggctcctcgctggcac
aaaccacacgcgcaccctcgacccatggcatccccaaatcccgcccaggttcctctcccttcccattcgcggacg
>gene7
Best maize hits:
NM_001196826.1
Zea mays uncharacterized LOC100502348 (LOC100502348), mRNA >gb|BT087877.2| Zea
mays full-length cDNA clone ZM_BFb0280P11 mRNA, complete cds
BT067383.1
Zea mays full-length cDNA clone ZM_BFc0037B10 mRNA, complete cds
EU975326.1
Zea mays clone 487586 mRNA sequence
DQ244682.1
Zea mays clone 10027 mRNA sequence
BT016603.1
Zea mays clone Contig436 mRNA sequence
HQ140759.1
Zea mays Mu transposon insertion mu1013340 flanking sequence
Best non maize hits:
XM_002444661.1 Sorghum bicolor hypothetical protein, mRNA
AP006170.2
Oryza sativa Japonica Group genomic DNA, chromosome 9, BAC clone:OSJNBa0042B15
XM_002454373.1 Sorghum bicolor hypothetical protein, mRNA
AF133839.1
Sandersonia aurantiaca papain-like cysteine protease (PRT5) mRNA, complete cds
NM_001070477.1 Oryza sativa Japonica Group Os09g0564000 (Os09g0564000) mRNA, complete cds
AK071733.1
Oryza sativa Japonica Group cDNA clone:J023107H18, full insert sequ
ATGTCGTCGCGCGTCGACATTCGCGTCCCCAAATCCCCGGTTTCTCTTCCTTCCTGTTCGCGGACGGCGCTTCCGC
GCGTCGACGTTCGTGTCCCAAATCCCGCCCCGGTTCGGCGCTTCCCCACATCGAGGCGCCACACCATCGCCGCCG
GTCCCGATAGCGTCGACAGCGCGTGGCCCCATCGTTCGCGAACCAACCTGGCCAAGGGTGCCCTGCTGCTGACG
GACAAGGACCTGGAGTCGGAGGAGAGCCTGTGGAGTCTGTACGAGCGGTGGCGCAGCGTGCACACCGTGTCG
CGGGACCTCACGGAGAAGCAGAGCAGGTTCGATGCGTTCAAGgtgaactctaggcacatcggcgagttcaacaagaggaaag
acattcctacaagctcggcctcaacaagttcgtggacctaacgcagGAGGAGTTTGTCAGCAAGTACACGGGCGCCAAGGTCGTC
GACCCCGACTCTGAGGCTGCCGCTGCCAGGCTCGCCAGCGACgtgtgcgtgtcgtccagcgacgagtcgccgccgcaactggccg
cctctgccgtgaataccatcgtcaccgccttgaaggaccactgagtcgtcaccgccgtgaacgtcgttaatttattctgtattgcagACTGGAGAG
ACACGAAGCCGTCACCGCCGTGAAGGACCAGGGGCAGTGCGCCAGCTGCTGGGCCTGGGCgcggtatcatttccttgct
tgttttacaagatccattttgatggtggccatgtagtcttagctgagtgacgacttgtttttcatagatctgatctgagtttgtcatgatgtcgtgcggcgat
ctgtttgtctgagttcggagacaggttagtattctatcagatccaatggttcattggttttgtcacttggacatggattcggtggtttcaggctcaattgttc
gtctcctgttgtggtacgatcagaaacagatactgcagccgttgcctgcataccaggtgaattgtgtggagaccttcggtccattcttatgttaactgttg
atcacgcttgtggccttctggttgtttgtccttcttaaacttcagatgcaaggagggacactagccgtatgtttgagagacactagccgtgcagcacaat
cagatgcaaggagggacactagccgtccttcttaaacttcagatgctgatctcaattttttcacaaagacactagccgtgcagcacaatcttacggtgtc
tagagcatgaaagctccattttttcacagcatgtgctttgattttatggaggaggcgtatgtttttaccacttactcatgaaggctccaattttcattagtg
catacccttgttttgcagttttcatgggtggtacatgcattttcttccactaaatatcttgaagctatacattgttgtatagcttcccattgagactgcctttt
tatttgaatgtaacagggacccttttgtaatcaccttgttctttgccttgtcctttgaatgggtaaagctagatctgtcaatagaatacacagcctgtttga
gagagctctagctgatgataagttgcaaaagtctgtcatgttgtggatttatccttgcttattgtttaattactaatcactatattttcattttaggtagcac
aggaacaccagttggaaacaaaaatggagaaagcagctcagaatgaagctctgttttgtttttttcacagcaaccaagtgcaactagattaattaatc
tcacacatgagataaagcttaagaaacatggtcttttgccttcgcacaatcattttagtggatcaaagatatgtgatgatcaaaccaaatcaactccaa
attcagatatgggttctttggattatgaccttaccaacaaaagctagactacaaaagtttgttctaactcctaaacgttatgttcttttatgactcctgaaa
gtacagagtgacatatactatgtgcagatttttgtcgtttcttactctctgcttgcacttcatataaacaaattaatggttgtagttcttgtgtttcaacttac
tgcaatgttatatttgaaaaaaagatattattatattcctttccctgcagtattccatgataacctctatttcttcattaggcaccacatttttttggccata
atgctaattattttttctaaaactcaatgttggatatttacaccagcattttcatgcttaaaaatggcattcaaacagcctataaatccaataccacgcct
aaaatgaatagtaaaactataactataagaggtttcattgtgcaGATGTCCCACTCCTTTCGTCCCGTCGAGGGTGTCGAACAATG
AACGTGTTCGATTGTTGGTCGTCTTTGCTGCTGTGCAGCTCGGAGTTGTATGGGCTGCACAGATCGATGGATTCC
tggtaggatcggatcttaaggacctctatctattttgttcatctgacaaataaaaacaagttcagacagttcctaaatcactgaaggagccgcatcaaa
tagagcttgactatataaaaatgttgaaaagttataaaacaatttcaaattttcttctaaataataaatagctctccctctataaaaagaatttagaact
gacattagcttcggtttgtgccaaaattgaaaccgaataagccgtcaaggtattgtggctttggatctagcccttaagcatggtctcttagaactgacat
tagcatctaatctcaggatacacgtactcagtaatctatattttataataataatttgtacatgcacttactagatgttgaattatatcctcagcatgcata
acatcttgtgagaactgagatcctgtcatagttgctgtttgggaagccaatatctcaatgggatcttgagtagtagcatcctcagttactttgaaactgcg
gaatgattttctcctgtccattaacagtttttcaaattttctccattccaatttatcatagaaggcactgtatatgatagcgtacagtttagaaatcaaatg
ttgccgccataataggtaccacatgttgtcactggttgtatgataattacttgatcttactgaccctgcttgttcttcaGGCTGGTAAGCAAGCAC
CACCAGAGAAGTTTGTGATCCCAGCTAAAATCAAGCTGACACTGGAGCCATGGACGTCGAGCTTAAGAGGgagg
tcatcaagaaggcatacgagatggaccttgcccagttgtacaaccagtgacactaccgctacagtttctcaaggataggggtgttgcttagcttgtagc
aaccgataacaacatttatttttgtacttcgaactatgtttgaaagcttgtctttcttttcagactgtggttgactggtaaatcaattgcAGAAAAAAG
CTGAAGATTACAGAAATAGTTTGCAAAAAGAAGCACAAGCAAAAGCTCAAGCATTACACTTTGAAGATGAATT
AG
ctagaaaaagaatgtcggtattgaaaagtcgctttttatatcttcttttagctacttcctataatgtttataatgttgttgcgctgctatgaagctagcaca
caatttaaccaagcatatccattttggcacacagatcatgctccacagaggcgataggatgctgaactcgtgaagataaaagaggcatcttcccttag
gaaagaagctaggcgtgcaacacccgacatctaactattctttcaatatatactgagtactgactcgtacaatatagcttttttactgtctttttgcacac
acttgttacctttttcatatcacttacaaccatattgaatccttgatataaatctgtatcttattatatcttgacctggcccatgtgctattattttgtcagtta
tcttaactgttgcacaactatgtgtttttcgattccacatgtatttaccatttcttttactttgcagttgtatgtatagttaggacaatctggtgactatattta
ttggttcaggattggatctctagcctcacaagatcggcgacttgtatgataaattcacttacagttaagttacctttggtcaactggattctggggcatgc
acttgtagctactggcaacttttagaccatttttgtaatcttttcaattgaaattcgtggctatttgagagcttctctgcgtttcactgatatcttccttatca
ccggtgttgacgcgtggtagtggaaaagacacctaaacaattcatttttgtttggttggcaatgggagatcacacctaaacaattcatagcattaagga
atgcattcgacccttaccaacatagagctgactaacttgcgaaagttagcgaggaaccttccatgacattttgctctaggtattgcttttgacagtgtca
atattacttttgactataatagattttacttgttgtgttccagaataatgtgtataatcttgcataaagtcaggccctgtgtttcttaagagcaaaggactg
agttgtgttgctactttgaacatgaaaataaaatgtggttgtgcactatgtattgcatatctagaattatgtatgaatcatctacctcctgattagataag
cgttctcaattttttaaaatatttaccagatattctattctatgctactcttgtaatttgtttcattcaactatatagggagtacccagagtaaggagcactc
tgccagtttagatgcatggaggggtgttgatgcctttgcgacatcaagcaagtttggtaaacaaactttcagtctagcagattacagagagcactgaga
gaacctgttttatcggtacgacattgtttactcactataacttctcacacagaaaacgacatagtgatctatccatctgagtcaggctatttatttgacaa
ctgatacaaattgatgtagcaacaaagttttccagttttaaaaatatttttgtatgattggaattggagggttcaaatgatgaagggattttagaaatct
ataatatgtatgccttttctttagtctccctgcggaatgaggggctcaacttggttaagagcaggttagctaaatgctctatgagagacggagttctcca
caactttctttggggctctatgcaacagccactcatcaactgtagtaggaggagggcatttatacataagttattgtagtaatatgtttccgttgcaacgc
acgggcactcacctagtatatatatatatactagctgaatgcccgtgcgttgcaacgggaatatataataccagtacactacgataacttatatacaaa
atgtgtgttataccgttataagaaaatgtttcataatcaatttatgattctggccatacataaattttgttatttataatctatttgtttcaccactatattgc
aaccatcagtatcatgcagacttcgatatatgtcacgatttgcatggtctcatcattggagagcacgtttcacacataccggaagaaattccctttacat
cgttagtcatcagacacgtaccaccgtacacttttgcttaaacaaaaaggcaagtgtgtgtttacgaagagaattaaaggcaagccagcacaaaagc
taccccaacggtggcgaggatgacgaactggtcattgttgtcggtcctcctctgcgtcacctctggcgccaagatgacgccacagtcctcgatatagttg
tcgtcgaacgcgcgcgacataccgagtactgatgactcttggctgggctgtaaaacgagtgctcccggggctcatcagcaaggtagtacccctggctg
ttgcaccaccggatgcgctactcctctacatacatcgtgttcaaggacactcatacaacgtaagcaacgaccatcgtctcagcgcacaagaattcatgg
ccagtcagtagcgacttacgtggcaggttgggcttcaggtggacgatgagctggacgacgtgatggcgtcgtcgtcgaatgcggtgccaacaacccg
agagtcgtcgacgttggcgacgaccatgaggtccccctgcttgacgatggacagcgcggagcagccgctctgcaccgcgtccaagcggcggctgcgc
cggagcttgtcgtacatagcggcacatgcggccacgtaggactgtttctagaggtcgaactgatagtcgccaagtttttcttgtcgtcgatgagcgaccc
caacacgagtgcctcctgttaatggtgtttgttcggggttttcaagtaagaaacatgaattcatgtttggcgttggtttataaaaatgactcacaagtcag
atctatggaaaaaatattacgaagaataaatatcacgcatgcaaaaaagaaatttaagttgaaaacattattcaaacaaaagaaattgcatgcaag
gctcttctttaaatactactccctccatccaaaaatataattgaagaatctcggtgatacttatctactacacgcattgtgcaagggtagcaggtggactt
ggggagagatatagtagatgtattttctgttataaatgtaaacataaacacatatgtggtgtagtggtagctactgctattatttgtttgagaggttttgg
gttcgaatccccttgagaccatgtttatttttttaattttagctaggcgtcggctgtacgggggaatgagaatgagctttacagggaggggaatcggaac
gcggggaggagggaatcggaaaggcacacaacataggaatggatgcgcaatggggtagcgactactattacagtcttaataagtagtatatatatat
atatatatatatatatatatatatatatatacgttggtttttaaggtgatctggatatttgagaatcgtgatttaaacatatctctctattcctggataaatc
aagagggtgtttgaatgcaatagaactagtagttagtgactaaaattagttgagacattcaaacaccctagctaatagttcagctattagtttttttgta
aattagttaatagttagatagctatttgttagctagctaatttcactaataatttttagtcaactaacgattatttctagtgtattcaaacacccgggaatc
ttaatccaaataaaaacagacgatattattaatattctttggtctaatatttagacagactattattaggtaggatcagccagtcgtgctgcagcaaagt
attgagtcgaattgtttgaaatatcagcttagttttgcaagattccaaggtccaggtacatcacgtatcacttagttttgcctgacggctgatggtcttcat
acattattagaaggaaacaagcgcgcaacagaaaatcaactagaggccgtcagggacacatatttacattattagactggtccaacggaagtttaaa
aagcaaaatacaactcgggcttgacacctgatcggatggctttgagttagcacgtactacttctacttgatctgatcggatggctttgagttagcacgta
ctacttctacttattgaaggactcgagaaagaagcggttggggtctccttttctgctcaaggtcctcaaaataaataagttgctttgattcatccaacag
atatactgaatgaccaaatgtcaaagctagaatgacgaagcgatgaacgcaatagacacaacaaggtctggcaacaaaccagatcaagggtagaa
acacaactttatccttagtgtttgtattaaaacatatgtatagatattaagagcacaattgcaatttcgtctgggttgcgtcccctgctataaatagatga
acattacccacatactgttcacgtgatcattgaggaaagtaattagagaaagcccaaagtgcaatctcaattattcattctccgaaggttactcttgtat
ttcccttcgcgtgaagccgaaggtacaaatgtaatcatttattcttgtattgctcaaagtatggtcttatcaacatatatataaataaaagaaaggaata
gttaatctcttatcattttctcatcattttaacatgaataccttcgtcatattataaaccttcacttacaaacaaccttcagcgaaggattgattgaaggtg
aagttacatattcactatcattttcgtctaagaactattatcttcaagagaagataatgcttcgaaggacgaatgtccttaatgttcaatattgtgttgcct
tatttttgaatcacagcaattgaaaacaagtaaccaacattggcgctcacctccggtgaactcacttccacaacaatgaccacgatcagccatcgagct
tcgtcactagcaatgacgaagctcatactctctatcacaggcggttcaagcttcgagccagccaacaagaagcaaaaaaggaagcccagcgaagtgt
gcaacatgttggaatgcagggaccctatgtcaaatccaaatggtctcatgttccaatcactttctctcaagaggacctttaactcaaagactacccaca
caatgatgcc
>gene8
Best maize hits:
AC196774.5 Zea mays BAC clone CH201-435B12 from chromosome 5, complete sequence
AC191361.5 Zea mays BAC clone CH201-216O9 from chromosome 5, complete sequence
AC160211.1
Genomic seqeunce for Zea mays BAC clone ZMMBBb0448F23, complete
sequence
AY555142.1 Zea mays BAC clone c573F08, complete sequence
Zea mays cultivar B73 putative gag protein, putative gag-pol precursor, putative
transposase, putative copia-type pol polyprotein, putative copia-like retrotransposon
AF464738.1 Hopscotch polyprotein, putative gag protein, putative prpol, putative prpol, putative
pol protein, putative pol protein, putative gag protein, and teosinte branched1
protein genes, complete cds
AF049110.1 Zea mays retrotransposon Cinful-1, complete sequence
AF123535.1 Zea mays alcohol dehydrogenase 1 (adh1) gene, adh1-F allele, complete cds
JF791320.1 Zea mays subsp. mays cultivar Yu87-1 clone 87-1tb1to69k, complete sequence
Zea mays putative growth-regulating factor 1 (Z214A02.12), putative 40S ribosomal
AY530951.1 protein S8 (Z214A02.25), and putative casein kinase I (Z214A02.27) genes,
complete cds
AF049111.1 Zea mays retrotransposon Cinful-2
Best non maize hits:
EU053446.1
Mini-chromosome MMC1 retrotransposon xilon, complete sequence; hypothetical protein
gene, complete cds; retrotransposon xilon, complete sequence; satellite CentC sequence,;
and retrotransposons cinful and ji, complete sequence
XM_001545031.1 Botryotinia fuckeliana B05.10 hypothetical protein (BC1G_16418) partial mRNA
AC243237.1
Panicum virgatum clone PV_ABa020-K05, complete sequence
AC243260.1
Panicum virgatum clone PV_ABa103-K10, complete sequence
AC243249.1
Panicum virgatum clone PV_ABa094-D05, complete sequence
AC243250.1
Panicum virgatum clone PV_ABa094-D10, complete sequence
JN800064.1
Saccharum hybrid cultivar R570 isolate scTat_7.1 retrotransposon, co
ATGGTTATATCTTGTGTCATAAAAGGATTCTTGGTCCACAATGTTCTAGTGGATACATGCAGTGCATCGAATATC
ATCTTTGCCAAAGCTTTCAGACACATGCAAAAGCAAGAAGATAAGATACATGATACAACACATCCTCTATGTGG
CTTCAGAGAGAAACAAATAGCAGCACTTGGCAAGATAACAATGCCAATCACCTTTGGTTACATACACAACACAA
GAACTGAGCAAGTTGTATTTGACATAGTGGACATGGAATACCCTTACAATGCAATCATTGGAAGGGGAACATTG
AATGCCTTTGAAGCTGTGTTACACCCAGCATACTTGTGCATGGAAATACCATCAAATCAGGGTCCAATCTCTGTA
CATGGAAGTCAAGAGGCTGCAAGAAGAGCCGAAGGAAACTGGATAGACTCCAAGGCTATTCACAATATAGAC
GAAGTTGAAGCTCACGAGCAACAGAAGCACATAAGGGATAAGGCAGCTTCGGCAGATCAGCCAAAATCCATAC
TCTTATGTGAAGATATAGCTGATCAGAAGGTGTTGTTTGGATCTCAGTTGATCAAAGAGCAAAAAAAGAATCTA
ACAAAGTTCTTGTTCCACAACAAAGATGTATTTGCTTGGTCAGCCAATGACCTATGTGGTGTCAACAGAAACATC
ATTGAACATTCTCTTAATGTAGATCCTACCATCAGGCCAAGGAAGCAGAAGCTTCAGAAAATGTCAGAGGATAA
AGCCAAAGGAGCAAGAAACGAAGTTAAAAGACTTCTCAATACTGGAGCGATCAGATAA
gtcatttacccacaatggcttgctaacactataatggtaaaaaaggcaaatggaaatagagtatttgtattgactttacagatctcaacaaggcatgcc
caaaggatgagtttcctttgcccaggatagactcactcatagatgcagcggccactttagagctcatgagcctgctagattgttattcaggatatcacca
gatatggatgaagaaagaagatgagcataagacaagcttcataacccccaatggtacataatactacctttggatgcctgaggggctctagaatgct
ggaggaagcttcagcgggatgatttcaaaagtgttgaatacccaaattggtaggaatgttcgctatttgctctaaggttcagcacttgtttcgtcttcttc
atcatcatactacaaaatcataagtatatcagcacatgaagaagagaagaatcagaaaacattatggaaacagtttgaaatatacctcatccaaaag
ctcccaagcttcggcaccagcaagatctatgccacctttcatccaaattagtgtgataaatcggttggcaacatttctagcttcggttgacgttgttgcta
tatcatttgaagaaatatcaaaatttggcttcccaacagttttcaggtgggagcatccaaccttctcaagatcaaagatgtgacacaagagggaacca
gtgcacaaaagtcaccttagcccatcattacttcgccgaaggtatctatttctccttctacccacttcagcgctccaggtaagtcgttaggggcataattc
tcttctttagacgtcgtcccgaccgagtggaatacttcgcgtagccagttagagcatctgtagacaatctcaaaataattatcctgaagatttttttcaaa
gtttctactagctttgagttgtctcggtccttcgttttaacttcttttttcaaattgtcaaattcaagttgaaggcgacaaacttcgtcggcatgatctgtttt
aattttctccagctcagttgttaatcgtcggatgtctttatcttgattcgcatttttgtctttgaagcttgttttcctcttcctcatcttcattagccctctgaga
tttggccagatccccgtttaatgactgaattgtgacgtctttttccttatttgcttctcaaagtttttcattttcaatttccaacttagaaataacaaactcaa
ttttttgatcctctggatcttgttgaagctttaaggctttactcagcagaaaactttgcatttttacaaaggttgaaaaaatttaggtatcttactatttttat
tctaagacagaaagtatgatgcataaacgaagctttacttagattacgtaagccaagctaccggcgatatgttgtttccttatgctgcttagctcactttc
gagcttcaagaagccaatattcttcatcaaagtgttaaccactctagccctgtcgcggtcaggaatgcactctagtaaattcaccttccaaagcaaact
ccttcagttcagaaatttcttcagcacttagctccccactagttaggtgtctgaagtcgaaggtggcctctcccaggcttcagtttcttgcaaagacgaag
tggtcgtcccaactttttaagctctacagcctttgaaagtcgcctacaggggggtgaataggcaaatctgaaatttacaaactttaagcacaactacaa
gccggggttagcgttagaaatataaacgagtccgaaagagagggcaaaaaacaaatcacaagcaaataaggcggatgacatggtgatttgttttac
cgaggttcggttcttgcaaacctactccccgttgaggtggtcacaaagaccaggtctctttcaaccctttccctctctcaaacggtcacctagaccgagt
gagcttctcttctcaattaaatgggacacttagtccactacaaggaccaccacaacttggtgtctcttgccttgattataattaagttggaaacaagaaa
gaaggaagaagaaaagcaatccaagcgcaagagctcaaaagaacacaaatgtctctctctctagtcactaaattttttggagtgattccggacttggg
agaggatttgatctctttgattgtgtcttggaatgaagtctatagctcttgtatgaggtttgatggctgaaaaacttggatacaatgaatggtgggtggtn
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnnnnnnnnnnnn
(50244-10000)
Yichun Qian
The genes predicted by FGENESH are better. But I also attached the genes predicted by Augustus for
comparison.
Gene Prediction
FGENESH
FGENESH predicted 7 genes. The predicted genes were then translated into peptides. These peptides were
used as queries to run Blastp in the swissprot database. 3 of them had significant hits.
Segment 1: 62665 - 64698
1 CDSf: 62665 - 63676 1011bp
2 CDSl: 63853 - 64698 846bp
Retrotrans_gag[pfam03732], Retrotransposon gag protein; Gag or Capsid-like proteins from LTR
retrotransposons.
>ATGGCGACCGACAACTCGCCCGCCGGCGGCGGAATCGACGACGTCTTCCCCGCGCGGTGGAAGAACAACATTCGAGC
TTGCCTCGTCCCCTCCCCCGCCGACGGAGGAGGAGGCGGGGCAACCCAAGGCCAAGCAGGAGGCGGCACCTCGTCGG
CTGTCGAGCGAGTCGACGGTGCCAGCGCCCCAATGGGGGGCACGTCGGGCATCGACCTCGCGTCTGAGACGAAGACG
AGCGCCGTCTCCCCGCAACACGTCAACCCCAAGCAAACGGACGACGCCAACACGCTCGCAAGGGACTTGCTGGGCGTC
ACCCTCGTACCTGAGACGGCGGTGCAGTCTACCCCTGACGTGACTTCGTCACCGCCCGTCGACCAAGAGGTACCGACCG
ATTCCCATCTCGCGCCTTTTGGATTCAGCCTCAACCCCCCAAGCGACTTCGCTTTGGTGGACGCTCTCATAGAGGCGAGT
CCAAACCCTCTGGGGTATCGTATGCGGTCACCATGGGACCGGCTGACGGCCGTCTCAACCTACGGGCCCTTAGGGTCCG
AGGAAGATGACGAGCCCGACTTTAGTTGGGATTTCTCTGGACTTGGTAACCCCAGTGCCATGCGGGACTTTATGACCGC
GTGCGACTACTGCCTTTCCGACTGTTCCGACGGTAGCCGCAGCCTCGGCGACAAGGACTGCGGCCCAAGTCGTGAATGT
TTTCACGTCGATCTAGGGGGTCCCGACGAAGGCAACCATCTTGGTATGCCAGAGAATGGTGACCTTCCTAGGCCTGTGC
CTCACGTTGACATCCTTCGGGAGCTAGCTGTGGTCCCCGTTCCGGCAGGGGGTCATGACCCACAACTCGAGCAAATCCG
CGAGATGCAGGCCAGGCTCGACGAGGGAGCAGGAACACTTGAGCCGTTCCGCCGGGACAATAGGCAGGAATGGGCG
GGCCAACCTCTGGCCGGAGAAGTGCGTCATCTACCCCAGGGCATCCAGCACCGCGTCGCCGACGATGTCAGGgtaaggcc
gccaccggtttccagtggggtcggccagaacctggctgcagcggcaatacttctccgcgcgatgccggagccatcaaccaccgaggggcggcgtatccaggg
agagctcaagaacctcctggaggacgccgcggtctgacgggccgaaagctccgcctcccgaaggcagGGGTACCCCTCGGAACATCGCGCCGC
GACTTCCCGATTCATGCGGGAAGCCTCGGTCCACACCGGCCGCATGCGTAACATAGCGCATGCGGCCCCGGGTCGCCTC
GGCAACGAGCACCATCACCATAACTGTTGGGCCCACCTCGACGAGAGGGTGCGCCGAGGCTACCACCCCAGGCGTGGG
GGACGCTACGACAGCGGGGAGGATCGGAGTCCCTCGCCCAAACCACCTGGTCCGCAGGCTTTCAACCGCGCCATACGA
CGGGCGCCGTTCCCGACCCGGTTCCGAACCCCGACTACTATCACAAAGTACTCGGGGGAGACGAGACCGGAACTGTGG
CTCGCAGACTACCGGCTGGCCTGCCAGCTGGGTGGAACGGACGATGACAACCTCATCATCTGCAACCTCCCCCTGTTCCT
TTCCGACACCGCTCGCGCCTGGCTGGAGCACCTGCCTCCGGGGCAGATCTCCAACTGGGACGACCTGGTCCAAGCCTTC
GCCGGTAATTTCCAGGGCACGTACGTGCGCCCTGGAAACTCCTGGGATCTCCGAAGCTGCCGCCAGCAGCCGGGGGGG
TCTCTCCGGGACTACATCCGGCGATTCTCGAAGCAGCGCACCGAGCTGCCCAACATCGCCGATTCGGATGTCATCGGCG
CGTTCCTCGCCGGCACCACCTGCCGTGACCTGGTGAGCAAGCTGGGTCGCAAGACCCCCACCAGGGCGAGCGAGCTGA
TGGACATCGCCACCAAGTTCGCCTCTGGCCAGGAGGCGGTTGAGGCCATCTTCCGGAAGGACAAGCAGCCCCAGGGCC
GCCCACCGGAAGATGTCCCCGAGGCGTCAACTTAG
Protein sequences:
MATDNSPAGGGIDDVFPARWKNNIRACLVPSPADGGGGGATQGQAGGGTSSAVERVDGASAPMGGTSGIDLASET
KTSAVSPQHVNPKQTDDANTLARDLLGVTLVPETAVQSTPDVTSSPPVDQEVPTDSHLAPFGFSLNPPSDFALVDALIE
ASPNPLGYRMRSPWDRLTAVSTYGPLGSEEDDEPDFSWDFSGLGNPSAMRDFMTACDYCLSDCSDGSRSLGDKDCG
PSRECFHVDLGGPDEGNHLGMPENGDLPRPVPHVDILRELAVVPVPAGGHDPQLEQIREMQARLDEGAGTLEPFRRD
NRQEWAGQPLAGEVRHLPQGIQHRVADDVRGYPSEHRAATSRFMREASVHTGRMRNIAHAAPGRLGNEHHHHNC
WAHLDERVRRGYHPRRGGRYDSGEDRSPSPKPPGPQAFNRAIRRAPFPTRFRTPTTITKYSGETRPELWLADYRLACQL
GGTDDDNLIICNLPLFLSDTARAWLEHLPPGQISNWDDLVQAFAGNFQGTYVRPGNSWDLRSCRQQPGGSLRDYIRR
FSKQRTELPNIADSDVIGAFLAGTTCRDLVSKLGRKTPTRASELMDIATKFASGQEAVEAIFRKDKQPQGRPPEDVPEAS
T
Segment 2: 66287 - 69085
3 exons:
1 CDSf 66287 - 67405 1119bp
2 CDSi 67439 - 67615 177bp
3 CDSl 68270 - 69085 816bp
RNase_HI_archaeal_like[cd09279], RNAse HI family that includes Archaeal RNase HI;
RT_LTR[cd01647], RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which
have long terminal repeats (LTRs) in their DNA copies but not in their RNA template.
rve[pfam00665], Integrase core domain
RVT_3[pfam13456], Reverse transcriptase-like; This domain is found in plants and appears to be part of
a retrotransposon.
RNase_HI_RT_Ty3[cd09274], Ty3/Gypsy family of RNase HI in long-term repeat retroelements;
RNase_H[cd06222], RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a
sequence non-specific manner
RNase_H[pfam00075], RNase H; RNase H digests the RNA strand of an RNA/DNA hybrid. Important
enzyme in retroviral replication cycle.
RVT_1[pfam00078], Reverse transcriptase (RNA-dependent DNA polymerase) PRK07238[PRK07238],
bifunctional RNase H/acid phosphatase
PRK07708[PRK07708], hypothetical protein; Validated
>ATGCCATTCAGTTTGAGGAATGCGGGTGCAACGTACCAACGGTGCATGAACCACATGTTCGGCGAACACATTGGCCGA
ACGGTCGAGGCCTACGTCGATGACATCGTAGTCAAGACGAGGAAAGCCTCCGACCTCCTTTCCGACCTTGAAGCGACAT
TCCGATGTCTCAAGGCGAAAGGCGTGAAGCTCAATCCCGAGAAATGTGTCTTCGGGGTTCCACGAGGCATGCTCTTGGG
GTTCATCGTCTCCGAGCGGGGCATCGAGGCCAACCCGGAGAAGATCGCGGCCAACACCAGCATGGGGCCCATCAAGGA
CTTGAAAGGCGTACAGAGAGTCACAGGATGCCTTGCGGCTCTGAGCCGTTTCATCTCGCGCCTCGGCGAAAGAGGCCTA
CCTCTGTACCGCCTCTTAAGGAAGGCCGAGTGCTTCACTTGGACCCCTGAGGCCGAGGAAGCCCTCGGGAACCTGAAGG
CGCTCCTCACGAACGCGCCCATCTTGGTGCCCCCCGCTGCCGGAGAAGCCCTCTTGATCTACGTCACCACGACCACTCAG
GTGGTTAGCGCCGCGATTGTGGTTGAGAGACGAGAAGAGGGGCATGCATTGCCCGTACAGAGGCCAGTCTACTTCATC
AGTGAGGTACTGTCCGAGACCAAGATCCGCTACCCACAAATTCAGAAGCTGCTGTACGCAGTGATCCTGACACGACGGA
AGTTGCGACACTACTTCAAGTCTCATCCGGTGACTGTGGTGTCATCCTTCCCCCTGGGGGAGATCATCCAGTGCCGAGAG
GCCTCGGCTAGAATTGCAAAGTGGGCGGTGGAAATCATGGGCGAGACGATCTCGTTCGCCCCTCGGAAGGCCATCAAG
TCCCAGGTCTTGGCGGACTTTGTGGCTGAATGGGTCGACACCCAGCTCCCAACAGCTCCGATCCAACCGGAACTCTGGA
CCATGTTTTTCGACGGGTCACTGATGAAGACAGGAGCAGGCGCAGGCCTGCTCTTGATCTCGCCCCTCAAGAAGCACCT
ACGCTACGTGCTACGCCTCCACTTCCCGGCGTCCAACAATGTGGCTAAGTACGAGGCTCTAGTCAACGGGTTGCGCATC
GCCATCGAGCTGGGGgtctgacgcctcgacgctcgtggtgactcgcagCTCGTCATCGACCAAGTCATGAAGAACTCCCACTGCCAC
GACCCGAAGATGGAGGCCTACTGCGATGAGGTTCGGCGCCTGGAAGACAAGTTCTACGGGCTCGAGCTCAACCACATC
GCCCGACGCCACAACGAGACTGCGGACGAGCTGGCTAAAATAGCCTCGGGGCGAACAACGgttcccccagacgtcttctcccga
gacctgcatcaaccctccgtcaagaccgacgacacgcccgagcccgagacaccctcggcttagtccgaggcaccctcggctcagtccgaggcgccatcggct
cggcccgaggcaccctcggctcaacccgaggcaccctcggcccccgagggtgaggcactgcgcatcgaggaggagcggagaggggtcatgcctaatcgaa
actggcagaccccgtacctgcaatatctccgccgaggagagctacccctcgaccaagccgaagcttggcggttggcgcggcgcgccaagtcgttcgtcttgct
gggagacgagaaggagctctaccaccgcagcccctcgggcatcctccagcgatgcatttccatcgccgaaggccaggagctcctacaagagatacactcgg
gggcttgtggccatcacgcagcacctcgagcccttgttggaaacgccttccgacaaggtttctactggccgacggcggtggccgacaccactagaattgtccg
cacctgcgaagggtgtcagttctacacaaggcagacccacctacccgcttaggccctgcagaccatacccatcacctggtcatttgttgtgtggggtctggacc
tagttggccccttgcagAAGGCACCCGGGGGCTACACGCATCTGTTGGTCGCCATCGACAAATTCTCCAAGTGGATCGAGGTC
CGACCCCTAAACAGCATCAGGTCCGAACAGGCGGTGGCGTTCTTCACCAACATCATCCATCGCTTTGGGGTCCCGAACTC
CATCATCACCGACAACGGCACCCAGTTCACCGGCAGAAAGTTCCTGGACTTCTGCGAGGATCACCACATCTGGGTGGAC
TGGGCCGCCGTGGCTCACCCCATGACGAATGGGCAAGTAGAGCGTGCCAACGGCATGATTCTACAAGGACTCAAGCCT
CGAATCTACAACGACCTCAACAAGTTCGGCAAGCGGTGGATGAAGGAACTCCCCTCGGTGGTCTGGAGTCTGAGGACG
ACGCTGAGCCGGGCCACGGGCTTCACACCGTTCTTTCTAGTCTATGGGGCCGAGACCGTCTTGCCCATAGACTTAGAATA
CGGTTCCCCGAGGACGAGGGCCTACGACGACCAAAGCAATCGAGCTAATCGAGAAGACTCACCGGACCAGCTGGAAGA
GGCTCGGGACATGGCCTTACTACACTCGGCGCGGTACCAGCAGTCCTTGCGACGCTACCACGCCCGAGGGGTTCGGTCC
CGAGACCTCCAGGTGGGCGACCTGGTGCTTCGGCTGCGACAAGACGCCCGAGGGCGGCACAAGCTCATGCCTCCCTGG
GAAGGGTCGTTCGTCATCGCCAAAGTTCTGAAGCCTGGGACGTACAAGCTGGCCAACAGTCAAGGCGAGGTCTACAGC
AACGCTTGGAACATCCGACAGCTACGTCGCTTCTACCCTTAA
Protein sequence:
MPFSLRNAGATYQRCMNHMFGEHIGRTVEAYVDDIVVKTRKASDLLSDLEATFRCLKAKGVKLNPEKCVFGVPRGMLL
GFIVSERGIEANPEKIAANTSMGPIKDLKGVQRVTGCLAALSRFISRLGERGLPLYRLLRKAECFTWTPEAEEALGNLKALL
TNAPILVPPAAGEALLIYVTTTTQVVSAAIVVERREEGHALPVQRPVYFISEVLSETKIRYPQIQKLLYAVILTRRKLRHYFKS
HPVTVVSSFPLGEIIQCREASARIAKWAVEIMGETISFAPRKAIKSQVLADFVAEWVDTQLPTAPIQPELWTMFFDGSL
MKTGAGAGLLLISPLKKHLRYVLRLHFPASNNVAKYEALVNGLRIAIELGLVIDQVMKNSHCHDPKMEAYCDEVRRLED
KFYGLELNHIARRHNETADELAKIASGRTTKAPGGYTHLLVAIDKFSKWIEVRPLNSIRSEQAVAFFTNIIHRFGVPNSIITD
NGTQFTGRKFLDFCEDHHIWVDWAAVAHPMTNGQVERANGMILQGLKPRIYNDLNKFGKRWMKELPSVVWSLRTT
LSRATGFTPFFLVYGAETVLPIDLEYGSPRTRAYDDQSNRANREDSPDQLEEARDMALLHSARYQQSLRRYHARGVRSR
DLQVGDLVLRLRQDARGRHKLMPPWEGSFVIAKVLKPGTYKLANSQGEVYSNAWNIRQLRRFYP
Segment 3: 82383 - 88664
7 exons
1 CDSf 82383 - 83722 1338bp
2 CDSi 84124 - 84298 174bp
3 CDSi 84369 - 85018 684bp
4 CDSi 85130 - 85433 303bp
5 CDSi 85920 - 86500 579bp
6 CDSi 86862 - 87035 174bp
7 CDSl 87327 - 88664 1338bp
RT_LTR[cd01647], RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses.
RNase_HI_archaeal_like[cd09279], RNAse HI family that includes Archaeal RNase HI;
rve[pfam00665], Integrase core domain;
DUF4370[pfam14290], Domain of unknown function (DUF4370);
RT_DIRS1[cd03714], RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of
retransposons.
RVT_1[pfam00078], Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase
gene is usually indicative of a mobile element such as a retrotransposon or retrovirus.
PRK12829[PRK12829], short chain dehydrogenase; Provisional
PHA03307[PHA03307], transcriptional regulator ICP4; Provisional
>ATGGCGGCCGACAACCCGCCCGCCGGCGGCGGAATCGATGACGTCTTCCCCACGTGGCGGAAGAACGACATTCGGGC
TTGTCCCGTCCCCTCCCCCGTCGACGGAGGAGGAGGCGGGGCAACCAAGGCCAAGCAGGAGGCGGCACCTCGTCGGCT
ATCGAGCGAGTCGACGGCGCCGGTGCCCCCAACGAGGGGCGCGATGGGCATCGACATCGCGTCTGAGACGAAGACGA
GCGCCGTCTCCCCGCAACACGCCAACTCCAAGCAAACGGACGACGCCAGCACGCTCGCAAAAGACTTGTTGGGCGTCAC
CCTCGTACCTGAGACGACGGTGCAGTCTACCCCTGACGTGACTTCGTCACCGCCCGTCGACCAAGACGTACCGACCGATT
CCCATCTCGCGCCTTTTGGATTCAGCCTCGACCCACCAAGCGACTTCGCTTTGGTGGACGCTTTCATAGAGGCGAGTCCA
AACCCTCCGGGGTATCGTGTGCGGTCACCCTGGGACCGGCTGACAGCCGTCTCGACCTACGGGCCCTCGGGTTCCGAGG
AAGATGACGAGCCCGACTTTTGTTGGGATTTCTCTGGACTTGGTAACCCCAGTGCCATGCGGGACTTCATGACCACATGC
GACTACTGCCTTTCCGACTGTTCCGACGGTAGCCGCAGCCTCGGCGACGAGGACTATGGCCCAAGTCGTGAATGTTTCC
ACGTCGACCTAGGGGGTCCCGGCGAAGGAAACCATCCTGGTATACCGGAAAATGGTGATCCCCCTAGGCCTGCGCCTC
GCGTTGACATCCTACGGGAGCTAGCTGTGGTCCCAGTCCCTGCGGGGGTCAGGACTCACAGCTCGAGCAAATCTGCGA
GATGCAGGCCAGGCTCGACGAGGGAGCAGGAACACTTGAGCCGTTCCGCCGGGACATCGGGCAGGAATGGGCAGGCC
AACCTCCGGCCGGAGAAGCGCGCCATCTACCCCAGGGCATCCAACACCGCATCGCCGACGATGTCAGGGCAAGGCCGC
CACCGGCCTCCAGTGGGGTCGGCCAGAACCTGGCTGCAGCGGCAATACTTCTCCGCGCGATGCCGGAGCCATCTACCAC
CGAGGGGCGGCGTATCCAGGGAGAGCTCAAGAATCTCCTGGAGGATGTCGCGGTCCGACGGGCCGAAAGCTCCGCCT
CCCGAAGGCAGGGGTACCCCTCGGAACATCGCGCCGCGACTTCCCAATTCATGCGGAAAGCCTCGGTCCACACCGGGC
GCACGCGCAACACAGCGCCTGCGGCCCTGGGTCGCCTCGGCAACGAACACCCTCACCGCAACCGTCGAACCCACCTCGA
CGAGAgggtgcgccgaggctaccaccccaggcgtgggggacgctacgacagcggggaggattggagtccctcgcccgaaccacccggtccgcaggcttt
cagccgggccatacgacgggcgccgttcccgacccggttccgaaccccgactactatcacaaagtactcgggggagacgagaccggaactgtggctcgcgg
actaccggctagcctgccacctgggtggaacagacgatgacaatctcatcatccggaacctccccctgttcctctccgacaccgctcgagcctggctggagca
cctgcctccggggcagatctccaactaggacgacctggtccaagccttcgccggcaacttccagggtacgtatgtgtgccctgggaactcctgggatctccaa
aGCTGCCGCCAGCAGCCGGGGGAGTCTCTCTGGGACTACATCCGGCAATTCTCGAAGCAGCGCACCGAGTTGCCCAATG
TCACCGACTCGGATGTCATCGGCGCGTTCCTCGCCGACACCACTTGCCGCGACCTGGTTAGCAAGCTGGGTCGCAAGAC
CCCCACCAGGGCGAGTGaggtgatggacatcgccaccaagttcgcctctggctaGGATGCGGTTGAGGCCATCTTCCGGAAGGACAA
GCAGCCCCAGGGCCGCCCACCGGAAGATGTCCCCGAGGCGTCAACTCAGCGCGGCATCAAGAAGAAAGGCAAGAAGA
AGTCGCAAGCAAAACGCGACGCCGCCGATGCGAACTTTGTCGCCGCCGCCGAGTACAAGAACCCTCGGAAACCTCCTG
GAGGTGCCAATCTCTTCGACAAGATGCTCAAGGAGCCGTGCCCCTGTCATCAGGGGCCCGTCAAGCACACCCTTGAGGA
GTGCGCCATGCTTCGGCGCCACTTTCACAAAGCCGGGCCACCTGCGGAGGGTGGCCGGGCCCGCGACGACGATAAGAA
GGAGGATCACAAGGCAGGAGAGTTCCCCGAGGTCCACGACTGCTTCATGATCTACGGTGGGCAAGTGGCGAACGCCTC
GGCTCGGCACCACAAGCAAGAGCGTCGGGAGGTCTGCTCGGTAAAGGTGGCGGCGCCAGTCTACCTAGACTGGTCCGA
CAAGCCCATCACCTTCGACCAGGGCGACCACCCCGACCGCGTGCCGAGCCTGGGGAAGTACCCGCTCGTTGTCGACCCC
GTCATCGGCAACGTCAGGCTCACCAAGGTCCTCATGGACGGAGGCAGCAGCCTCAACGTCATCTACGCCAAGACCCTCG
GGCTCCTGCGGATCGATCTGTCCTCggtacgggcaggagctgcgccttttcacgggatcatccctgggaagcgcgtccagcccctcggacaactcg
atctacccgtctgctttgggacaccctccaacttctgaaagGAGACCCTCACGTTCGAGGTGGTCGGGTTTCGAGGAACCTACCACGCA
GTGCTGAGGAGGCCATGCTACGCCAAGTTCATGGTCGTCCCCAACTACACCTACCACAAGCTAAAGATGCCAGGCCCCA
ACGGGGTCATCACCGTCGGCCCCACGTACCGACACGCGTACGAATGCGACGTGGAGTGCATGGAGTACGCCGAGGCCC
TCGCCAAATCCGAGGCCCTCATCGCCGACCTGGAGAGCCTCTCCAAGGAGGCGCCAGACGTGAAGCGCCACACCAGCA
ACTTCGAGCCAACGGAGATGggtaagttcgtccctctcaacaccagcaacgatacctccaagctgatccggatcgggctccgagctcgaccccaaat
aggaagcagtctcgtcgactttctccgtgcaaacaccgatgtttttgcatggaatccctcggacatgcccggcataccgagggatgtcgccgagcactcgctgg
atatccgagctagagcccgacccgtgaagcagcctctgcgccggttcgacgaagaaaagcgcagagccataggcgaggagatccacaagctaatggcggt
agggttcatcaaagaggtattccatcccgagtggcttgccaaccctgtgcttgtgagaaagaaaggagggaaatggcgtatgtgtgtagactacactggtcta
aacaaagcatgtccaaaagttccctaccctctgcctcgcatcgatcaaatcgtggattccactgctgggtgcgaaaccctgtctttcctcgatgcctactcagG
GTATCGCCAAATCAGGATGAAAGAGTCCGACCAGCTCGCGACTTCTTTCATCACACCTTTCGGCATGTACTGCTATGTTA
CCATGTCGTTTGGTTTGAGGAATGCGGGTGCGACATACCAAAGGTGCATGAACCACGTGTTCGGCGAACACATTGGTCG
AACGGTCGAGGCTTACATCGATGACATCGTAGTCAAGACGAGGAAAGCCTCTGACCTCCTTTCCGACCTTGAAACGACA
TTCTGGTGTCTCAAGGCGAAAGGTGTAAAGCTCAATCCCGAGAAGTGCGTCTTCGGGGTCCCCCAAGGCTTGCTCTTGG
GGTTTATCGTCTCCGAGCGGGGCATCGAGGCCAACCCAGAGAAAATCGTGGCCATCACCAACATGGGGCCCATCAAGG
ACTTGAAAGGCGTACAGAGGGTCACGGGGTGCCTTGCGGCTCTGAGCCGTTTCATCTCACGCCTCGGCGAAAGAGGCC
TGCCTCTGTACCGCCTCTTAAGGAAGGCCGAGTGCTTCACTTGGACCCCTGAGGCCGAGGAAGCCCTCGGGAACCTGAA
GGCGCTCCTCACGAACGCGCCCATCTtggtgcccccgcggccggagaagccctcttgatctacgtcgccgctaccactcaggtggtcagcgccgcg
atcgtggttgagagacgagaagagggacatgcattgcctgtccagaggccagtctacttcgtcagtgaggtactgtccgagaccaagatccgctacccacaa
attccgagtctcatccggtgactgtggtgtcatctttccccctgggggagatcatccagtgccgagaggcctcgggtaggattgcaaagtgggcggtggaaatc
atgggcgagacaatctcgttcgccactcgtaaggccataaagtcccaagtcttggcggactttgtggctgaatgggtcgatacccaGCTCCCGACAGCT
CCGATCCAACCGGAACTCTGGACCATGTTTTTTGACGGGTCGCTGATGAAGACAGGGGCAGGCGCGGGCCTGCTCTTCA
TCTCGCCCCTCGGGAAGCACCTACGCTACGTGCTACGCCTCCACTTCCCGGCGTCCAACAATGTGGCCGAGTACGAGGCT
CTggtcaacgggttgcgcgtcgccatcgagctagggatccgacgtctcgacgctcgcggtgactcgtagctcgtcattgactaagtcatgaagaactcccact
tctgcgactcgaagatggaagcctactgcgatgaggttcggcgcctggaggacaagttctatgggctcgagttcaaccacatcgcccgacgctacaacgaga
ctgcggacaagctggctaagatagcctcggggcaaacaacggttcccccggacgtcttctcctgagacctgcatcaaccctccgtcaagACCGACGACA
CGCCCGAGCCCGAGAAGGCCTCGGCCCAGCCCGAGGCACCCTCGGCCCCCGAGGATGAGGCACTGCGTGTCGAGGAG
GAGCGGAGCGGGGTCACGCCTAATCGAAACTGGCAGACCCCGAACCTGCAATATCTCCACCGAGGAGAGCTACCCCTC
GACCGAGCCGAAGCTCGGCGGTTGGCGCGGCGTGCCAAGTCGTTCGTCTTGCTGGGGGACGGGAAGGAGCTCTACCAT
CGCAGCCCCTCAGGCATCCTCCAGCAATGCATATCCATCACCGAAGGCCAGGAGCTCTTACAAGAAATACACTCGGGGG
CTTGCGGGCATCACGCGGCGCCCCGAGCCCTTGTTGGGAACGCCTTCCGACAAGGTTTCTACTGGCCAACCGCGGTGGC
CGACGCCACTAGAATTGTTCGCACCTGCCAGGGGTGTCAATTCTACGCAAGGCAGACTCACCTTCCCGCCCAGGCTCTAC
AGACCATACCCATCACCTGGTCGTTTGCTGTGTGGGGTCTGGACCTCGTCGGCACCTTGCAGAAGGCACCCGGGGGCTA
CACGCACCTGCTGGTCGCCATCGACAAATTCTCCAAGTGGATCGAGGTCCGACCCCTAAACAGCATCAGGTCTGAACAG
GCGGTGGCGTTCTTCACCAACATCATCCATCGCTTTGGGGTCCCGAACTCCATCATCACCGACAACGACACCCAGTTCAC
CGACAGAAAGTTCCTGGACTTCTGCGAGGATCACCACATCCGGGTGGACTGGGCCGCCGTGGCTCACCCCATGACGAAT
GGGCAAGTAGAGCGTGCCAACGGCATGATCCTGCAAGGACTCAAGCCGTGGATCTACAACAACCTTAACAAGTTCGGC
AAGCGATGGATGAAGGAGCTCCCCTCGGTGGTCTGGAGTCTGAGGACAACGCCGAGCCGAGCCACGGGCTTCACACCG
TTCTTTCTAGTCTATGGGGCCGAGGCCATCTTGCCCATAGACTTAGAATACGGTTCCCCAAGGACGAGGGCCTACAACG
ACCAAAGCAATCGAGCTAACCGAGAAGACTCACTGGACCAGCTGGAAGAGGCTCGGAACATGGCCTTCCTACACTCGG
CGCGGTATCAGCAGTCCCTGCGACGCTACCACGCCCGAAGGGTTCGGTCCCGAGACCTCCAGGTGGGCGACTTGGTGCT
TCGGCTGCGACAAGACGCCCGAGGGCGGCACAAGCTCACGCCTCCCTGGGAAGGGTCGTTCGTCATCGCCAAGGTTCT
GAAGCCCGGGACGTATAAGCTGGCCAACAGTCAAGGCGAGGTCTACAACAACGCTTGGAACATCCGATAG
Protein sequence:
MAADNPPAGGGIDDVFPTWRKNDIRACPVPSPVDGGGGGATKAKQEAAPRRLSSESTAPVPPTRGAMGIDIASETKT
SAVSPQHANSKQTDDASTLAKDLLGVTLVPETTVQSTPDVTSSPPVDQDVPTDSHLAPFGFSLDPPSDFALVDAFIEASP
NPPGYRVRSPWDRLTAVSTYGPSGSEEDDEPDFCWDFSGLGNPSAMRDFMTTCDYCLSDCSDGSRSLGDEDYGPSRE
CFHVDLGGPGEGNHPGIPENGDPPRPAPRVDILRELAVVPVPAGVRTHSSSKSARCRPGSTREQEHLSRSAGTSGRNG
QANLRPEKRAIYPRASNTASPTMSGQGRHRPPVGSARTWLQRQYFSARCRSHLPPRGGVSRESSRISWRMSRSDGPK
APPPEGRGTPRNIAPRLPNSCGKPRSTPGARATQRLRPWVASATNTLTATVEPTSTRGCRQQPGESLWDYIRQFSKQR
TELPNVTDSDVIGAFLADTTCRDLVSKLGRKTPTRASEDAVEAIFRKDKQPQGRPPEDVPEASTQRGIKKKGKKKSQAKR
DAADANFVAAAEYKNPRKPPGGANLFDKMLKEPCPCHQGPVKHTLEECAMLRRHFHKAGPPAEGGRARDDDKKED
HKAGEFPEVHDCFMIYGGQVANASARHHKQERREVCSVKVAAPVYLDWSDKPITFDQGDHPDRVPSLGKYPLVVDPV
IGNVRLTKVLMDGGSSLNVIYAKTLGLLRIDLSSETLTFEVVGFRGTYHAVLRRPCYAKFMVVPNYTYHKLKMPGPNGVI
TVGPTYRHAYECDVECMEYAEALAKSEALIADLESLSKEAPDVKRHTSNFEPTEMGYRQIRMKESDQLATSFITPFGMYC
YVTMSFGLRNAGATYQRCMNHVFGEHIGRTVEAYIDDIVVKTRKASDLLSDLETTFWCLKAKGVKLNPEKCVFGVPQG
LLLGFIVSERGIEANPEKIVAITNMGPIKDLKGVQRVTGCLAALSRFISRLGERGLPLYRLLRKAECFTWTPEAEEALGNLKA
LLTNAPILLPTAPIQPELWTMFFDGSLMKTGAGAGLLFISPLGKHLRYVLRLHFPASNNVAEYEALTDDTPEPEKASAQPE
APSAPEDEALRVEEERSGVTPNRNWQTPNLQYLHRGELPLDRAEARRLARRAKSFVLLGDGKELYHRSPSGILQQCISIT
EGQELLQEIHSGACGHHAAPRALVGNAFRQGFYWPTAVADATRIVRTCQGCQFYARQTHLPAQALQTIPITWSFAVW
GLDLVGTLQKAPGGYTHLLVAIDKFSKWIEVRPLNSIRSEQAVAFFTNIIHRFGVPNSIITDNDTQFTDRKFLDFCEDHHIR
VDWAAVAHPMTNGQVERANGMILQGLKPWIYNNLNKFGKRWMKELPSVVWSLRTTPSRATGFTPFFLVYGAEAILP
IDLEYGSPRTRAYNDQSNRANREDSLDQLEEARNMAFLHSARYQQSLRRYHARRVRSRDLQVGDLVLRLRQDARGRH
KLTPPWEGSFVIAKVLKPGTYKLANSQGEVYNNAWNIR
Augustus gene prediction
Augustus predicted 13 genes. The predicted genes were then translated into peptides. These peptides
were used as queries to run Blastp in the swissprot database. Only 2 of them had significant hits. One belongs
to the Reverse transcriptases (RTs) superfamily, the other belongs to the RNase H superfamily.
Segment 1: 65858 --- 67411
CDS 65858
--- 67411 1553bp
RT_LTR[cd01647]: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long
terminal repeats (LTRs) in their DNA copies but not in their RNA template.
RT_Rtv[cd01645]: Reverse transcriptases (RTs) from retroviruses (Rtvs).
RT_ZFREV_like[cd03715]: A subfamily of reverse transcriptases (RTs) found in sequences similar to the
intact endogenous retrovirus ZFERV from zebrafish and to Moloney murine leukemia virus RT.
>ATGCCCGGCATACCGAGGGATGTCGCCGAGCACTCGCTGGATATCCGAGCTGGAGCCCGACCCGTGAAGCAGCCTTT
GCGCCGATTCGACGAAGAAAAGCGCAGAGCCATAGGCGAGGAGATCCACAAGCTAATGGCGGCAGGGTTCATCAAAG
AGGTATTCCACCCCGAATGGCTTGCCAACCCTGTGCTTGTGAGAAAGAAAGGAGGGAAATGGCGGATGTGTGTAGACT
ACACTGGTCTAAACAAAGCATGTCCGAAAGTTCCCTACCCTCTACCTCGCATCGATCAAATCGTGGATTCCACTGCTGGG
TGCGAAACCCTATCTTTCCTTGATGCCTACTCGGGGTATCACCAGATCAGGATGAAAGAGTCCGACCAGCTCGCGACTTC
TTTCATCACACCCTTCGGCATGTACTGTTATGTTACCATGCCATTCAGTTTGAGGAATGCGGGTGCAACGTACCAACGGT
GCATGAACCACATGTTCGGCGAACACATTGGCCGAACGGTCGAGGCCTACGTCGATGACATCGTAGTCAAGACGAGGA
AAGCCTCCGACCTCCTTTCCGACCTTGAAGCGACATTCCGATGTCTCAAGGCGAAAGGCGTGAAGCTCAATCCCGAGAA
ATGTGTCTTCGGGGTTCCACGAGGCATGCTCTTGGGGTTCATCGTCTCCGAGCGGGGCATCGAGGCCAACCCGGAGAA
GATCGCGGCCAACACCAGCATGGGGCCCATCAAGGACTTGAAAGGCGTACAGAGAGTCACAGGATGCCTTGCGGCTCT
GAGCCGTTTCATCTCGCGCCTCGGCGAAAGAGGCCTACCTCTGTACCGCCTCTTAAGGAAGGCCGAGTGCTTCACTTGG
ACCCCTGAGGCCGAGGAAGCCCTCGGGAACCTGAAGGCGCTCCTCACGAACGCGCCCATCTTGGTGCCCCCCGCTGCCG
GAGAAGCCCTCTTGATCTACGTCACCACGACCACTCAGGTGGTTAGCGCCGCGATTGTGGTTGAGAGACGAGAAGAGG
GGCATGCATTGCCCGTACAGAGGCCAGTCTACTTCATCAGTGAGGTACTGTCCGAGACCAAGATCCGCTACCCACAAAT
TCAGAAGCTGCTGTACGCAGTGATCCTGACACGACGGAAGTTGCGACACTACTTCAAGTCTCATCCGGTGACTGTGGTG
TCATCCTTCCCCCTGGGGGAGATCATCCAGTGCCGAGAGGCCTCGGCTAGAATTGCAAAGTGGGCGGTGGAAATCATG
GGCGAGACGATCTCGTTCGCCCCTCGGAAGGCCATCAAGTCCCAGGTCTTGGCGGACTTTGTGGCTGAATGGGTCGACA
CCCAGCTCCCAACAGCTCCGATCCAACCGGAACTCTGGACCATGTTTTTCGACGGGTCACTGATGAAGACAGGAGCAGG
CGCAGGCCTGCTCTTGATCTCGCCCCTCAAGAAGCACCTACGCTACGTGCTACGCCTCCACTTCCCGGCGTCCAACAATG
TGGCTAAGTACGAGGCTCTAGTCAACGGGTTGCGCATCGCCATCGAGCTGGGGGTCTGA
Protein sequence:
MPGIPRDVAEHSLDIRAGARPVKQPLRRFDEEKRRAIGEEIHKLMAAGFIKEVFHPEWLANPVLVRKKGGKWRMCVDYTGL
NKACPKVPYPLPRIDQIVDSTAGCETLSFLDAYSGYHQIRMKESDQLATSFITPFGMYCYVTMPFSLRNAGATYQRCMNHMF
GEHIGRTVEAYVDDIVVKTRKASDLLSDLEATFRCLKAKGVKLNPEKCVFGVPRGMLLGFIVSERGIEANPEKIAANTSMGPIKD
LKGVQRVTGCLAALSRFISRLGERGLPLYRLLRKAECFTWTPEAEEALGNLKALLTNAPILVPPAAGEALLIYVTTTTQVVSAAIV
VERREEGHALPVQRPVYFISEVLSETKIRYPQIQKLLYAVILTRRKLRHYFKSHPVTVVSSFPLGEIIQCREASARIAKWAVEIMGE
TISFAPRKAIKSQVLADFVAEWVDTQLPTAPIQPELWTMFFDGSLMKTGAGAGLLLISPLKKHLRYVLRLHFPASNNVAKYEAL
VNGLRIAIELG
Segment 2: 86898 --- 88664
2 exons
1 CDS
86898---87090
192bp
2 CDS
87304---88664 1360bp
RNase_HI_archaeal_like[cd09279], RNAse HI family that includes Archaeal RNase HI
RVT_3[pfam13456], Reverse transcriptase-like; This domain is found in plants and appears to be part of a
retrotransposon.
RNase_H[cd06222], RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a
sequence non-specific manner
RnhA[COG0328], Ribonuclease HI [DNA replication, recombination, and repair]
PRK07238[PRK07238], bifunctional RNase H/acid phosphatase; Provisional
PRK07708[PRK07708], hypothetical protein; Validated
>ATGTTTTTTGACGGGTCGCTGATGAAGACAGGGGCAGGCGCGGGCCTGCTCTTCATCTCGCCCCTCGGGAAGCACCTA
CGCTACGTGCTACGCCTCCACTTCCCGGCGTCCAACAATGTGGCCGAGTACGAGGCTCTGGTCAACGGGTTGCGCGTCG
CCATCGAGCTAGGGATCCGACGTCTCGACGCTCGCggtgactcgtagctcgtcattgactaagtcatgaagaactcccacttctgcgactcga
agatggaagcctactgcgatgaggttcggcgcctggaggacaagttctatgggctcgagttcaaccacatcgcccgacgctacaacgagactgcggacaag
ctggctaagatagcctcggggcaaacaacggttcccccggacgtcttctcctgagaCCTGCATCAACCCTCCGTCAAGACCGACGACACGCC
CGAGCCCGAGAAGGCCTCGGCCCAGCCCGAGGCACCCTCGGCCCCCGAGGATGAGGCACTGCGTGTCGAGGAGGAGC
GGAGCGGGGTCACGCCTAATCGAAACTGGCAGACCCCGAACCTGCAATATCTCCACCGAGGAGAGCTACCCCTCGACC
GAGCCGAAGCTCGGCGGTTGGCGCGGCGTGCCAAGTCGTTCGTCTTGCTGGGGGACGGGAAGGAGCTCTACCATCGCA
GCCCCTCAGGCATCCTCCAGCAATGCATATCCATCACCGAAGGCCAGGAGCTCTTACAAGAAATACACTCGGGGGCTTG
CGGGCATCACGCGGCGCCCCGAGCCCTTGTTGGGAACGCCTTCCGACAAGGTTTCTACTGGCCAACCGCGGTGGCCGAC
GCCACTAGAATTGTTCGCACCTGCCAGGGGTGTCAATTCTACGCAAGGCAGACTCACCTTCCCGCCCAGGCTCTACAGAC
CATACCCATCACCTGGTCGTTTGCTGTGTGGGGTCTGGACCTCGTCGGCACCTTGCAGAAGGCACCCGGGGGCTACACG
CACCTGCTGGTCGCCATCGACAAATTCTCCAAGTGGATCGAGGTCCGACCCCTAAACAGCATCAGGTCTGAACAGGCGG
TGGCGTTCTTCACCAACATCATCCATCGCTTTGGGGTCCCGAACTCCATCATCACCGACAACGACACCCAGTTCACCGAC
AGAAAGTTCCTGGACTTCTGCGAGGATCACCACATCCGGGTGGACTGGGCCGCCGTGGCTCACCCCATGACGAATGGG
CAAGTAGAGCGTGCCAACGGCATGATCCTGCAAGGACTCAAGCCGTGGATCTACAACAACCTTAACAAGTTCGGCAAGC
GATGGATGAAGGAGCTCCCCTCGGTGGTCTGGAGTCTGAGGACAACGCCGAGCCGAGCCACGGGCTTCACACCGTTCT
TTCTAGTCTATGGGGCCGAGGCCATCTTGCCCATAGACTTAGAATACGGTTCCCCAAGGACGAGGGCCTACAACGACCA
AAGCAATCGAGCTAACCGAGAAGACTCACTGGACCAGCTGGAAGAGGCTCGGAACATGGCCTTCCTACACTCGGCGCG
GTATCAGCAGTCCCTGCGACGCTACCACGCCCGAAGGGTTCGGTCCCGAGACCTCCAGGTGGGCGACTTGGTGCTTCGG
CTGCGACAAGACGCCCGAGGGCGGCACAAGCTCACGCCTCCCTGGGAAGGGTCGTTCGTCATCGCCAAGGTTCTGAAG
CCCGGGACGTATAAGCTGGCCAACAGTCAAGGCGAGGTCTACAACAACGCTTGGAACATCCGATAG
protein sequence:
MFFDGSLMKTGAGAGLLFISPLGKHLRYVLRLHFPASNNVAEYEALVNGLRVAIELGIRRLDARDLHQPSVKTDDTPEPEKASA
QPEAPSAPEDEALRVEEERSGVTPNRNWQTPNLQYLHRGELPLDRAEARRLARRAKSFVLLGDGKELYHRSPSGILQQCISITE
GQELLQEIHSGACGHHAAPRALVGNAFRQGFYWPTAVADATRIVRTCQGCQFYARQTHLPAQALQTIPITWSFAVWGLDL
VGTLQKAPGGYTHLLVAIDKFSKWIEVRPLNSIRSEQAVAFFTNIIHRFGVPNSIITDNDTQFTDRKFLDFCEDHHIRVDWAAV
AHPMTNGQVERANGMILQGLKPWIYNNLNKFGKRWMKELPSVVWSLRTTPSRATGFTPFFLVYGAEAILPIDLEYGSPRTRA
YNDQSNRANREDSLDQLEEARNMAFLHSARYQQSLRRYHARRVRSRDLQVGDLVLRLRQDARGRHKLTPPWEGSFVIAKVL
KPGTYKLANSQGEVYNNAWNIR
(100001-150000) Norman Best
catgttgcggcgctggagaaagaatcaggttgtggcactgctgctgaatt
gtgatagtgctttgcatcctcactaaaacgtgagccgaatactagagaaa
gtgtatgcagcggcgccgaacgatgagaaattaaatcagaggcattatgg
ggagtttatatgatgacgaaaaatgaaaggaaaaagaaatattaaaaaaa
aagttggaatactattttggagagtaatatcacttataaatactattata
gatgaagattattttaaaaataccataaagaagaagcctattaaaacagg
tactacagtagaaaaaaaactccgaccggacgacggagagaaaggcggac
caaacagaccaaatagacggagtcatttccacacttgtgtgcagacgtca
ctatggggctgtagccgagaagagagagacacggcccaacaggcataaca
gaccggaccatttccagccctcacatgggccaagctgcagccgacaacat
accagatttaatatgggcctacaacagcagaagaccggccccagaaatag
tgaaatactactgctaggaaatcgggacatgatgattgtttgttttctaa
gctcgttcgccctctctcttctctccgtccaacgacggcgcgattcagat
ccaggcatacaggccatccttccgccgccgcggactgcacctaatcttcc
ttctccagcagccgggcgttcccgtcgtcctcgatcgtcccctggactgg
atcgcaacggagatccagctggacctgcggtggtcccttccttttctcgg
taggccgccttgaatgaattcctcgttttctgtaatgcatattctttgct
tactatgaatcctcaaatccgaatcttttatgtggtggatgtgatggatt
gtttcctttgcaagatagcataatgagctcgccagtggagaggtgattcg
tcaggctggtatcaagttcctagtccatggtgcctggatgagcagtagag
101001-102816 Forward Strand
Exon 1: 101001-101557
Exon 2: 102747-102816
Identified through Augustus, FGENESH had same first exon but different second exon. Did a blastp
of both results and found better homology to Augustus result.
Shows high homology to Sb03g026500 and Os01g0589700 with supporting EST data from Sorghum
bicolor, Saccharum officinarium (sugar cane), and Zea mays. (Phytozome) Maizegdb.org shows
both EST and cDNA data supporting my gene structure.
Splice donor site present after exon 1 and splice acceptor site found before exon 2. No
polypyrimidine identified.
Protein:
MLLAVEGGGFFSSSATGYSHGLALLLLGRKAEEKPVKVSPWNQYRLVDRETEQVYHLPSAKDQAPGKCAPFVCFGCT
ANGLEVASPPKATSSNALGTGTSQEEASCSANKTLTTSGSISGSERRGCLKSNSKRDSLEHRIVVSEGEEPRESLEE
VQTLRSSMERRKVQWTDTCGKDLFEIREFETSDESLSDDDPENEGFRKCECVIQ
ATGCTACTGGCCGTGGAAGGAGGAGGGTTCTTCTCGTCTTCAGCTACAGG
TTACAGCCATGGCCTCGCCCTCTTGCTGCTCGGACGCAAAGCTGAGGAGA
AGCCCGTAAAGGTCTCGCCGTGGAACCAGTACCGGCTAGTCGACCGGGAG
ACCGAGCAGGTGTACCACCTGCCCTCCGCCAAGGACCAGGCTCCTGGGAA
ATGTGCTCCCTTTGTCTGCTTCGGATGCACGGCTAATGGCCTCGAGGTGG
CGTCTCCTCCAAAAGCGACCTCCAGCAATGCGCTTGGTACCGGTACCTCA
CAGGAAGAAGCATCTTGTTCAGCAAACAAGACGTTGACCACTAGTGGTTC
CATCAGTGGCAGTGAGAGACGAGGCTGTCTTAAGAGCAACTCCAAAAGGG
ATTCTTTGGAGCACCGTATAGTGGTGAGCGAGGGTGAAGAACCGCGTGAG
TCTTTGGAAGAGGTGCAAACCCTGAGATCTAGCATGGAACGGAGGAAAGT
TCAGTGGACAGATACATGTGGAAAGGATCTTTTTGAGATAAGGGAGTTTG
AAACAAGgtatgatccaaaaatcctatctttccttgcatgccattgtggc
ctggaaatatattgtgtttgtggttactaagtctgaacaccctattgtca
attgtcatacatgaatgaatgcacatttctaacttcttctattaatgaat
ccaaaatccaaaataaatgtgcacaacattatctcaattttttgtcatgg
ttatttttgtgctattaggtaaccatcttgtgaggtaacttatcttcaag
ttttcaagtaacaaagataccacaaagttcgagccatcttaatgagtagt
ctccaaccaaaaggttaaaacttaatgccttcttcctcatagtcaaatct
cctctaaatccctatgcacatacttttagatgtaaagacaattccagata
ctaaaaatgtaacatatgcttaccaaaatgttctatattgtgaacatgcg
caagtgcatggagtttgcactagtttatagctcattgaagaagcagtgca
cctgcaccattaaaggggtgtttggtttctagggactaatgtttagtccc
tggattttattccattttagttccaaaattaccaaatatagaaactaaaa
ctttattttagtttctatatttagcgatttatacactaaaaaggaataaa
atgaagggactaaacattagtccctggaaaccaaacaccccctaaatgat
gcaaatatgtgtgtggtctggcttgctgcagaatcctcattactactaaa
ggccccgcagagcttcgggcagctctagctccagttcttccatgcagtat
agggacaagccgcttttagctctcctatcaaaatgaatagaagctggtga
agccattatttttagcttcattagctctagctcctctgcgtgctgtaggc
tgtagcaaggagccggagccatttaacggaacctaaatgttaacactatg
aagaatagtcatatctttgtttgatttgcactcattacatcgtaagccaa
gttccctacaaattaattgacactgattttccatatgcatgttgaaatgt
atctggccacgtttactactcacaaactataggtcacgaagtcagataga
cgtggtgatacatatattttgtgttcttgtgttctactattgctttgctg
tacctgtatactatattctgattgctgccttttggtgtcttggtagCGAC
GAGAGTCTGTCAGATGATGACCCAGAAAATGAAGGTTTCCGGAAATGTGA
GTGTGTGATTCAGTAG(End of Gene)atttcttgtgtgcacaacaagcctgacaggaaca
tctcctcatgggaagcccatgtcctgctgagaaggacatgccatatgttc
ttctgggaaggatatgccacacccagtgatttcattgcaactggaacgat
cgatgacagtcgaaacaaagcaaacggtgatgaatcgatattgtgctaga
ttatttgttggttctacgcttaactagtctgctcggcagggttatagttt
cgatgcccacctgtacaataccaagatcccattcctttctttttgttgtc
atgttttgtttctctgactttccatcacctttttcacgcccgctgtacca
tatgaggctgccctgttacttctgccatggatggtggatccctgacttca
tgctgttcccttttgttttgcccctaacgtgttgagtgatgtcttgttga
cattttcgtccaaattggtaaaattcattcttcgttgcgtaagagtgaca
tacgagggctatttggttgtttaactcaaagtccggcgtactgatgtgga
tctgtctttctgcccctataatgtcgatgagacattaatgtcttctgtga
gaaaatgtatatattggttgctaggtacaattaccttttgacaagttttt
atattataatacagtagtagggccggggcaggaaaatattgaggccttgt
gccaaatctaaagtaggatctatactttaaaaaatcgaatattaacaatc
taaaacttgcaatagcaaatattgtaatggaatatataaagaaacatatc
tatttaattagaataataccattcttttggtatatttttgaaataaaatc
ttctatgatatgtttatattcgatcttctttaacacttcactcttcagcg
atattatagacaaatcattaagtatttgttgtattattttagaacataaa
tttaacttcagtaacttcaatttagagaaactttgttttgaagatgcaac
agttgcaccagaatagtcagcaaaactctatatgcaatacatggattagg
gaaaaaaatcatgctgctttagaaacttcacaatttacaaccgccccata
tttttagttagcatggaattttgaagaaactttaactccatataaagttc
atgggcataaatacctgatttttatccttgtaagggcaacctctagatta
tcaaaaacagggtcttgatcatgtgactgattgctggatgacacgtgtga
ttgtttagtaataatacatatatctgtagtagtagctcttttttacactc
aattgaattctctattatgtatttcttcttatgtttttggtaacctgaat
catattttctagacgtatttatgaaaaacatgattctgatgtgtaacggt
tcactatttgcacatatataaattaaaaaatcaatctatattaatagaac
aatatataatcggtcattgttataaacatattattaaataaatacctaat
aaagctaggtagtcgatgtccctgatcaccaacacatcaataaggatgct
gctgcgtaggttataggcactatggtgttccaaaagatgcatcacgcagg
caagatatcttttaagaggctatgaaaattcattctttggtgcgacgttc
ggtagtttcgttttagcgtagtggtaggataagctaaaaagttaagatag
atctgctaattttttgtaccctcatttttaggccatgcgcaagtgcacct
tttgcaaatgcataggcccgggcctgtacagtagtaaacgtgctcatgtg
ttgtaaaggaatgctcaacttgttatctcgtcacttgaatatggtcaact
taatgaattgagctagggatttatacaataattatgacatttatttgaat
atgtctagtgttctagcttatgaacatgtattctaagtccagaggatgga
caacgcatcgagatgggaaggtgtgtggggaaaaaaatcttttttctact
atattaaagcaaaatttccacggtttcatggttctacacgccttcatctt
tttccgctttatgcaaaaacaacataaaattgtgcacaaataggtgttca
aaccttagttgttggctccacatttatatccaccaaaccaatagaacaca
catatttttatgatttattaaaacaaattctacccatatgataaattgaa
actatagctgttgggggactacatctgcataggttttgtttaagagttta
attattggcctagtttagagcacgaggaatagaagtcctaattcacccct
cttaggcatacatgttcctttcacgggatacttattgtcttcaaagctag
gaggttgctcaacattgacatcatccttgattggtccattaagaaatgtg
ctcttaacatccatttgaaatatcttaaaaccatggtcagtagcatagac
taataatatgcgaatagatttaagcctagttataagtgcaatggtctaat
caaagtccaaacatgtgaattgggcataaccttttgccataagtctagcc
ttgttccatgtcaccacaccatgttcatcttgttttttgcggaacacaca
cttggtttcaacaacattctaatttggatgtggcactaggttctatactt
caatccttttgaagttgttgaactcttcctgcatggccatcacccactct
ggatccttcaaggcatattcaatcctgaaaggctcaacaaaagacacaaa
agagtaatgttcacaaaaaattgcaattctagaacgagtggttactcgct
tgttgatgtctccttaggtgtggcacttggcttgttgttagtggtggtac
attatcttcatcctccttgtcttgtgcttcctgctctcccccttgatcac
tgccaccatcttgagcttcttgctcctcgtcctagcttggggttgtgcct
aagtggaggatggttgatcttgcttttgtggtcgcccatctccaatggac
atgtttcttagtgtagtgcacgaagcctcctcgtcatctatctcatcaag
atcaacttgctcccattgagagccattagtttcatcaaacacaacgtcac
aagcaatctcaacacatccagatgatttgttgaagactctatataccttt
gtgtttgaatcataaccgagtaaaaaccttctacaacttcaggaggaaat
ttagaatttatacctttttaccaaaatataacatttgctctcaaagactc
taaaatatgataaattaggtttgttaccggtaagaagttcatatggtgtc
ttcttgaggagtcgaagaagatagaacctgttgatggcatgacaagctgt
gttgatagcttccgcccaaactgatctgatgtcttgtactctttaagcat
ggtcctcgccatatcgattagagttctattcttcctttccattacaccat
tctgttgtgatgtataaggagaagaaaactcatgcttgattctctcctcc
ttaagaatccttgtatttgagtgttcttgaactccgttccgttgtcgctc
cttatcttcttgattttcaagttggactcattttgagctcttctcagtaa
cctcaagaatatatttaatgtcgcctcttatcttcttgtatttatatata
gtctaactgaaaataaagagggatgttttttaacctacaattgcttaaat
tagttcgcggatgccaaattcaagacctaaaagaataataagggcttgtt
tggaagcaccacaatttctaagaaactagtttctattcctagtatctcta
gtttatttatataaaaaactgagtgtgcatgtttaggatctagtttctta
gaaaacaacttctagcttttcgaaacacaaatacaagtttattcttactt
tattttaaaattaatcgattcaattatttccttcatttttgttgctaact
ttaaatatagcttgaactcatattttttaatttaaaattttcaaatcata
attaaacattcaaatcatgcgtaagggaatcacggcctatgtcctgggag
caatgggaatcaacgtaatattcatcaaaccaactaaaggtagtacataa
tttgtctagcaaaaactgatttttagctagggaaaatcagcttctagatt
tctaaattattggtttcctagaaactggaaattcagtttctaaaaactta
agtataaaatggtatgtttggattcatctcattttttataaaccaatttt
ttttaaaaaaaaactcggtgctccgggtctgcttggtgagattataatct
gaccagattatataatctaacaaattttgaactaactatataatctggac
agattataatcccaaacaaacaccctctaagtcttctataagcactaggt
tcctactatcttagcatccaaaacctgctaagacaaagtgtgacatacca
ataccagctgaataatcgtataggaaaactcaattttacaagctatcgtt
gctttgatgtagtagtaactagtaaccgtaaaagaaaagcggtactacaa
gcagcccatgcgacagggacaatcccatcagaatgtctgattgtgctgcg
cccatgttgagtccctgcaaacctcacaaactgaccgttaggagacagca
agttcctggccccacaatcctatccacaaccacaacgagagcgtaggggt
gtgtgtgtgtgtccaaaccttctccacgcgttccttccttccctccctcc
ctccctcccctcccaagaaacacaagagccaggcagcgcttgcgaggctc
aagcgccgcagca
107614-108618 Forward Strand
Exon 1: 107614-108056
Exon 2: 108504-108618
Augustus and FGENESH showed gene structure exactly the same. Nucleotide blast (nr) showed a
plant-light harvestingdomain (PLN00014).
Shows high homology to Sb03g026510 (2e-91) with supporting EST/cDNA data. Also has supporting
EST data from Saccharum officinarium (sugar cane) and Zea mays. (Phytozome) Maizegdb.org
shows both EST and cDNA data supporting my gene structure.
Protein:
MSLAPSIPSIKVKVGPVSVAPPHRACRSFAVVRSSKAEGPIRRPAAPPLSPPPKTPALSTPPTLSQPPKPAAPPTSS
ESTPPSPQPKAAVATAPAAALQRPLAGAVTLEYQRKVAKDLQDYFKKKKLDEADQGPFFGFVPKNEISNGRWAMFGF
AVGMLTEYATGSDFVQQLKILLSNFGIVDLD
ATGTCTCTGGCCCCGTCCATCCCTTCCATCAAGGTGA
AGGTGGGGCCCGTCTCGGTGGCGCCCCCGCACCGCGCATGCCGATCCTTC
GCGGTGGTCAGGAGCTCCAAGGCGGAGGGCCCCATCCGGAGACCCGCGGC
GCCCCCGCTGTCGCCACCGCCAAAGACGCCGGCTTTGTCCACTCCTCCCA
CCCTGTCGCAGCCTCCCAAGCCCGCTGCTCCACCCACGTCGTCGGAGTCG
ACACCACCGTCTCCTCAGCCGAAGGCGGCTGTGGCCACAGCCCCCGCGGC
AGCGCTGCAGAGGCCGCTGGCTGGGGCTGTGACGCTGGAGTACCAGAGGA
AGGTGGCCAAGGACCTGCAAGACTACTTCAAGAAGAAGAAGCTGGACGAG
GCCGACCAGGGCCCATTCTTTGGGTTCGTGCCCAAGAATGAGATTTCCAA
CGGAAGgtacgttcgcggtttggccccctgcaagatgcatcgaaatcatg
aactacaaagccttttcaagtagtccagtatggtttgtgatgagattttt
cgtagtactttctactatttgaaggttttttatctagcctgagcatttca
gacagcatggtccttaatagcaatcaatcgatgagtcgatcaaacattgg
gcttgagattgattaatcagttggaaatgacatttttccaagtcatctgg
aacaagaccttgtgatgttaccacagactattatgctccccttcacgcta
ctcaatctactcgttctaactaaatatgggtccacaataggcatgctgaa
aaaaatacacgatgggtagatgcaaagattcaccacgttcttgagtaggt
taaggaagctcatctctatgcgctaattcatcagctttctacaaaccttg
cagGTGGGCCATGTTTGGGTTTGCAGTAGGGATGCTAACAGAGTACGCAA
CAGGCTCTGATTTTGTTCAGCAATTGAAGATCCTTCTTTCCAATTTCGGA
ATTGTGGACTTGGATTAA(End of Gene)taatgccgggcttggtgttttcgttatgcata
agtcttttaaatgtaatgtacttaattgatatggcatatagaaaatttta
tcttgcaccttgtttagcttggtttcgtttgtgctgccctgtttggatac
acatgaataccaatttccagtttccttccgttaggtaccttgatttacac
gggagaaggacaacactgtaggattgaccgagttgcagtgtgggcaaaaa
ggattatctttaatcaagttgccaaatgccaataatagtatatgcttcta
gataaataacaacggatttagattggtattgctatattatgatgtgaata
tgtcattgttctatcctggttcatcaatctagcattgagcaacaggatca
taagacaaaaggagatacttaatttgaaagcacttaaaaagatgcagttt
caaaaggtaacaccctgcatgcggaagttatgcatgatggcgttaaccaa
gctctgcaattgctttttcctcctgttattgcacaaagaatatcaatcac
tacaagaaactgataaacgttcgtgggttttgttaagaacgtgggtaccg
acgttcttatattaattaagttcatgggtttatttaagaacgtgggtacc
cacgttcttacattaagttcgtgggctctaaccaaaaccgtcgaacttaa
ccttttaaaaacgtgggtacccacgttcttaaattaaatacgtgggccgc
gtggacaccgtcgaacttaacccaacctaaatccctaatcccgcgcatct
ctcttcctatctcctttccggcttcccaccagtgctcgcacccgccgccg
gccgccagcccgagcacactgcgcgcccggccgccgccacccagcgcgtt
gcgccgcaccaccaccccgctcgcgcgctacctcgctcgctgcgccgcgc
cgcccgctcgacagccccacgccgcctggccgcccgcgctgctcctctcc
tgagatgggtctgagaaagagaaacgccatggttggtgaggcagccatgt
tcgccacatcttggtagccctcactcggcattgacatcctctcgttcttg
gacttgaagttaactcagccatgaggaagcatgggtggcagctgccgtac
caccctctccaggtataatccacgtccactcacctgcgtggggttgtgcg
ctcgcctttccctcctcctcgcagtgttcttgccgatcctagcatttgcc
tcacgcgtcgcgctctgttatagtgttcagcgactgactggctgactggc
gggcgagctgcaggtggttgcgatagccgtgttctcggcactggggttca
ccttctcgtcttcttcgtgccgttcgtggggacgaagccgttccagattg
tggccatggctatctacactccattggtgaggttcctgcccttttttcag
gggtttaattcgctagtatgcatagatctggagttcggtatttgtttcgc
tgccagccatcccgcgctttacagttcttggtacttactacgtcaggcat
tgcaagttcaaacgctggtagagaacgtcatctgctttgcatgtgctgcc
catatctgttgtaaaggaatgatgatatgacatccgttagactatatagc
atcattgctcacttgagcgacttgacaatttaaccttgagaatgattaat
gtcttctcttgcagattacatgtgttgttgtgttatacatatggtgtgcg
gaaacaaatcctggagacccaggcattttcgattcaacgaagaatttgaa
gtaagataagaatgaaaagcactcctatgtgaattcggatcaggggatta
atcatggagaaggccattaagtgagacttttggtactgctgataacagtg
agaagctgagcagcatgcttgagaggtaggaagctacactgtccctatga
agttattcatgatgacagtgtcaaatgatttggcgagtgttatttttcgg
gtaaaaaattatttgtactgtcaaaaaaatactggcaacgcttctttgtc
aagtatattttttttttccgttgccgagtgttttttcgacactagacaaa
gagttttttgccgaatagtttttttacactaggcaaagagcttctttgtc
gaatgttttttctcgatactaggcggagagcttctttaccgagggttttt
ttcgacactagacaaagataatttttaaatcaacttttttgaaacagtaa
attaatttaaataaaaaagttttaaactacaaagttgtataactcatcac
gatgtacaatatttattttagtcatttcttcatatgacaaagttaaagta
atttttttcacaaaacttatatacctctcatgttgtttatgaaactacaa
cagaggtgtataagatttgtaaacaattttagaatcaccatgttggatga
ataaataaccaaacaaccaaaataaattttgtaaatcttaagaagttata
gagttttgtagtttgcaaccttttaatttgaggtcatcttctcgacgaaa
actatgtctgaactcaaaaaaatcgaatttgtgaaacaagctccaatgaa
aaaacaaccaaaatgatagttgtgggtctcaaaaagttatgaaactttgt
aattgacaattttttgatttgaaatcatctgatcgtgcaaaactatattt
agatttaaaattcgaatttttcaaatgacctcgatgaaaaaacatcaaaa
taaaagttgtaggtatcgaaaaagttatttaactttatagttgacaacgt
tttgatttgaaatcatcttatcatataaaaatatgtttgaatttttcaaa
tttaaaatttaaattttgtaaacgatctcaaataaaaaactactaaaatg
aaagttgtaggcctcaaagagttatgcaactttatagttaacaacttttt
tatttgaatttatttagtgtctaaaataatcaatttactcttggtttggt
agaggaggttacgcatgagggagaggttgcgagttcaaatcttactgacc
acaaaacacatgaattccattcaaaatatggtgaaagcaactaggaaaaa
acactcattaaagagtcatttgccgataatttttttaccgagtgtaacac
tcaataaaggatttgtcgagcgtaaaaagatctttgctgagtgttctgga
cactatgcaaataaggtgagtccgatagtaattctaaataacaaaagaat
caatgttttatttcagacatttattaagacatgataaagcaaacaatgtt
tttaaggtcttaccttctcatgaggtgaacgcttcgaaactttaaagggg
tgtttggtttgaggaatcacttcatccaaaatgtggtggtgcatcatgta
ttcattcctcaaatttggtgggatgacctcattcctcacattagtactaa
ctaaataactataaggaatgaggtgatgatggatcaactcaatccattcc
ataaaccaaacaaaaaagtgaagagtgagaagatgatggactagctcatt
tctcaaaccaaacaccttataagcatctgatcaagactgtagacaacaga
tgaaa
112206-115963 Reverse Strand
Exon 1: 115759-115963
Exon 2: 112455-112528
Exon 3: 112206-112358
Both Augustus and FGENESH show a possible gene at this locus. Augustus shows two possible gene
structures, but through blast and homology to Sorghum bicolor I believe that transcript 1 is more
correct. BlastX gives a hit to a coiled-coil domain containing protein 94 (6e-12). This is a
relatively low evalue. Reading about these domains, it appears they are more common in animals
and transposons in rice have been shown to contain these domains. Nucleotide blast has only one
hit to Sorghum bicolor and this is a putatitve protein. I therefore believe that there is possibly a
transposable element at this locus but still has potential for a candidate gene.
Protein:
MGGGEPAEQQKTDERRAVTRSSSGDDNDFCSWGPNNGSTDLRADAAAKRPPPMPKLVRQGRAEGRRVGDPQNSDYTV
GSGAGRNLDPWREKDEQDAVMGDAMKVLENRAMDSKQDMDILAALEEMRSMKIGSSSKPLAVVYEF
TCAGAACTCGTATACAACAGCTAGTGGCTTAGAACTAGACCCAAT
CTTCATGGACCGCATCTCTTCCAAAGCAGCAAGGATGTCCATATCCTGCT
TTGAATCCATTGCCCTATTTTCCAACACTTTCATTGCATCACCCATCACA
GCATCTTGctgcctgaataaaaaaatattcatcacggtgatcaaattgaa
gattatatgcatgggatcgaatttgtcctttgaatatgccccaaaatcac
ttacCTCGTCCTTTTCACGCCAAGGATCAAGATTGCGACCAGCCCCTGAT
CCCACTGTGTAGTCAGAATTCTGAGGATctgttttgaatgcgacctcagc
aaagccctcagcacacttgaagtaaaacttaaatactctctccgttttat
tttagttatcgctgaatagtgtaaaattgaactatccagcgacaactaaa
aagaaacggagggagtatttgtattcccaactatgtctacgcccaaaaga
agggaatagaattatcatgagttagaatacaatcagtaatcaaaacacgc
actagttgatacggatatgaatttccacagttccaagactgacaaccaaa
taaaacattatgccagaccataaaatgatgaaaaataaattcaaaccatg
gctgtaaattcaaaagcacaaagatgcctatgaaaaagatattacggcaa
gcgctgaacacaaagaactgttgtctcgaacactgcacaacatagaagtg
taggcagagcaatgtggtggctcaggaacaacattaatacattctggagt
taggctccgtttcaatctcacgggataaactttagcttcctgctaaactt
tagctatatgaattgaagtgctaaagtttagcttcaattactaccattag
ctctcctgtttagattataaatggctaaaagtagctaaaaaaaagctgct
aaagtttatctcgcgagattggaacagggccttattcatgtagtcatgac
gaacaaaagttgctaggttgagcagcccatggactgaaaagataacacct
ataaatcagtaagtcgaaataataagtcacgttgtctcaacatagtaagt
ctcatactcagataaagaaaggtaatggtagtcttgacaagccaatatta
aacatagtcattctaatagagatgaaacccaaaagaaatcagttaaggtg
taaccctcttagcgacgcaccatatcgaaacccagatatggtatcaaatg
ggcaaaggtcgggtcgacactttcttggtgacgtgtcgtgccgtgattag
gacacgatgtcaagttatcagaatcaggttgtaagattcttattggcacg
ctacatcggtacccgagtgtagtgaaaaatatgcaaggtcttcacataat
ctatggatggacggataagaaagctactcgaaccaactaggatccattta
ggtagttggaatataagattgcttactggtaagttaagagagttaattga
gacaacgactaggagacttataaatatcttataagatcagaaggcggagg
tggacgatatcgacttcaagctttggtacacaaggacagttgtaaataga
aatggagtagttttgattgataagatcctcaagaatggtgtgtcggttgt
gataatgcaaggagataagattattctagttcagcttgtcatgggtgatt
tggtcttgaacgtaattagtgtatatgcccccccaagtagttcacgacga
gaatgctgagagacttattctagaaagacttagatggcatgattagagat
gcacctattagtgagaagttttcataatagaagaactggacctgcggggg
gtggtaagacagtctccaagtgttacattaagaatatcttctcatgcagg
tcgagataacccctaaactcttgccctacctgtcggtgttaagtaccagc
aacacactacggaggtgggcgcatagtgcttttctaaggtaggtgacgag
gatcacaactcgatggcacgtgtagaacacgtcggagacacaagaaattg
ctcggacgctcgatggtacgtcctgtggtttccgtctgtgtgcggttgac
ttctctctggagtacaggggatgtctagtggaagaagatgagttctatga
gtgtccatctcatttttgtagggtcccttcagcataattgccgactgaat
tggggttacagagaaactaccatgagtccacgatctccatgttccgcttg
ttctgtctctcgttgcgctgggctacggtcaaacgagtcatccaccgatt
tctgtgcgccgcacgtacaagaagtgactggttcacgtacacgcacctgc
accttcctcggactccaatagctcccattcgaccgaggcccatctgtcgg
cgtttcgagaccggggggtccctgggccgacgagtgaatgtcgccgcgtg
ccccagcccagatgggtcgagcgcgagggcgagcgcgaaggggggagagc
gaggcggccggagaccggcgtgagagaggtgggaatcccgcggccttcgt
gttcgtcccacgcccaggtcgggtgcgcttgcagtagggggttacaagcg
tccacgcgggagagggagcgagcggctccaagcgagcgcctgtctcgtcc
tcgtccccgcgcgtccaaccctctctaagagggccctggtccttcctttt
ataggcgtaaggagaggatccaggtgtacaatggggggtatagcagggtg
ctacgtgtctagcggtggagagctagtgccctaagtacatgccgttgtgg
cagccggagagatttgggcacccagctggtgtgatgtcgtggccgtcgga
ggagcgatggagcctggcggagggacaactgttggagcggttgagtcctt
gctgacgtcctcttgcttccgtaagggggctgagagccgccgtcgtcaca
gagtacgcggggcgccatcattgcctatctggcggagctagccagatggg
acgccggtcttgttccctgcggcccgagtcagcttggaaggcgaccatgt
cttcgcgctcccttatgcatcgtgcctttccaccttccaagcccccggac
gagggatacccgccgtctttccgcctcatcgttggaggaacgcaactccg
tgggagttggtacctttcagccgtcgttcggcttcaaggattttcatcat
gcagcccggctgcacccctccgccggcggtcacccaagatggtgacctcc
agtttgatggtgggggaaagcgagccgggctgcggcctctgcccctccct
cagcctcaaggattttcatcaccagggctggggaggggagtgtgccgagt
tggggtcggcccctgcgtgggcggtggcccgctccttccctcagtgatcg
gagggatgggcggtcgtcgtctgtggcgccggcagctgcagcgtgcctgg
ctctcgggcgcgagtggcttcggagccgctcgcggcgccggtgtctgcca
cggcagctggaagaggttcttccaccgacgagatagccagggccgcccac
ggaccgacCTCCAACTCTACGGCCTTCTGCCCGTCCTTGCCTCACGAGTT
TGGGCATGGGCGGGGGCCTCTTGGCAGCAGCATCCGCCCTGAGGTCAGTG
CTGCCGTTGTTCGGCCCCCAGGAGCAGAAGTCGTTGTCGTCGCCGCTGCT
GGAGCGGGTGACGGCGCGCCGTTCGTCGGTCTTCTGTTGCTCCGCAGGTT
CCCCCCCCCCCAT(Start of gene)cgagtggggttgttcgtacctgcggaggtggaaccgg
agttccgtttgtaatggcactttgaatgccagtgtttttgttcattgtgg
ctgtcgaggcctgaacatatatgtaattttggcatggagccgtgtttttt
cctcattttcgagcactgagactcgcctgttggttgtctgaaccgcttca
ccaagcgtgagtcgccccgtgtcaaggtgacgagtgaggtatccgtatcc
cggagacgttggagtccctcggctcggtcggccttgttgtccgaggcttc
tctagcttagttaaagggaccccttggccgctcttcgatgagccgaggcc
aggggtagcggtatcagcatgaacaggggcagagttggctcgaaaaggaa
acctggttggccggagcctaaccgggtcgtccgttagcgggaccgacgtc
gaagttgaccagccgaggcctcgggtcgggctaacgtccttggaggatgg
ctggccgaggccccggggtgaccggccgagccgcctgctcgggccgggtt
cccggagaagaccctggcagcgattgcccgggcgtggcgatgacgtcgtc
ctttagagtggagatcctcggaccgcgtcgccgtccgaggctaggtcgga
cctcgccgaaggtgtcgtcgatgcggagggtgctgctgcccccttccagc
gtcaagacccgagcctgcaggatcggattgtcttgtagcgtgtgtctcct
gcggccgccgaggccagaacacaccctcgctgtgttgtaaagctgcgtct
ctttttctcttgtttcgagtatctggacttttttgtcggtaacagggatg
tttgtgcgagcgagagttgcttctcgcggaaggtgatgagtgaggtatcc
gtatcccggaggcgtaggagtcccttggctcggtcggccttgccgcttac
gcgcactcttacccgtccatggggctctgtcaccgactcagtcgagaagg
ctcgaaggatcgcttcggcagaagagcttccgatcgtgaagacttgttcg
gtccgcggaatcacttatccgaacgtgagttacttatcgcagaaggtgat
gagtgaggtatccgtatcccggaggcataggagtccctcggctcggtcag
ccttggctgcttacgtgtactccgtcgttttcaggatccacttttcgaag
tagtcaaaaagcacgaaagacattctggcagaagagatctttcttcgagg
aaaatttcgacgcagagggggtttcccccctttcagcccccgagggaggg
tcgagctttgccgaggcgaggccgacccttccttgatgactaaactttgc
gtgggtgcgaggtatatgaacaacttgaaaacatcttaagggtagaagcg
acgtagctgttggatgttccaagcgttgtcgtagacctcgccttgactgt
tggccagcttgtacgttccgggcttcagaaccttggcgatgacgaacggt
ccctcccaggggggcgtgagcttgtgcctccctcgggcgtcttgtcgtag
ccgaagcaccaggtcgcccacctggaggtctcgggaccggacccctcggg
cgtggtagcgtcgcagggactgctggtaccgcgccgagtgtagtaaggcc
ttgtcccgagcctcttccagctggtccagcgagtcttctcggctagcttg
gttgctttgatcgctgtaggccctcgtcctcggggagccatattccaggt
ctgcgggcaagacagcctcagccccgtagactaggaagaacggcgtgaag
cccgtggctcggctcggcgttgtcctcaggctccagaccaccgaggggag
ttccttcatccatcgcttgccgaacttgttgaggtcgttgtagatccgag
gcttgagcccttgtagaatcatgccgttggcacgctctacttgcccattt
gacatgggatgagccacggcggcccagtccacctggatgtggtgatcctc
gcagaagtccaagaactttctgccggtgaactgggtgccgttgtcggtga
tgatggagttcgggaccccgaagcgatggatgatgttggtgaagaacgcc
actgcctgctcggacctgatgctgttcagaggtcggacctcgatccactt
ggagaatttgtcgatggcgaccagcaggtgcgtgtagcccccgggtgcct
tctgcaaggggccgacgaggtccagacccgacacagcaaagggccaggtg
atgggtatcgtctgtagagcctgagcgggcaggtgggtctgcttcgcata
gaattgacaccctttgcaggtgcggacaattctagtggtgtcggccaccg
ccgtcggccagtagaagccttgtcggaaagcattcccgacaagggctcga
ggcgctgcgtgatggccgcaagcccccgagtgtatctctcgcaggagttc
ctgaccttcggcgacggagatgcatcgctggaggatgcctgaggggctgc
ggtggtagagctccttctcatcgcccaacaagacgaatgacttggcgcgc
cgcgccacccgccgagcctcggctcggtcgaggggtagctctcctcggtg
gagatattgcaggtacggggtctgccagttttgatcaggcgtggccccgc
ttcgctcctcctcgacgcgcagtgcctcaccctcgggggccgagggtacc
tcgggctgaaccgagggtgcctcgggctgtgccgagggtacctcgggctg
ggccgagggtacctcggactgggccgagggcgcctcaggctcgggcgtgt
cgtcgatcttgacggaggattgatgcagatcccaggagaagacgtcccgg
ggaaccgttgtttgccccgaggctattttagccagctcgtccgcagtctc
gttgtaacgccgagcgatgtggttaagctcgagcccgtagaacttgtctt
ccaagcgccgaacctcatcgcagtaggcctccatcttcgggtcgcggcag
tgggagttcttcatgacttggtcgatgacgagctgcgagtcaccgcgggc
gtcgaggcgccggacccctagctcgatggcgatccacaaccctttgacca
gagcctcatactcagccacattgttggacgccggaaaatggaggcgtagc
acatagcgtatgtgtttcccgaggggcgagatgaagagcaggcctgcgcc
ggctcccgtcttcatcaacgacccgtcgaaaaacatggtccagagctccg
gttggatcggagccgtcggtagctgggtgtcgacccattcggccacgaag
tccgccaagacctgggacttgatggccttccgaggggcgaacgagattgt
ctcgcccatgatttccaccgcccatttcacaatcctacccgaggcctctc
ggcactggatgatctcccctagggggaaggatgacaccacagttaccgga
tgagactcgaagtagtgtcgcaacttccgtcgtgtcaggatcactgcata
cagcagcttctgaacttgtgggtagcggatcttggtttcggacagtacct
cgctgacgaagtaaactggcctctgaacgggcaatgtgtgcccctcttct
tgcctctcgaccacaatcgcggcgctaaccacctgagtggtcgcggcgac
gtagaccaagagggcttctccggcagctagaggcaccaagataggcgcct
ttgtgaggagcgccttcaggtttccgagagcttcctcggcctcaggggtc
caagtgaagcgctcggccttccttaagaggcggtacagaggtagacctct
ttcgccgaggcgtgagatgaagcggctcagagccgcaaggcatcccatga
ccctctttacgcctttcaagtccttgatgggcccaatgctggtgatggct
gcgatcttctccgggttggcttcgatgccccgctcgaagacgatgaaccc
caagagcatgcctcggggcaccccgaagacacacttctcgggattgagct
tgacgcctttcgccttgagacatcggaatgtcgcttcaaggtccgaaagg
aggtcggaagctttcctcgtcttgactacgatgtcatcgacgtaggcctc
gaccgtgcggccaatgtgttcgccgaacacatggttcatgcaccgctggt
acgtcgcgcccgcattcctcaagccgaacggcatggtgacatagtagtac
atgtcgaagggtgtggtgaaagaagtcgcgagctggtcggactctttcat
cctgatttgatgataccctgagtaggcatcgaggaaagacagggtttcgc
acccagcagtggaatccacgatttgatcgatgcgaggcagagggtaggga
accttcggacatgctttgttgagaccagtgtagtctacacacatccgcca
tttcccccttttctttctcacaagcacagggttggcaagccattcgggat
ggaatacctctttgatgaaccctgtcgccattagcttgtggatctcctcg
cctatcgctctgcgcttctccttgtcgaatcggcgcagaggctacttgac
gggtcgggctccggcccgaatatccagcgagtgctcggcgacatccctcg
gtatgccgggcatgtccgagggactccacgcgaagacgtcggcgttcgcg
cggagaaagtcgacgagcactgcttcctatttgggatcgagcccggagcc
gatccggatctgcttggaggcgtcgccgctggggtcgagggggacggcct
taaccatctccactggctcgaagttgccggcatgacgtttcacgtctggc
acctccttagagaggctttccaggtcggcgatgagggcctcggactcgac
gagggcctcggcgtactccacgcactccacgtcgcattcgaacgtgtgtt
tgtacgtggggccgacggtgatgaccctgttggggcccggcatcttgagc
ttcaggtaggtgtagttggggacggccatgaatttcgcgtagcatggtct
tcccagtaccgcgtggtaggttcctcggaacccgaccacctcgaacggca
gggtctcccttcggaagttggagggtgttccgaagcagacgcgaaggtcg
agttgtccgaggggctggacgcgcttcccgggaataatcccatggaaggg
cgcagcgcctgctcggacggaggacagatcgacacgcaggagcccgaggg
tctcggcgtagatgatgttgaggctgctgcctccgtccataaggaccttg
gtgagcctgacgtcgccgatgacggggtcgacgacgagcgggtatttccc
cgggctcggcacgtggtcggggtggtcggcttggtcgaaggtgatgggct
tgtcggaccagtctaggtagactggcgccgccaccttcaccgagcagacc
tcccggtgctcttgcttgcggtgccgagtcgaggcattcgccgcttgccc
actgtagatcatgaagcagtcgcggacctcggggaactctcctgcttggt
gatcttccttcttgttgtcgtcgcgggccctgccaccctccgtgggtggc
ccggccctgtggaagtggcgccgaagcatgacgcactcctcaagggtgtg
cttgacgggcccctggtgataggggcacggctccttgagcatcttgtcga
agaggttggcacctccggggggtttccgagggttcttgtgctcggcggcg
gtgacaaggtccgcgtcggcggcgtcgcgtttcgcttgcgacttcttctt
gcctttcttcttggcaccgcgctgagtcgacgcctcgggagcatcttccg
atgggcggccctggggctgcttgtcctttcggaagatagcctcgaccgcc
tcctggccggaggcgaacttggtggcgatgtccatcagctcgctcgccct
ggtgggggtcttgcgacccaacttgctcaccaggtcgcggcaggtggtgc
cggcgaggaacgcgccgatgacgtccgagtcggtgatgttgggcagctcg
gtgcgctgcttcgagaatcgccggatgtagtcccggagagactctcccgg
ctgctgtcggcagcttcggaggtcccaggaattcccggggcgcacgtacg
tgccctggaaattgccggcgaaagcttggactaggtcgccccagttggag
atctgccccggaggcaggtgctccaaccaggcgcgagcggtgtcggagag
gaacagggggaggttgcggatgatgaggttgtcatcgtccgttccaccca
gttggcaggccagacggtagtccgcgagccacagttccggtctcgtctcc
cccgagtactttgtgatagtagtcgggggtcggaaccgggtcgggaacgg
tgcccgtcagatggcccggctgaaggcctgcggaccgggcggttcgggcg
agggactccgatcctccccgctgtcgtagcgtcccccacgcctggggtgg
tagcctcggcgcaccctctcgtcgaggtgggcctgacggtcgcggtgatg
gtgctcgttgccgaggcggcccggggccgcaggcacggtgttgcgtgtgc
gcccggtgtagaccgaggcttcccgcatgaatcgggaagtcgcggcatga
ggttccgaggggtacccctgccttcgggaggcagagctctcggcccatcg
gaccgcggtgccttccaggagattcttgagctccccctggattcgccggc
cctcggtggttgatggctccggcattgcgcagagaagcatcgctgctgca
gccaggttctggccgacccactagatgcgggtggtggcctgaccctgaca
tcgtcggcgacgcggtgctggaagccctggggcagatgacgtatttctcc
ggtcgggggttggcccgcccatgcctgcccgatgtcccagcggatcggct
caagcgctcctgctccctcgtcgagcctggcctgcaccccgcggatttgc
tcgagctgtgggtcatggcccccgcctgaacggggaccacagctagctcc
cgtgggatgtcaacgcggggcgccgacctagggagatcaccgtcctccgg
catgctgagatgattgccttcggagggaccccctagatcgacgtggaaac
attcgcggcctgggccgcagtcctcgtcgccgaggctgcggctaccgtcg
gaacagtcggagaggcagtagtcacatgcggtcatgaagtcccgcatgga
actggggttgccaagtccagagaaatctcagcagatgctgggctcgtcgt
cttcctcggacccagagggcccgtaggtcgagacgtccgttagccggtcc
caaggcgaccgcatgcgaaaccccagagggtttggactcgcctctacgag
agcgcccgccaaagcggggtcgctaggcgggttgaggctgaatccaaatg
acgtgggacgggaatcggtcggtacctcttggtcgacgagcggcgataaa
gtcacgtcggggactggctgcaccgtcgtctcaggtataagggtgacgtc
cagcaagcttttcgcaagcgcgctggcgtcgtccgcttgctcgggattgg
cgtgtcgcggggagacggcgctcgtcttcgtctcaagcgcgaagtcgata
cccggtgcgccccctgtaggggtgccggcgctgccgacttgctcgacagc
cgacgaggcgctgcctcctgcttggccttggttgccctgtcttcccctcc
gtcggcggggaagagggcgggatgagctcgaaggttgttcttgcaccacg
cgagggaagacgttgtcgatttcgccgccggcgggcgggctgtcggccgc
cattgtcgttgtcgcacggcggtggaaggagtatcatgtcgtagctgccg
tcgagggacatgagctcaagactcccgaaacggagcaccgtcccgggttg
gagaggttgctggagactgcccatctggagcaccgttccgggttcgtcaa
catgcagcaggcccctacctggcgcgccaactgtcggcgtttcgagaccg
ggaggtccctgggccgacgagtgaatgtcgtcgcgtgccccagcccatat
gggtcgagcgcgagggcgagcgcgaaggggggagagcgaggcggccggag
accggcgtgagagaggtgggaatcccgcggccttcgtgttcgttccgcgc
ccaggtcgggtgcgcttgcagtagggggttacaagcgtccacgcgggaga
gggagcgagcggctccaagcgagcgcctgtctcgtcctcgtccccgcgcg
gccaaccctctctaagagggccctggtccttccttttataggcgtaagga
gaggatccaggtgtacaatgggggtgtagcagggtgctacgtgtctagcg
gtggagagatagtgccctaagtacatgccgttgtggcagccggagagatt
tgggcacccagctggtgtgatgtcgtggccgtcggaggagcgatggagcc
tggcggagggacagctgtcgaagcggttgagtccttgctgacgtcctctt
gcttccgtaagggggctgagagccgccgtcgtcacagagtacgcggggcg
ccatcattgcctatctggcggagctagccagatgggacgccggtcttgtt
ccctgcggcccgagtcagctcggggtagggtgatgatggcgcctcctgtt
gacgtggctggtctgcgccctaggttgggtgatgtggaagctcctccgaa
gtcgaggtcgagtctgtcttccgtggccgaggtcgagtccgagcccctgg
gtcgggcgaggcgaagttcgtcgtcttctggggctgagcccgagtccgag
ccctgggtcgggcggagcggagttcgccgtcttccgggacttagcccgag
tccgagccctgggtcgggcggagcggagatcgccgtcttccaaggctgag
cccgagtccgagccctgggtcgggcggagcggagttcgccgtcttccggg
acttagcccgagtccgagccctgggtcgggcggagcggagttcaccgtct
tccgggacttagcccgagtccgagccctgggtcgggcgaagcggagcttc
ctatgatgccttcggcagggcctgactgtctgtcagttttcactctgtca
agtggcaccgcagtcggagcggtgcaggcggcgttgtccttctgtcaggt
cggtcagtggagcggcgaagtgacggcggtcacttcggctctgccgggct
ggggggcgcgcgtcaggataaaggtgtcaggccacctttgcattaaatgc
tcctgcgatttggtcggtcggtgcggcgatttagtcagggttgcttcttg
gcgaaggcagggcctcgggcgagccggaaatatgttcgccgctggagggg
ggcctcgggcgagacagaaatcctccggggtcggctgcccttgtccgagg
ctaagctcgggcgaggcgtgatcgagtccttcgaatggactgatccctga
cttaatcgcgcccatcaggcctttgcagctttatgctgatgggggttact
agctgagaattaggagccttgagggtacccctaattatggtccccgacac
catcgggcggcccatcaagggcctttcggatggcccacgaaatgtctccg
ggttgtggtgacccaggggatatctatcccccacaaccacaaagcccccg
aggctcaagccaatttctgaatataattggctcgagactctccaaagtaa
actctgactctgatggtgagttcttgttgagtaagttgcttgcttccttg
gagtgcagcttcccaaaacatgtaaatttaattcaaaaaatggtgaaaaa
tgataggtgatggatagggcaatggtagtggatgaagagttgttcccata
atttaaaataagatttagctgtccgaaaaaagtactcgacaaagaaccat
ttgtcgattggcaaactcagcaaaaaaaggcaagtccaatagtgatttaa
actaagcaagtattaaatcaaaatatttaagttggtattaatcattttaa
aaattatttataaaaagtgcatattaaaaagtataaattagtttaaatta
aaaacctatttaataatcattagttaaacaattaattaattaaaacaata
ttattactgggtagtcaaaattaataccaacctaatgccatttgaatgaa
ataaatcagacatctaacgcaaagtatatcacatatttaataaattaggt
ctgcacagaaatatgtgatatgatgcgatttgattgattaaatatgtgtt
acaaaaaaaatttacaaggtaaatatacgtaatatataaaatctaggcgt
agccatcttctacgggcatgtttgccgccatggccccgcagtctggttgt
gctcctgggctgcagctcccgtgtttgtgtcgtatggcttgctcgggagt
caggttgtgtgtgtgcaaatttagcctcttagtcttgctccatagaaacg
aggtcaaggtccgtttcctgggagcctggctgctcgcggcgtaaggattc
cgtattgaggtagttttagtcttctagcccgtctgcatatcgggtgacaa
acattaactcgtctaagtcagactcatatggtacgcaaacaaagtgagtt
acattagccgaggccggctagcctgagtatggaccagttgaacaggccac
gccctacatttcttatctatgtgttcgaactccgacggcgtcggcaaagg
attcccggcgagaacacggcaccggactctttcacgctcgccgatgctgt
accggcatagacggccagcgttcctgatgaggtcggcggcggcggggaac
aggcgacggcaaggaacaacagtgttcgtgacgaggtcggcgatggtgga
accggcggcggcatgcgacagcggcagcgggctcgcgtatagaataaagg
agcgattgtatgctgtttgcctgttgctatcgtggggaagaaaaagtggg
gtgggggctttccagaaaactgccaggaggactgtagcgactgcggaagg
gttattttgcattttgtttgtgcaccatcggatatgagatcgatggccaa
aaatttgcttgtgcatttctgcaccggtgtagccttgtgtgcacaacatt
ttatgttcggtatcttatttcaacagactatctatctattttggttagtc
aggtgtctagctaagtttgtctagtgaaggatctaatttagagagtcgga
taacttggcgattcagatagctaatatgttagagttattttaaccgttga
atagctaaaatttggtttgactagccatttagatagtttttttagatgct
cttagtatactcgactttacagctctatcaatagtaggcaaggaaaagac
acagtgcgacagcgtacgtctagcttggcgcatagcaatgcatttccctt
tgacgctcgatgcatgtacaagtacaacccaagacttggcgccttgttgg
cgcccgccgcatagggaagaaagatgcacttgcagtcatttaactttttt
tgagagagagaaaaaaaagaagggacacttgtagagtagaagatcgatgg
atgtgcgtgcaagaacggtacgagtaccgcagcatgcatcaatgatccgt
ccgggcggaccgatcattctttttttccccttcttgtatgcatgcacgca
aacgaatgactgttgtgctgcatgacacgtacacctctccgtcctttcgt
ttcggtctactctctactggtacagacgtagtacgttcacttgtgccgcc
gttgtgtgcgtacgtttttcttctaaagaaaatgaaatgcgggggcgagc
agccccgcaggccgcagagcgagctcggattgaccagactgcgtccacga
acgtgcgcgatcgcaaacctcgccacgtcacgtcaccgacgattcccgcc
aactttgacatcatgcgcgcgacgcgatcgatctgggccacggcccacgg
ccaccagccctgccggccacgcgcctccctcgtcagcacagacgcgcggc
ccgccggcatagcgccgagtctacgcgaccgaccggccggccagcctcgg
ccgtctcgtataaatacacatgcacaccgccgcgagagctatcaccagtc
ggcagtcagtctagctcggcatcggctcattgatcagtcagagccgctat
tacgatccagtcatccagatctcgctagatatagatatatacgtgagtac
gtgacctagctagctaggtagtagcctccagcaggcagtagtgtacgtgc
taagagtaatcgccgagaaaccattactgttcatc
128236-128604 Forward Strand
Exon 1: 28236-28365
Exon 2: 28531-28604
Augustus and FGENESH both show the same gene structure. Blastp of the proposed protein showed
a top ofArabinogalactan peptide 22 from Arabidopsis (1e-06). This is a relative high evalue
but it did show a conserved domain of DUF1070.
The DUF1070 is a family that consists of short hypothetical proteins of unknown function
(http://aranet.mpimp-golm.mpg.de/pfamnet/DUF1070). Gene ontolgy shows that the
Arabinogalactan peptide 22 is a plasma membrane anchored protein (multiple GO terms) that
appears to be involved in transport (GO:0006826 iron ion transport and GO:0015706 nitrate
transport). It could also possibly be involved in brassinosteroid biosynthesis as proposed by gene
ontology (GO:0016132). (http://www.ebi.ac.uk/QuickGO/GProtein?ac=Q9FK16)
Protein:
MSSLTARLALLAYLLLATLLHPCLCHAAAAAPAARGAGNWRMDPKAIDQGVAYVLMLLALFVTYIVH
ATGAGCTCGCTGACG
GCGAGGCTCGCTCTGCTGGCGTACCTGCTGCTCGCGACACTGCTGCACCC
CTGCCTCTGCCATGCCGCGGCGGCGGCTCCGGCCGCTCGTGGTGCCGGCA
ACTGGAGAATGGATCgtgagttctatagtctccctgatctcctccatatc
atctttagatgctgctgcttcgtcttaatttgtcttcgaaattaaagata
gaatcattaatcatgcagagtgatctatacaattgctaccgtttgttttg
acttgttttcatttctcgttccgcatgcagCGAAAGCGATCGACCAGGGG
GTCGCCTACGTCCTCATGCTCCTCGCGCTCTTCGTCACCTACATCGTGCA
CTGA(End of Gene)atatattattattggtcgactgtcctattggacagcattcgcaaaa
agacgttgattacattctccaatcgctcatgcatggcggtcgtgttctgc
tggcagctcaaattaaggcaagaggcaggcagtagtgtcttgattcccgt
tcgccggctatcggctatatggtgcccgacgtcgtcgacgagcagtctcg
tttgcccggccgcgcgccggactccggcccacttcctttcttctctcttt
ttcttttcaagcttgatgtgatgtgacttgaggcgagtcaaagtagtcta
gtcatcatccatcaggtgtactgtaaaatgttcattgctgtctagtagtg
attatgtacattacattaatatcagagatctcaattagaaataataacga
gaagatgagataagttctattttacaaataaattcttaataataatgtcg
gtatacttttgttgcttcgtacacaacaagatatgtatgccgtcttttgt
gacggcatttacgttggtaccgtatattttgtggaacggcttgatccaac
aaggggcacgtgcaatctatatagctctgtgttggcgccgatccagcgct
aaacactggtaagaacaacttggcagtgttctttgcgagagcgcgtacag
tctgtaaccacaacgcatgagctgcttctcctctgcgtgcttcggtgcct
ggggccggatggttcgtgatgacgcagaggatcttcttttccacaataaa
cctggatctcgtctcccgagagaaaccccgtcggagaagagagtttcagg
ggttgtcttgggatcggcaggccacccaaaacgcctctggccaacgtaga
gtcgaaaaacaacgaaaaaattaaataggtcgatgaagactaatattaag
gctagattactcctactctaaggaatggaaaaattgcaaaaaaaggtaaa
tttgaaaatagatttgattggatcgattataagggtttcaatcggtcgta
ccctttcatctatataaagtggagaggtctttaacccgttcctaggcaat
agacaacagatcccacgtgattatatggataaccacacatgagataagaa
taatcgcccgagtcaatcttatcgcgggaacgcggactgtccgagcccta
ggcctggaccgtccgccacttctggtgttcaacgcatgcccctgcctttt
ggtgaacattgacaaaccaaaagcacatgttctagcagcgcctttcgaga
aacaataaaccttgtgacttactttattcctgaagtatgaattaagtttg
atgtgagtcactgacttttcaatctagtagggtgatcatttagcacagaa
agatcgtttaggattacatctttctcggccattactacttgatcaaggga
tcaataagtataaaattgtagtgctcaagcctgagtaagcaaagagatga
tgcatataatatgggttcatcgccgcactacctcatgcttgaacaagaaa
atatgtcggcggtgaataggctggtggaaagtaccttgaacagagacact
tgttgtgtcgccttttggcccgttttattcggccgtgttgcctttgcggg
tgactggagttgcttcattggccgattgcaaggtacgaccttgtcaaaag
ccaggacgactttaactagccgaccaaaagtttcagatgaattttttgct
tcgacgtacctgtttttggttgtcgatgcttaaaggcgctgccggatcgt
ccggctgagttagccagaccgtctcgctgagtgagccggaccgtttcgct
gagttagctggactgtccacttggctccggactgtccggacacagatgtc
ataccgtctatgatgcgcaagacatggagctttgatcggctgcccgatca
cccttgcccccagcacctccattcttattagtcttcttgcctagccttct
gagtaactactcctcgtgaaccatttgatgtgcgaatatcacaaatggcg
atgtttttgtctttgcctttatcggccacttcaggtcgaactatgacctt
ttttgcttgctagttccactatattgaaatgaaatgattgtgtgtccact
tacatctcctaaaagatcaaccaaccctcatttatgctcaattgtatctg
tctacaaaaaaacattacaatcattggtggcatgtgaaaaataattatgc
catttgcaataagaacatctctttaattcatccataggaggaatagtatg
cgttaatttaatgctgtcactgaaagtcgcctagaggggggtgaataggg
cgaaactgaaatttacaaagttaatcacaactacaagcccggttagcgtt
agaaatataatcgagtctgcgagagagggtgcaaaacaaatcgcaagcaa
ataaagagtgtgacacgcggatttgttttaccgaggttcggttctcgcaa
acctactccccgttgaggcggtcacaaagaccgggtctctttcaaccctt
tccctctctcaaacggtccctcggaccgagtgagctttctcttctcaaat
caaccgggaacaaaacttccccgcaaggaccaccacacaattggtgtctc
ttgccttggttacaattgagttttgatcacaagcaaagaatgagaaaaga
agaaagcaatccaagcgcaagagctcaaaagaacacaacaaatctctctc
acttaacactaaagcttttgtggaatttgggagaggatttgatcacttgg
gtgtgtcttgtattgaatgcctagctcttgtaagtgattggaagttgaga
aacttggatgacttgaatgtggggtggttgtactacccaaaaagttcagt
ggtaaacaaggtataaagataatcaaactagggtaacctattgggtccca
tcaaaattaacctatgcagatcattaagattaattagaacatgagtgggt
aaaaagaagtgatcaagggcacaacttgcctggcacttgagattctaggt
gccaacttgctcttcagatgacacgtgacctcgctctagtcgtagcaata
caaacaaacattgtataggcaaaattaacattacaccaaacataagaata
aactgcgtaataataatttatgtgtcgctacgagatcgtaggagcgagaa
tcactaaattcggagctacgtttaagaagttatggttttccgaagtcttt
atgtgcttggtaaaaaaaaattaattgagtgatcaattctaatgtggctt
ccatgttaaaacagagttactaggttataaacaatattattataaaatta
atacaattggaatggtcaaaatgagttaaaatgaattagttatgaattaa
ccaagtttatgtaatttgtttttcattctagaaatcattttctattttta
ccttctttgtacttcctcccagcgatatatggaactgttgnnnnnnnnnn
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnncgggcccaca
cgtcagagctccaacggtcggaacccaacgacctggtgacgtggctggcg
caccgaacagtgtctggtggcgcaccggactgtctggtgcgccatgcgac
agcagcctacaccaaactagccgtttggtgtaggctgctgtcgcatggcg
caccagacagtccggtgcgcccaccggacactgttcggtgcgccagccac
gtcaccaggtcgttgggttccgaccgttggagctctgacgtgtgggcccg
ccttgcctgttcggtggcgcaccggacagtcctgttagattgttccggtt
gtgccacccgcgcgtgctctgttccttttacgcgcgctggcgccgcattt
aatgcgttgcagacgaccgttgacgcgaagtagtcgttgctccgctggct
caccggacagtcccggttgtgcacaggacatgtccggtgaattatagcgg
agcgggaatccgaagctgggcgagttcagagtcgctctcccctggagcac
cggacatgtccggtgaattatagcgaagcgcctctgagaattcccgaagg
tgcgaagtttggcttagagtaccctggtgcaccggacaatgtccggtggc
acaccggacagtccggtgcgccagaccagggctgccttcggttatcccat
gctctctttgatgaacccaattcttggtctttttattggctaagtgtgaa
cctttggcacctgtataacttatagactagagcaaactagttagtccaat
tatttgtgttgggcaattcaaccaccaaaatcaattaggaactaggtgta
agcctaattccctttcaatctccccctttttggtgattgatgccaacaca
aaccaaagcaagtatagagatgcataattgaactaatttgcataatgtaa
gtgcaaaggttacttagaattgagccaatataaatacttaatataaatac
ttataagatatgcatggattgtttcttctatttttaacattttggaccac
gcttgcaccacttattttgtttttgcaaattcttttgtaaatccttttca
aagtccttttgcaaatagtcaaaggtaaatgaataagattttgagaagca
tttatgagatttgaaattttctccccctgtttcaaatgcttttcctttga
ctaaacaaaactcccccttaatgaaatcctcctcttagtgttcaagaggg
ttttaagatatcgattttgaaaatactactttctcccccttttgaacaca
atgagataccaatttggaaaatcataccaaatgaaaaaactcttttttaa
aattagggtggtggtgcggttcttttgctttgggctcatgctctctcccc
ctttggtataaatcgccaaaaacggaatcattagagcccttacctaacta
ctttctcccctttggcaaataaaacatgagtgaaggttataccaaagccg
gagagatgctcggagcgacggcgaaggatgagttacgtagtggaagcctt
tgtcttcaccgaagactccaattccctttcaatacacctatgacttggtt
tgaatttcactcgaacacacattagtcatagtatatgaaagagacatgat
caaaggtatatgaatgagctatgtgtgcaattcaacaaaagaagttccta
gaatcaagaatatttagctcatgcctaagtttgttaaaggtttgttcatc
aagtggcttggtaaagatatcggctaattgatctttagtgttaatatatg
aaatctcgatatctcccttttgttggtgatcccttagaaagtgataccga
atggctatgtgtttagtgcggctatgctcaacgggattatccgccatgcg
gattgcactctcattatcacatagaagaggaaccttggttaatttgtaac
cgtagtccctaagggtttgcctcatccaaagtagttgcgcgcaacaatgg
cctgcggcaatgtactcggcttcggcggtagaaagagctacaaaattttg
cttctttgaagcccaagacaccaaggaccttcccaagaactggcaagtcc
ccgatgtgctctttctattaattttacaccccgcccaatcggcatcggaa
taaccaattaaatcaaatgtggatcccctaggataccaaagcccaaactt
aggagtataaactaaatatctcaagattcgttttacggccgtaaggtgag
cttccttagggtcggcttggaatcttgcacacatgcatacggaaagcatt
atatccggtcgagatgcacataaatatagcaaagagcctatcatcgaccg
gtataccttttgatccacggacttaccttccgtgtcgaggtcgagatgcc
cattggttcccatgggtgtcttgatgggtttggcatccttcatcccaaac
ttgcttagaatatcttgagtatacttcgtttggcttaggaaggtgccctc
ttggagttgcttcacttgaaatcctaagaaatacttcaactcccccatca
tagacatctcgaatttttgtgtcatgatcctactaaattcttcacatgta
gatttgttagtagacccaaatataatatcatcaacataaatttggcatac
aaacaaatcattgtcaagagttttagtgaataaagtaggatcggcttttc
cgactttgaagccattagtgataagaaaatctctaaggcattcataccgt
gttcttggggcttgcttgagcccataaagcgccttagagagtttataaac
atggttagggtactcactatcttcaaagtcgggaggttgctcaacataga
cctcttccttgattggtccattgaggaagacacttttcacgtccatttga
taaagcttgaagccatggtaagtagcataggctaataatatacgaattga
ttcaagcctagctacgggtgcataggtttcaccgaaatctaaaccttcga
cttgggagtatcctttggcaacaagtcgggctttgttccttgtcaccaca
ccatgctcgtcttgcttgttgcggaacacacatttggttcctacaacatt
ttgattaggacgtggaactaaatgccatacctcattcctcgtgaagttgt
tgagctcctcttgcatcgccaccacccaatccgaatcttgtagtgcttcc
tctacactgtgtggctcaatagaggaaacaaaagattaatgctcacaaaa
atgtgctacacgagatctagtggttacccccttatggatgtctccgagga
tggtgtcgacggggtgatctcgttggattgcttggtggactcttgggtgt
ggcggtcttgggtcttcatcctccttgtcttgatcatttgcatctcccct
tgattattgtcgtcttcttgaggtggctcatttgcttgatcttctccttc
atcaacttgagcctcatcctcattttgagtcggtggagatgcttgcgtgg
aggaggatggttgatcttgtgcatttggaggctcttcggattccttagga
cacacatccccaatggacatgttccttagtgcgatgcatggagcctctta
aacacctatctcatcaagatcaacttgctctacttgagagccgttagtct
catcaaaacacaacgtcacaagaaacttcaacttgtccggaggacttgtt
aaagacactatatgcccttgtgtttgaatcatatcctagtaaaaagcctt
ctacagtcttaggagcaaatttagattttctacctcttttaacaagaata
aagcatttgctaccaaagactctaaaatatgaaatattgggctttttacc
ggttaggagttcatatgatgtcttcttgaggattcggtgtagatataacc
ggttgatggcgtagcaagcgatgttaaccgcctcggcccaaaaccgatcc
gaagtcttgtactcatcaagcatggttctagccatgtccaatagagttcg
attcttcctctccactacaccattttgttggggcgtgtagggagaagaga
actcatgcttgatgccctcctcctcaaggaagccttcaatttgagagttc
ttgaactccgtcccgttgtcgcttctaattttcttgatcctcaagccgaa
ctcattttgagcccgtctcaagaatccctttaaggtctcttgggtgtagg
atttttcctgcaaaaagaacacccaagtgaagcgagaataatcatccaca
ataactagacagtacttactcccgccgatgcttatgtaagcaatcgggcc
gaatagatccatgtgtaggagctccagtggcctgtcggtcgtcatgatgt
tcttgtgtggatgatgggttctaacttgcttccccgcttggcatgcgcta
caaatcctgtctttctcaaaatgaacatttgttaatcctaaaatgtgttc
tccctttagaagcttatgaagattcttcatcccaacatgggatagtcggc
ggtgctagagccaacccatgttagtcttagcaattaagcaagtgttgagt
ttagctctatcaaaatctaccaagtatagctgaccctctaacactccctt
aaatgctactgaatcgtcacttcttctaaagacagtgacacctacattag
taaataaacagttgtagcccatttgacataattgagatacagaaagcaaa
ttgtaatctaatgaatcaacaagaaaaacattggaaatggaatggtcagg
ggatataactattttacccaaacctttgaccaaaccttggtttccatccc
cgaatgtgatagctcgttggggatcctggtttttctcataggaggagaac
atcttcttctcccctgtcatgtggtttgtgcacccgctatcgatgatcca
acttgagcccccagatgcataaacctacaaaacaagtttagttcttgatt
ttaggtacccaaatggttttgggtcctttggcattagacacaagaacttt
gggtacccaaacacaagtctttgaccccttgtgcttgcccccaacatatt
tggcaactacctagccgaatttgttagttaacacatatgatgcatcaaaa
gttttaaatgaaatgctatgttcatttgatgcactaggagttttcttctt
aggcaacttagcacgggttggttgcctagaactagatgtctcacccttat
acataaaagcatggttagagccagagtgagacttcctagaatgaattttc
ctaattttgtcctcgggataaccgacagggtacaaaatgtaacactcgtt
atctcgaggcatgtgagccttgcccttaacaaagttagacaatttcttag
gagggacattaagtttgacattgcctccctgttggaagccaatgccatcc
ttaatgccagggtgtctccctctatagagcatgcttctagcaaatttaaa
tttttcgttttctaagtcatgctcgacaattttagcatctaattttgcta
tatgatcattttgttgtttaattaaggccatgtgatcatgaataacatca
atgttaacatctttacatctagtgcaaatagtagtatgttcaacggtaga
tgtagagggtttgcaagaattaagatcaacaatcttagcacgcaatatat
catttttatctttaagatcggaaattgtaacattgcaaacatctagttct
ttagccttagcaagcaatttttcattttcaaatctaaagctggcaagata
aatgtttaattcttcaatcttagcaagcaaatcatcattatcatttctaa
gattgggaattgaaacattacaaacatttgaatcaaccttagctaacaaa
ttagcattctcatttctaaggttgtctatagtctcatggcaagtgcttag
ctcactagatagtttttcacatttttctacttctagagcgtaagcatttt
taactttaacatgcttcttgttttctttaataaggaagtcctcttgggac
tccaagagatcatccttctcatggatagcactaatcaattcatttaattt
ttctttttgttgcatgtttaagttggcaaaaagagtacgcaaattatctt
cctcatcactagcattatcatcgctagaggactcatatctagtggaggat
ttggatttaaccttcttctttttgccgtcctttgccatgaggcacttgtg
gccgacgttggggaagagaaggcctttgttgatggcgatgttggcggcgt
cctcgtcggaggaggagtcggaggagctctcgtccgagtcccactcgcga
catacatgggcgtcgccacccttcttcttctcctttcttctccccttctt
gtcatcgcccctatcactgtcactagaaataggacattttgctataaagt
gaccgggcttaccacacttgtagcaaaccttcttggagcggggcttgtaa
tctttccccctcctttgcttgaggatttggcggaagctcttgatgacgag
cgccatttcctcgttgtcgagctttgaggcgtcgattggttgtctacttg
gtgtagactcctccttcttctcctctgtcgccttgaatgcgaccggttgt
gcttcggacgtggagggatcatctagctcgttgattttctttgagccttt
gatcatatactcaaagctcacaaaattcccgattacttcctcgggagtca
ttagtgtatatctaggattaccacgaattaattgaacttgagtagggtta
aggaaaataagtgatctaagaataaccttaaccatctcgtggtcatccca
ctttttgctcccgaggttgcgcacttggttcaccaaggttttgagccggt
tgtacatgtcttgtggctcttctcctttgcgaagccggaagcgaccgagc
tccccctcgatcgtctcccgcttggtgatcttggttagctcatctccctc
gtgagaggtcttgagcacgtccccaaacttccttggcgctcttcaaccct
tgcaccttgttatactcctctcgacttagagaggcgaggagtatagttgt
ggcttgtgagttgaagtgctcgatttgggccacctcgtcctcatcatagt
cctcatcccctacagatggtacctgtacaccaaactcaacgacatcccat
atacttttgtggagtgaggttagatgaaatcgcattaaatcactccacct
tgcataatcttcaccatcaaaagttggtggtttgcctaatgggacggaaa
gtaaaggcgtatgttttggaatgcgaggatagcgtaaggggatcttacta
aacttcttgcgcttatggcgcttagaagttacggagggcgcgtcggagcc
ggaggtggacggtgatgaagtatcggtatcgtagtagaccaccttcctca
tcttctttttcttatcgccactccgatgggacttgtgggaggaggctttc
ttctccttccccttctcctttttgcgggactcttccgatgaagccttctc
gtggcttgtagtgggcttgtcgccggtctccatctccttcttggcgtgtt
ctcccgacatcactccgagcggttaggctctaataaagcaccgagctctg
ataccaattgaaagtcgcctagagggggggggggtgaatagggcgaaact
gaaatttacaaagttaatcacaactacaagcccggttagcgttagaaata
taatcgagtccgcgagagagggtgcaaaataaatcgcaagcaaataaaga
gtgtgacacgcggatttgttttaccaaggttcggttctcgcaaacctact
ccccgttgaggcggtcacaaagaccgggtctctttcaaccctttccctct
ctcaaacggtccctcggaccgagtgagctttctcttctcaaatcaaccgg
gaacaaaacttccccgcaaagaccaccacataattggtgtctcttgcctt
ggttacaattgagttttgatcacaagcaaagaatgagaaagaagaaagca
atccaagcgcaagagctcaaaagaacacaacaaatcactctcacttaaca
ctaaagcttttgtggaatttgggagaggatttgatcacttgggtgtgtct
tgtattgaatgcctagctcttgtaagtggttggaagttgagaaacttgga
tgacttgaatgtggggtggttaggggaattaataaaaaaaaaattgaggg
tgatcactttggttaaacaagaccgaaaaactcctcctatgcctaacccc
ttatgttaaaacatcaaaagacaaattcatttttcctattccacgaatga
ttgtaagagagagaaatccaaaggtcggaacatcagtgacgagaactcaa
aaaaacaaaaaatatgttttctactctaccccccccccctccccccccgn
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnc
ccacacgtcagagctccaacggtcggaatccaacggcctggtgacgtggc
tggcgcaccggatagtgtccggtggcgcaccggactgtccggtgtgccat
gcgacagcagcctccaccaaactagccgtttggtggaggctgctgtcgca
tggcacaccggacagtccggtgcgccaccggacactatccggtgcgccag
ccacgtcaccaggccgttggattccgaccgttggagctctgacgtgtggg
cccgcctggctgtccggtggcgcaccgaacaattcctgtagattgtccag
tgtgccacccgtgcgtgctctgctcctctgcgcgcgctggcgcgcattta
atgcgttacagacgaccgttggcgcgaagtagtcgttgctccgctggctc
accggacagtccggtgtgcaccggacatgtccggtgaattatagcggagc
gtgaatctgaagttggcgagttcagagtcgctcttccctggagcaccgga
cactgtccggtgaattatagcgaagcgcctctgagaattcccgaaggtgc
gaagtttggcttggagtaccctggtgcaccggacaatgtctggtggcaca
ccgaacagtccggtgcgccagaccagggctgccttcggttatcccatgct
ctctttgatgaacccaattcttggtctttttattggctaagtgtgaacct
ttgacacctgtataacttatagactagagcaaactagttagtccaattat
ttgtgttgggcaattcaaccaccaaaatcaattaggaactaggtgtaagc
ctaattccctttcagtcactttatactagctcgtcaaatattttatcaca
tttggcaatgttaaaagtaaatttatcttcttgctgattcttttgaaccg
acttcaaagaagaacatatagaaggtttagcctgtgttggccaaacaagt
tcgacggtatatatatccatggattcatcatcctaactatcgtgccccac
tagatgcatcttatggctagccgattttgatgcctctctctacttcgact
tttgcccgctagtgcaagtgcactaacaaaataaactatgtgccatccaa
tttctcttttaagtaagatcacaacccattaaaagtcagcccttctaaat
attttccgtgatatggatctgaaagcatcggtttatagtgtcttggaacc
tccagatatagtcattaaccggttcttcgcccccctgctggactaaagct
aaatcagttaattctaactcaggttctactgagaaaaagtgttcataaaa
tttctgctctaatttatcccaagagttaatagagttaggtgctagggttg
cataccatgcgaatgcggtaccagtcagggatagcgagaataaacaaaca
cgataggcttccctaccaaccaattctcctaggtgtcctaagaattgtct
tatatgctcatgcgtgttttttccgccttcaccggaaaatttagagaaat
ccgatagtctagttccctgtggatatggcaagacgacaaatcagtgacta
taaggtttccgatatgactgccccatgcctggcacactaacatcgagttt
atctctgaacaacctagctacttcgtccctaatcttttccaccatgtcta
gtgaccatccattaagtttgagggtgaaaatctcaggttgcctgacatta
ttttgtcgactttcctcctaggggatgttcaagatgcgtgttaggtgtcc
tattaatgccaccaccttctgccctatatctctcgggctctctggtggcc
gaagaatattcggccatgtgcatatgaggtggcatgacatgattataata
tgccgctggtggagtagcgtagtgctgctatgatggatgtgaaaaatgtg
catattatgtagctcctggttctgttctgtgtacgtatacctcgactgtc
cggctaaggggctggatggtccatgatggggctcaatgatccgaggttat
atccggatggtctggccatatatggcgcgacctagatagtctgtgcgcgg
gtgtgtgttgggtctgatggtccgacaaatatgccggacgatctatgaca
tatccgatcagtccggtgtaaggggttagacagtctgcaaccgtaatgta
gccggacggtccgagataactttcgaaactatagtaccaaatcctatagg
aatgagaatccaagtttccccgtgaaatcctttgatccaaaggagccttt
agaataatttatggcaggaatgtttacacgttgaggctacttttcacacg
cgtctacaacctttcgcgccgtttttcttggttgctgccactatgcaaca
gatgcctcgattcaaattcaaaaccacaacaaggtgattcctttattttt
tcctatggtgccaccaaagacattttcacctgaattcctatattttcaat
ttctgtagggttcaaaacatccaccatctctattcccatgtttttcctat
ttctatgtttttttatccttcattcccaagaaggccgtaacatgtgtaac
ccaaaatggcctgagcacaaagccccaaagccaggttgagcccagatggc
cctcatagaatctcccgatcccagaccagtctcccgctgacccatcctgc
tcagtgctcacttccacgcgggccccagcccgccacgagtccccgctccc
acctgctcctcgccgccgttagcgcgccccgagccccgacagcctttctc
cttctagggtttctctctgtttcgtcctcatcgaatcggagtttcgc
143398-145767 (or 146054) Forward Strand
Exon 1: 143398-143493
Exon 2: 143562-143652
Exon 3: 144047-144164
Exon 4: 144238-144318
Exon 5: 144399-144447
Exon 6: 144715-144829
Exon 7: 144945-145036
Exon 8: 145136-145283
Exon 9: 145743-145849
Exon 10: 145881-145978
Exon 11: 146048-146054
Identified through both Augustus and FGENESH. Augustus has two possible gene structures (4 or 5
exons...additional internal exon). FGENESH has 11 exons. I believe FGENESH is more accurate
according to EST/cDNA data (see below).
Did a blastp and showed a hit on Serine/threonine-protein kinase (PKc_like superfamily domain)
with an evalue of 5e-31 for transcript 1 from Augustus. Transcript 2 from augustus did not have as
good of a match. FGENESH actually showed a stronger hit of 2e-138 (Serine/threonine-protein
kinase AFC3). I therefore used the FGENESH and I tried to translate the EST data that showed
Exon 3, 4, and 5 (or just 3 and 4 from cDNA/EST) being one exon and got stop codons. The
FGENESH also matched other cDNA/EST data where exon 3, 4, and 5 are separate.
Protein:
MESSRSRKRTRQAHDCAAAPPPEREVVGPALLVARGGASPPWREDDRDGHFVFDLGENLT
RRYKILSKMGEGLGFLQSVVSSSAGYLRYYSNEEGYELDFRSIRKYRDAAMIEIDVLNRL
AENEKYRSLCVQIQRWFDYRNHICIVFEKLGPSLYDFLKRNRYQPFPVELVREFGRQLLE
SVAYMHELRLIHTDLKPENILLVSSEYIKVPSTKKNSQGEMHFKCLPKSSAIKLIDFGST
AFDNRNHNSIVSTRHYRAPEIILGLGWSFPCDIWSVGCILVELCSVRSLHTLYFSSTSLG
EALFQTHENLEHLAMMERVLGPLPEDMIRKASF
ATGGAGTCATCGCGGTCTCGGAAGCGGACGCGCCAGGCGCATGACTGTGCCGC
CGCGCCGCCACCAGAGCGAGAAGTGGTAGGGCCCGCTTTGCTCgtgagta
ctcgcagatttgcggggtttcaggctgtgctgatgcagatttggccacgc
tgtgcacgcagGTAGCGAGGGGTGGCGCGTCGCCACCATGGAGGGAGGAT
GACCGTGATGGACACTTCGTATTCGACCTTGGCGAGAACTTGACCCGCCG
ATgtgagttcccgactctttgactctctccttcagttcctcgagcccttg
tgcatggatatatcgtttagttgtgtctaataatttgtaattagtggcaa
catctttttcttattttggttttatctgcttttttattgttataagaccc
cgtttgcaaactggcagttagtacacctttaggttttcctttgtcgtatc
aagctaataatgtgtgaagcagggttcccattctgatctttagtttttgt
gattttaatagcaataaccttttttcgcaaaaaagtaacaaactttatgc
agtatgtggttcacatagttcccagtatccttttggacttagatttagtt
tgatatatggttatgtttattgacagaatgctttttagctttacagATAA
AATCTTGAGCAAAATGGGAGAAGGTTTGGGTTTTCTGCAATCTGTCGTCT
CCTCTTCTGCTGGGTATCTCCGTTATTATTCTAATGAAGAGGGCTACGAG
TTGGATTTTCGTAGgtacatttgggcgtgttttggaatgctgggaccgtg
aaacacatgaatatgttgccataaaagttgttcgcagTATCCGCAAGTAC
CGTGATGCTGCAATGATTGAGATAGATGTGCTCAATCGCCTTGCAGAAAA
TGAAAAGTACAGATCTCTgtgagtatctcagagacctaattacttaaagt
ataaccctattctgtttctgacttgacagttcctcttcaatttgtcagTT
GTGTTCAAATTCAGAGATGGTTTGACTATCGTAATCATATATGCATTgta
agtactagcttttggtcaattatactagtttctaaaagtcagttaccaat
tttgagaatgcatttttttgcagatgtgttggttgtttataatatttatc
ttattggccctcccagtacattgtatggaatgagagacttgtctatagaa
attatgataaatcgcctgtagtatagtctttttgttcttcgctcgtgaag
gttccctcatatggtcacaaattaaaaatattctttattcaatggatatc
aatgattttgtcagGTTTTTGAGAAGCTTGGGCCAAGCTTGTATGATTTT
CTAAAGAGAAATAGATACCAACCTTTCCCTGTGGAACTTGTGCGGGAGTT
TGGACGGCAACTGTTGGAATCTGTAGCATgtgtgttagttattgcttgat
tctgcacatcaattattagctaatttggtagcattggcttcacttttcta
ttgcttcaaaggggcagtgttaaccatcttttcttttgtaacagATATGC
ATGAGTTGCGGCTTATCCACACTGATCTGAAGCCAGAAAACATACTACTT
GTCTCTTCTGAGTATATAAAAGTTCCAAGTACAAAGgtgcctattctcac
ccactcttttagtgattgtctgctatttctttactaggaatacagtattt
gcagttttcttgaatgattgcttggggttgtacagAAGAATTCGCAAGGT
GAAATGCATTTCAAGTGCTTGCCGAAGTCCAGTGCCATAAAGCTGATAGA
TTTTGGCAGTACCGCCTTTGATAATCGGAACCATAACTCGATTGTTTCTA
CGAGGCATTATAGGGCACCTGAAATAATATTAGgtaaattggttattttt
gaagatttgattcagttttatcttgtttgagatatggactatgataattc
agtgtgaagatttatgtgctcttaaatatgctcgatagagtaaaacagac
atcgtccaaaatcttagcaaagtcttggcacatcctgattcatactctta
tttactcctaaatgatgtgttttaatcaaatctgaaagaaacaggtggta
cttcttcctctttgatggtttggtgttttctgcacatatttagttttaca
gttgatggttgagcgatatcgcactgtattgttgaaattgattgaactga
gttctatgcttgtgtgcagaataagagttagatcttttatactatttcgt
tagtggccatgtcagctatcatgatagttgctaaaggtttcttctccaat
tctgtatgtcattagttctgatgttggatgtttcttttgcagGTTTAGGC
TGGAGTTTTCCATGTGATATTTGGAGTGTTGGCTGTATTCTTGTTGAGCT
ATGCTCTGTTAGATCTCTTCACACACTTTATTTTTCTAGTACCAGTCTGg
taattgttttcctaacatcacattttttagGGGGAAGCATTGTTTCAGAC
ACACGAGAATCTGGAACACCTAGCAATGATGGAGAGGGTTTTGGGACCTC
TACCAGAGGATATGATACGGAAAGCAAGgtgcgttacttgcaatcttaat
ttgcgtattattttatcttttgaatgaaatgtatgtgttcaaatcagCTT
TTGA(End of Gene)agcatatctttatttgcttgatgttcaaagttctttttttgatggg
gctaggatttttggttttactgtgctttcttgcctatttattgaacattc
atttgtttaagcattttgagagaatgatgcatttgattttgctgcattag
agtatcagagttacattttgaggaaataacactttcttctctatttattg
aacgtttctaatctgttacacttcattatgtttcatgattagcagagtat
cagtgtacataactttcatattgtttcattattctattttgtttattgcc
agtttgccacttgaattttcaaagctaatagctgccttttatttattcta
ttttgtttattgtaccttgattttcgaagttaataactgcctttatggcc
actactaatactaacttgccctccctaccccttttttttgtttattgttc
cttgtatttggtaattttgcatgtgtattgcctgcttgttatatttaatt
tgtggcattgaaatttcttgggaactcttgtaatcttttttttgtttctt
tccttcatttttaatattcattcgttgtgcttacatgtataccacctcaa
atgtatagctcttcagctcagaaatattttaggagagcaacacgattaaa
ttggcctgaaggtgctgtttcaagagaaagcatcagagctgtgaggaaac
tggatcgactaaaggttttagttctctcgtttcttctactttggccatgc
ttatctatcaacttaccacgctttgacttaagaaattttaatattttctt
gtctttgatctgattatattgtctcctagatttcaaggtagttacggatc
aaggttagaagttggggtttgcgcttgtgattagttgtagtaattagcta
tattgtttggtatacctgctagtatttttttacattaatgtagaataaca
tttattagctttcttttggaaggaataaattgatttaaataaaatgtaat
tgtgcatagacaaaaatgctatccttcatgaaagcattagaagatgccaa
actttattttgggagacaccaatttgcttaataatacattataaaacggg
tttgaagaggacaacctgcgactaactcgggcataaaataagttcggctt
gttggaggcagttaggaaagcataagagttaaatttttctgtttcattca
gtgaattgctagggctgtcttcaaaatttcttaataaataataataggtg
ctatctagtagcgctgctagtaatttcaaattttaaactaacatatgatg
tgaaacgtacctcaaaatgagtaagaagctatggaaacatgaggattgac
cttacagatcctgaaacttatgaaccacaactacaggacttggtatcgag
gaacgctgaccattcaaaggtggcactggtggacttgctatacggtctcc
tacggttcgagccatcggaacgcctgacagcagaagaagctctggaccat
ccattcttcaggaacccaacatgacttgttgagcaggcttgactcatgta
actctactgaaggtggtggtaggaagcaggcagagccgtccccattgcaa
agctgaaggcatcttgttcctctataccagagagtttgtgtaaggcctgt
caatggggcttcttgtggtgataaaactgatggctttagtcacccaatgt
catgtacactagcatttttcctagggagcttagttcgccagacggcgctt
gtatgttgttaaccgggatacactatgagctgtgccccgatgccctggtg
ctataatccaatcgcaaatgatgaaatcaagcatagcgtgcgtccgtata
gtttatgattgaattgtaccttggaatgattgccctacattcgttcattc
ctacctagcgttgttacttcgatggttctggggtgctgttgataggtgca
cacggcttgggatttgttgcatcccaactgccgtgtcctgttgctaatac
ttgtgctgctgctgaagtgctgggtgggtttgaattctgtcgtgcctaca
atgcatgtggtataggaaatagtgtagtagtacctcgctgcttggtacca
gaagcggatggcttctacttctcactctggcccactactatagaaatgga
cactgtaccgtcagaatcggatggctttaagccagccgctctgctggcgt
ttttttgagataaaatactgtgcacactgtcagccggttcggctgttagc
tgccagacggaatattcacaccctatacggctagcttcaagggaagtacc
tggttggattgtgtactcgaccacgtactagctatagtttttctgcctcg
actgctaaataattggacacatcaagttgccaaccaataggtaatgatgc
ttaacatggcctctggtacattaaagcctgcacgtatactctttcgagaa
cttgatttggcacggtcttagcacctggatgattaaataaaatcacgtac
gactacgaaagagacatctaaattataaactatcaaaacggtggaagata
actttgaacaaaccagaaatgaaataaaagagaaaggggaccacacccta
gatgttaaacactagattttgcacctggacggtccgaccgtgtaaccagg
acggtccacgggatagttcggacggctcgcaattagattaattcaagctg
aattctttaccccttgcgtgattgtcctaatgaatctcgtgggaagttgt
tggaaaccgcctaggaacgggacctccccactatatatatgaagggatac
gatcgattcgaacacacaccaatcgatttatcaatctaaatttaccttat
gtattaggagtagaaataatttatccttagctttagtcttcctcaacctc
aattttcgtctctcttcagctctatatcgtctagagacgtcttgagtggc
ctgccgatcccaagaaaacctaggatctttcctcctcgatggggtccctc
ctgagatacgagatcttggttatgcaggagattcacaaatccctctacgt
cctcacggacagtccgccgcctccacaaggacggtttgcgcacccgcaga
gaagagcagggctctctgcgcagagtcacagatggttcgatgacctaccg
cggacgttccacgctcttccagagagattcccaaggtcttgttccgcgca
agcggggtctagggattattggtgtttggtcccaaaaaaacgcgaacaca
atttttggcatctccgctggggacgactgcgcttcagatctattagatcg
accatgatggtcgatttcaaagattgtaccaacatcttcccaagtaacat
cgttaggccaactatggagagtctatcggtcgaggaggagctatggttcc
atgatctcatgaagcagtggatgacgacgcaccgcgacaacgcgcttggg
tacgcgaagaagcgaaggagaaattcttgtcacactccgcgatggatcga
cactagaagattatcaagtaatgggaaataaagctcgactccctcgtacc
tttgctcaatgatagcaatgtaagtaattccagtgatgatcaatctatta
tgcattttgtggaattacaacaagatcagttagaacaaacgcatagggag
atagaagagacactaaaaaaactcacacacacatttgaaaaatctactat
tcccagttttccatcccatgatgtcgcaatagaggcaacctcgctggatg
catcatcaacaaataggcctacacaatcccaacaattatatggtatgacg
gtaaactcatattaaggatagccgctacctccacaacacctaattgaccc
aacaactcctctcgccatggtcagactggccgagtataaccagttggccc
catatgggaccgttctatagcccaccgtggaacgtctgatgacctaagaa
Download