(1-50343) Christopher Dugard acttcatcaacatcgacctccgactccgatgcgccgtcagtaacttctaagcgccaagagcgcaagaagtttagtaagatccccttacactatcctcat acatctagacatactccattactttccgttccattaggc >gene1 Best maize hit: AY664414.1 Zea mays cultivar B73 locus 9008 Best non-maize hits: HM596859.1 Zea diploperennis bio-material USDA:GRIN:PI 462368 bz locus, partial sequence HM596856.1 Zea luxurians bio-material USDA:GRIN:PI 1441933 bz locus, partial sequence AC243120.1 Gossypium raimondii clone GR__Ba0131I15-hog, complete sequence AAACCACCAACTTTTGATGGCGAAGATTATGCTAGGTGGAGTGATTTAATGAAATTTCATCTAACCTCACTCCAC AAAAGTATATGGAATGTTGTTGAGTTTGGAGCACAGGTACCATCTATAGGGGATGAAGACTATGATACGGACG AAGTGGCCCAAATCGAGCACTTCAACTCCCAAGCTACAACCATACTCCTCGCCTCTCTAAGCAAGGAGGAATACA ACAAGGTGCAAGGGTTGAAGAATGCAAAGGAAATTTGGGACCTTCTCAAAACCGCGCACGAGGGTGATGAACT CACCAAGATCACCATGCGGGAAACGATCGAGGGGGAGCTCGGTCGCTTTCGTCTTCGTCAAGGAGAGGAGCCA CAAGATATGTACAACCGGCTCAAAACCTTGGTGAACCAAGTGCGCAACATCGGGAGTAAGAAGTGGGACGACC ACGAGGTGGTTAAGGTTATTCTAAGATCTCTTATTTTCCTTAACCCCACTCAAGTTCAATTAATTCGTGGTAATCC TAGATACCCACTAATGACTCCCGAGGAAGTAATCGGGAATTTTGTGAGCTTTGAATGTATGATTAAAGGATCAA AGAAGATCAACGAGCTTGACGAACCCTCCACGTCCGAAGCACAACCGGTGGCATTTAAGGCGACGGAGGAGAA GAAGGAGGAGTCTACACCAAGTAGACAACCAATCGACGCCTCCAAGCTCGACAATGAGGAAATGTCGCTCGTC ATCAAGAGCTTCCGCCAAATCCTCAAACAAAGGAGAGGGAAAGACTACAAGTCCCACTCCAAGAAGGTTTGCTA CAAATGTGGTAAGCCCGGTCATTTTATTGCAAAATGTCCTATATCTAGTGATAGTGACCGGGGTGACGACAAGA AGGGTAGAAGGAAGGAGAAGAAGAAATACTACAAGAAGAGGGGCGGCGATGCCCATGTTTGTCGGGAGTGG GACTCCGACGAGAGCTCCACCAACTCCTCCGACAACGAGGACGCCGCCAACATCGCCGTCACCAAGGGACTCCT CTTCCCCAACGTCGGCCACAAGTGCCTCATGGCAAAGGACGGCAAAAAGAAGAAGGTTAAATCTAAATCCTCCA CTAGATATGAGTCCTCTAGTGATGATAATGCTAGTGATGAGGAAGATAATTTGCGTACCCTTTGTGCCAACCTTA ACATGGAACAAAAGGAAAAATTAAATGAATTAATTAGTGCTATTCATGAAAAGGATGACCTTTTGGATTCTCAA GAGGACTTCCTAATTAAGGAGAACAAGAAACATGCTAAGGTTAAAAATGCTTATGCTCTAGAAATTGAGAAAT GTGAAAAATTATCTAGTGAGCTAAGCACTTGCCATGAGACAATAGACAACCTTAGAAATGAAAATGCTAATTTG TTAGCTAAGGTTGATTCTCATGTTTGTAATGTTTCAATTACCAATTCTAGAAATAATGATGATGATTTACTTGCTA GAATTGAAGAATTGAACATTTCTCTTGCTAGCCTTAGGATTGAAAATGAAAAATTGCTTGCTAAGGCTAAAGAT TTTGATGTTTGCAATGCTACTATTTCCGACCTTAGAACTAAGAATGACATGTTACAAGCTAAGGTTGTAGAATTA AAATCTTGCAAACTCTCTACATCTATTGTTGAGCATGTATCTATTTGTACTAGATGTAGAGATGTTGATATTAATG TTATTCATGATCACATATCTTTAATTAAACAACAAAATGATCATATAGCAAAATTAGATGCTAAAATTGCCGAGC ATAACTTAGAAAATGAAAAATTTAAATTTGCTAGAAGTATGCTCTATAGTGGGAGACGCCCTGGCATCAAGGAT GGCATTGGCTTCCAAAGGGGAGACGATGTCAAACTTAATGCCCCTCCTAAAAGATTGTCCAACTTTGTTAAGGG CAAAGCTCCCATGCCTCAGGATAATGAGGGTTACATTTTATACCCTGCCGACTATCCCGAGGACAAAATTAGGA GAATTCATTCTAGGAAGTCTCACTCTGGCCCTAATCATGCTTTTATGTATAAGGGTGAGACATCTAGCTCTAGGC AACCAACTCGTGCTAAGTTGCCTAAGAAGAAAACTCCTAGTGCATCAAATGAACATAGCATTTCATTTAAAACTT TTGATGCATCTTATGTGTTGACTAACAAATCCGGCAAAgtagttgccaaatatgttgggggcaaacacaaggggtcaaagacttgt gtttgggtacccaaagttcttgtatctaatgccaaaggacccaaaaccatttgggtacctaaagtcaagaactaaacttgttttgtagGTTTATGCA TCCGGGGGCTCAAGTTGGATACTCGACAGCGGATGCACAAACCATATGACAGGGGAGAAAAGGATGTTCTCCT CCTACGAGAAAAATCAAGATCCCCAACGAGCGATCACATTCGGGGATGGAAACCAAggtttggtcaaaggattgggtaa aattgctatatcccctgaccattccatttccaatgtttttcttgtagattcattagattacaatttgctttctgtatctcaattatgcaaaatgggctacaact gtctctttactgatgtaGGTGTCACTGTCTTTAGAAGAAGTGATGATTCAATATCATTTAAGGGTGTGTTAGAGGGTCA GCTATACTTAGTAGATTTTGATAGAGCTGAACTCGACACATGCTTAATTGCTAAGACTAACATGGGTTGGCTCTG GCACCGCCGACTAGCCCATGTTGGGATGAAGAATCTTCATAAGCTTCTAAAGGGAGAACACATTTTAGGACTAA CAAATGTTCATTTTGAGAAAGACAGGATTTGTAGCGCATGCCAAGCAGGAAAGCAAGTTGGTGCCCATCATCCA CATAAGAAAATCATGACGACCGACAGGCCGCTTGAGCTACTCCACATGGATCTATTCGGCCCGATTGCTTACATA AGCATCGGCGGGAGtaagtattgtcttgtaatagtggatgattattctcgcttcacttgggtgttctttttgcaggAAAAATCTCAAACCCA AGAGACCTTAAAGGGATTCTTGAGACGGGCTCAAAATGA gttcggcttaaggatcaagaaaattagaagcgacaacgggacggagttcaagaactctcaaatcgaaggcttccttaaggaggagggcatcaagca tgagttctcttctgtaatactcaaaattgtgtacaaggaatatatagtgttttcctcatttgatgtgtctcatttgcatcataaaaagagcatacctgaaat ttagagtttaattcaaacaaatgacaaaaaatgcatcatgttggagtttatatgtttgtgcattaaataaaataataatgttaatggtaataagataatg attgctagaagttgaattaaaccctaaattaaaactagggttttcaaaaataagagaagaaaaaagggtataaaaatataaataatatactattatct ccaagatttatattctcatttcacaatatgattggaggataaaaatctatacaatgttcgaatttaaaattcaaattcaaaccaaattttgaaatgaaga aagaaaatagaaaaaaataaaaaggaaaaagaataaaagcctcatgggccgccaaaccatcatttcggcccgctccagcttcccacccgcgcagcc cacacctgaagctcgcgccgacgtgtgggtcccgcttgtcagccgcttttacttcgcgcgcgcgctgggtgactctctgtctggtgggcccggggcgcca gactcttcgtccacctccgtaacaaacgcgtgagaatcgaccgcgcctgagcaccgtaatctccggaattcggttgcgattggacctagtctcgcgggg aaagtgggggcataagaagacccggcgtcgcccatcctctgccctcgctaatcttctcaaaatcgcacgcgcacagaaaacccagagcctcgcccat gctcgagtccgccgtcgccggcttgtgcttgcaacgccacttgatgtccgggaggtagttagggagctgcgcgagcgcactaggaaggcatagcaatc ctcaatttggtggtcgggacctcggagggtggttgatttcgcgtcgtcgctggaacaccgccgcggcgccgcatcgtgccgcggacccagctctgcaca gctcaaacataggtaagaaaaccctaggccatttcgctttgatctcagtaccgtgtagcgcgtttcaatttgcgagttggggcacgggttcgccggattg gatgaccacggcggagcgccgccgtggggatcgcggcgctgcgggggtcctaggtcccgaggtgggggaagagcgtcgggaccgtccgatctgggc gaacggctgcgattagaagttagcgtaccccttcgccactttggtcgggtaccgttgatctcgaatctaacggtggtggttagatctggtttcataaactc tgggccgtcggatcccgatccggcggttggtaacgcataccggttcggggttgagtgatctaatctgagccctcagattctaatcgtgcggcccacatct acggataccccttcgcagggactttttgcttaagaaaccctggatttattagaaaccaacccgcggtccacaggaacagtccacagagtcttggaaac acttgcaccgaaacccctgagttttctgggattcgaggcctagttcaggggatattaaaaaccgagaaaatgattatagaatttagtttttaatacaaa aacaattctagaaacttgtaaaattcatagaaattccatttttgacccaaattgagccattccaatttctaaaattttgtaataatattctctaccatttag agtcactgttttgacatgaactctgatagaacattaatttaacatttaatcctaatttaatcacattaaaccttaggaaattcataacttgaaacctataa ctccaaatttagtgattccagttcctatgatctcattttaatgtatagatttttactgtatattttattcatctgtttggtgtgatgttgattttggctatactat gtatgtattgtgttgatgcgaatagacgagcaagccactgtggaaactgaggttcaacaagtagaagtagctgagcaggagctcattgaaggcaagt tgtgcccttgatcacttacttttcccaaccatgttcttattaattttaatgatctgcataggtaaattttgatgggagactttatgttaccccagttttgatta cctctataccttgttcacccctgaaataattttggggagggcgatcagagtgcttttgtggatgggtatggagttacactacacatgattacaatgatatt atcttaatattacactggtcatgttaagatcattaaattaatgggaacatggagcgacaaccgggtaaaacagtggtacctcaagggtataatgggac ggccttggctgggtaattaggaaagctagtggaagactaccttacccgaaaggggcaagggcagtaggggagaggtcagtgcggggaggtccctgg ttgattttgctgggatggctgtcagccaggaaccctgaactggatcttcctataaactgtagcgggttttcggaagctagtggaactttgtaaaggcctc gtagtggatccctagccattcacctcggtagtgtctaaggtccttgcaaacccaggcgacatgggatacacgacttgtgggtaaagatgcgcaacctct gcagagtgtaaaactagtatactagccgtgctcacggtcaagagcggctcggaccctcacatgattaattatggaacttaaatttaatttgacattgca tcgcatttgggattattttactattactgttctttattattattaaggtttggtatttacttacacttagtaattgctaataaaattttgaccaacttataaaa gcaatgctcagcctcaacctctatttcattgatcagccttacactacatgaactcccacctttggtgagttcatgccacattattccccacgacttgttgag ctatgaacgtatgtgagctcactcttgctgtctcacacccccccacaggagaagatcaggtggtcgaagaggagctgcctaacactgaggagttcgat ctgatctaggtggcgtgtctcggtcgacattggcgccgacgatccttagttcattttatacttattgttttcttttgtaataagacttccgctatgtaataaa tactctgatgtattatgacatttatctctatacactctgttattatatatgttgtcttcttggcgcatgtatgagatgcacccggctttgtcctttaaaaccgg gtgttacatcttctccctacacccctcaacaaaatggtgtagtggagaggaagaatagaactcttttggacatggcgagaaccatgcttgatgatacag acttcggatcggtttgggcgagcggtcacacgctgctacgcatcacgctatatctacatcgatctcagaagacatctatgactctaccggtaaaagcca tattcataatttagaagtctgnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnngatgaaattcaaagctttgaattcgtataattaggtatatgtggagtacttcagtg aaccaatggactacaaaactgagccgagctaagaattgtggacattattatcacatgaaatactccttggtttactaattactaagagaatgttgtcaa ggaataacaacaaataccacaattagtgctattgggtataatgatatatttgctcaaataaatgattcatgccccgtatcctttccagttcaaggaaag aaggaaactcaaatgcaatttagaccactgtaagatgaaccaataaagttattctgctacctagcaccatggtgtatctaatcaataaaaaacaagga tgatggttttatttataaagtggtcaactatattattttaaacaaattcagatcaggcaaggacttgactttgacacacaaaatgagtatttaccttacta catcatctgaggagcttaaattttttaccacgttagccaataggctttggcagaagttaaaatgtggggcagtatctaataaagcatccatgcaacgga caaccagagtatgatagtgggg >gene2 (reverse strand) Best maize hit: NM_001158676.1Zea mays LOC100285786 (si618065b11), mRNA >gb|EU973949.1| Zea mays clone 424851 nuclear ribonuclease Z mRNA, complete cds Best non-maize hits: XM_002453594.1Sorghum bicolor hypothetical protein, mRNA XM_002436496.1Sorghum bicolor hypothetical protein, mRNA NM_001052815.1Oryza sativa Japonica Group Os02g0214300 (Os02g0214300) mRNA, complete cds XM_003571604.1PREDICTED: Brachypodium distachyon nuclear ribonuclease Z-like (LOC100834210), mRNA TTACTTCTCCAACGATATCAATTTCTGTAGGTAAGCctaaaaattcacaaaagaaatccagctccaaccattgtttgattttctttataa gcatgtatgtcagcagggagaaaaccaattgaaaagtctacagtggtgcagagttgggacagagcgaggttggtttacccacCTTAAGCAGTG GGCGCAGCCACTCTTTGCACTTGTAAGAGCTGTGCAAGAACGCGATAGCTGGTTTTCTTCTAGGGACGGGGTCG TCAAGCTTCAGCAGAAGTACACGTACTCGTCCTTGTTCACCTAGctgctttgaaaggaaaaaagtagtgatagtttctggttaga aaaatagaggggaaaataaaaaccagcagtaaggcgttgactcggtagtttagtgtggggaatacatgtgagtctggatctgagaacctagaaaaa aggatgtcgactatcaagggtgcacccaagtattcagcagagggatgcaatggagtctgcatgtatgtggagtaagagtccactattatattcttgctc tagatctcgtgaggctgcttagatgcgattggctcctgtaggtcgtaggtgacacgatctggctgccggttacCTCAGTGATTAGGTTAAAAT TTTCTTCAACCAGCTTCTCCACCCAGTTTTCCACCTCTTTACTGGGGCACGACTCCATTGGATTTGCctgagattaccac agacaatcaaccctgtaggttcaaccattttagcagcacattcatttaccaagagtccaagatactgaatgtgaaccaaagccctttcagtttacacga taatttgaagaacatagaaagagatgcagacctcgacatttaggatagatgacaccaaaaagaaagggtagcaatatccaacgttaacatcaacaa ggtacaaaaagttttggcatacaattctacttaatttgtctaacaagtcattcaaattgcataacaatatcacatgaaaatgtgacgagacagaacaag aaaaaaacctggaagaaaacagtctctgcatgcccatagagattgcggcggacagtcttggttctaggaagatttgggtcagaagctacagacCTG TGTTCGCTCCCTAGCAGTGTGGTGCACCTCGTTTGGATAGACACAATAATTGTCAGAGACCTCctgaatcaaaccagg aatagtattgttaacttgtaagcaaaataaaggatgtttgagggatggcaaaataaatgatgtttgagacatgctcttgtgctttgcaactttgggttgg aagacatctgatccatcatggaaaatcttatggtggtgaagttgtggtagctttattctctgtaattctgagcggcctgtaagcgactttattacacagat gttttgctctatataaattgttctgtcatgtgtatccttatctatgttttttttcaaaatttgtagtggtttgaatcaagcagctaccaacttctattcattcga ccaagggcgcattgctgcttatatagactatatgagatgattagtcgttcaacttccagcaccaatcaagaagggacaaccttaccccaggtgcaagg gaatattgaatttcgaaatgtgtactttagttacctgtctcgtcctgaaatcccaatattaagtggtttcttcctcactgtaccgacgagaaaaactgtgg cacttgttggtagaaatggctcaggaaaagcagtgtaattcctctgatggagagattctatgacccaacattaggtaagagcgttgagaagcatctttt gttttccacttctatttccactcaaaagactactattatttttcaggtgaagttcttttggatggggaaaacataaaaaaattgaaagtagaatggttaag aagccaaattcgtctggtgacacaagaaccagccttattgagtttgagcatttaggaaaatattgcttatggaagatctgctacctttgatcagataga agaggcagcaaaaacagcccatgcccatgggttcatcagttcacttgaaaagggggacgaaacccaggtaaaggaggttaccatatatcttattgttt atcaaggctagactgcttaggcattgaatctctggtactcaccttttgaaattctgtaggttggtcgagctggtctaacactcactgatgaacataaaat caaaatttctattgctcgtgctgtgctttcgaatccatccattattttgcttgatgaggtcaccggaggacttgattttgaagctgagaaggctgtacaag aagcattagatgttctgatgttgggaaggtcaaccattataattgctagacgtctttgcctcattaaaaatgctgattatatagctgtaatggaggaggg ccatcttgttgaaatggggacacatgatgaacttctaaacttggatggccatactcccttcaacaagttactaaaccatgcttcaatttttgaactccaa agccaccaagccagacagacCTCCACACCTGATAGCTTGAGCCGCTTGATCTCGCTCCCTGGTAGGCCAATGAACTCCT GCTTAAGCTTCTGCTTCACCGAGTATATCACATACCCctgcatacacacaatttaattgagactccatccaaccattgcaggattata acagatggattcacaacatacacagttccaagaagtgaagtgaaccaattcatctcacCTGGCTGGGTATGGCGTGATAGGTCCTGAA AGCCCTGACCTTGAGGTCCCTCCTGAGCTCGTACTCCTCCCTGACCTCCAGGGGGACCAGGTTGTGGTTGAGCTC AGACTGGTCCAT ggcgcggcgatgcacctcaaatagccgctccacgaggtcccagaggcacgctgggacgaagatggtgggcggacgcagcctgaagagcccccgcta gccacatacatgggaaggcccccgatgtggtcgaggtgcccgtgagagacgaagaggaactcctgcgagacggcgcactgcgggcatcggccgatg tcgaaggcgaggctcagcgtcgggaagatgacgcaggtttcttgaccgccgatggagacgaggaagcgggaataggagaggagctgcaggacgca acctccatagtgtactcgtcgcagcttttcgccattccggcgccgaggagagaggaggggggcttaagggtcgcggtgggctgtgcgggcgattcccg cgcgcgtagggagggaatcgcggcggcgggggtcgtgcacgtgatccgcgggggagggaatcgcggcggcggggggtcgtgcgcgtgatccgcgg ggggagggcaatcgcggggggcgtgatccgcggcagggagggggagcgggggcagtctacagtgcacccttaatagttagtagagataacacaca tatgtacataagttattgtgttattatacggttccgttgcaacgcatgggcactcacctagttatatatatgaggtcagtatattgggagcctaaagcttc caatagttacaggaagcccagccaccccctggtccacacccgacaaccaggcacatgcggtcgggtgcgtcccccgccatgcaccaggtcaaacctc cccacgcgtgtaagtcaaaccccggtccaacacccaacctccgcgtgggacgcacaacataaattgtagttctgttttttgcgcaaatagagtaaatag ttcacagaaataacgtaaatattttgtagagatagcataaaaactccgcccaactactggaccaacaccacatgtgcagacacatgcatgcagatcct atttttatgatagatccaataattatggcagatcatggaagtttctattgcatgcgtgtaaattttggatgacattttcgtttccactgcatacgcgcaaaa attggattgatccataaatataggatcgctccaacagccatacagctaggcatgttagaacccaatttatgatatatactaatattaattattctacaca gtgtagatattttatgtattcaatatactacgtaaaaaattgttttctgttgcatgcgcacaaaatttaggatgacaggaatttgcgatgaaagaaaatg gattgacatgcatgggaaaacaaaaaaatcttattcaaatgttttatttcctccataaatatatgatcgctccaacacccatgcaggtagactcgttaga accagtttatgatatatattgatactaattatgttacacggtgtagatacttatgcaatacgtacactacataaatatatggcatgtggtttactggtgga cacatcttcatattcgagcgtaaaaaccagcagttatgagcgtaaatactgttccaacatgaaagctattgaaaacctcttgatgttggcatagatatgt atacaaagtgacataaaaatatgtacgcgattgtatgtcatcactccaccgcgaacatgcattcaggcgcgtcccccaacagacgcatgcatgcaatc ctatttttatggtagatctggaaaatatggcaaccactaggcacatgcatgaggttcctaatttcatgcagatccaaaaattatgacaggtcgtgagag tatccattgcatgcgcacaattttgaacgacattttcgttcctacagcatacacacaaaaactggatggctggtatcacacaaagcataaatcattata gaaaattgcataaatatatacaacgtacatcataatcataattaataaagaaggctcaaccaacagccgcatgcataaacaaaggattgcaggtcat caccggaggacgtcaccgaggccaccagatcggaagagttcgtaaacgaaggatgcccaaaccgtcgtgtagaccgccaaccctcgcagctcgcgt ctcgcgtcaccgatgtcgccgctggagccacccgtgcctagcgtggcaaccatcgctagagacggacgcgcctcatggggtgccaccgttgctcggtc gtcagggtcaccaccgctcgcccgcctcgggccccatcatcgattgccctacgtattagggagctgccggcaccacgcctccgatttgggaaccaccgc cgccgcacctccaaatcgggtgtaacaccccaaaattttattttgggcacttttcaagataatacaaacatcttgaaacaaaatatctttaataggattt tcctcactttgtaagccacgtctcctgaaataaatataactcatagtaaaggatcaaatgaacagatcttatataagttaaattattcttttaaataatat atttagagatattctccaaataggttaagttaaaacatctcttactgaaaattttacactatgaatctttattcataaaataattatttccttgtgaggagc tacatatctaaaagccctttcttagataggataataaattaattaattaatcaaatgatatatgcatctcatgctaggtcttattttgtggtgcatctcgaa tttgaactctgaatccaaattaaaattttgaatttgagttagtacataaaatagaaaatgaaagaaaggaaaatagaaaatgaaattagaaaagagg aaacccacatatgggccggttttctcacttctcggcccacctctcccgcacggcccgcttcccttgtccacgctgcgatgaaagaagtgggaggagaag tatcagacttgccctaggaaaaaagttgggggcactcagacaaagacaccgaccttgcttcatgacccacacaacgtgacattccaacaaccagcga tttgtcgaccaacccgtctgcgaggtctattcaaaaaggagaccgaagagaaagggggctaagagtgatcgacaagtgaacgaccatccggtgcttc taaaggagggcgaaggtgctcgactcttccccaagaagctacatgttgcatcaaaggtagccttcatcgccttatgtgaagggtatctaggcgctaag cccaacctcactttatggcaatacttcttttgtgtcgagctgttgaggaagaagaaggagcgaggaataatggaatcatggtcgatcaggtgcgccagt atccgccttcacagcggaaggtcacgtgagtatatcccaacccccttttccttatataacaagggttggcacatgtgatggttctacctgcgagacatcg atcgagagccctcgttggggttaccgcttcttagcatcgatcggaatgtcccgttcgaggctccctaaactagagccacgatgtcacgcctgaagagaa agttcgaatccagggacatctagcgtgggtgcatatcctctaagagaaaggcgtcacaagagcggacattatctataaaacctctggtgttacgagaa ctaaaactttagcatgacatcatatgcactgcattatattatgaatgacttacccaaaatgcattcactaggctaaaatttcaaaacaaaaacatgtga tgtgatgcttggttgagtatacggtctagcaaggggattcttaaccctatgtaggaatgaatctcttcaagggctagaagcatgtgaccttatgatttga gcacaaaggagtgatcttaccaagcttggtagccatgtttaagaaataatgaattggagaggttaaacacttaatgtagcttaggatactcaaaactct aatttggagccccaaaaccctaattagtgccctatagggtgatttcaaattctatgcactttttgccaaaagtgtatatatcaaagtggtagagtaacaa aaaccaagaaacttttatatttggaggtttccaagttttgtggtgaaacttggagtaatttgtaaaagttcataagcacctctagtcttgaactctgaaaa tagtgttgatgagccaactttgggctcttgtatcacttaatctataagggattttttggaagtcattatggcgaagttgtagagctactatagtagtccaa attttattaagtgacttagcccaaaacctatatgaaatttggagaaaaatgcccttgaagtcgggctgtcagtcgaagggaagtctgaacctagactga cagtggcatgagcctaactttggagccattttatgcttgatccattgagcaaatgaccaagatccttatgacaaagttgaagctgatatataggagaac aactttgatgcagtggggttggtagttcactgcataaaaatcgaagaaaaaggtccctaaagtcgctttgtcaggcgccctgaattgggtgtggcatgg aacaataattgaatcatatcccaggttcactgaacccgtcgtcggcgtcctcacgtcgtgatccgagcttggcttgaagattcgtgtgaaggaacaaat atccgtggtgaaaagatcatcgggcagggtttggattggctcacgggcggaatcattcgtcgtacccacttgcacggcgacgtggccgagcagccatc gtcgtggtggttgcagtctcccgctgttacgccatgccgaggtccattttaccacgtcatcgtggtcgtggcgagtagggaattattgccctgaaccgtta caccgtcttgagccaccatgaccaagttccatgaaagctcttgatgccgtcgctggcatactcattcaaaattcacgacatgtgatgctggttcctttaa attgaatgcttccctcgctccaccctcttgttacgaacccagcgcaccacccctcactcccgatcatcaaccatagcttgagaatttcgtatttccctgga atcggttggaacaccatcgtgaaacacctcgccgtggccagcccttttcaggttagttcatgtccctctagtccccattttagtacctctctgacctctaga tgattacggtttcgaccaattgaactctaccgcgttatagccctgggaacgtcgtgttgtcgctcgggaatgctgtcgtggacacccctcacgtcgatcg gctcttccgggtcttctccgtccaaattcacatgatcaccatagttctagtgagtcactgatcccttccgggtcgtttgattgaactctaccggcttgtgta gtccaaattcgccggagctcggattctgctattgtcatgggcacacatgaattagggttggattattcaaattaacttaagagaagtgctcaggtagcct ctgggacg >gene3 Best maize hits: NM_001148386.1 Zea mays uncharacterized LOC100274000 (LOC100274000), mRNA >gb|BT042177.1| Zea mays full-length cDNA clone ZM_BFb0153F14 mRNA, complete cds EU966203.1 Zea mays clone 292556 single myb histone 1 mRNA, complete cds Zea mays single myb histone 1 (Smh1) mRNA, complete cds AY271659.1 Best non-maize hits: XM_002455819.1 Sorghum bicolor hypothetical protein, mRNA XM_003569238.1 PREDICTED: Brachypodium distachyon uncharacterized LOC100830626 (LOC100830626), mRNA AK365152.1 Hordeum vulgare subsp. vulgare mRNA for predicted protein, complete cds, clone: NIASHv2031H21 CT830863.1 Oryza sativa (indica cultivar-group) cDNA clone:OSIGCRA110E03, full insert sequence JF951953.1 Aegilops tauschii clone TaMYB70 MYB-related protein mRNA, complete cds AK334835.1 Triticum aestivum cDNA, clone: WT011_D14, cultivar: Chinese Spring NM_001049977.1 Oryza sativa Japonica Group Os01g0589300 (Os01g0589300) mRNA, complete cds AJ495788.1 Oryza sativa Japonica Group partial mRNA for MYB19 protein ATGTCCGTGTACTTTCTCCTGCTAGGGAGGGTCACCGGCGACGAGAATGCACGGCTCTGGCCAggtgcttgaccggca catggtcggggacccggagttgtaaagaccagagaaggggaaggcttatgtgaagttgtcagtgacacattggaacagtgccacagactctttattg gttgtttaaaacatctaggatctttgcgcaaaatcgtcagcgctggcgcaggcacaagcgtgtttccctgttcgtggaccggtttgggctagattcagcc caatcctgttcattattttctcttttccttttcctggtaacttgtgaaatacataaaaaatagtagaaaaatgattaaaacatggaccaattttactagact ccaagaaatatgtagtatttaataaaaatacttctaagatttttaattcaaattataaaatgtatagcgtaaggtacttgagcatagcttcttagaatttt agaaatattttaataatcccaaaaatcctaaaactttttgggtagacttaaaatgtttgttttgaaccttgagtaaagtttgaacttatttgaacactgttt gactaggaaaccaataataagcccaaagaattaaaccctttaatacctagggagtctaggagtagttttgtaagtagaacatgaaaatcatcactttg ctataatttggtagcccaagaagattagttgaactaatcctaaaatattatctagctaaaagtcttggtagttaataggcttcatgaaggaaaattgtat tcaagacacaacttactatcttcttgttataaatttgcatatccatgacacaacatatcgtaggatgtgaataaggccaacatatcatcatatgaggggt ttggtcccacatggatcacgatctatcatattattttgtcAGCCTCGGGAGGCAGGGCCGACCTGTGCCTCTCGCGCATGGAGAG GCTGAGTCCACGCAGAGCCTCCGAGACCGCGTCGGGGCTCTCGAGCGGAGGAGGccaggtaggcccaccctgaccgcat cgccctcccggttgatgatgaaatcgagcttcccgaagcggatgccactctagggtgtagtcaacgcggggaccgccactggccctcatggtcgcgcc aaggttcctcggcggtggaggctctcgaaatgaatcacggcttcggggatgcgattggtgtcacgcctggccatccagagcttgaagagctttggcac ggaaagcgcaagtcccctacctggcgtgccaactgtcggtctttcaaacctcgctccgaaaagtaaatttattgtgcgcattccatgctctggacggttt gcgaggtgtgcacaagatttatactagtttgggcagcacctccctacatatagtcatcggcggcttgcgctatcgacaccattgatgatcaaagctcgt agtatgggttacaagcaaggcgagagagggaggagaggctcccaagtctcttgttcggggtggaggtggttacaaggtggagtcctagctaggtctt ggcttggcagcagggcttcgatctccaggtcctcctctaattcgtcggtcttctgggcgtcgtcttctctagggtccgtccttgcatcgtccacccgagag aaagaaccctcaagtgctctctctagaagggtgtcctccatgcgctcctctagaacagggtccccctttgcgcttgggggccgacctcctccctttataa gctaaggagcggggggtcggcccgtggtggattcctttggaaggatccaccgggtgatggtaaaaccagggttcttaccatggggtaaagccacaca tgcttgcaatcgcttggctatcctgtattctttatacgaacgtcgctggtggcatgggcctgagcgctatcatttgggctatgccgaccctcgaccttcgta gcctggggctcggcgcggctcatcgtgttgcgccctgccagcgggcctgcgcacttaggcctagtcccgataggtcattagtgcaccgtcgtccaggga cgtgcgggccataaatgcaccacagtccgagggcttgtagatcatgagtgctctgcaccccgagggtttgtaggccatgagtggtaggtagggtccga ggggcttgtagttaatgtgaccgatcatttatgatgggttggtctaGGGCCAGGCGCAGGATCCACTCGAGTGGTTTCCCTTCGAC GCGACCGCCCCCCTGTCCGGTTCCGCCAAACCCGACCTCGGGCTCGATGCCACaaggttcgtttgggggcggattcttccaga gtagacgttatttgtctctttcgactgaccagtgggcccctgttgccagaggccctttcccagtgtgctcgctggccgatgggccccaggatctggggat catatccccgacagagcgctgcgatgtttaaggtgtacagttgcggctatcgtaggggaggggtcacccgtggatagtctatggcggcacaaccacac tcagcggccaacctagggatggcaacatggaattcctcgttggggaataacttccatacctgtcctggtgttgaatttttttccacggggatccccacga atgcttgcggggtacatttcttcccctgcagggataaatcgccgacgaggatctcgtccctacttaaactacaattaggacgtacatcctttattattaat gcaaaacattgtcatttatgcgcattgttatatgtacatcaatatgaacacatttatacaataagtaatctaacaaaataattatttatttttattattata cggaatatcgtcatgtgcaaatttaacaaattcccgtttggaagaagatagacatcgcttcaccatttcccgtcccatttatgtgctatgtgagaaaata ttttttcctgtaaaatggagaatcgggatagaatcccattgcctggttgccatctctagggcaaccctctttgattttatttatttttatgttgattttatttat ttttatgtttgatgacgacgacgaagaatcaaagagggcttccttcgaacgccacccccgggcctgtggcaattaatcccctcgccttcgtaaccgcca gtcctccgccacctctccacccaacaacaaaacaagcctaatgggtaatggcttcctcagttcctctccatctcacgccactgcgctgccctagaacctt cgccgccgcccctctttAGGATTACCGTGTCCTCGATCGCCCCCTGGCACGCCCGCCCCGATCGCCCCCTGGCACGCCCG CCCCGGATCCTGCACCGGAATTCCGTGTGGAATTCCGGACCCGATCGAAATCGAACCGCCTCCCTGCGTGCTCCC CTGATCAAGAAATTCTCACGCCGCGCGAACGCACGCCTGAGGCGGTTGGATCCTCTGGTGGGGTTCCTTGGTCTC GATTTGCTGTGGATGGGGGCGCCGAAGCAGCGCTGGACGCCGGAGGAAGAGGCCGCTCTCAAGGCCGGCGTC GCCAAGCACGGGCCCGGCAAGTGGCGCACCATCCTCCGGGACTCGGACTTCAGCGCGCTCCTGCGCCTCCGCTCC AATGTTGACCTCaaggtgacgctcggatcgccagggagggtgggcggtgggtttttggcgatcagctgaggggagtatggaggaaatctccagt ccttttagtcggttagactgaatgaaagaaaatcgcggttttgcacgctccgtacatcctgacgtgttccttatttggttctagatcggtattttaggtggt cggtactcggtagtaggtattgcatagagttttgaggtggctcgagagtaatctggtatatcagaaacttctagccgtagcatcatgatgcccaaaaag cttccacacccctttgttttattccatgtagattttttcacctgctaggtattgatggttccttgagagagaggtcacttccatgtgttgatcttgttcaccat aactatgatttgtttagttgggcataatatgtttttagtttccaagttcattttgaatgcctgtatcgatgagtacaaatgaatcttcggtatggtctatttgt gcAGGACAAGTGGCGCAACTTGAGTGTCACCGCAGGAGGTTATGGGTCTAGAGAGAAGGCAAGGATGGCGTTG AAGAAAGGTAGACGTGTGGTGCCTAAGCTTACTGCTGAGCCAATGGATGTAGATGTGAAGGATATGGACGATG CTCATGACACAGCCATTGATGTGGAACCGTTAGCAATGGCTTTCGAGTCCTTGCCAACTGAGGAAAGTCCAGAT AAGTCAGTTGctaggtcagtagactcggtgcaaatattttatcaaaactacttattgtctttatgagtaacctaccttttcagttcgtcatcttcctttt tggtgaaatgtgttttatgtgggactagatagaaatcatcgatataatttgatttgacattctgattctagtcctgccttgctagacattaaagttcagaa gttgaccttaaggaaattccttttaaaccatatctgaagttctgaactgtctcggcaacaactagtgcaggaatttgtttcgacttattgtctttatgagca acctaccttttcagttcgccatcttcccttttggtgaatgtggtttatgtgggactagaatgaaatcatcaatatagatcgatttggcattctagtcctgcct ggctagacattaataagttgtttggtttgaggaatgagttagttcatcatcttctcactcctcacttttttgtttgttttgtggaatggattgagttgatccat catcacctcattccttatagttatttagttagtactaatatgaggaatgaggtcatcccaccaaattttaggaatagacccataatgcaccaccatatttt ggatggagtgattcctcaaaccaaacaccccctaaagttcagaagttgaccttaacagaaattcctttgaaaacatattaaggtatgtgaactgtctcg acaactagtgtaggaatttgtttacagttgtgttttgagttggggttcgtgatttttgtttacactaacagttgcagcttacaagcacattaaaggagcat ataaacagtgatcttgtagaaaatcaaatgcttgcctattctctgccatattataggcattcatacatttcctttggtgctcagtttgacctctcttatcacg tctttagtcattaaagtttatgtcatttttaggaaatgaaatataaattattaccccctttcttttacgttagaaggcttgttttattcaatgatggttaaata taaaaggcacacatagttcccaatgctttcctatatatctaaaggctagttaacttcatgcacaaagcgttgaattgtttttttctcccaactcaatttgttt gtaccaaagcaacagagtagtaatagctacggtgagcatttatgcatgcaaccagtgtgaatgactctttgttgtgttcatgtatgtacacagtagtaat gtgtagttacttgggtcactttagtcagattagcaagtttttttttgtctttgttcccaaataaacgtgctcgctaaaactgaatggagggaccttgtctctg agttgatggtaactgtgcatcggaaatgaagttatatggtcatcgtaattgaaacaggaggctatgcacatgttaaactcttatggctagtgcgacaat ggtgagctatgtgcagagtcattaaaagagaacatattaataaatgttgacctgactgatactcttactaTAGAAGCTGGACAAGCTCTATG TTTAAGCCAGTCACAAAAAATCTTATTGGAGCAGGgagggtcatcagtgtctaaaaacatgagttgaaggtggttgtatggaacgat ggaacccaatcaatctgggttgataatttgatttggggtcgatgacatgcacaaatagttacaaaaggtcaatacaaataatacaaatccaacacgct aaccacctaataattgtgtggtgtgatagtcatgctcgatcatgcagccttcgtcctttcttgttcctcgctttctctctactcaggcaacctgatgccgcta gtcacgaatcaaccactgctagccaacatctcacactaggcagtagtgaaattgattgcctgcaatgttctcttctcttttgtgctcctagtttcattggcg aagtcatgtccacaattgttggtagccatgatcaactctagttgtctctcatgtcacaaattcacaataacataaactcacatcatcttcgtgtcctctcct ctccctgtacctcattatagccctttggatctgtggtgtactgtcgcaGTGGAGTGGTGATCCAGACAGAATCGAGTTGTTGACCTTT GATAGGTTTTGCCGTTTTGGAATCCTACCGCGACTatgtaagttttatgtatggaagatcgatggctagagcgtgtaccccgatgctgt ggtcgcgcataaggagtaggcgagcgagaaagagacgtggtagatggcccttgaggccaagcaagataattgtcttgcttacttagattgatacatct ctcattatatagagatgcttactttagccctaagtaagctattcttatttttatggaaacctcctagtcaagaatcaatcatgaccctattcaagaatcaat catgatcctaatcaagaatcaattatgatcttatttatggaaacaaatgtgaaatatctaaaggaaattattttgtattttgtttctaattgtccttatggtg accccaacatctataggtgacaccagttataagatatctcctcgcctctctgaacaactcgctcgagctagatatcggggtagaggtccttgaattgctg aagcagctcctaagtggtgtcagtcgccgccacaccattgaatgacatgccaagtgccctggcgtagttgggcgcaacacatggtctgctttaggcagt agtcgaccatcctgtatgggtggcacagGCGAAGGAGTCACCTGGGGTCCCACAGAACAACTTGAGCTAAGAAGCCATGT TCTTGGAATGGGAACACAGAATACAAAACATTATTTTTTAGAACcatggtatgaagggggtacaccttgttgtattgttttttcga caagtgaaacgagaaccatgacagtctggaacaatatactagaacgaggaacatgttcccgagaattttgggttgacgcaactggcaacgaaacta aaataacaagttcaaatcttacaaataatgccatgccatatacatgacctttagatttcagtctaccaaatctcgaccatccatccaaaatttggaaccc tagtcgcgggccatcttttatttgcacaacaacacaagctgccagcgtctgtctacatcactccattggtctgtgtccatacgcctttgtactctagcagc cgccgagcgcccgagttgtggcagctgcaagctgccattcgcaccaagccaccaactaaccagctcccgagtcaccagtccccacctgcggcagctat tgcccattgatacgaggtggtgacaccgacgtccggttagggatacacgacgatgtccaatgccggtggctttgcgtccattgttcgctctgtcactgac taccaaccgtgctactatggacacaagcatgctagtcggagcgtggagcgtgtgcccagcatgggatggcgtcccacgtcctaagagtcataggtgtg atgccctaatctaatcccaagtaactaactcaatcttctaatataatatttatccactaatcctaaccctgagcctatctctatgggcctccatgccctag gttgccgccccatgacatttagagcctacaaaaatgttttaagtgcctcaaataaatagttttttatttttggaaaaaagaaaaatactcggaaaaata cgaaaaattatcggagacttctacggggtatgttaacatgtttcaaacaccttttgtacttaaaaacattatacagcggtatgaatgcaacacactacc acaccaagttcacacttatgatctgttttagttgataaaaaatataaaaccataatgcgtgattctagaccactcaaatcttatcaagttaaaccaaatt ttcaaaaaaaaatgttgattaaaaaattagggtgttacaacagatctggttaagtacatggaacgagggcccaaagagggtgtgcagctctgtgaag tcctcggtggtgtcggtgttgcagcatgagggctttcatcaccgagttgaggcgaccggggcggtactctactgcgaaatcgaagccagacaacttgct gatcaactagtgttggggaatagttgagtcgctggtccaagaggaacttgagggcttgaggctgtagtggtcggtgcaggcgaggaagcagcgactc tagaggtatggccgccaatgctgcacagcctgaaccagaccaatgagttcacactcatatgcagctagcttgtgatgaagagcggtgaagggtcagct gaagatggctaggggaccagcccttggtggagaacggtgccgaaacttgcttgcgaagcgtcatagtctacgacaaattgtcagtcaaagtttgacgt ttcattcctaattgtctgcatggtgaccctaatgtcttagatgacaccagtcataacaatcttgcttgacccaacccttgatttgggatgatgtcctgggg atgcagaacgtgcccgaccctaacccctgtgattatgtgtaacaaggattaagcaacaaacatatatatcttgctatttgtagtatgcaacacatgtca gttgattacgttcagtaaagccaacttcgggacattattggatgtatccacatgtattattcatcttctgatactgcataacagttttaacatttttgcctcc tttatggtctgtgctctattgagatgttcgggtagtgtctttatttgttcctatggtgtctgcagtacatctcgactctgtaccacagttttcaagtaaacaa aaaaggaaactgtttgttgtgaattggaatataatgttgacacatacatcctctgagttccaatgtgtgccctttctaatgttcaatttcttacagaggttg ccttattatttagtttatttaactatgttagcatatctcttcagcggttgccttattatttagtttatttaactatgttagcatatcctcttgatgcttggtcatc catgtgactttatgtatgtctgatttttccaGGCTCGATGATCTCATATTAGAAGCTATAAGGAAGCTTAAGGAGCCTTCTGG GCCCAGTAAGGCAGCCATTGCTGCATACATTGaggtttgtggatatcaatttatggtctgctcacttgttgtattttgtttaacttaagtca gctgccctgctatttcttttctggtcactgctgcctgtaatgggaatatttgtctcatccttttgtatgaagaatctctgctgattttttgctaacttgagaca atgctatagtgatttttcttttgccgatggatttaacttaaattgtctatatatttAGGACCAATACTGGCCACCTGCTGATTTTCAGCGC CTGCTATCTACAAAACTGAAGGCATTGGTTAATTCCGGAAAATTAATCAaggtctctccctccctccttccctgggttgtcttttt aggaataatagtggggacagacttcaagctgcatactatttgaatcatgattgtAGGTAAACCAAAAGTACAGAATTGCACCAAGTC CACCCCCCTCGGGCAGAATAGGCACTAAGGTATCCTCTGCTGAAGGGATGAAGGCAGAAAATAATAATGCTAA ACGTCTAACTAAGCATCAAGTGATTGCTGAGCTGGAGAAGATGAAAGGCATGACCAAGGAAGAGGCAGCTGCC TTTGCTGCCAAGGCAGTTGCTGAGGCAGAAGTAGCAATAGCAGAAGCTGAGGAAGCAGCAAGAGTTGCAGAG GCTGCAGAAAATGATGCTGAAGCTGCAAAGGCTTTTCTTGATGCTGTCACTTTATCAATGAGAAACAGGAATGC TGCTTCCAtggtatagtgaacccagcctttgttagtatataactggtacataaactctttgactgataacaccttttgatgaatgcagATGCTTCG AGCTTGCTGA tcctttacccatgcggtcatatggttttggccagcagcacaccggtgtttctgctgcaaagattttctggatatatgagtttcttgtatagccttatttatag atcacgacgcatcaactggaggggttttcaacatcttgctgttggtcctttgtcgttgcagcctttagacaatcgacaatatgataacaagaattatcgt gaaagtagtcttcatttatttttgttttcttcccattgtgttgctttgtaattaaactctagccttttcgtttagttagttgaaatggtgtcgtctagtagaact gtagtgaaaagaaaatgaagaaaatgttagatagccaccttagttattcaatttgctgtgatagtatgtttctagatatgattagctaatgacagcgttt aagattacattcctgctgtttcgtttttttctatagagattacctcttttaagaaaagcatgtaagatttacaaattcggctcgggacaccaacaaaagta caaaaaagttcagcacaaatgatgtctcgtttctcatagctatggacatgtttgggatagcttttttctaaaggggttctatgaaaatttaaatttctaaa taatctagctgtcacaagaatatgagaaaggggcttacatgtggaggaagataaacatagcgatagtttttctcacatagtgacaacttggagtttgc attcaaattgaattgtcaattttcatagaaaagtgctaagaagtgaagtgtttggttgtctgaaaccgatttcgatggccagaattcatccagaagtcga aacaaaggggcttgtgtattcttatgtaacttttgtttggtttttggctaattttgctataacttctgtgaagttaatcgatcaaattgaacaactaatctta gacagaaaaagttaaataaattatgaactaaataggtctttaggctgactccaacagaaggcgtcatatcctatgcaataatagatttagcaaaacta aaatatgtataaatatttagaagtatagtaaatatgtatagttttatcatgcacatgtttagttctctgccattgtgaaggtgaggctgccaggcctactg ccatatatgttttgattgtaaataaagttaggcacaatatatcttttttagagcagcacaacttgccacatgtgtggtggcctattttttttggcatcacta aggctatccctagcagtactcggatgcgaatgtgcaaaatatgcgtgcaatggtgaatatgagcgtgaaaaagtcgcttctagtagagcgcgtaggg gatacaatccgacaagagagatgcaatatatgccctcctcaaggggtcaagcactcgatcagaactaaaccatcccgcgacactcacaacatgagca cacagggcaagataaagatgaggataaaccctaaagagatcgtcaacttcaccttggcatgcttccattagtgcctcaaaacatttgctattcttatctc ttgaagggtcctattgttgtagtgtttctagtgaaagaagtcccacgctagcagcctgccagaaccattaatacatggtttgatccactgaacacttgttt tatctatgaagagctgagttggcaccacgaggtgtcgactcagctatttgccgagaaataagcagatcctgccttctacccattatgtcgatgattaggt agagctcttacgtgggctaggttactaggatacgaccgttccgcaatagaacatggcactaatgtatcccctaaggtcccaacacctttttgatcattac cgaaatcataacatttctatatgataatttgagattgctataccagacctgactaccaagattgaagtctccatgctcgtcatttctagtgcttgcacaac ctcctggtgccttatttactgacaattcccgggtgtaagccaacacttgcaaatgtcttttatttcattaaaaactacgctttgtcgacatcgtcttttccaa acattcgggaccttagttgcaagtatgagaaatgttggttttggttaataaaaatgaaacattcatgaacaaataagccattacttaattcaagctccac cctgaaagtgtcaatataaaagcgaacaactagaagctcccacatcagaagctagaattgacgccacagcataaggcatcgcctccatgctatctgc acatggccccgaagtggaatggctaatttgcacaatctaggaagctcagccttctccctctcagttcctccgcaatcgacttaccagaatcgttgcggca actccaattcaacccccactctttcccggtgtctatacttagtgaaagtcgtctagaggggggtggataggcggaaactgaaatttataaacttaaagc acaactacaagccgttgttagcgttagaaataaaaccgagtccgaaagagagggcaaaaacaaatcaactaagaaataaagcgagtgacacggtg ttttatcgaggtttcgggttcttgcaaacctagtccccgttgaggtggtcacaaagaccgggcctctttcaaccctttccctctctcaaacggtccgtcgg accgagtgagctttcacttctcaaaacaaccgggagcaaaacctccccgcaaggaccaccacataattggtgtatcttgccttggttacaagtgagtat tgatcacaagaaagaatgacatagataaagccatccgagcccaagagctcaaatgaactcgagtatcactctcactctcactagggttatgtgagga aatggagaggatatgatatcttaggttgtcaaaaattggatgttatagttcttgtagtaattgggaatggatctatttgaatgctatgactggagggatg gttggggtatttttatccccaaccaccaaatgcgtcgttggcagctattgttcgatggacaccgggtgtcagccgacaaccggcgcccccagcgcgaca ctannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn nnnnnnnnnnnnnnnnnnnnnnactaaaaataataaaaaaatcagaaaaataaaaaatttataaaaaaaaaaaaataagaaaataggaag ttaaagaagttaatggatacctatggaatacaaaccctacctttggtttggtttaagtagtggatgtggggcacttccttccgccgcgccatcgcttgaa caacagttcaattcggcgatgacgcggatctcgctcttcagcacgataaagacacggaaaagatttttgaaatgtcctccctccaaaccgcgcgcaac ccgtgccgccgaatgcagtgtgttcatgtagaatgcaattcttccnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnaaccaccaattcaaccgttggggaggatgt atgtcgacgagcgcaccagacagtccggtgtgccagccacgtcacccaaccg >gene4 (reverse) Best maize hits: AY664415.1 Zea mays cultivar B73 locus 9009, complete sequence AY664419.1 Zea mays cultivar Mo17 locus 9009, complete sequence Best non maize hits: AC243120.1 Gossypium raimondii clone GR__Ba0131I15-hog, complete sequence DQ645537.1 Zea luxurians mitochondrion, complete genome DQ645538.1 Zea perennis mitochondrion, complete genome TTAGGGTTCTGACGGTTTTGACCGTTGGAGCTCTGACATCTTGGGGCACCGGACAGTTCGGTGCCGCACCGGAC ATGCACTGTTCACTGTCCAGTGCGCCTTCTGGCGCTGCTCTGATTCTGCGCGAACTGTCCGCGCACTGTAGCGCTT TTGCAGGTGTCCGTTGGAGTGGACCATTGCGCTGGAGCCGTTGCTCCACTGGCACACCGGACAGTCCGGTGAAT TATAGCGGAGCGGCTCTCTAGAAACCAGAAGGTGGCAAGTTTGAAGGAGTGCGGCCTTGGGCACCGGACACTG TCCGGTGGTGCACCGGACAGCCCGGTGCGCCAGACCAGGGTTCTCTTCGGTTTCTTTTGCTCCTTTCTTTTGAACC CTAACTTGGATCTTTTTATTGGTTTGTGTTGAACCTTTGGCacctgagaattcatagtctagagcaaactagtcaatccaattatttg tgttgggcatttcaaccaccaaaatcatttaggaaaaggtttgaccctatttccctttcacttagctctttaaaaagtcttccagtacctccctcgtggcag cccgttagaagcatgcctcataagaatctttgtggcgacaatcgaagtatcctcactcaattttgaaacaaaccaccttccagtttcgtactgctactac ctcctaggtcgtcgttctttccattgcttcgggttctttcctatgcttcggagctaccaaactcatagatttgcgatacagggacagttgaccgcacgctgc acaagaaagcctaaagtgcggcgatgtcataatgtgctcacatcagaagctagaattgatgccacgacataaggcatcgcctccatgctatctgcaca tggccccgaagtggaatggctaatttgcacaatctgggaagctcggccttctccctctcagttcctccgtaatcgacttaccaaaaatcgtcgtggcaac tccaattcaacccccactctttcccggtgtccatacttagctctttgaaaagtcttccagtacctcccgcgtggcagcccgtcagaagcatgcctcataag aatctgtgtggtgacgatcgaaggctcctcactcaactttgaaacaaaccaccttctggtttcgtactgctactaccTCCTAGGTCGTCGTTCTT TCCATTGTTTCGGGTTCCTTCCTCTGCTTCGGAGCTGCCAAAGTCCTTCGTATTTACATACACATATTTGCTCTCTG GACCCATCTCCTGGTTCATATTTTCGATTATCCTCTTCAATTTCTTGTTGAAAGCTTCAGTGACGCCATTTTTATAT GAAAGACCTTTTTTTATTTGTCTATTCCACTCCAT ttaaatctaataaaggagccaatacaatgaaattccactagtatagaaaaagaggaatagatgcggatgcacatatacatcaatgggagagagaaa aacacatagatttgcgatatagggacagtcgaccgcacactgcccaagaaagcctaaagtgtggcgacgtcataatgcgcttggtaagtttcggccg ggacaaaatatatcctaaatttgattcgtttaagacgtgtatggctaaacttttataagtttgatcgttagtatattaataactaagataaggttatatta aaaatcatgtttatatttgtttcaaatgaaatttacaataacacaaccttattgggttctatggcgtgcatatttttacggtgatataaatatttaaaaagt attactcgtttgaagtgtcttttctataatagtattaataggttagacttctctataatagtacttagaagtgagtgacttctctaaaaccgtatttcactct tatttttgcctatttctattttcttttcataatttctatcgcttaaaaagatattttacctagatttcaactccttttaaccgaaccctttcaaccccatcacta cattcagcttaggttaggatggacattaaaagtgtttatattaaatttgttttacagattccaattaaaaacacaaaaacatttttgttaaaattttaaaa ttgaaatagctaaaatttgtaaaaaaaaaacaggacaactattagagacctgtacacaaaaatatgggtttctctacgagcaatagaatccatatccg atagaatataaaagttacgggttgaggaactgcaaccataactcattgagctgttgcaggggtaaccatcctccgtttatatgtataggacctgttgtgc aatttttatttttatttttaaatttttaacaaaatgttacatgttttcaattagaatttgtaaaataaacataacataaacacttttaacatccatcctaactt gagctgaatgaagagttggaatctaatcactacaggaaaatggttaaaaaacgtgggtgaaaaatcgtcgaacttaatgtaataggccgtcgaactt accataagaacgtgggtttaccgacgaatatagccgacggtgatttttggatgaacttagagaagtaagttcgacagccccgacgaacttaacattaa gaacgtgggtaccgtcgaacttaatattaagaacgtggataccgtcgaactagtatgagcaataaaggtaaaaagatatccttttaaacgatagaaa ataagaaaagaagatagaaatagttgaaataaggatgaaattctgttttagagaagctcacttgtaagtattattttagataaacctaatctattaata ctattatagagaagccaatccaaactagaactataatagcaaataactctataaaaacacacacatataaaataattgtattattattattatttgccc atcttcttctaactaagtctataaatacgaaaactgtctctgttgaactttctatacaaccatcgtattattttatatcaggggcatactaggcccgtacag attctacgactgcataggccccccaaaatcataaggccccaaaatctacagcaatatatatgcaacaaataaaaaatgttttgcataaaatattggtc aacgtgactaagaatcacaagaacgttgaaatcaatctattcctctcatttttgatcaagttatttcttcccttataagaagatttaatttaaacagtatc acgattataaaaaatatttccttatttatctctaatgcattgcaatcattagataaaaagatcttaaaatattgttgtcacaatcttaaggttgcacgtaa aagagacgaacaatctgatttgatgcaagtgacatatttgatgagttgtttcctctaagattttattccacaagaaagtatgtgccctattgacattctaa atttttttgaagcaacataattatttttctaatgcaactgatgcaccgaacaaagtttttctaaactaaagttattattgaagtcccatttttattctataac taggtgagtgcccgtgcgttgcaacgaaaacatataataacacaaaaattatgtataaatgtgtgttatattgatacgagaaaagatagtgaagataa gctagcagaaatttgcatccaatgttattgagaagtgtcttacttttggttctcatgagcaacgataaattctgatcaatgaaatgcttggtagttggcac aactgatgagaacgaacaattacatgtttgttttcaacaatgattctcagctcacctttttcttttaaaatcagtagttggcacaaccactctatagcaca actcaaagcacagaagcagacgcctaagaaacacaacacccgccggtaccagcgcagtgtgttgccgaccggtgacacaatgacacaagacattat ggtgtcggctaccacctccttggacatcttcaggaacgggtccatgttgtggtggtaccagccattttgcctcactcctatggatgttactctccagggca ggtgagttcatctttaac >gene5 (reverse) Best maize hits: AY103582.1 Zea mays PCO147975 mRNA sequence BT053879.1 Zea mays full-length cDNA clone ZM_BFc0001N20 mRNA, complete cds BT043117.1 Zea mays full-length cDNA clone ZM_BFc0100C04 mRNA, complete cds EU960258.1 Zea mays clone 222946 SNF4 mRNA, complete cds Best non-maize hits: XM_003579593.1 PREDICTED: Brachypodium distachyon sucrose nonfermenting 4-like protein-like (LOC100844830), mRNA AK367711.1 Hordeum vulgare subsp. vulgare mRNA for predicted protein, complete cds, clone: NIASHv2060P11 NM_001059222.2 Oryza sativa Japonica Group Os04g0401300 (Os04g0401300) mRNA, partial cds AJ575236.1 Oryza sativa Japonica Group partial mRNA for NF protein TCACAGCTTGCTGCTCATCGTCGGACCAGTAGCCATCACGTCGCGCATGGCTAGGCCCTCATACTCCTCATCTGAC TTGGTCGAGTGTTCGTCCATAGAGCCGACATCGATCTGCAAGTACCCTGTGACGCGGAAGAACATTAGCTTTGC AGTGTTAAGAGGtgctgagaagataaggaaaacaaaaaaccaaattgttgaacactgacataaatcgcccatacctcgtgcatgtgcagcatg tctcttttttcatcggaggtcttcatcagtttcagcatcacgaagatcgtcaacttaaaatatatatatagatatatgaaggcaccagagtccacctgcca tgagaaccagacttgaagaggctaaatttaaagatgcacgagttaaaggcaagaagagtactgatatcattggtgaagtggttacactacctatgctt gccaaccaactttcctatcaagctagtgtgttgccttctgcagtatcacatacagttttcagcataagagttggttcaaatgacttgtttaaaacgtgcac tgcaggccttttttgggcaaaatgtttccctcaaaataaatgcacaaatataaaagcagtgacaatccgggaggaaacacagcttattcactagccct gctttggcgacagagatctggtggggtaagtttgttgcccttgcttcaaacaggaagcatgccgagttgagcttctttcttggccctgagatcaatgttgc ctgtctgtacaaactacagagtggttaaaaacatagttcctcactgaataaaccaaaattgttgatggaatggaagtccaaatgatcaaacatacaat agaaataggtaattctgaaagttggggaggagaaacccaaacaatttctccttatttcaagacataagatgatttgtaattttttaacacaaacgacat taagcttaatgttgtattggagaccatatctagaatttcatacaagtaatatatttatattatgaaagtgctgatcttgcaactgcttacacattgaatca aatgaaagaatcagcaaatatatggagaagtacatggattaatggcgttctctgcatgacagcacCATCAGGCCCACCATAGAACTGTAG CTTTGCTTCTTTCCAAGCAGAAATGGGCTCTTCGTTGCCAATAACTTGAATGTTTCTCTGCAactgcatagcaaatgataa ttagtaagaacataaaatgcttccaagaagaaacaaagtaatttattccacccacCTTTCTCAAGATTAAAACAAAATCTGATGCAGT GAGCATGCCTGTTATGGTTCCCTGACGGTCATCCCAAAAAGGAACCAGAGCAAGACCctatgagacacaaaaggttcga agctattcttaaattattaaactgatgccacgtatgaagtgcagatgaaagaagtgcctgcttctaggagcgattttgttgtgtttttaaaccgcaaaac ataataggaaaacaaattaaactgctacttaaaaaggcaaacaattgctagcacaactttaaaaattcttcaaaatgaaagtaaaaacagagcaaa gttgaaatgagagcaaacaagcaccatagagggaaagtgaactatttgaagcaactctagcaagcacagtaacatacataccACATCATGCAT TATTTTAAATGCTTGTTTAACAGGAAGCTGAGTGTCCAAAACTGTTAACtgaaatgttcatatgaaatccgattatacacatag tctaaagagaagtgctggcttgcaagaaataaacctaacCTTGCTAGAAAGGGGAACAACGTCATATATGGTATTGTGTAATA ATATTCCAGAGACCACATGGCGGATAACTGCTATTTGCATGCTTGGGTTCTGAGATGATGGCTCCAGGGGCAT ctttgtttatttaaggccacagtaaataacagaccattagttatcaggaatatggagaacaagaaaatataatgttaaacttctaagacatactgttttc aaaatggtacccttatccatattagttcctctgatagaaggctctggctgcacaacgggtagtacattgttttccacaagcacttcattgctgatcagtcc atattcatcacgtacaaagggttttgtctcatcacacctccagacaccatcaaccaaaaaccggtactaaattgcacaaaaatgccatataaatgtatg tggcatagaaaatgtcacaggagaaacataatatcttatgcaaatagctgggcagtggccatttgtacagcaaatatgcactactatagtatcatgttc aaatagcttgctaaccattgaactgaatcaaaccatcacatgtgaacatcatgatttacagaactcgcaataacaagggtgtgaatttggtacctgtgc gcaatgttatgtccaagaattacattgatagatgaaaaagcgacacatcaaatacaaaaatccatcatgaatatgtcaaatatatcatgtttcgatcaa ttgctggaattgaaaaacatattgaagttcaaaatagttgtctggtatatgcaaaagtagttccatgtaaagccaacagttaaggggcattcaagattc atttagtctaactgcaagtactaattgctggggaagaagaggtgcaataaaaactccatgtacagaaacctgagagacagaagtgtggttgtatttaa tcatcactcactttacaatggcgttaaatggctgctttctgatgcaacaactgcaatggcattctaccttacaggcattatgtaacggccaaaacatctta tcgaaaacgaaagagacagtgggatgtctttacaggcataaagattcatgatgcctttatgggggtgatctcgcaaatcaaattgcgaagaatcccga gctcggttacttattttaggaataaaagcattaacaaacacaattaatttaggaataaaaggattaactaaaaatgcagtcatttgctcctggtattttc tctgttcagtagatggacaggggctcaccattccactccttgaggcacttgcgcagcgatgtcttcctaaaatgagcgcttgaagtactcgtgataatca tcctacataccagctccacatgaagtagcatctccagtacgatgaacaagatatagaaattttaggcaataactacatcaagaaactgaaggcaaaa aacacaggactatgaccgcgaactgaaagcgagcagctcaaaaaacacggtattccaccagtgacccagtcatcacacgctgaaagttccacaggt gtggtttcacatgtctacatcccagctattgacttcaatgagtggtatcaaatttggtttaaaggagagactagaccagtggaaagccagactggagga gttctcttctcaatgaaacatctggcgctctggtaggttgtcgagttccacaaacgtgcgggaatcactacaatgtcctagtgccatatcagggagaagt tgactgcttcttgcggaaattacagaagcacatgcactgcagaaccattccatcgaactgacactatgacacccgctgactaccaatccaatttgatgt cacttgcaggcttgcagcgcaccgcgtaccatgataacaatccagataaggggctcaccgtgctgatggggtgacgagttgaggcgctgcctgcgttc tggcccttccaggaggttgatgttgacaacaacattttcttaaacaccttcggcccgcccgcccttccgcaggaacaacacaacaaacaggccacata ctgtaaataacgcccccacagagaagcagtagataatgcagagcaaaataaaaaacatgagctctgtttgtgttcctcttcaagcgagaggccctaa cgtcaagcatcggaggccctaacgtcaagcgtgacagcgacgacgtcctccagcatcctcgcgatcctcgtcgagtgggaggaggagaatggatttg ttgtgggagagggagcgaggcggctgcggataggggatgactacaccgcgcgacgtggatgggcgggtgggtgacgctgccaggcagggcaggcg taaaggtttagggagaacacaaaagcgaatgaatcaagaacgcgggagagcacaaagcgaatgaagtaacctcaggaatccatatcctggattgtc gcgccttcgcccactgccgaccaccagcagttccctcccctcctctacctccctccccctgctggatctagccttgggctccacccacgaatccaaaccct agccccgcagttgtctcggtggaaccaggccatcgcgggggagggagcggtggatccggtgtgcgggcatgtgggagcgatggatccggtgtgcga gcggtggattcggtggaaccagaactcgatggtgctcgttccacctccactgtgcccgttgtgcccgcggtggagcacctatttcacgtcgccctcgacg cgcatgcgtcgctggttccaccactcggactcgacgacc >gene6 Best maize hits: NM_001174809.1 Zea mays uncharacterized LOC100382044 (LOC100382044), mRNA >gb|BT062978.1| Zea mays full-length cDNA clone ZM_BFc0013I21 mRNA, complete cds NM_001196777.1 Zea mays uncharacterized LOC100502299 (LOC100502299), mRNA >gb|BT087700.2| Zea mays full-length cDNA clone ZM_BFb0162H19 mRNA, complete cds EU975938.1 Zea mays clone 506018 hypothetical protein mRNA, complete cds EU964704.1 Zea mays clone 280953 40S ribosomal protein S17-4 mRNA, complet Best non-maize hits: NM_001055203.2 Oryza sativa Japonica Group Os03g0103300 (Os03g0103300) mRNA, complete cds AB433794.1 Oryza sativa Japonica Group qLTG3-1 gene, complete cds, haplotype: 5 AB433793.1 Oryza sativa Indica Group qLTG3-1 gene, complete cds, haplotype: 4 AB433792.1 Oryza sativa Japonica Group qLTG3-1 gene, complete cds, haplotype: ATGGGGGAGGGGCGTGGCGGTGGCGGTGGCCGCGGAGTTGGTGAGGCGGGCGTGGAGGTAGACAGTCTTGT CCCTGGCGTGGGTGAGGAGCGCGGTGGTGGAGGAGCGGATTGCGGATCTGCCGCAGTGGGGAGGCGATTGAC AGCGGCTCTGGTGTGGTGGGAGCGCGGTGGGGGAGGAGCGGATTGCGGATCTGCCGCGGTGGGGAGGCGATT GAGAGCGGCTGTGGCGTGGTGGGAGCGCATCCGTGCGATGAGGAGCGGAGGGACAGGATTGTGCGGCAGAA CATGGTGCACTATTGTGAACGTCTACGTTGAAGGTTTATAG agtagtagagatgacaaaaaacttatgcattgcgtgtaataccaactatatatcggtaaaattactggagaagttcgaatgtgaaaatttcattgaag attttattttaaagaacactaaaagaataatgctttttaaaatgaagattatataatatttagtcatcgttattttatgtgcacatatatacatagcattac ttaatatattttttgctatattttgtacggagcctcccaaaatgcgaggacctgttatatatatatatctctactactccttaagagggtaaggagggcgtc cacaccctcaccgcggctccgccctccccaccccgttgatctcgctgccaccgcgcacgcccccgtcttcgtctcccccgcgctaatctcgccgcggcag tgcgtcctccgccccaccttctcgaaaatgtcgtcgcccccataaccgcgcgggccggcaccgcggcgcgcagatctcgctccggctcctcgctggcac aaaccacacgcgcaccctcgacccatggcatccccaaatcccgcccaggttcctctcccttcccattcgcggacg >gene7 Best maize hits: NM_001196826.1 Zea mays uncharacterized LOC100502348 (LOC100502348), mRNA >gb|BT087877.2| Zea mays full-length cDNA clone ZM_BFb0280P11 mRNA, complete cds BT067383.1 Zea mays full-length cDNA clone ZM_BFc0037B10 mRNA, complete cds EU975326.1 Zea mays clone 487586 mRNA sequence DQ244682.1 Zea mays clone 10027 mRNA sequence BT016603.1 Zea mays clone Contig436 mRNA sequence HQ140759.1 Zea mays Mu transposon insertion mu1013340 flanking sequence Best non maize hits: XM_002444661.1 Sorghum bicolor hypothetical protein, mRNA AP006170.2 Oryza sativa Japonica Group genomic DNA, chromosome 9, BAC clone:OSJNBa0042B15 XM_002454373.1 Sorghum bicolor hypothetical protein, mRNA AF133839.1 Sandersonia aurantiaca papain-like cysteine protease (PRT5) mRNA, complete cds NM_001070477.1 Oryza sativa Japonica Group Os09g0564000 (Os09g0564000) mRNA, complete cds AK071733.1 Oryza sativa Japonica Group cDNA clone:J023107H18, full insert sequ ATGTCGTCGCGCGTCGACATTCGCGTCCCCAAATCCCCGGTTTCTCTTCCTTCCTGTTCGCGGACGGCGCTTCCGC GCGTCGACGTTCGTGTCCCAAATCCCGCCCCGGTTCGGCGCTTCCCCACATCGAGGCGCCACACCATCGCCGCCG GTCCCGATAGCGTCGACAGCGCGTGGCCCCATCGTTCGCGAACCAACCTGGCCAAGGGTGCCCTGCTGCTGACG GACAAGGACCTGGAGTCGGAGGAGAGCCTGTGGAGTCTGTACGAGCGGTGGCGCAGCGTGCACACCGTGTCG CGGGACCTCACGGAGAAGCAGAGCAGGTTCGATGCGTTCAAGgtgaactctaggcacatcggcgagttcaacaagaggaaag acattcctacaagctcggcctcaacaagttcgtggacctaacgcagGAGGAGTTTGTCAGCAAGTACACGGGCGCCAAGGTCGTC GACCCCGACTCTGAGGCTGCCGCTGCCAGGCTCGCCAGCGACgtgtgcgtgtcgtccagcgacgagtcgccgccgcaactggccg cctctgccgtgaataccatcgtcaccgccttgaaggaccactgagtcgtcaccgccgtgaacgtcgttaatttattctgtattgcagACTGGAGAG ACACGAAGCCGTCACCGCCGTGAAGGACCAGGGGCAGTGCGCCAGCTGCTGGGCCTGGGCgcggtatcatttccttgct tgttttacaagatccattttgatggtggccatgtagtcttagctgagtgacgacttgtttttcatagatctgatctgagtttgtcatgatgtcgtgcggcgat ctgtttgtctgagttcggagacaggttagtattctatcagatccaatggttcattggttttgtcacttggacatggattcggtggtttcaggctcaattgttc gtctcctgttgtggtacgatcagaaacagatactgcagccgttgcctgcataccaggtgaattgtgtggagaccttcggtccattcttatgttaactgttg atcacgcttgtggccttctggttgtttgtccttcttaaacttcagatgcaaggagggacactagccgtatgtttgagagacactagccgtgcagcacaat cagatgcaaggagggacactagccgtccttcttaaacttcagatgctgatctcaattttttcacaaagacactagccgtgcagcacaatcttacggtgtc tagagcatgaaagctccattttttcacagcatgtgctttgattttatggaggaggcgtatgtttttaccacttactcatgaaggctccaattttcattagtg catacccttgttttgcagttttcatgggtggtacatgcattttcttccactaaatatcttgaagctatacattgttgtatagcttcccattgagactgcctttt tatttgaatgtaacagggacccttttgtaatcaccttgttctttgccttgtcctttgaatgggtaaagctagatctgtcaatagaatacacagcctgtttga gagagctctagctgatgataagttgcaaaagtctgtcatgttgtggatttatccttgcttattgtttaattactaatcactatattttcattttaggtagcac aggaacaccagttggaaacaaaaatggagaaagcagctcagaatgaagctctgttttgtttttttcacagcaaccaagtgcaactagattaattaatc tcacacatgagataaagcttaagaaacatggtcttttgccttcgcacaatcattttagtggatcaaagatatgtgatgatcaaaccaaatcaactccaa attcagatatgggttctttggattatgaccttaccaacaaaagctagactacaaaagtttgttctaactcctaaacgttatgttcttttatgactcctgaaa gtacagagtgacatatactatgtgcagatttttgtcgtttcttactctctgcttgcacttcatataaacaaattaatggttgtagttcttgtgtttcaacttac tgcaatgttatatttgaaaaaaagatattattatattcctttccctgcagtattccatgataacctctatttcttcattaggcaccacatttttttggccata atgctaattattttttctaaaactcaatgttggatatttacaccagcattttcatgcttaaaaatggcattcaaacagcctataaatccaataccacgcct aaaatgaatagtaaaactataactataagaggtttcattgtgcaGATGTCCCACTCCTTTCGTCCCGTCGAGGGTGTCGAACAATG AACGTGTTCGATTGTTGGTCGTCTTTGCTGCTGTGCAGCTCGGAGTTGTATGGGCTGCACAGATCGATGGATTCC tggtaggatcggatcttaaggacctctatctattttgttcatctgacaaataaaaacaagttcagacagttcctaaatcactgaaggagccgcatcaaa tagagcttgactatataaaaatgttgaaaagttataaaacaatttcaaattttcttctaaataataaatagctctccctctataaaaagaatttagaact gacattagcttcggtttgtgccaaaattgaaaccgaataagccgtcaaggtattgtggctttggatctagcccttaagcatggtctcttagaactgacat tagcatctaatctcaggatacacgtactcagtaatctatattttataataataatttgtacatgcacttactagatgttgaattatatcctcagcatgcata acatcttgtgagaactgagatcctgtcatagttgctgtttgggaagccaatatctcaatgggatcttgagtagtagcatcctcagttactttgaaactgcg gaatgattttctcctgtccattaacagtttttcaaattttctccattccaatttatcatagaaggcactgtatatgatagcgtacagtttagaaatcaaatg ttgccgccataataggtaccacatgttgtcactggttgtatgataattacttgatcttactgaccctgcttgttcttcaGGCTGGTAAGCAAGCAC CACCAGAGAAGTTTGTGATCCCAGCTAAAATCAAGCTGACACTGGAGCCATGGACGTCGAGCTTAAGAGGgagg tcatcaagaaggcatacgagatggaccttgcccagttgtacaaccagtgacactaccgctacagtttctcaaggataggggtgttgcttagcttgtagc aaccgataacaacatttatttttgtacttcgaactatgtttgaaagcttgtctttcttttcagactgtggttgactggtaaatcaattgcAGAAAAAAG CTGAAGATTACAGAAATAGTTTGCAAAAAGAAGCACAAGCAAAAGCTCAAGCATTACACTTTGAAGATGAATT AG ctagaaaaagaatgtcggtattgaaaagtcgctttttatatcttcttttagctacttcctataatgtttataatgttgttgcgctgctatgaagctagcaca caatttaaccaagcatatccattttggcacacagatcatgctccacagaggcgataggatgctgaactcgtgaagataaaagaggcatcttcccttag gaaagaagctaggcgtgcaacacccgacatctaactattctttcaatatatactgagtactgactcgtacaatatagcttttttactgtctttttgcacac acttgttacctttttcatatcacttacaaccatattgaatccttgatataaatctgtatcttattatatcttgacctggcccatgtgctattattttgtcagtta tcttaactgttgcacaactatgtgtttttcgattccacatgtatttaccatttcttttactttgcagttgtatgtatagttaggacaatctggtgactatattta ttggttcaggattggatctctagcctcacaagatcggcgacttgtatgataaattcacttacagttaagttacctttggtcaactggattctggggcatgc acttgtagctactggcaacttttagaccatttttgtaatcttttcaattgaaattcgtggctatttgagagcttctctgcgtttcactgatatcttccttatca ccggtgttgacgcgtggtagtggaaaagacacctaaacaattcatttttgtttggttggcaatgggagatcacacctaaacaattcatagcattaagga atgcattcgacccttaccaacatagagctgactaacttgcgaaagttagcgaggaaccttccatgacattttgctctaggtattgcttttgacagtgtca atattacttttgactataatagattttacttgttgtgttccagaataatgtgtataatcttgcataaagtcaggccctgtgtttcttaagagcaaaggactg agttgtgttgctactttgaacatgaaaataaaatgtggttgtgcactatgtattgcatatctagaattatgtatgaatcatctacctcctgattagataag cgttctcaattttttaaaatatttaccagatattctattctatgctactcttgtaatttgtttcattcaactatatagggagtacccagagtaaggagcactc tgccagtttagatgcatggaggggtgttgatgcctttgcgacatcaagcaagtttggtaaacaaactttcagtctagcagattacagagagcactgaga gaacctgttttatcggtacgacattgtttactcactataacttctcacacagaaaacgacatagtgatctatccatctgagtcaggctatttatttgacaa ctgatacaaattgatgtagcaacaaagttttccagttttaaaaatatttttgtatgattggaattggagggttcaaatgatgaagggattttagaaatct ataatatgtatgccttttctttagtctccctgcggaatgaggggctcaacttggttaagagcaggttagctaaatgctctatgagagacggagttctcca caactttctttggggctctatgcaacagccactcatcaactgtagtaggaggagggcatttatacataagttattgtagtaatatgtttccgttgcaacgc acgggcactcacctagtatatatatatatactagctgaatgcccgtgcgttgcaacgggaatatataataccagtacactacgataacttatatacaaa atgtgtgttataccgttataagaaaatgtttcataatcaatttatgattctggccatacataaattttgttatttataatctatttgtttcaccactatattgc aaccatcagtatcatgcagacttcgatatatgtcacgatttgcatggtctcatcattggagagcacgtttcacacataccggaagaaattccctttacat cgttagtcatcagacacgtaccaccgtacacttttgcttaaacaaaaaggcaagtgtgtgtttacgaagagaattaaaggcaagccagcacaaaagc taccccaacggtggcgaggatgacgaactggtcattgttgtcggtcctcctctgcgtcacctctggcgccaagatgacgccacagtcctcgatatagttg tcgtcgaacgcgcgcgacataccgagtactgatgactcttggctgggctgtaaaacgagtgctcccggggctcatcagcaaggtagtacccctggctg ttgcaccaccggatgcgctactcctctacatacatcgtgttcaaggacactcatacaacgtaagcaacgaccatcgtctcagcgcacaagaattcatgg ccagtcagtagcgacttacgtggcaggttgggcttcaggtggacgatgagctggacgacgtgatggcgtcgtcgtcgaatgcggtgccaacaacccg agagtcgtcgacgttggcgacgaccatgaggtccccctgcttgacgatggacagcgcggagcagccgctctgcaccgcgtccaagcggcggctgcgc cggagcttgtcgtacatagcggcacatgcggccacgtaggactgtttctagaggtcgaactgatagtcgccaagtttttcttgtcgtcgatgagcgaccc caacacgagtgcctcctgttaatggtgtttgttcggggttttcaagtaagaaacatgaattcatgtttggcgttggtttataaaaatgactcacaagtcag atctatggaaaaaatattacgaagaataaatatcacgcatgcaaaaaagaaatttaagttgaaaacattattcaaacaaaagaaattgcatgcaag gctcttctttaaatactactccctccatccaaaaatataattgaagaatctcggtgatacttatctactacacgcattgtgcaagggtagcaggtggactt ggggagagatatagtagatgtattttctgttataaatgtaaacataaacacatatgtggtgtagtggtagctactgctattatttgtttgagaggttttgg gttcgaatccccttgagaccatgtttatttttttaattttagctaggcgtcggctgtacgggggaatgagaatgagctttacagggaggggaatcggaac gcggggaggagggaatcggaaaggcacacaacataggaatggatgcgcaatggggtagcgactactattacagtcttaataagtagtatatatatat atatatatatatatatatatatatatatatacgttggtttttaaggtgatctggatatttgagaatcgtgatttaaacatatctctctattcctggataaatc aagagggtgtttgaatgcaatagaactagtagttagtgactaaaattagttgagacattcaaacaccctagctaatagttcagctattagtttttttgta aattagttaatagttagatagctatttgttagctagctaatttcactaataatttttagtcaactaacgattatttctagtgtattcaaacacccgggaatc ttaatccaaataaaaacagacgatattattaatattctttggtctaatatttagacagactattattaggtaggatcagccagtcgtgctgcagcaaagt attgagtcgaattgtttgaaatatcagcttagttttgcaagattccaaggtccaggtacatcacgtatcacttagttttgcctgacggctgatggtcttcat acattattagaaggaaacaagcgcgcaacagaaaatcaactagaggccgtcagggacacatatttacattattagactggtccaacggaagtttaaa aagcaaaatacaactcgggcttgacacctgatcggatggctttgagttagcacgtactacttctacttgatctgatcggatggctttgagttagcacgta ctacttctacttattgaaggactcgagaaagaagcggttggggtctccttttctgctcaaggtcctcaaaataaataagttgctttgattcatccaacag atatactgaatgaccaaatgtcaaagctagaatgacgaagcgatgaacgcaatagacacaacaaggtctggcaacaaaccagatcaagggtagaa acacaactttatccttagtgtttgtattaaaacatatgtatagatattaagagcacaattgcaatttcgtctgggttgcgtcccctgctataaatagatga acattacccacatactgttcacgtgatcattgaggaaagtaattagagaaagcccaaagtgcaatctcaattattcattctccgaaggttactcttgtat ttcccttcgcgtgaagccgaaggtacaaatgtaatcatttattcttgtattgctcaaagtatggtcttatcaacatatatataaataaaagaaaggaata gttaatctcttatcattttctcatcattttaacatgaataccttcgtcatattataaaccttcacttacaaacaaccttcagcgaaggattgattgaaggtg aagttacatattcactatcattttcgtctaagaactattatcttcaagagaagataatgcttcgaaggacgaatgtccttaatgttcaatattgtgttgcct tatttttgaatcacagcaattgaaaacaagtaaccaacattggcgctcacctccggtgaactcacttccacaacaatgaccacgatcagccatcgagct tcgtcactagcaatgacgaagctcatactctctatcacaggcggttcaagcttcgagccagccaacaagaagcaaaaaaggaagcccagcgaagtgt gcaacatgttggaatgcagggaccctatgtcaaatccaaatggtctcatgttccaatcactttctctcaagaggacctttaactcaaagactacccaca caatgatgcc >gene8 Best maize hits: AC196774.5 Zea mays BAC clone CH201-435B12 from chromosome 5, complete sequence AC191361.5 Zea mays BAC clone CH201-216O9 from chromosome 5, complete sequence AC160211.1 Genomic seqeunce for Zea mays BAC clone ZMMBBb0448F23, complete sequence AY555142.1 Zea mays BAC clone c573F08, complete sequence Zea mays cultivar B73 putative gag protein, putative gag-pol precursor, putative transposase, putative copia-type pol polyprotein, putative copia-like retrotransposon AF464738.1 Hopscotch polyprotein, putative gag protein, putative prpol, putative prpol, putative pol protein, putative pol protein, putative gag protein, and teosinte branched1 protein genes, complete cds AF049110.1 Zea mays retrotransposon Cinful-1, complete sequence AF123535.1 Zea mays alcohol dehydrogenase 1 (adh1) gene, adh1-F allele, complete cds JF791320.1 Zea mays subsp. mays cultivar Yu87-1 clone 87-1tb1to69k, complete sequence Zea mays putative growth-regulating factor 1 (Z214A02.12), putative 40S ribosomal AY530951.1 protein S8 (Z214A02.25), and putative casein kinase I (Z214A02.27) genes, complete cds AF049111.1 Zea mays retrotransposon Cinful-2 Best non maize hits: EU053446.1 Mini-chromosome MMC1 retrotransposon xilon, complete sequence; hypothetical protein gene, complete cds; retrotransposon xilon, complete sequence; satellite CentC sequence,; and retrotransposons cinful and ji, complete sequence XM_001545031.1 Botryotinia fuckeliana B05.10 hypothetical protein (BC1G_16418) partial mRNA AC243237.1 Panicum virgatum clone PV_ABa020-K05, complete sequence AC243260.1 Panicum virgatum clone PV_ABa103-K10, complete sequence AC243249.1 Panicum virgatum clone PV_ABa094-D05, complete sequence AC243250.1 Panicum virgatum clone PV_ABa094-D10, complete sequence JN800064.1 Saccharum hybrid cultivar R570 isolate scTat_7.1 retrotransposon, co ATGGTTATATCTTGTGTCATAAAAGGATTCTTGGTCCACAATGTTCTAGTGGATACATGCAGTGCATCGAATATC ATCTTTGCCAAAGCTTTCAGACACATGCAAAAGCAAGAAGATAAGATACATGATACAACACATCCTCTATGTGG CTTCAGAGAGAAACAAATAGCAGCACTTGGCAAGATAACAATGCCAATCACCTTTGGTTACATACACAACACAA GAACTGAGCAAGTTGTATTTGACATAGTGGACATGGAATACCCTTACAATGCAATCATTGGAAGGGGAACATTG AATGCCTTTGAAGCTGTGTTACACCCAGCATACTTGTGCATGGAAATACCATCAAATCAGGGTCCAATCTCTGTA CATGGAAGTCAAGAGGCTGCAAGAAGAGCCGAAGGAAACTGGATAGACTCCAAGGCTATTCACAATATAGAC GAAGTTGAAGCTCACGAGCAACAGAAGCACATAAGGGATAAGGCAGCTTCGGCAGATCAGCCAAAATCCATAC TCTTATGTGAAGATATAGCTGATCAGAAGGTGTTGTTTGGATCTCAGTTGATCAAAGAGCAAAAAAAGAATCTA ACAAAGTTCTTGTTCCACAACAAAGATGTATTTGCTTGGTCAGCCAATGACCTATGTGGTGTCAACAGAAACATC ATTGAACATTCTCTTAATGTAGATCCTACCATCAGGCCAAGGAAGCAGAAGCTTCAGAAAATGTCAGAGGATAA AGCCAAAGGAGCAAGAAACGAAGTTAAAAGACTTCTCAATACTGGAGCGATCAGATAA gtcatttacccacaatggcttgctaacactataatggtaaaaaaggcaaatggaaatagagtatttgtattgactttacagatctcaacaaggcatgcc caaaggatgagtttcctttgcccaggatagactcactcatagatgcagcggccactttagagctcatgagcctgctagattgttattcaggatatcacca gatatggatgaagaaagaagatgagcataagacaagcttcataacccccaatggtacataatactacctttggatgcctgaggggctctagaatgct ggaggaagcttcagcgggatgatttcaaaagtgttgaatacccaaattggtaggaatgttcgctatttgctctaaggttcagcacttgtttcgtcttcttc atcatcatactacaaaatcataagtatatcagcacatgaagaagagaagaatcagaaaacattatggaaacagtttgaaatatacctcatccaaaag ctcccaagcttcggcaccagcaagatctatgccacctttcatccaaattagtgtgataaatcggttggcaacatttctagcttcggttgacgttgttgcta tatcatttgaagaaatatcaaaatttggcttcccaacagttttcaggtgggagcatccaaccttctcaagatcaaagatgtgacacaagagggaacca gtgcacaaaagtcaccttagcccatcattacttcgccgaaggtatctatttctccttctacccacttcagcgctccaggtaagtcgttaggggcataattc tcttctttagacgtcgtcccgaccgagtggaatacttcgcgtagccagttagagcatctgtagacaatctcaaaataattatcctgaagatttttttcaaa gtttctactagctttgagttgtctcggtccttcgttttaacttcttttttcaaattgtcaaattcaagttgaaggcgacaaacttcgtcggcatgatctgtttt aattttctccagctcagttgttaatcgtcggatgtctttatcttgattcgcatttttgtctttgaagcttgttttcctcttcctcatcttcattagccctctgaga tttggccagatccccgtttaatgactgaattgtgacgtctttttccttatttgcttctcaaagtttttcattttcaatttccaacttagaaataacaaactcaa ttttttgatcctctggatcttgttgaagctttaaggctttactcagcagaaaactttgcatttttacaaaggttgaaaaaatttaggtatcttactatttttat tctaagacagaaagtatgatgcataaacgaagctttacttagattacgtaagccaagctaccggcgatatgttgtttccttatgctgcttagctcactttc gagcttcaagaagccaatattcttcatcaaagtgttaaccactctagccctgtcgcggtcaggaatgcactctagtaaattcaccttccaaagcaaact ccttcagttcagaaatttcttcagcacttagctccccactagttaggtgtctgaagtcgaaggtggcctctcccaggcttcagtttcttgcaaagacgaag tggtcgtcccaactttttaagctctacagcctttgaaagtcgcctacaggggggtgaataggcaaatctgaaatttacaaactttaagcacaactacaa gccggggttagcgttagaaatataaacgagtccgaaagagagggcaaaaaacaaatcacaagcaaataaggcggatgacatggtgatttgttttac cgaggttcggttcttgcaaacctactccccgttgaggtggtcacaaagaccaggtctctttcaaccctttccctctctcaaacggtcacctagaccgagt gagcttctcttctcaattaaatgggacacttagtccactacaaggaccaccacaacttggtgtctcttgccttgattataattaagttggaaacaagaaa gaaggaagaagaaaagcaatccaagcgcaagagctcaaaagaacacaaatgtctctctctctagtcactaaattttttggagtgattccggacttggg agaggatttgatctctttgattgtgtcttggaatgaagtctatagctcttgtatgaggtttgatggctgaaaaacttggatacaatgaatggtgggtggtn nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn nnnnnnnnnnnnnnnnnnn (50244-10000) Yichun Qian The genes predicted by FGENESH are better. But I also attached the genes predicted by Augustus for comparison. Gene Prediction FGENESH FGENESH predicted 7 genes. The predicted genes were then translated into peptides. These peptides were used as queries to run Blastp in the swissprot database. 3 of them had significant hits. Segment 1: 62665 - 64698 1 CDSf: 62665 - 63676 1011bp 2 CDSl: 63853 - 64698 846bp Retrotrans_gag[pfam03732], Retrotransposon gag protein; Gag or Capsid-like proteins from LTR retrotransposons. >ATGGCGACCGACAACTCGCCCGCCGGCGGCGGAATCGACGACGTCTTCCCCGCGCGGTGGAAGAACAACATTCGAGC TTGCCTCGTCCCCTCCCCCGCCGACGGAGGAGGAGGCGGGGCAACCCAAGGCCAAGCAGGAGGCGGCACCTCGTCGG CTGTCGAGCGAGTCGACGGTGCCAGCGCCCCAATGGGGGGCACGTCGGGCATCGACCTCGCGTCTGAGACGAAGACG AGCGCCGTCTCCCCGCAACACGTCAACCCCAAGCAAACGGACGACGCCAACACGCTCGCAAGGGACTTGCTGGGCGTC ACCCTCGTACCTGAGACGGCGGTGCAGTCTACCCCTGACGTGACTTCGTCACCGCCCGTCGACCAAGAGGTACCGACCG ATTCCCATCTCGCGCCTTTTGGATTCAGCCTCAACCCCCCAAGCGACTTCGCTTTGGTGGACGCTCTCATAGAGGCGAGT CCAAACCCTCTGGGGTATCGTATGCGGTCACCATGGGACCGGCTGACGGCCGTCTCAACCTACGGGCCCTTAGGGTCCG AGGAAGATGACGAGCCCGACTTTAGTTGGGATTTCTCTGGACTTGGTAACCCCAGTGCCATGCGGGACTTTATGACCGC GTGCGACTACTGCCTTTCCGACTGTTCCGACGGTAGCCGCAGCCTCGGCGACAAGGACTGCGGCCCAAGTCGTGAATGT TTTCACGTCGATCTAGGGGGTCCCGACGAAGGCAACCATCTTGGTATGCCAGAGAATGGTGACCTTCCTAGGCCTGTGC CTCACGTTGACATCCTTCGGGAGCTAGCTGTGGTCCCCGTTCCGGCAGGGGGTCATGACCCACAACTCGAGCAAATCCG CGAGATGCAGGCCAGGCTCGACGAGGGAGCAGGAACACTTGAGCCGTTCCGCCGGGACAATAGGCAGGAATGGGCG GGCCAACCTCTGGCCGGAGAAGTGCGTCATCTACCCCAGGGCATCCAGCACCGCGTCGCCGACGATGTCAGGgtaaggcc gccaccggtttccagtggggtcggccagaacctggctgcagcggcaatacttctccgcgcgatgccggagccatcaaccaccgaggggcggcgtatccaggg agagctcaagaacctcctggaggacgccgcggtctgacgggccgaaagctccgcctcccgaaggcagGGGTACCCCTCGGAACATCGCGCCGC GACTTCCCGATTCATGCGGGAAGCCTCGGTCCACACCGGCCGCATGCGTAACATAGCGCATGCGGCCCCGGGTCGCCTC GGCAACGAGCACCATCACCATAACTGTTGGGCCCACCTCGACGAGAGGGTGCGCCGAGGCTACCACCCCAGGCGTGGG GGACGCTACGACAGCGGGGAGGATCGGAGTCCCTCGCCCAAACCACCTGGTCCGCAGGCTTTCAACCGCGCCATACGA CGGGCGCCGTTCCCGACCCGGTTCCGAACCCCGACTACTATCACAAAGTACTCGGGGGAGACGAGACCGGAACTGTGG CTCGCAGACTACCGGCTGGCCTGCCAGCTGGGTGGAACGGACGATGACAACCTCATCATCTGCAACCTCCCCCTGTTCCT TTCCGACACCGCTCGCGCCTGGCTGGAGCACCTGCCTCCGGGGCAGATCTCCAACTGGGACGACCTGGTCCAAGCCTTC GCCGGTAATTTCCAGGGCACGTACGTGCGCCCTGGAAACTCCTGGGATCTCCGAAGCTGCCGCCAGCAGCCGGGGGGG TCTCTCCGGGACTACATCCGGCGATTCTCGAAGCAGCGCACCGAGCTGCCCAACATCGCCGATTCGGATGTCATCGGCG CGTTCCTCGCCGGCACCACCTGCCGTGACCTGGTGAGCAAGCTGGGTCGCAAGACCCCCACCAGGGCGAGCGAGCTGA TGGACATCGCCACCAAGTTCGCCTCTGGCCAGGAGGCGGTTGAGGCCATCTTCCGGAAGGACAAGCAGCCCCAGGGCC GCCCACCGGAAGATGTCCCCGAGGCGTCAACTTAG Protein sequences: MATDNSPAGGGIDDVFPARWKNNIRACLVPSPADGGGGGATQGQAGGGTSSAVERVDGASAPMGGTSGIDLASET KTSAVSPQHVNPKQTDDANTLARDLLGVTLVPETAVQSTPDVTSSPPVDQEVPTDSHLAPFGFSLNPPSDFALVDALIE ASPNPLGYRMRSPWDRLTAVSTYGPLGSEEDDEPDFSWDFSGLGNPSAMRDFMTACDYCLSDCSDGSRSLGDKDCG PSRECFHVDLGGPDEGNHLGMPENGDLPRPVPHVDILRELAVVPVPAGGHDPQLEQIREMQARLDEGAGTLEPFRRD NRQEWAGQPLAGEVRHLPQGIQHRVADDVRGYPSEHRAATSRFMREASVHTGRMRNIAHAAPGRLGNEHHHHNC WAHLDERVRRGYHPRRGGRYDSGEDRSPSPKPPGPQAFNRAIRRAPFPTRFRTPTTITKYSGETRPELWLADYRLACQL GGTDDDNLIICNLPLFLSDTARAWLEHLPPGQISNWDDLVQAFAGNFQGTYVRPGNSWDLRSCRQQPGGSLRDYIRR FSKQRTELPNIADSDVIGAFLAGTTCRDLVSKLGRKTPTRASELMDIATKFASGQEAVEAIFRKDKQPQGRPPEDVPEAS T Segment 2: 66287 - 69085 3 exons: 1 CDSf 66287 - 67405 1119bp 2 CDSi 67439 - 67615 177bp 3 CDSl 68270 - 69085 816bp RNase_HI_archaeal_like[cd09279], RNAse HI family that includes Archaeal RNase HI; RT_LTR[cd01647], RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long terminal repeats (LTRs) in their DNA copies but not in their RNA template. rve[pfam00665], Integrase core domain RVT_3[pfam13456], Reverse transcriptase-like; This domain is found in plants and appears to be part of a retrotransposon. RNase_HI_RT_Ty3[cd09274], Ty3/Gypsy family of RNase HI in long-term repeat retroelements; RNase_H[cd06222], RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner RNase_H[pfam00075], RNase H; RNase H digests the RNA strand of an RNA/DNA hybrid. Important enzyme in retroviral replication cycle. RVT_1[pfam00078], Reverse transcriptase (RNA-dependent DNA polymerase) PRK07238[PRK07238], bifunctional RNase H/acid phosphatase PRK07708[PRK07708], hypothetical protein; Validated >ATGCCATTCAGTTTGAGGAATGCGGGTGCAACGTACCAACGGTGCATGAACCACATGTTCGGCGAACACATTGGCCGA ACGGTCGAGGCCTACGTCGATGACATCGTAGTCAAGACGAGGAAAGCCTCCGACCTCCTTTCCGACCTTGAAGCGACAT TCCGATGTCTCAAGGCGAAAGGCGTGAAGCTCAATCCCGAGAAATGTGTCTTCGGGGTTCCACGAGGCATGCTCTTGGG GTTCATCGTCTCCGAGCGGGGCATCGAGGCCAACCCGGAGAAGATCGCGGCCAACACCAGCATGGGGCCCATCAAGGA CTTGAAAGGCGTACAGAGAGTCACAGGATGCCTTGCGGCTCTGAGCCGTTTCATCTCGCGCCTCGGCGAAAGAGGCCTA CCTCTGTACCGCCTCTTAAGGAAGGCCGAGTGCTTCACTTGGACCCCTGAGGCCGAGGAAGCCCTCGGGAACCTGAAGG CGCTCCTCACGAACGCGCCCATCTTGGTGCCCCCCGCTGCCGGAGAAGCCCTCTTGATCTACGTCACCACGACCACTCAG GTGGTTAGCGCCGCGATTGTGGTTGAGAGACGAGAAGAGGGGCATGCATTGCCCGTACAGAGGCCAGTCTACTTCATC AGTGAGGTACTGTCCGAGACCAAGATCCGCTACCCACAAATTCAGAAGCTGCTGTACGCAGTGATCCTGACACGACGGA AGTTGCGACACTACTTCAAGTCTCATCCGGTGACTGTGGTGTCATCCTTCCCCCTGGGGGAGATCATCCAGTGCCGAGAG GCCTCGGCTAGAATTGCAAAGTGGGCGGTGGAAATCATGGGCGAGACGATCTCGTTCGCCCCTCGGAAGGCCATCAAG TCCCAGGTCTTGGCGGACTTTGTGGCTGAATGGGTCGACACCCAGCTCCCAACAGCTCCGATCCAACCGGAACTCTGGA CCATGTTTTTCGACGGGTCACTGATGAAGACAGGAGCAGGCGCAGGCCTGCTCTTGATCTCGCCCCTCAAGAAGCACCT ACGCTACGTGCTACGCCTCCACTTCCCGGCGTCCAACAATGTGGCTAAGTACGAGGCTCTAGTCAACGGGTTGCGCATC GCCATCGAGCTGGGGgtctgacgcctcgacgctcgtggtgactcgcagCTCGTCATCGACCAAGTCATGAAGAACTCCCACTGCCAC GACCCGAAGATGGAGGCCTACTGCGATGAGGTTCGGCGCCTGGAAGACAAGTTCTACGGGCTCGAGCTCAACCACATC GCCCGACGCCACAACGAGACTGCGGACGAGCTGGCTAAAATAGCCTCGGGGCGAACAACGgttcccccagacgtcttctcccga gacctgcatcaaccctccgtcaagaccgacgacacgcccgagcccgagacaccctcggcttagtccgaggcaccctcggctcagtccgaggcgccatcggct cggcccgaggcaccctcggctcaacccgaggcaccctcggcccccgagggtgaggcactgcgcatcgaggaggagcggagaggggtcatgcctaatcgaa actggcagaccccgtacctgcaatatctccgccgaggagagctacccctcgaccaagccgaagcttggcggttggcgcggcgcgccaagtcgttcgtcttgct gggagacgagaaggagctctaccaccgcagcccctcgggcatcctccagcgatgcatttccatcgccgaaggccaggagctcctacaagagatacactcgg gggcttgtggccatcacgcagcacctcgagcccttgttggaaacgccttccgacaaggtttctactggccgacggcggtggccgacaccactagaattgtccg cacctgcgaagggtgtcagttctacacaaggcagacccacctacccgcttaggccctgcagaccatacccatcacctggtcatttgttgtgtggggtctggacc tagttggccccttgcagAAGGCACCCGGGGGCTACACGCATCTGTTGGTCGCCATCGACAAATTCTCCAAGTGGATCGAGGTC CGACCCCTAAACAGCATCAGGTCCGAACAGGCGGTGGCGTTCTTCACCAACATCATCCATCGCTTTGGGGTCCCGAACTC CATCATCACCGACAACGGCACCCAGTTCACCGGCAGAAAGTTCCTGGACTTCTGCGAGGATCACCACATCTGGGTGGAC TGGGCCGCCGTGGCTCACCCCATGACGAATGGGCAAGTAGAGCGTGCCAACGGCATGATTCTACAAGGACTCAAGCCT CGAATCTACAACGACCTCAACAAGTTCGGCAAGCGGTGGATGAAGGAACTCCCCTCGGTGGTCTGGAGTCTGAGGACG ACGCTGAGCCGGGCCACGGGCTTCACACCGTTCTTTCTAGTCTATGGGGCCGAGACCGTCTTGCCCATAGACTTAGAATA CGGTTCCCCGAGGACGAGGGCCTACGACGACCAAAGCAATCGAGCTAATCGAGAAGACTCACCGGACCAGCTGGAAGA GGCTCGGGACATGGCCTTACTACACTCGGCGCGGTACCAGCAGTCCTTGCGACGCTACCACGCCCGAGGGGTTCGGTCC CGAGACCTCCAGGTGGGCGACCTGGTGCTTCGGCTGCGACAAGACGCCCGAGGGCGGCACAAGCTCATGCCTCCCTGG GAAGGGTCGTTCGTCATCGCCAAAGTTCTGAAGCCTGGGACGTACAAGCTGGCCAACAGTCAAGGCGAGGTCTACAGC AACGCTTGGAACATCCGACAGCTACGTCGCTTCTACCCTTAA Protein sequence: MPFSLRNAGATYQRCMNHMFGEHIGRTVEAYVDDIVVKTRKASDLLSDLEATFRCLKAKGVKLNPEKCVFGVPRGMLL GFIVSERGIEANPEKIAANTSMGPIKDLKGVQRVTGCLAALSRFISRLGERGLPLYRLLRKAECFTWTPEAEEALGNLKALL TNAPILVPPAAGEALLIYVTTTTQVVSAAIVVERREEGHALPVQRPVYFISEVLSETKIRYPQIQKLLYAVILTRRKLRHYFKS HPVTVVSSFPLGEIIQCREASARIAKWAVEIMGETISFAPRKAIKSQVLADFVAEWVDTQLPTAPIQPELWTMFFDGSL MKTGAGAGLLLISPLKKHLRYVLRLHFPASNNVAKYEALVNGLRIAIELGLVIDQVMKNSHCHDPKMEAYCDEVRRLED KFYGLELNHIARRHNETADELAKIASGRTTKAPGGYTHLLVAIDKFSKWIEVRPLNSIRSEQAVAFFTNIIHRFGVPNSIITD NGTQFTGRKFLDFCEDHHIWVDWAAVAHPMTNGQVERANGMILQGLKPRIYNDLNKFGKRWMKELPSVVWSLRTT LSRATGFTPFFLVYGAETVLPIDLEYGSPRTRAYDDQSNRANREDSPDQLEEARDMALLHSARYQQSLRRYHARGVRSR DLQVGDLVLRLRQDARGRHKLMPPWEGSFVIAKVLKPGTYKLANSQGEVYSNAWNIRQLRRFYP Segment 3: 82383 - 88664 7 exons 1 CDSf 82383 - 83722 1338bp 2 CDSi 84124 - 84298 174bp 3 CDSi 84369 - 85018 684bp 4 CDSi 85130 - 85433 303bp 5 CDSi 85920 - 86500 579bp 6 CDSi 86862 - 87035 174bp 7 CDSl 87327 - 88664 1338bp RT_LTR[cd01647], RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses. RNase_HI_archaeal_like[cd09279], RNAse HI family that includes Archaeal RNase HI; rve[pfam00665], Integrase core domain; DUF4370[pfam14290], Domain of unknown function (DUF4370); RT_DIRS1[cd03714], RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of retransposons. RVT_1[pfam00078], Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. PRK12829[PRK12829], short chain dehydrogenase; Provisional PHA03307[PHA03307], transcriptional regulator ICP4; Provisional >ATGGCGGCCGACAACCCGCCCGCCGGCGGCGGAATCGATGACGTCTTCCCCACGTGGCGGAAGAACGACATTCGGGC TTGTCCCGTCCCCTCCCCCGTCGACGGAGGAGGAGGCGGGGCAACCAAGGCCAAGCAGGAGGCGGCACCTCGTCGGCT ATCGAGCGAGTCGACGGCGCCGGTGCCCCCAACGAGGGGCGCGATGGGCATCGACATCGCGTCTGAGACGAAGACGA GCGCCGTCTCCCCGCAACACGCCAACTCCAAGCAAACGGACGACGCCAGCACGCTCGCAAAAGACTTGTTGGGCGTCAC CCTCGTACCTGAGACGACGGTGCAGTCTACCCCTGACGTGACTTCGTCACCGCCCGTCGACCAAGACGTACCGACCGATT CCCATCTCGCGCCTTTTGGATTCAGCCTCGACCCACCAAGCGACTTCGCTTTGGTGGACGCTTTCATAGAGGCGAGTCCA AACCCTCCGGGGTATCGTGTGCGGTCACCCTGGGACCGGCTGACAGCCGTCTCGACCTACGGGCCCTCGGGTTCCGAGG AAGATGACGAGCCCGACTTTTGTTGGGATTTCTCTGGACTTGGTAACCCCAGTGCCATGCGGGACTTCATGACCACATGC GACTACTGCCTTTCCGACTGTTCCGACGGTAGCCGCAGCCTCGGCGACGAGGACTATGGCCCAAGTCGTGAATGTTTCC ACGTCGACCTAGGGGGTCCCGGCGAAGGAAACCATCCTGGTATACCGGAAAATGGTGATCCCCCTAGGCCTGCGCCTC GCGTTGACATCCTACGGGAGCTAGCTGTGGTCCCAGTCCCTGCGGGGGTCAGGACTCACAGCTCGAGCAAATCTGCGA GATGCAGGCCAGGCTCGACGAGGGAGCAGGAACACTTGAGCCGTTCCGCCGGGACATCGGGCAGGAATGGGCAGGCC AACCTCCGGCCGGAGAAGCGCGCCATCTACCCCAGGGCATCCAACACCGCATCGCCGACGATGTCAGGGCAAGGCCGC CACCGGCCTCCAGTGGGGTCGGCCAGAACCTGGCTGCAGCGGCAATACTTCTCCGCGCGATGCCGGAGCCATCTACCAC CGAGGGGCGGCGTATCCAGGGAGAGCTCAAGAATCTCCTGGAGGATGTCGCGGTCCGACGGGCCGAAAGCTCCGCCT CCCGAAGGCAGGGGTACCCCTCGGAACATCGCGCCGCGACTTCCCAATTCATGCGGAAAGCCTCGGTCCACACCGGGC GCACGCGCAACACAGCGCCTGCGGCCCTGGGTCGCCTCGGCAACGAACACCCTCACCGCAACCGTCGAACCCACCTCGA CGAGAgggtgcgccgaggctaccaccccaggcgtgggggacgctacgacagcggggaggattggagtccctcgcccgaaccacccggtccgcaggcttt cagccgggccatacgacgggcgccgttcccgacccggttccgaaccccgactactatcacaaagtactcgggggagacgagaccggaactgtggctcgcgg actaccggctagcctgccacctgggtggaacagacgatgacaatctcatcatccggaacctccccctgttcctctccgacaccgctcgagcctggctggagca cctgcctccggggcagatctccaactaggacgacctggtccaagccttcgccggcaacttccagggtacgtatgtgtgccctgggaactcctgggatctccaa aGCTGCCGCCAGCAGCCGGGGGAGTCTCTCTGGGACTACATCCGGCAATTCTCGAAGCAGCGCACCGAGTTGCCCAATG TCACCGACTCGGATGTCATCGGCGCGTTCCTCGCCGACACCACTTGCCGCGACCTGGTTAGCAAGCTGGGTCGCAAGAC CCCCACCAGGGCGAGTGaggtgatggacatcgccaccaagttcgcctctggctaGGATGCGGTTGAGGCCATCTTCCGGAAGGACAA GCAGCCCCAGGGCCGCCCACCGGAAGATGTCCCCGAGGCGTCAACTCAGCGCGGCATCAAGAAGAAAGGCAAGAAGA AGTCGCAAGCAAAACGCGACGCCGCCGATGCGAACTTTGTCGCCGCCGCCGAGTACAAGAACCCTCGGAAACCTCCTG GAGGTGCCAATCTCTTCGACAAGATGCTCAAGGAGCCGTGCCCCTGTCATCAGGGGCCCGTCAAGCACACCCTTGAGGA GTGCGCCATGCTTCGGCGCCACTTTCACAAAGCCGGGCCACCTGCGGAGGGTGGCCGGGCCCGCGACGACGATAAGAA GGAGGATCACAAGGCAGGAGAGTTCCCCGAGGTCCACGACTGCTTCATGATCTACGGTGGGCAAGTGGCGAACGCCTC GGCTCGGCACCACAAGCAAGAGCGTCGGGAGGTCTGCTCGGTAAAGGTGGCGGCGCCAGTCTACCTAGACTGGTCCGA CAAGCCCATCACCTTCGACCAGGGCGACCACCCCGACCGCGTGCCGAGCCTGGGGAAGTACCCGCTCGTTGTCGACCCC GTCATCGGCAACGTCAGGCTCACCAAGGTCCTCATGGACGGAGGCAGCAGCCTCAACGTCATCTACGCCAAGACCCTCG GGCTCCTGCGGATCGATCTGTCCTCggtacgggcaggagctgcgccttttcacgggatcatccctgggaagcgcgtccagcccctcggacaactcg atctacccgtctgctttgggacaccctccaacttctgaaagGAGACCCTCACGTTCGAGGTGGTCGGGTTTCGAGGAACCTACCACGCA GTGCTGAGGAGGCCATGCTACGCCAAGTTCATGGTCGTCCCCAACTACACCTACCACAAGCTAAAGATGCCAGGCCCCA ACGGGGTCATCACCGTCGGCCCCACGTACCGACACGCGTACGAATGCGACGTGGAGTGCATGGAGTACGCCGAGGCCC TCGCCAAATCCGAGGCCCTCATCGCCGACCTGGAGAGCCTCTCCAAGGAGGCGCCAGACGTGAAGCGCCACACCAGCA ACTTCGAGCCAACGGAGATGggtaagttcgtccctctcaacaccagcaacgatacctccaagctgatccggatcgggctccgagctcgaccccaaat aggaagcagtctcgtcgactttctccgtgcaaacaccgatgtttttgcatggaatccctcggacatgcccggcataccgagggatgtcgccgagcactcgctgg atatccgagctagagcccgacccgtgaagcagcctctgcgccggttcgacgaagaaaagcgcagagccataggcgaggagatccacaagctaatggcggt agggttcatcaaagaggtattccatcccgagtggcttgccaaccctgtgcttgtgagaaagaaaggagggaaatggcgtatgtgtgtagactacactggtcta aacaaagcatgtccaaaagttccctaccctctgcctcgcatcgatcaaatcgtggattccactgctgggtgcgaaaccctgtctttcctcgatgcctactcagG GTATCGCCAAATCAGGATGAAAGAGTCCGACCAGCTCGCGACTTCTTTCATCACACCTTTCGGCATGTACTGCTATGTTA CCATGTCGTTTGGTTTGAGGAATGCGGGTGCGACATACCAAAGGTGCATGAACCACGTGTTCGGCGAACACATTGGTCG AACGGTCGAGGCTTACATCGATGACATCGTAGTCAAGACGAGGAAAGCCTCTGACCTCCTTTCCGACCTTGAAACGACA TTCTGGTGTCTCAAGGCGAAAGGTGTAAAGCTCAATCCCGAGAAGTGCGTCTTCGGGGTCCCCCAAGGCTTGCTCTTGG GGTTTATCGTCTCCGAGCGGGGCATCGAGGCCAACCCAGAGAAAATCGTGGCCATCACCAACATGGGGCCCATCAAGG ACTTGAAAGGCGTACAGAGGGTCACGGGGTGCCTTGCGGCTCTGAGCCGTTTCATCTCACGCCTCGGCGAAAGAGGCC TGCCTCTGTACCGCCTCTTAAGGAAGGCCGAGTGCTTCACTTGGACCCCTGAGGCCGAGGAAGCCCTCGGGAACCTGAA GGCGCTCCTCACGAACGCGCCCATCTtggtgcccccgcggccggagaagccctcttgatctacgtcgccgctaccactcaggtggtcagcgccgcg atcgtggttgagagacgagaagagggacatgcattgcctgtccagaggccagtctacttcgtcagtgaggtactgtccgagaccaagatccgctacccacaa attccgagtctcatccggtgactgtggtgtcatctttccccctgggggagatcatccagtgccgagaggcctcgggtaggattgcaaagtgggcggtggaaatc atgggcgagacaatctcgttcgccactcgtaaggccataaagtcccaagtcttggcggactttgtggctgaatgggtcgatacccaGCTCCCGACAGCT CCGATCCAACCGGAACTCTGGACCATGTTTTTTGACGGGTCGCTGATGAAGACAGGGGCAGGCGCGGGCCTGCTCTTCA TCTCGCCCCTCGGGAAGCACCTACGCTACGTGCTACGCCTCCACTTCCCGGCGTCCAACAATGTGGCCGAGTACGAGGCT CTggtcaacgggttgcgcgtcgccatcgagctagggatccgacgtctcgacgctcgcggtgactcgtagctcgtcattgactaagtcatgaagaactcccact tctgcgactcgaagatggaagcctactgcgatgaggttcggcgcctggaggacaagttctatgggctcgagttcaaccacatcgcccgacgctacaacgaga ctgcggacaagctggctaagatagcctcggggcaaacaacggttcccccggacgtcttctcctgagacctgcatcaaccctccgtcaagACCGACGACA CGCCCGAGCCCGAGAAGGCCTCGGCCCAGCCCGAGGCACCCTCGGCCCCCGAGGATGAGGCACTGCGTGTCGAGGAG GAGCGGAGCGGGGTCACGCCTAATCGAAACTGGCAGACCCCGAACCTGCAATATCTCCACCGAGGAGAGCTACCCCTC GACCGAGCCGAAGCTCGGCGGTTGGCGCGGCGTGCCAAGTCGTTCGTCTTGCTGGGGGACGGGAAGGAGCTCTACCAT CGCAGCCCCTCAGGCATCCTCCAGCAATGCATATCCATCACCGAAGGCCAGGAGCTCTTACAAGAAATACACTCGGGGG CTTGCGGGCATCACGCGGCGCCCCGAGCCCTTGTTGGGAACGCCTTCCGACAAGGTTTCTACTGGCCAACCGCGGTGGC CGACGCCACTAGAATTGTTCGCACCTGCCAGGGGTGTCAATTCTACGCAAGGCAGACTCACCTTCCCGCCCAGGCTCTAC AGACCATACCCATCACCTGGTCGTTTGCTGTGTGGGGTCTGGACCTCGTCGGCACCTTGCAGAAGGCACCCGGGGGCTA CACGCACCTGCTGGTCGCCATCGACAAATTCTCCAAGTGGATCGAGGTCCGACCCCTAAACAGCATCAGGTCTGAACAG GCGGTGGCGTTCTTCACCAACATCATCCATCGCTTTGGGGTCCCGAACTCCATCATCACCGACAACGACACCCAGTTCAC CGACAGAAAGTTCCTGGACTTCTGCGAGGATCACCACATCCGGGTGGACTGGGCCGCCGTGGCTCACCCCATGACGAAT GGGCAAGTAGAGCGTGCCAACGGCATGATCCTGCAAGGACTCAAGCCGTGGATCTACAACAACCTTAACAAGTTCGGC AAGCGATGGATGAAGGAGCTCCCCTCGGTGGTCTGGAGTCTGAGGACAACGCCGAGCCGAGCCACGGGCTTCACACCG TTCTTTCTAGTCTATGGGGCCGAGGCCATCTTGCCCATAGACTTAGAATACGGTTCCCCAAGGACGAGGGCCTACAACG ACCAAAGCAATCGAGCTAACCGAGAAGACTCACTGGACCAGCTGGAAGAGGCTCGGAACATGGCCTTCCTACACTCGG CGCGGTATCAGCAGTCCCTGCGACGCTACCACGCCCGAAGGGTTCGGTCCCGAGACCTCCAGGTGGGCGACTTGGTGCT TCGGCTGCGACAAGACGCCCGAGGGCGGCACAAGCTCACGCCTCCCTGGGAAGGGTCGTTCGTCATCGCCAAGGTTCT GAAGCCCGGGACGTATAAGCTGGCCAACAGTCAAGGCGAGGTCTACAACAACGCTTGGAACATCCGATAG Protein sequence: MAADNPPAGGGIDDVFPTWRKNDIRACPVPSPVDGGGGGATKAKQEAAPRRLSSESTAPVPPTRGAMGIDIASETKT SAVSPQHANSKQTDDASTLAKDLLGVTLVPETTVQSTPDVTSSPPVDQDVPTDSHLAPFGFSLDPPSDFALVDAFIEASP NPPGYRVRSPWDRLTAVSTYGPSGSEEDDEPDFCWDFSGLGNPSAMRDFMTTCDYCLSDCSDGSRSLGDEDYGPSRE CFHVDLGGPGEGNHPGIPENGDPPRPAPRVDILRELAVVPVPAGVRTHSSSKSARCRPGSTREQEHLSRSAGTSGRNG QANLRPEKRAIYPRASNTASPTMSGQGRHRPPVGSARTWLQRQYFSARCRSHLPPRGGVSRESSRISWRMSRSDGPK APPPEGRGTPRNIAPRLPNSCGKPRSTPGARATQRLRPWVASATNTLTATVEPTSTRGCRQQPGESLWDYIRQFSKQR TELPNVTDSDVIGAFLADTTCRDLVSKLGRKTPTRASEDAVEAIFRKDKQPQGRPPEDVPEASTQRGIKKKGKKKSQAKR DAADANFVAAAEYKNPRKPPGGANLFDKMLKEPCPCHQGPVKHTLEECAMLRRHFHKAGPPAEGGRARDDDKKED HKAGEFPEVHDCFMIYGGQVANASARHHKQERREVCSVKVAAPVYLDWSDKPITFDQGDHPDRVPSLGKYPLVVDPV IGNVRLTKVLMDGGSSLNVIYAKTLGLLRIDLSSETLTFEVVGFRGTYHAVLRRPCYAKFMVVPNYTYHKLKMPGPNGVI TVGPTYRHAYECDVECMEYAEALAKSEALIADLESLSKEAPDVKRHTSNFEPTEMGYRQIRMKESDQLATSFITPFGMYC YVTMSFGLRNAGATYQRCMNHVFGEHIGRTVEAYIDDIVVKTRKASDLLSDLETTFWCLKAKGVKLNPEKCVFGVPQG LLLGFIVSERGIEANPEKIVAITNMGPIKDLKGVQRVTGCLAALSRFISRLGERGLPLYRLLRKAECFTWTPEAEEALGNLKA LLTNAPILLPTAPIQPELWTMFFDGSLMKTGAGAGLLFISPLGKHLRYVLRLHFPASNNVAEYEALTDDTPEPEKASAQPE APSAPEDEALRVEEERSGVTPNRNWQTPNLQYLHRGELPLDRAEARRLARRAKSFVLLGDGKELYHRSPSGILQQCISIT EGQELLQEIHSGACGHHAAPRALVGNAFRQGFYWPTAVADATRIVRTCQGCQFYARQTHLPAQALQTIPITWSFAVW GLDLVGTLQKAPGGYTHLLVAIDKFSKWIEVRPLNSIRSEQAVAFFTNIIHRFGVPNSIITDNDTQFTDRKFLDFCEDHHIR VDWAAVAHPMTNGQVERANGMILQGLKPWIYNNLNKFGKRWMKELPSVVWSLRTTPSRATGFTPFFLVYGAEAILP IDLEYGSPRTRAYNDQSNRANREDSLDQLEEARNMAFLHSARYQQSLRRYHARRVRSRDLQVGDLVLRLRQDARGRH KLTPPWEGSFVIAKVLKPGTYKLANSQGEVYNNAWNIR Augustus gene prediction Augustus predicted 13 genes. The predicted genes were then translated into peptides. These peptides were used as queries to run Blastp in the swissprot database. Only 2 of them had significant hits. One belongs to the Reverse transcriptases (RTs) superfamily, the other belongs to the RNase H superfamily. Segment 1: 65858 --- 67411 CDS 65858 --- 67411 1553bp RT_LTR[cd01647]: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long terminal repeats (LTRs) in their DNA copies but not in their RNA template. RT_Rtv[cd01645]: Reverse transcriptases (RTs) from retroviruses (Rtvs). RT_ZFREV_like[cd03715]: A subfamily of reverse transcriptases (RTs) found in sequences similar to the intact endogenous retrovirus ZFERV from zebrafish and to Moloney murine leukemia virus RT. >ATGCCCGGCATACCGAGGGATGTCGCCGAGCACTCGCTGGATATCCGAGCTGGAGCCCGACCCGTGAAGCAGCCTTT GCGCCGATTCGACGAAGAAAAGCGCAGAGCCATAGGCGAGGAGATCCACAAGCTAATGGCGGCAGGGTTCATCAAAG AGGTATTCCACCCCGAATGGCTTGCCAACCCTGTGCTTGTGAGAAAGAAAGGAGGGAAATGGCGGATGTGTGTAGACT ACACTGGTCTAAACAAAGCATGTCCGAAAGTTCCCTACCCTCTACCTCGCATCGATCAAATCGTGGATTCCACTGCTGGG TGCGAAACCCTATCTTTCCTTGATGCCTACTCGGGGTATCACCAGATCAGGATGAAAGAGTCCGACCAGCTCGCGACTTC TTTCATCACACCCTTCGGCATGTACTGTTATGTTACCATGCCATTCAGTTTGAGGAATGCGGGTGCAACGTACCAACGGT GCATGAACCACATGTTCGGCGAACACATTGGCCGAACGGTCGAGGCCTACGTCGATGACATCGTAGTCAAGACGAGGA AAGCCTCCGACCTCCTTTCCGACCTTGAAGCGACATTCCGATGTCTCAAGGCGAAAGGCGTGAAGCTCAATCCCGAGAA ATGTGTCTTCGGGGTTCCACGAGGCATGCTCTTGGGGTTCATCGTCTCCGAGCGGGGCATCGAGGCCAACCCGGAGAA GATCGCGGCCAACACCAGCATGGGGCCCATCAAGGACTTGAAAGGCGTACAGAGAGTCACAGGATGCCTTGCGGCTCT GAGCCGTTTCATCTCGCGCCTCGGCGAAAGAGGCCTACCTCTGTACCGCCTCTTAAGGAAGGCCGAGTGCTTCACTTGG ACCCCTGAGGCCGAGGAAGCCCTCGGGAACCTGAAGGCGCTCCTCACGAACGCGCCCATCTTGGTGCCCCCCGCTGCCG GAGAAGCCCTCTTGATCTACGTCACCACGACCACTCAGGTGGTTAGCGCCGCGATTGTGGTTGAGAGACGAGAAGAGG GGCATGCATTGCCCGTACAGAGGCCAGTCTACTTCATCAGTGAGGTACTGTCCGAGACCAAGATCCGCTACCCACAAAT TCAGAAGCTGCTGTACGCAGTGATCCTGACACGACGGAAGTTGCGACACTACTTCAAGTCTCATCCGGTGACTGTGGTG TCATCCTTCCCCCTGGGGGAGATCATCCAGTGCCGAGAGGCCTCGGCTAGAATTGCAAAGTGGGCGGTGGAAATCATG GGCGAGACGATCTCGTTCGCCCCTCGGAAGGCCATCAAGTCCCAGGTCTTGGCGGACTTTGTGGCTGAATGGGTCGACA CCCAGCTCCCAACAGCTCCGATCCAACCGGAACTCTGGACCATGTTTTTCGACGGGTCACTGATGAAGACAGGAGCAGG CGCAGGCCTGCTCTTGATCTCGCCCCTCAAGAAGCACCTACGCTACGTGCTACGCCTCCACTTCCCGGCGTCCAACAATG TGGCTAAGTACGAGGCTCTAGTCAACGGGTTGCGCATCGCCATCGAGCTGGGGGTCTGA Protein sequence: MPGIPRDVAEHSLDIRAGARPVKQPLRRFDEEKRRAIGEEIHKLMAAGFIKEVFHPEWLANPVLVRKKGGKWRMCVDYTGL NKACPKVPYPLPRIDQIVDSTAGCETLSFLDAYSGYHQIRMKESDQLATSFITPFGMYCYVTMPFSLRNAGATYQRCMNHMF GEHIGRTVEAYVDDIVVKTRKASDLLSDLEATFRCLKAKGVKLNPEKCVFGVPRGMLLGFIVSERGIEANPEKIAANTSMGPIKD LKGVQRVTGCLAALSRFISRLGERGLPLYRLLRKAECFTWTPEAEEALGNLKALLTNAPILVPPAAGEALLIYVTTTTQVVSAAIV VERREEGHALPVQRPVYFISEVLSETKIRYPQIQKLLYAVILTRRKLRHYFKSHPVTVVSSFPLGEIIQCREASARIAKWAVEIMGE TISFAPRKAIKSQVLADFVAEWVDTQLPTAPIQPELWTMFFDGSLMKTGAGAGLLLISPLKKHLRYVLRLHFPASNNVAKYEAL VNGLRIAIELG Segment 2: 86898 --- 88664 2 exons 1 CDS 86898---87090 192bp 2 CDS 87304---88664 1360bp RNase_HI_archaeal_like[cd09279], RNAse HI family that includes Archaeal RNase HI RVT_3[pfam13456], Reverse transcriptase-like; This domain is found in plants and appears to be part of a retrotransposon. RNase_H[cd06222], RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner RnhA[COG0328], Ribonuclease HI [DNA replication, recombination, and repair] PRK07238[PRK07238], bifunctional RNase H/acid phosphatase; Provisional PRK07708[PRK07708], hypothetical protein; Validated >ATGTTTTTTGACGGGTCGCTGATGAAGACAGGGGCAGGCGCGGGCCTGCTCTTCATCTCGCCCCTCGGGAAGCACCTA CGCTACGTGCTACGCCTCCACTTCCCGGCGTCCAACAATGTGGCCGAGTACGAGGCTCTGGTCAACGGGTTGCGCGTCG CCATCGAGCTAGGGATCCGACGTCTCGACGCTCGCggtgactcgtagctcgtcattgactaagtcatgaagaactcccacttctgcgactcga agatggaagcctactgcgatgaggttcggcgcctggaggacaagttctatgggctcgagttcaaccacatcgcccgacgctacaacgagactgcggacaag ctggctaagatagcctcggggcaaacaacggttcccccggacgtcttctcctgagaCCTGCATCAACCCTCCGTCAAGACCGACGACACGCC CGAGCCCGAGAAGGCCTCGGCCCAGCCCGAGGCACCCTCGGCCCCCGAGGATGAGGCACTGCGTGTCGAGGAGGAGC GGAGCGGGGTCACGCCTAATCGAAACTGGCAGACCCCGAACCTGCAATATCTCCACCGAGGAGAGCTACCCCTCGACC GAGCCGAAGCTCGGCGGTTGGCGCGGCGTGCCAAGTCGTTCGTCTTGCTGGGGGACGGGAAGGAGCTCTACCATCGCA GCCCCTCAGGCATCCTCCAGCAATGCATATCCATCACCGAAGGCCAGGAGCTCTTACAAGAAATACACTCGGGGGCTTG CGGGCATCACGCGGCGCCCCGAGCCCTTGTTGGGAACGCCTTCCGACAAGGTTTCTACTGGCCAACCGCGGTGGCCGAC GCCACTAGAATTGTTCGCACCTGCCAGGGGTGTCAATTCTACGCAAGGCAGACTCACCTTCCCGCCCAGGCTCTACAGAC CATACCCATCACCTGGTCGTTTGCTGTGTGGGGTCTGGACCTCGTCGGCACCTTGCAGAAGGCACCCGGGGGCTACACG CACCTGCTGGTCGCCATCGACAAATTCTCCAAGTGGATCGAGGTCCGACCCCTAAACAGCATCAGGTCTGAACAGGCGG TGGCGTTCTTCACCAACATCATCCATCGCTTTGGGGTCCCGAACTCCATCATCACCGACAACGACACCCAGTTCACCGAC AGAAAGTTCCTGGACTTCTGCGAGGATCACCACATCCGGGTGGACTGGGCCGCCGTGGCTCACCCCATGACGAATGGG CAAGTAGAGCGTGCCAACGGCATGATCCTGCAAGGACTCAAGCCGTGGATCTACAACAACCTTAACAAGTTCGGCAAGC GATGGATGAAGGAGCTCCCCTCGGTGGTCTGGAGTCTGAGGACAACGCCGAGCCGAGCCACGGGCTTCACACCGTTCT TTCTAGTCTATGGGGCCGAGGCCATCTTGCCCATAGACTTAGAATACGGTTCCCCAAGGACGAGGGCCTACAACGACCA AAGCAATCGAGCTAACCGAGAAGACTCACTGGACCAGCTGGAAGAGGCTCGGAACATGGCCTTCCTACACTCGGCGCG GTATCAGCAGTCCCTGCGACGCTACCACGCCCGAAGGGTTCGGTCCCGAGACCTCCAGGTGGGCGACTTGGTGCTTCGG CTGCGACAAGACGCCCGAGGGCGGCACAAGCTCACGCCTCCCTGGGAAGGGTCGTTCGTCATCGCCAAGGTTCTGAAG CCCGGGACGTATAAGCTGGCCAACAGTCAAGGCGAGGTCTACAACAACGCTTGGAACATCCGATAG protein sequence: MFFDGSLMKTGAGAGLLFISPLGKHLRYVLRLHFPASNNVAEYEALVNGLRVAIELGIRRLDARDLHQPSVKTDDTPEPEKASA QPEAPSAPEDEALRVEEERSGVTPNRNWQTPNLQYLHRGELPLDRAEARRLARRAKSFVLLGDGKELYHRSPSGILQQCISITE GQELLQEIHSGACGHHAAPRALVGNAFRQGFYWPTAVADATRIVRTCQGCQFYARQTHLPAQALQTIPITWSFAVWGLDL VGTLQKAPGGYTHLLVAIDKFSKWIEVRPLNSIRSEQAVAFFTNIIHRFGVPNSIITDNDTQFTDRKFLDFCEDHHIRVDWAAV AHPMTNGQVERANGMILQGLKPWIYNNLNKFGKRWMKELPSVVWSLRTTPSRATGFTPFFLVYGAEAILPIDLEYGSPRTRA YNDQSNRANREDSLDQLEEARNMAFLHSARYQQSLRRYHARRVRSRDLQVGDLVLRLRQDARGRHKLTPPWEGSFVIAKVL KPGTYKLANSQGEVYNNAWNIR (100001-150000) Norman Best catgttgcggcgctggagaaagaatcaggttgtggcactgctgctgaatt gtgatagtgctttgcatcctcactaaaacgtgagccgaatactagagaaa gtgtatgcagcggcgccgaacgatgagaaattaaatcagaggcattatgg ggagtttatatgatgacgaaaaatgaaaggaaaaagaaatattaaaaaaa aagttggaatactattttggagagtaatatcacttataaatactattata gatgaagattattttaaaaataccataaagaagaagcctattaaaacagg tactacagtagaaaaaaaactccgaccggacgacggagagaaaggcggac caaacagaccaaatagacggagtcatttccacacttgtgtgcagacgtca ctatggggctgtagccgagaagagagagacacggcccaacaggcataaca gaccggaccatttccagccctcacatgggccaagctgcagccgacaacat accagatttaatatgggcctacaacagcagaagaccggccccagaaatag tgaaatactactgctaggaaatcgggacatgatgattgtttgttttctaa gctcgttcgccctctctcttctctccgtccaacgacggcgcgattcagat ccaggcatacaggccatccttccgccgccgcggactgcacctaatcttcc ttctccagcagccgggcgttcccgtcgtcctcgatcgtcccctggactgg atcgcaacggagatccagctggacctgcggtggtcccttccttttctcgg taggccgccttgaatgaattcctcgttttctgtaatgcatattctttgct tactatgaatcctcaaatccgaatcttttatgtggtggatgtgatggatt gtttcctttgcaagatagcataatgagctcgccagtggagaggtgattcg tcaggctggtatcaagttcctagtccatggtgcctggatgagcagtagag 101001-102816 Forward Strand Exon 1: 101001-101557 Exon 2: 102747-102816 Identified through Augustus, FGENESH had same first exon but different second exon. Did a blastp of both results and found better homology to Augustus result. Shows high homology to Sb03g026500 and Os01g0589700 with supporting EST data from Sorghum bicolor, Saccharum officinarium (sugar cane), and Zea mays. (Phytozome) Maizegdb.org shows both EST and cDNA data supporting my gene structure. Splice donor site present after exon 1 and splice acceptor site found before exon 2. No polypyrimidine identified. Protein: MLLAVEGGGFFSSSATGYSHGLALLLLGRKAEEKPVKVSPWNQYRLVDRETEQVYHLPSAKDQAPGKCAPFVCFGCT ANGLEVASPPKATSSNALGTGTSQEEASCSANKTLTTSGSISGSERRGCLKSNSKRDSLEHRIVVSEGEEPRESLEE VQTLRSSMERRKVQWTDTCGKDLFEIREFETSDESLSDDDPENEGFRKCECVIQ ATGCTACTGGCCGTGGAAGGAGGAGGGTTCTTCTCGTCTTCAGCTACAGG TTACAGCCATGGCCTCGCCCTCTTGCTGCTCGGACGCAAAGCTGAGGAGA AGCCCGTAAAGGTCTCGCCGTGGAACCAGTACCGGCTAGTCGACCGGGAG ACCGAGCAGGTGTACCACCTGCCCTCCGCCAAGGACCAGGCTCCTGGGAA ATGTGCTCCCTTTGTCTGCTTCGGATGCACGGCTAATGGCCTCGAGGTGG CGTCTCCTCCAAAAGCGACCTCCAGCAATGCGCTTGGTACCGGTACCTCA CAGGAAGAAGCATCTTGTTCAGCAAACAAGACGTTGACCACTAGTGGTTC CATCAGTGGCAGTGAGAGACGAGGCTGTCTTAAGAGCAACTCCAAAAGGG ATTCTTTGGAGCACCGTATAGTGGTGAGCGAGGGTGAAGAACCGCGTGAG TCTTTGGAAGAGGTGCAAACCCTGAGATCTAGCATGGAACGGAGGAAAGT TCAGTGGACAGATACATGTGGAAAGGATCTTTTTGAGATAAGGGAGTTTG AAACAAGgtatgatccaaaaatcctatctttccttgcatgccattgtggc ctggaaatatattgtgtttgtggttactaagtctgaacaccctattgtca attgtcatacatgaatgaatgcacatttctaacttcttctattaatgaat ccaaaatccaaaataaatgtgcacaacattatctcaattttttgtcatgg ttatttttgtgctattaggtaaccatcttgtgaggtaacttatcttcaag ttttcaagtaacaaagataccacaaagttcgagccatcttaatgagtagt ctccaaccaaaaggttaaaacttaatgccttcttcctcatagtcaaatct cctctaaatccctatgcacatacttttagatgtaaagacaattccagata ctaaaaatgtaacatatgcttaccaaaatgttctatattgtgaacatgcg caagtgcatggagtttgcactagtttatagctcattgaagaagcagtgca cctgcaccattaaaggggtgtttggtttctagggactaatgtttagtccc tggattttattccattttagttccaaaattaccaaatatagaaactaaaa ctttattttagtttctatatttagcgatttatacactaaaaaggaataaa atgaagggactaaacattagtccctggaaaccaaacaccccctaaatgat gcaaatatgtgtgtggtctggcttgctgcagaatcctcattactactaaa ggccccgcagagcttcgggcagctctagctccagttcttccatgcagtat agggacaagccgcttttagctctcctatcaaaatgaatagaagctggtga agccattatttttagcttcattagctctagctcctctgcgtgctgtaggc tgtagcaaggagccggagccatttaacggaacctaaatgttaacactatg aagaatagtcatatctttgtttgatttgcactcattacatcgtaagccaa gttccctacaaattaattgacactgattttccatatgcatgttgaaatgt atctggccacgtttactactcacaaactataggtcacgaagtcagataga cgtggtgatacatatattttgtgttcttgtgttctactattgctttgctg tacctgtatactatattctgattgctgccttttggtgtcttggtagCGAC GAGAGTCTGTCAGATGATGACCCAGAAAATGAAGGTTTCCGGAAATGTGA GTGTGTGATTCAGTAG(End of Gene)atttcttgtgtgcacaacaagcctgacaggaaca tctcctcatgggaagcccatgtcctgctgagaaggacatgccatatgttc ttctgggaaggatatgccacacccagtgatttcattgcaactggaacgat cgatgacagtcgaaacaaagcaaacggtgatgaatcgatattgtgctaga ttatttgttggttctacgcttaactagtctgctcggcagggttatagttt cgatgcccacctgtacaataccaagatcccattcctttctttttgttgtc atgttttgtttctctgactttccatcacctttttcacgcccgctgtacca tatgaggctgccctgttacttctgccatggatggtggatccctgacttca tgctgttcccttttgttttgcccctaacgtgttgagtgatgtcttgttga cattttcgtccaaattggtaaaattcattcttcgttgcgtaagagtgaca tacgagggctatttggttgtttaactcaaagtccggcgtactgatgtgga tctgtctttctgcccctataatgtcgatgagacattaatgtcttctgtga gaaaatgtatatattggttgctaggtacaattaccttttgacaagttttt atattataatacagtagtagggccggggcaggaaaatattgaggccttgt gccaaatctaaagtaggatctatactttaaaaaatcgaatattaacaatc taaaacttgcaatagcaaatattgtaatggaatatataaagaaacatatc tatttaattagaataataccattcttttggtatatttttgaaataaaatc ttctatgatatgtttatattcgatcttctttaacacttcactcttcagcg atattatagacaaatcattaagtatttgttgtattattttagaacataaa tttaacttcagtaacttcaatttagagaaactttgttttgaagatgcaac agttgcaccagaatagtcagcaaaactctatatgcaatacatggattagg gaaaaaaatcatgctgctttagaaacttcacaatttacaaccgccccata tttttagttagcatggaattttgaagaaactttaactccatataaagttc atgggcataaatacctgatttttatccttgtaagggcaacctctagatta tcaaaaacagggtcttgatcatgtgactgattgctggatgacacgtgtga ttgtttagtaataatacatatatctgtagtagtagctcttttttacactc aattgaattctctattatgtatttcttcttatgtttttggtaacctgaat catattttctagacgtatttatgaaaaacatgattctgatgtgtaacggt tcactatttgcacatatataaattaaaaaatcaatctatattaatagaac aatatataatcggtcattgttataaacatattattaaataaatacctaat aaagctaggtagtcgatgtccctgatcaccaacacatcaataaggatgct gctgcgtaggttataggcactatggtgttccaaaagatgcatcacgcagg caagatatcttttaagaggctatgaaaattcattctttggtgcgacgttc ggtagtttcgttttagcgtagtggtaggataagctaaaaagttaagatag atctgctaattttttgtaccctcatttttaggccatgcgcaagtgcacct tttgcaaatgcataggcccgggcctgtacagtagtaaacgtgctcatgtg ttgtaaaggaatgctcaacttgttatctcgtcacttgaatatggtcaact taatgaattgagctagggatttatacaataattatgacatttatttgaat atgtctagtgttctagcttatgaacatgtattctaagtccagaggatgga caacgcatcgagatgggaaggtgtgtggggaaaaaaatcttttttctact atattaaagcaaaatttccacggtttcatggttctacacgccttcatctt tttccgctttatgcaaaaacaacataaaattgtgcacaaataggtgttca aaccttagttgttggctccacatttatatccaccaaaccaatagaacaca catatttttatgatttattaaaacaaattctacccatatgataaattgaa actatagctgttgggggactacatctgcataggttttgtttaagagttta attattggcctagtttagagcacgaggaatagaagtcctaattcacccct cttaggcatacatgttcctttcacgggatacttattgtcttcaaagctag gaggttgctcaacattgacatcatccttgattggtccattaagaaatgtg ctcttaacatccatttgaaatatcttaaaaccatggtcagtagcatagac taataatatgcgaatagatttaagcctagttataagtgcaatggtctaat caaagtccaaacatgtgaattgggcataaccttttgccataagtctagcc ttgttccatgtcaccacaccatgttcatcttgttttttgcggaacacaca cttggtttcaacaacattctaatttggatgtggcactaggttctatactt caatccttttgaagttgttgaactcttcctgcatggccatcacccactct ggatccttcaaggcatattcaatcctgaaaggctcaacaaaagacacaaa agagtaatgttcacaaaaaattgcaattctagaacgagtggttactcgct tgttgatgtctccttaggtgtggcacttggcttgttgttagtggtggtac attatcttcatcctccttgtcttgtgcttcctgctctcccccttgatcac tgccaccatcttgagcttcttgctcctcgtcctagcttggggttgtgcct aagtggaggatggttgatcttgcttttgtggtcgcccatctccaatggac atgtttcttagtgtagtgcacgaagcctcctcgtcatctatctcatcaag atcaacttgctcccattgagagccattagtttcatcaaacacaacgtcac aagcaatctcaacacatccagatgatttgttgaagactctatataccttt gtgtttgaatcataaccgagtaaaaaccttctacaacttcaggaggaaat ttagaatttatacctttttaccaaaatataacatttgctctcaaagactc taaaatatgataaattaggtttgttaccggtaagaagttcatatggtgtc ttcttgaggagtcgaagaagatagaacctgttgatggcatgacaagctgt gttgatagcttccgcccaaactgatctgatgtcttgtactctttaagcat ggtcctcgccatatcgattagagttctattcttcctttccattacaccat tctgttgtgatgtataaggagaagaaaactcatgcttgattctctcctcc ttaagaatccttgtatttgagtgttcttgaactccgttccgttgtcgctc cttatcttcttgattttcaagttggactcattttgagctcttctcagtaa cctcaagaatatatttaatgtcgcctcttatcttcttgtatttatatata gtctaactgaaaataaagagggatgttttttaacctacaattgcttaaat tagttcgcggatgccaaattcaagacctaaaagaataataagggcttgtt tggaagcaccacaatttctaagaaactagtttctattcctagtatctcta gtttatttatataaaaaactgagtgtgcatgtttaggatctagtttctta gaaaacaacttctagcttttcgaaacacaaatacaagtttattcttactt tattttaaaattaatcgattcaattatttccttcatttttgttgctaact ttaaatatagcttgaactcatattttttaatttaaaattttcaaatcata attaaacattcaaatcatgcgtaagggaatcacggcctatgtcctgggag caatgggaatcaacgtaatattcatcaaaccaactaaaggtagtacataa tttgtctagcaaaaactgatttttagctagggaaaatcagcttctagatt tctaaattattggtttcctagaaactggaaattcagtttctaaaaactta agtataaaatggtatgtttggattcatctcattttttataaaccaatttt ttttaaaaaaaaactcggtgctccgggtctgcttggtgagattataatct gaccagattatataatctaacaaattttgaactaactatataatctggac agattataatcccaaacaaacaccctctaagtcttctataagcactaggt tcctactatcttagcatccaaaacctgctaagacaaagtgtgacatacca ataccagctgaataatcgtataggaaaactcaattttacaagctatcgtt gctttgatgtagtagtaactagtaaccgtaaaagaaaagcggtactacaa gcagcccatgcgacagggacaatcccatcagaatgtctgattgtgctgcg cccatgttgagtccctgcaaacctcacaaactgaccgttaggagacagca agttcctggccccacaatcctatccacaaccacaacgagagcgtaggggt gtgtgtgtgtgtccaaaccttctccacgcgttccttccttccctccctcc ctccctcccctcccaagaaacacaagagccaggcagcgcttgcgaggctc aagcgccgcagca 107614-108618 Forward Strand Exon 1: 107614-108056 Exon 2: 108504-108618 Augustus and FGENESH showed gene structure exactly the same. Nucleotide blast (nr) showed a plant-light harvestingdomain (PLN00014). Shows high homology to Sb03g026510 (2e-91) with supporting EST/cDNA data. Also has supporting EST data from Saccharum officinarium (sugar cane) and Zea mays. (Phytozome) Maizegdb.org shows both EST and cDNA data supporting my gene structure. Protein: MSLAPSIPSIKVKVGPVSVAPPHRACRSFAVVRSSKAEGPIRRPAAPPLSPPPKTPALSTPPTLSQPPKPAAPPTSS ESTPPSPQPKAAVATAPAAALQRPLAGAVTLEYQRKVAKDLQDYFKKKKLDEADQGPFFGFVPKNEISNGRWAMFGF AVGMLTEYATGSDFVQQLKILLSNFGIVDLD ATGTCTCTGGCCCCGTCCATCCCTTCCATCAAGGTGA AGGTGGGGCCCGTCTCGGTGGCGCCCCCGCACCGCGCATGCCGATCCTTC GCGGTGGTCAGGAGCTCCAAGGCGGAGGGCCCCATCCGGAGACCCGCGGC GCCCCCGCTGTCGCCACCGCCAAAGACGCCGGCTTTGTCCACTCCTCCCA CCCTGTCGCAGCCTCCCAAGCCCGCTGCTCCACCCACGTCGTCGGAGTCG ACACCACCGTCTCCTCAGCCGAAGGCGGCTGTGGCCACAGCCCCCGCGGC AGCGCTGCAGAGGCCGCTGGCTGGGGCTGTGACGCTGGAGTACCAGAGGA AGGTGGCCAAGGACCTGCAAGACTACTTCAAGAAGAAGAAGCTGGACGAG GCCGACCAGGGCCCATTCTTTGGGTTCGTGCCCAAGAATGAGATTTCCAA CGGAAGgtacgttcgcggtttggccccctgcaagatgcatcgaaatcatg aactacaaagccttttcaagtagtccagtatggtttgtgatgagattttt cgtagtactttctactatttgaaggttttttatctagcctgagcatttca gacagcatggtccttaatagcaatcaatcgatgagtcgatcaaacattgg gcttgagattgattaatcagttggaaatgacatttttccaagtcatctgg aacaagaccttgtgatgttaccacagactattatgctccccttcacgcta ctcaatctactcgttctaactaaatatgggtccacaataggcatgctgaa aaaaatacacgatgggtagatgcaaagattcaccacgttcttgagtaggt taaggaagctcatctctatgcgctaattcatcagctttctacaaaccttg cagGTGGGCCATGTTTGGGTTTGCAGTAGGGATGCTAACAGAGTACGCAA CAGGCTCTGATTTTGTTCAGCAATTGAAGATCCTTCTTTCCAATTTCGGA ATTGTGGACTTGGATTAA(End of Gene)taatgccgggcttggtgttttcgttatgcata agtcttttaaatgtaatgtacttaattgatatggcatatagaaaatttta tcttgcaccttgtttagcttggtttcgtttgtgctgccctgtttggatac acatgaataccaatttccagtttccttccgttaggtaccttgatttacac gggagaaggacaacactgtaggattgaccgagttgcagtgtgggcaaaaa ggattatctttaatcaagttgccaaatgccaataatagtatatgcttcta gataaataacaacggatttagattggtattgctatattatgatgtgaata tgtcattgttctatcctggttcatcaatctagcattgagcaacaggatca taagacaaaaggagatacttaatttgaaagcacttaaaaagatgcagttt caaaaggtaacaccctgcatgcggaagttatgcatgatggcgttaaccaa gctctgcaattgctttttcctcctgttattgcacaaagaatatcaatcac tacaagaaactgataaacgttcgtgggttttgttaagaacgtgggtaccg acgttcttatattaattaagttcatgggtttatttaagaacgtgggtacc cacgttcttacattaagttcgtgggctctaaccaaaaccgtcgaacttaa ccttttaaaaacgtgggtacccacgttcttaaattaaatacgtgggccgc gtggacaccgtcgaacttaacccaacctaaatccctaatcccgcgcatct ctcttcctatctcctttccggcttcccaccagtgctcgcacccgccgccg gccgccagcccgagcacactgcgcgcccggccgccgccacccagcgcgtt gcgccgcaccaccaccccgctcgcgcgctacctcgctcgctgcgccgcgc cgcccgctcgacagccccacgccgcctggccgcccgcgctgctcctctcc tgagatgggtctgagaaagagaaacgccatggttggtgaggcagccatgt tcgccacatcttggtagccctcactcggcattgacatcctctcgttcttg gacttgaagttaactcagccatgaggaagcatgggtggcagctgccgtac caccctctccaggtataatccacgtccactcacctgcgtggggttgtgcg ctcgcctttccctcctcctcgcagtgttcttgccgatcctagcatttgcc tcacgcgtcgcgctctgttatagtgttcagcgactgactggctgactggc gggcgagctgcaggtggttgcgatagccgtgttctcggcactggggttca ccttctcgtcttcttcgtgccgttcgtggggacgaagccgttccagattg tggccatggctatctacactccattggtgaggttcctgcccttttttcag gggtttaattcgctagtatgcatagatctggagttcggtatttgtttcgc tgccagccatcccgcgctttacagttcttggtacttactacgtcaggcat tgcaagttcaaacgctggtagagaacgtcatctgctttgcatgtgctgcc catatctgttgtaaaggaatgatgatatgacatccgttagactatatagc atcattgctcacttgagcgacttgacaatttaaccttgagaatgattaat gtcttctcttgcagattacatgtgttgttgtgttatacatatggtgtgcg gaaacaaatcctggagacccaggcattttcgattcaacgaagaatttgaa gtaagataagaatgaaaagcactcctatgtgaattcggatcaggggatta atcatggagaaggccattaagtgagacttttggtactgctgataacagtg agaagctgagcagcatgcttgagaggtaggaagctacactgtccctatga agttattcatgatgacagtgtcaaatgatttggcgagtgttatttttcgg gtaaaaaattatttgtactgtcaaaaaaatactggcaacgcttctttgtc aagtatattttttttttccgttgccgagtgttttttcgacactagacaaa gagttttttgccgaatagtttttttacactaggcaaagagcttctttgtc gaatgttttttctcgatactaggcggagagcttctttaccgagggttttt ttcgacactagacaaagataatttttaaatcaacttttttgaaacagtaa attaatttaaataaaaaagttttaaactacaaagttgtataactcatcac gatgtacaatatttattttagtcatttcttcatatgacaaagttaaagta atttttttcacaaaacttatatacctctcatgttgtttatgaaactacaa cagaggtgtataagatttgtaaacaattttagaatcaccatgttggatga ataaataaccaaacaaccaaaataaattttgtaaatcttaagaagttata gagttttgtagtttgcaaccttttaatttgaggtcatcttctcgacgaaa actatgtctgaactcaaaaaaatcgaatttgtgaaacaagctccaatgaa aaaacaaccaaaatgatagttgtgggtctcaaaaagttatgaaactttgt aattgacaattttttgatttgaaatcatctgatcgtgcaaaactatattt agatttaaaattcgaatttttcaaatgacctcgatgaaaaaacatcaaaa taaaagttgtaggtatcgaaaaagttatttaactttatagttgacaacgt tttgatttgaaatcatcttatcatataaaaatatgtttgaatttttcaaa tttaaaatttaaattttgtaaacgatctcaaataaaaaactactaaaatg aaagttgtaggcctcaaagagttatgcaactttatagttaacaacttttt tatttgaatttatttagtgtctaaaataatcaatttactcttggtttggt agaggaggttacgcatgagggagaggttgcgagttcaaatcttactgacc acaaaacacatgaattccattcaaaatatggtgaaagcaactaggaaaaa acactcattaaagagtcatttgccgataatttttttaccgagtgtaacac tcaataaaggatttgtcgagcgtaaaaagatctttgctgagtgttctgga cactatgcaaataaggtgagtccgatagtaattctaaataacaaaagaat caatgttttatttcagacatttattaagacatgataaagcaaacaatgtt tttaaggtcttaccttctcatgaggtgaacgcttcgaaactttaaagggg tgtttggtttgaggaatcacttcatccaaaatgtggtggtgcatcatgta ttcattcctcaaatttggtgggatgacctcattcctcacattagtactaa ctaaataactataaggaatgaggtgatgatggatcaactcaatccattcc ataaaccaaacaaaaaagtgaagagtgagaagatgatggactagctcatt tctcaaaccaaacaccttataagcatctgatcaagactgtagacaacaga tgaaa 112206-115963 Reverse Strand Exon 1: 115759-115963 Exon 2: 112455-112528 Exon 3: 112206-112358 Both Augustus and FGENESH show a possible gene at this locus. Augustus shows two possible gene structures, but through blast and homology to Sorghum bicolor I believe that transcript 1 is more correct. BlastX gives a hit to a coiled-coil domain containing protein 94 (6e-12). This is a relatively low evalue. Reading about these domains, it appears they are more common in animals and transposons in rice have been shown to contain these domains. Nucleotide blast has only one hit to Sorghum bicolor and this is a putatitve protein. I therefore believe that there is possibly a transposable element at this locus but still has potential for a candidate gene. Protein: MGGGEPAEQQKTDERRAVTRSSSGDDNDFCSWGPNNGSTDLRADAAAKRPPPMPKLVRQGRAEGRRVGDPQNSDYTV GSGAGRNLDPWREKDEQDAVMGDAMKVLENRAMDSKQDMDILAALEEMRSMKIGSSSKPLAVVYEF TCAGAACTCGTATACAACAGCTAGTGGCTTAGAACTAGACCCAAT CTTCATGGACCGCATCTCTTCCAAAGCAGCAAGGATGTCCATATCCTGCT TTGAATCCATTGCCCTATTTTCCAACACTTTCATTGCATCACCCATCACA GCATCTTGctgcctgaataaaaaaatattcatcacggtgatcaaattgaa gattatatgcatgggatcgaatttgtcctttgaatatgccccaaaatcac ttacCTCGTCCTTTTCACGCCAAGGATCAAGATTGCGACCAGCCCCTGAT CCCACTGTGTAGTCAGAATTCTGAGGATctgttttgaatgcgacctcagc aaagccctcagcacacttgaagtaaaacttaaatactctctccgttttat tttagttatcgctgaatagtgtaaaattgaactatccagcgacaactaaa aagaaacggagggagtatttgtattcccaactatgtctacgcccaaaaga agggaatagaattatcatgagttagaatacaatcagtaatcaaaacacgc actagttgatacggatatgaatttccacagttccaagactgacaaccaaa taaaacattatgccagaccataaaatgatgaaaaataaattcaaaccatg gctgtaaattcaaaagcacaaagatgcctatgaaaaagatattacggcaa gcgctgaacacaaagaactgttgtctcgaacactgcacaacatagaagtg taggcagagcaatgtggtggctcaggaacaacattaatacattctggagt taggctccgtttcaatctcacgggataaactttagcttcctgctaaactt tagctatatgaattgaagtgctaaagtttagcttcaattactaccattag ctctcctgtttagattataaatggctaaaagtagctaaaaaaaagctgct aaagtttatctcgcgagattggaacagggccttattcatgtagtcatgac gaacaaaagttgctaggttgagcagcccatggactgaaaagataacacct ataaatcagtaagtcgaaataataagtcacgttgtctcaacatagtaagt ctcatactcagataaagaaaggtaatggtagtcttgacaagccaatatta aacatagtcattctaatagagatgaaacccaaaagaaatcagttaaggtg taaccctcttagcgacgcaccatatcgaaacccagatatggtatcaaatg ggcaaaggtcgggtcgacactttcttggtgacgtgtcgtgccgtgattag gacacgatgtcaagttatcagaatcaggttgtaagattcttattggcacg ctacatcggtacccgagtgtagtgaaaaatatgcaaggtcttcacataat ctatggatggacggataagaaagctactcgaaccaactaggatccattta ggtagttggaatataagattgcttactggtaagttaagagagttaattga gacaacgactaggagacttataaatatcttataagatcagaaggcggagg tggacgatatcgacttcaagctttggtacacaaggacagttgtaaataga aatggagtagttttgattgataagatcctcaagaatggtgtgtcggttgt gataatgcaaggagataagattattctagttcagcttgtcatgggtgatt tggtcttgaacgtaattagtgtatatgcccccccaagtagttcacgacga gaatgctgagagacttattctagaaagacttagatggcatgattagagat gcacctattagtgagaagttttcataatagaagaactggacctgcggggg gtggtaagacagtctccaagtgttacattaagaatatcttctcatgcagg tcgagataacccctaaactcttgccctacctgtcggtgttaagtaccagc aacacactacggaggtgggcgcatagtgcttttctaaggtaggtgacgag gatcacaactcgatggcacgtgtagaacacgtcggagacacaagaaattg ctcggacgctcgatggtacgtcctgtggtttccgtctgtgtgcggttgac ttctctctggagtacaggggatgtctagtggaagaagatgagttctatga gtgtccatctcatttttgtagggtcccttcagcataattgccgactgaat tggggttacagagaaactaccatgagtccacgatctccatgttccgcttg ttctgtctctcgttgcgctgggctacggtcaaacgagtcatccaccgatt tctgtgcgccgcacgtacaagaagtgactggttcacgtacacgcacctgc accttcctcggactccaatagctcccattcgaccgaggcccatctgtcgg cgtttcgagaccggggggtccctgggccgacgagtgaatgtcgccgcgtg ccccagcccagatgggtcgagcgcgagggcgagcgcgaaggggggagagc gaggcggccggagaccggcgtgagagaggtgggaatcccgcggccttcgt gttcgtcccacgcccaggtcgggtgcgcttgcagtagggggttacaagcg tccacgcgggagagggagcgagcggctccaagcgagcgcctgtctcgtcc tcgtccccgcgcgtccaaccctctctaagagggccctggtccttcctttt ataggcgtaaggagaggatccaggtgtacaatggggggtatagcagggtg ctacgtgtctagcggtggagagctagtgccctaagtacatgccgttgtgg cagccggagagatttgggcacccagctggtgtgatgtcgtggccgtcgga ggagcgatggagcctggcggagggacaactgttggagcggttgagtcctt gctgacgtcctcttgcttccgtaagggggctgagagccgccgtcgtcaca gagtacgcggggcgccatcattgcctatctggcggagctagccagatggg acgccggtcttgttccctgcggcccgagtcagcttggaaggcgaccatgt cttcgcgctcccttatgcatcgtgcctttccaccttccaagcccccggac gagggatacccgccgtctttccgcctcatcgttggaggaacgcaactccg tgggagttggtacctttcagccgtcgttcggcttcaaggattttcatcat gcagcccggctgcacccctccgccggcggtcacccaagatggtgacctcc agtttgatggtgggggaaagcgagccgggctgcggcctctgcccctccct cagcctcaaggattttcatcaccagggctggggaggggagtgtgccgagt tggggtcggcccctgcgtgggcggtggcccgctccttccctcagtgatcg gagggatgggcggtcgtcgtctgtggcgccggcagctgcagcgtgcctgg ctctcgggcgcgagtggcttcggagccgctcgcggcgccggtgtctgcca cggcagctggaagaggttcttccaccgacgagatagccagggccgcccac ggaccgacCTCCAACTCTACGGCCTTCTGCCCGTCCTTGCCTCACGAGTT TGGGCATGGGCGGGGGCCTCTTGGCAGCAGCATCCGCCCTGAGGTCAGTG CTGCCGTTGTTCGGCCCCCAGGAGCAGAAGTCGTTGTCGTCGCCGCTGCT GGAGCGGGTGACGGCGCGCCGTTCGTCGGTCTTCTGTTGCTCCGCAGGTT CCCCCCCCCCCAT(Start of gene)cgagtggggttgttcgtacctgcggaggtggaaccgg agttccgtttgtaatggcactttgaatgccagtgtttttgttcattgtgg ctgtcgaggcctgaacatatatgtaattttggcatggagccgtgtttttt cctcattttcgagcactgagactcgcctgttggttgtctgaaccgcttca ccaagcgtgagtcgccccgtgtcaaggtgacgagtgaggtatccgtatcc cggagacgttggagtccctcggctcggtcggccttgttgtccgaggcttc tctagcttagttaaagggaccccttggccgctcttcgatgagccgaggcc aggggtagcggtatcagcatgaacaggggcagagttggctcgaaaaggaa acctggttggccggagcctaaccgggtcgtccgttagcgggaccgacgtc gaagttgaccagccgaggcctcgggtcgggctaacgtccttggaggatgg ctggccgaggccccggggtgaccggccgagccgcctgctcgggccgggtt cccggagaagaccctggcagcgattgcccgggcgtggcgatgacgtcgtc ctttagagtggagatcctcggaccgcgtcgccgtccgaggctaggtcgga cctcgccgaaggtgtcgtcgatgcggagggtgctgctgcccccttccagc gtcaagacccgagcctgcaggatcggattgtcttgtagcgtgtgtctcct gcggccgccgaggccagaacacaccctcgctgtgttgtaaagctgcgtct ctttttctcttgtttcgagtatctggacttttttgtcggtaacagggatg tttgtgcgagcgagagttgcttctcgcggaaggtgatgagtgaggtatcc gtatcccggaggcgtaggagtcccttggctcggtcggccttgccgcttac gcgcactcttacccgtccatggggctctgtcaccgactcagtcgagaagg ctcgaaggatcgcttcggcagaagagcttccgatcgtgaagacttgttcg gtccgcggaatcacttatccgaacgtgagttacttatcgcagaaggtgat gagtgaggtatccgtatcccggaggcataggagtccctcggctcggtcag ccttggctgcttacgtgtactccgtcgttttcaggatccacttttcgaag tagtcaaaaagcacgaaagacattctggcagaagagatctttcttcgagg aaaatttcgacgcagagggggtttcccccctttcagcccccgagggaggg tcgagctttgccgaggcgaggccgacccttccttgatgactaaactttgc gtgggtgcgaggtatatgaacaacttgaaaacatcttaagggtagaagcg acgtagctgttggatgttccaagcgttgtcgtagacctcgccttgactgt tggccagcttgtacgttccgggcttcagaaccttggcgatgacgaacggt ccctcccaggggggcgtgagcttgtgcctccctcgggcgtcttgtcgtag ccgaagcaccaggtcgcccacctggaggtctcgggaccggacccctcggg cgtggtagcgtcgcagggactgctggtaccgcgccgagtgtagtaaggcc ttgtcccgagcctcttccagctggtccagcgagtcttctcggctagcttg gttgctttgatcgctgtaggccctcgtcctcggggagccatattccaggt ctgcgggcaagacagcctcagccccgtagactaggaagaacggcgtgaag cccgtggctcggctcggcgttgtcctcaggctccagaccaccgaggggag ttccttcatccatcgcttgccgaacttgttgaggtcgttgtagatccgag gcttgagcccttgtagaatcatgccgttggcacgctctacttgcccattt gacatgggatgagccacggcggcccagtccacctggatgtggtgatcctc gcagaagtccaagaactttctgccggtgaactgggtgccgttgtcggtga tgatggagttcgggaccccgaagcgatggatgatgttggtgaagaacgcc actgcctgctcggacctgatgctgttcagaggtcggacctcgatccactt ggagaatttgtcgatggcgaccagcaggtgcgtgtagcccccgggtgcct tctgcaaggggccgacgaggtccagacccgacacagcaaagggccaggtg atgggtatcgtctgtagagcctgagcgggcaggtgggtctgcttcgcata gaattgacaccctttgcaggtgcggacaattctagtggtgtcggccaccg ccgtcggccagtagaagccttgtcggaaagcattcccgacaagggctcga ggcgctgcgtgatggccgcaagcccccgagtgtatctctcgcaggagttc ctgaccttcggcgacggagatgcatcgctggaggatgcctgaggggctgc ggtggtagagctccttctcatcgcccaacaagacgaatgacttggcgcgc cgcgccacccgccgagcctcggctcggtcgaggggtagctctcctcggtg gagatattgcaggtacggggtctgccagttttgatcaggcgtggccccgc ttcgctcctcctcgacgcgcagtgcctcaccctcgggggccgagggtacc tcgggctgaaccgagggtgcctcgggctgtgccgagggtacctcgggctg ggccgagggtacctcggactgggccgagggcgcctcaggctcgggcgtgt cgtcgatcttgacggaggattgatgcagatcccaggagaagacgtcccgg ggaaccgttgtttgccccgaggctattttagccagctcgtccgcagtctc gttgtaacgccgagcgatgtggttaagctcgagcccgtagaacttgtctt ccaagcgccgaacctcatcgcagtaggcctccatcttcgggtcgcggcag tgggagttcttcatgacttggtcgatgacgagctgcgagtcaccgcgggc gtcgaggcgccggacccctagctcgatggcgatccacaaccctttgacca gagcctcatactcagccacattgttggacgccggaaaatggaggcgtagc acatagcgtatgtgtttcccgaggggcgagatgaagagcaggcctgcgcc ggctcccgtcttcatcaacgacccgtcgaaaaacatggtccagagctccg gttggatcggagccgtcggtagctgggtgtcgacccattcggccacgaag tccgccaagacctgggacttgatggccttccgaggggcgaacgagattgt ctcgcccatgatttccaccgcccatttcacaatcctacccgaggcctctc ggcactggatgatctcccctagggggaaggatgacaccacagttaccgga tgagactcgaagtagtgtcgcaacttccgtcgtgtcaggatcactgcata cagcagcttctgaacttgtgggtagcggatcttggtttcggacagtacct cgctgacgaagtaaactggcctctgaacgggcaatgtgtgcccctcttct tgcctctcgaccacaatcgcggcgctaaccacctgagtggtcgcggcgac gtagaccaagagggcttctccggcagctagaggcaccaagataggcgcct ttgtgaggagcgccttcaggtttccgagagcttcctcggcctcaggggtc caagtgaagcgctcggccttccttaagaggcggtacagaggtagacctct ttcgccgaggcgtgagatgaagcggctcagagccgcaaggcatcccatga ccctctttacgcctttcaagtccttgatgggcccaatgctggtgatggct gcgatcttctccgggttggcttcgatgccccgctcgaagacgatgaaccc caagagcatgcctcggggcaccccgaagacacacttctcgggattgagct tgacgcctttcgccttgagacatcggaatgtcgcttcaaggtccgaaagg aggtcggaagctttcctcgtcttgactacgatgtcatcgacgtaggcctc gaccgtgcggccaatgtgttcgccgaacacatggttcatgcaccgctggt acgtcgcgcccgcattcctcaagccgaacggcatggtgacatagtagtac atgtcgaagggtgtggtgaaagaagtcgcgagctggtcggactctttcat cctgatttgatgataccctgagtaggcatcgaggaaagacagggtttcgc acccagcagtggaatccacgatttgatcgatgcgaggcagagggtaggga accttcggacatgctttgttgagaccagtgtagtctacacacatccgcca tttcccccttttctttctcacaagcacagggttggcaagccattcgggat ggaatacctctttgatgaaccctgtcgccattagcttgtggatctcctcg cctatcgctctgcgcttctccttgtcgaatcggcgcagaggctacttgac gggtcgggctccggcccgaatatccagcgagtgctcggcgacatccctcg gtatgccgggcatgtccgagggactccacgcgaagacgtcggcgttcgcg cggagaaagtcgacgagcactgcttcctatttgggatcgagcccggagcc gatccggatctgcttggaggcgtcgccgctggggtcgagggggacggcct taaccatctccactggctcgaagttgccggcatgacgtttcacgtctggc acctccttagagaggctttccaggtcggcgatgagggcctcggactcgac gagggcctcggcgtactccacgcactccacgtcgcattcgaacgtgtgtt tgtacgtggggccgacggtgatgaccctgttggggcccggcatcttgagc ttcaggtaggtgtagttggggacggccatgaatttcgcgtagcatggtct tcccagtaccgcgtggtaggttcctcggaacccgaccacctcgaacggca gggtctcccttcggaagttggagggtgttccgaagcagacgcgaaggtcg agttgtccgaggggctggacgcgcttcccgggaataatcccatggaaggg cgcagcgcctgctcggacggaggacagatcgacacgcaggagcccgaggg tctcggcgtagatgatgttgaggctgctgcctccgtccataaggaccttg gtgagcctgacgtcgccgatgacggggtcgacgacgagcgggtatttccc cgggctcggcacgtggtcggggtggtcggcttggtcgaaggtgatgggct tgtcggaccagtctaggtagactggcgccgccaccttcaccgagcagacc tcccggtgctcttgcttgcggtgccgagtcgaggcattcgccgcttgccc actgtagatcatgaagcagtcgcggacctcggggaactctcctgcttggt gatcttccttcttgttgtcgtcgcgggccctgccaccctccgtgggtggc ccggccctgtggaagtggcgccgaagcatgacgcactcctcaagggtgtg cttgacgggcccctggtgataggggcacggctccttgagcatcttgtcga agaggttggcacctccggggggtttccgagggttcttgtgctcggcggcg gtgacaaggtccgcgtcggcggcgtcgcgtttcgcttgcgacttcttctt gcctttcttcttggcaccgcgctgagtcgacgcctcgggagcatcttccg atgggcggccctggggctgcttgtcctttcggaagatagcctcgaccgcc tcctggccggaggcgaacttggtggcgatgtccatcagctcgctcgccct ggtgggggtcttgcgacccaacttgctcaccaggtcgcggcaggtggtgc cggcgaggaacgcgccgatgacgtccgagtcggtgatgttgggcagctcg gtgcgctgcttcgagaatcgccggatgtagtcccggagagactctcccgg ctgctgtcggcagcttcggaggtcccaggaattcccggggcgcacgtacg tgccctggaaattgccggcgaaagcttggactaggtcgccccagttggag atctgccccggaggcaggtgctccaaccaggcgcgagcggtgtcggagag gaacagggggaggttgcggatgatgaggttgtcatcgtccgttccaccca gttggcaggccagacggtagtccgcgagccacagttccggtctcgtctcc cccgagtactttgtgatagtagtcgggggtcggaaccgggtcgggaacgg tgcccgtcagatggcccggctgaaggcctgcggaccgggcggttcgggcg agggactccgatcctccccgctgtcgtagcgtcccccacgcctggggtgg tagcctcggcgcaccctctcgtcgaggtgggcctgacggtcgcggtgatg gtgctcgttgccgaggcggcccggggccgcaggcacggtgttgcgtgtgc gcccggtgtagaccgaggcttcccgcatgaatcgggaagtcgcggcatga ggttccgaggggtacccctgccttcgggaggcagagctctcggcccatcg gaccgcggtgccttccaggagattcttgagctccccctggattcgccggc cctcggtggttgatggctccggcattgcgcagagaagcatcgctgctgca gccaggttctggccgacccactagatgcgggtggtggcctgaccctgaca tcgtcggcgacgcggtgctggaagccctggggcagatgacgtatttctcc ggtcgggggttggcccgcccatgcctgcccgatgtcccagcggatcggct caagcgctcctgctccctcgtcgagcctggcctgcaccccgcggatttgc tcgagctgtgggtcatggcccccgcctgaacggggaccacagctagctcc cgtgggatgtcaacgcggggcgccgacctagggagatcaccgtcctccgg catgctgagatgattgccttcggagggaccccctagatcgacgtggaaac attcgcggcctgggccgcagtcctcgtcgccgaggctgcggctaccgtcg gaacagtcggagaggcagtagtcacatgcggtcatgaagtcccgcatgga actggggttgccaagtccagagaaatctcagcagatgctgggctcgtcgt cttcctcggacccagagggcccgtaggtcgagacgtccgttagccggtcc caaggcgaccgcatgcgaaaccccagagggtttggactcgcctctacgag agcgcccgccaaagcggggtcgctaggcgggttgaggctgaatccaaatg acgtgggacgggaatcggtcggtacctcttggtcgacgagcggcgataaa gtcacgtcggggactggctgcaccgtcgtctcaggtataagggtgacgtc cagcaagcttttcgcaagcgcgctggcgtcgtccgcttgctcgggattgg cgtgtcgcggggagacggcgctcgtcttcgtctcaagcgcgaagtcgata cccggtgcgccccctgtaggggtgccggcgctgccgacttgctcgacagc cgacgaggcgctgcctcctgcttggccttggttgccctgtcttcccctcc gtcggcggggaagagggcgggatgagctcgaaggttgttcttgcaccacg cgagggaagacgttgtcgatttcgccgccggcgggcgggctgtcggccgc cattgtcgttgtcgcacggcggtggaaggagtatcatgtcgtagctgccg tcgagggacatgagctcaagactcccgaaacggagcaccgtcccgggttg gagaggttgctggagactgcccatctggagcaccgttccgggttcgtcaa catgcagcaggcccctacctggcgcgccaactgtcggcgtttcgagaccg ggaggtccctgggccgacgagtgaatgtcgtcgcgtgccccagcccatat gggtcgagcgcgagggcgagcgcgaaggggggagagcgaggcggccggag accggcgtgagagaggtgggaatcccgcggccttcgtgttcgttccgcgc ccaggtcgggtgcgcttgcagtagggggttacaagcgtccacgcgggaga gggagcgagcggctccaagcgagcgcctgtctcgtcctcgtccccgcgcg gccaaccctctctaagagggccctggtccttccttttataggcgtaagga gaggatccaggtgtacaatgggggtgtagcagggtgctacgtgtctagcg gtggagagatagtgccctaagtacatgccgttgtggcagccggagagatt tgggcacccagctggtgtgatgtcgtggccgtcggaggagcgatggagcc tggcggagggacagctgtcgaagcggttgagtccttgctgacgtcctctt gcttccgtaagggggctgagagccgccgtcgtcacagagtacgcggggcg ccatcattgcctatctggcggagctagccagatgggacgccggtcttgtt ccctgcggcccgagtcagctcggggtagggtgatgatggcgcctcctgtt gacgtggctggtctgcgccctaggttgggtgatgtggaagctcctccgaa gtcgaggtcgagtctgtcttccgtggccgaggtcgagtccgagcccctgg gtcgggcgaggcgaagttcgtcgtcttctggggctgagcccgagtccgag ccctgggtcgggcggagcggagttcgccgtcttccgggacttagcccgag tccgagccctgggtcgggcggagcggagatcgccgtcttccaaggctgag cccgagtccgagccctgggtcgggcggagcggagttcgccgtcttccggg acttagcccgagtccgagccctgggtcgggcggagcggagttcaccgtct tccgggacttagcccgagtccgagccctgggtcgggcgaagcggagcttc ctatgatgccttcggcagggcctgactgtctgtcagttttcactctgtca agtggcaccgcagtcggagcggtgcaggcggcgttgtccttctgtcaggt cggtcagtggagcggcgaagtgacggcggtcacttcggctctgccgggct ggggggcgcgcgtcaggataaaggtgtcaggccacctttgcattaaatgc tcctgcgatttggtcggtcggtgcggcgatttagtcagggttgcttcttg gcgaaggcagggcctcgggcgagccggaaatatgttcgccgctggagggg ggcctcgggcgagacagaaatcctccggggtcggctgcccttgtccgagg ctaagctcgggcgaggcgtgatcgagtccttcgaatggactgatccctga cttaatcgcgcccatcaggcctttgcagctttatgctgatgggggttact agctgagaattaggagccttgagggtacccctaattatggtccccgacac catcgggcggcccatcaagggcctttcggatggcccacgaaatgtctccg ggttgtggtgacccaggggatatctatcccccacaaccacaaagcccccg aggctcaagccaatttctgaatataattggctcgagactctccaaagtaa actctgactctgatggtgagttcttgttgagtaagttgcttgcttccttg gagtgcagcttcccaaaacatgtaaatttaattcaaaaaatggtgaaaaa tgataggtgatggatagggcaatggtagtggatgaagagttgttcccata atttaaaataagatttagctgtccgaaaaaagtactcgacaaagaaccat ttgtcgattggcaaactcagcaaaaaaaggcaagtccaatagtgatttaa actaagcaagtattaaatcaaaatatttaagttggtattaatcattttaa aaattatttataaaaagtgcatattaaaaagtataaattagtttaaatta aaaacctatttaataatcattagttaaacaattaattaattaaaacaata ttattactgggtagtcaaaattaataccaacctaatgccatttgaatgaa ataaatcagacatctaacgcaaagtatatcacatatttaataaattaggt ctgcacagaaatatgtgatatgatgcgatttgattgattaaatatgtgtt acaaaaaaaatttacaaggtaaatatacgtaatatataaaatctaggcgt agccatcttctacgggcatgtttgccgccatggccccgcagtctggttgt gctcctgggctgcagctcccgtgtttgtgtcgtatggcttgctcgggagt caggttgtgtgtgtgcaaatttagcctcttagtcttgctccatagaaacg aggtcaaggtccgtttcctgggagcctggctgctcgcggcgtaaggattc cgtattgaggtagttttagtcttctagcccgtctgcatatcgggtgacaa acattaactcgtctaagtcagactcatatggtacgcaaacaaagtgagtt acattagccgaggccggctagcctgagtatggaccagttgaacaggccac gccctacatttcttatctatgtgttcgaactccgacggcgtcggcaaagg attcccggcgagaacacggcaccggactctttcacgctcgccgatgctgt accggcatagacggccagcgttcctgatgaggtcggcggcggcggggaac aggcgacggcaaggaacaacagtgttcgtgacgaggtcggcgatggtgga accggcggcggcatgcgacagcggcagcgggctcgcgtatagaataaagg agcgattgtatgctgtttgcctgttgctatcgtggggaagaaaaagtggg gtgggggctttccagaaaactgccaggaggactgtagcgactgcggaagg gttattttgcattttgtttgtgcaccatcggatatgagatcgatggccaa aaatttgcttgtgcatttctgcaccggtgtagccttgtgtgcacaacatt ttatgttcggtatcttatttcaacagactatctatctattttggttagtc aggtgtctagctaagtttgtctagtgaaggatctaatttagagagtcgga taacttggcgattcagatagctaatatgttagagttattttaaccgttga atagctaaaatttggtttgactagccatttagatagtttttttagatgct cttagtatactcgactttacagctctatcaatagtaggcaaggaaaagac acagtgcgacagcgtacgtctagcttggcgcatagcaatgcatttccctt tgacgctcgatgcatgtacaagtacaacccaagacttggcgccttgttgg cgcccgccgcatagggaagaaagatgcacttgcagtcatttaactttttt tgagagagagaaaaaaaagaagggacacttgtagagtagaagatcgatgg atgtgcgtgcaagaacggtacgagtaccgcagcatgcatcaatgatccgt ccgggcggaccgatcattctttttttccccttcttgtatgcatgcacgca aacgaatgactgttgtgctgcatgacacgtacacctctccgtcctttcgt ttcggtctactctctactggtacagacgtagtacgttcacttgtgccgcc gttgtgtgcgtacgtttttcttctaaagaaaatgaaatgcgggggcgagc agccccgcaggccgcagagcgagctcggattgaccagactgcgtccacga acgtgcgcgatcgcaaacctcgccacgtcacgtcaccgacgattcccgcc aactttgacatcatgcgcgcgacgcgatcgatctgggccacggcccacgg ccaccagccctgccggccacgcgcctccctcgtcagcacagacgcgcggc ccgccggcatagcgccgagtctacgcgaccgaccggccggccagcctcgg ccgtctcgtataaatacacatgcacaccgccgcgagagctatcaccagtc ggcagtcagtctagctcggcatcggctcattgatcagtcagagccgctat tacgatccagtcatccagatctcgctagatatagatatatacgtgagtac gtgacctagctagctaggtagtagcctccagcaggcagtagtgtacgtgc taagagtaatcgccgagaaaccattactgttcatc 128236-128604 Forward Strand Exon 1: 28236-28365 Exon 2: 28531-28604 Augustus and FGENESH both show the same gene structure. Blastp of the proposed protein showed a top ofArabinogalactan peptide 22 from Arabidopsis (1e-06). This is a relative high evalue but it did show a conserved domain of DUF1070. The DUF1070 is a family that consists of short hypothetical proteins of unknown function (http://aranet.mpimp-golm.mpg.de/pfamnet/DUF1070). Gene ontolgy shows that the Arabinogalactan peptide 22 is a plasma membrane anchored protein (multiple GO terms) that appears to be involved in transport (GO:0006826 iron ion transport and GO:0015706 nitrate transport). It could also possibly be involved in brassinosteroid biosynthesis as proposed by gene ontology (GO:0016132). (http://www.ebi.ac.uk/QuickGO/GProtein?ac=Q9FK16) Protein: MSSLTARLALLAYLLLATLLHPCLCHAAAAAPAARGAGNWRMDPKAIDQGVAYVLMLLALFVTYIVH ATGAGCTCGCTGACG GCGAGGCTCGCTCTGCTGGCGTACCTGCTGCTCGCGACACTGCTGCACCC CTGCCTCTGCCATGCCGCGGCGGCGGCTCCGGCCGCTCGTGGTGCCGGCA ACTGGAGAATGGATCgtgagttctatagtctccctgatctcctccatatc atctttagatgctgctgcttcgtcttaatttgtcttcgaaattaaagata gaatcattaatcatgcagagtgatctatacaattgctaccgtttgttttg acttgttttcatttctcgttccgcatgcagCGAAAGCGATCGACCAGGGG GTCGCCTACGTCCTCATGCTCCTCGCGCTCTTCGTCACCTACATCGTGCA CTGA(End of Gene)atatattattattggtcgactgtcctattggacagcattcgcaaaa agacgttgattacattctccaatcgctcatgcatggcggtcgtgttctgc tggcagctcaaattaaggcaagaggcaggcagtagtgtcttgattcccgt tcgccggctatcggctatatggtgcccgacgtcgtcgacgagcagtctcg tttgcccggccgcgcgccggactccggcccacttcctttcttctctcttt ttcttttcaagcttgatgtgatgtgacttgaggcgagtcaaagtagtcta gtcatcatccatcaggtgtactgtaaaatgttcattgctgtctagtagtg attatgtacattacattaatatcagagatctcaattagaaataataacga gaagatgagataagttctattttacaaataaattcttaataataatgtcg gtatacttttgttgcttcgtacacaacaagatatgtatgccgtcttttgt gacggcatttacgttggtaccgtatattttgtggaacggcttgatccaac aaggggcacgtgcaatctatatagctctgtgttggcgccgatccagcgct aaacactggtaagaacaacttggcagtgttctttgcgagagcgcgtacag tctgtaaccacaacgcatgagctgcttctcctctgcgtgcttcggtgcct ggggccggatggttcgtgatgacgcagaggatcttcttttccacaataaa cctggatctcgtctcccgagagaaaccccgtcggagaagagagtttcagg ggttgtcttgggatcggcaggccacccaaaacgcctctggccaacgtaga gtcgaaaaacaacgaaaaaattaaataggtcgatgaagactaatattaag gctagattactcctactctaaggaatggaaaaattgcaaaaaaaggtaaa tttgaaaatagatttgattggatcgattataagggtttcaatcggtcgta ccctttcatctatataaagtggagaggtctttaacccgttcctaggcaat agacaacagatcccacgtgattatatggataaccacacatgagataagaa taatcgcccgagtcaatcttatcgcgggaacgcggactgtccgagcccta ggcctggaccgtccgccacttctggtgttcaacgcatgcccctgcctttt ggtgaacattgacaaaccaaaagcacatgttctagcagcgcctttcgaga aacaataaaccttgtgacttactttattcctgaagtatgaattaagtttg atgtgagtcactgacttttcaatctagtagggtgatcatttagcacagaa agatcgtttaggattacatctttctcggccattactacttgatcaaggga tcaataagtataaaattgtagtgctcaagcctgagtaagcaaagagatga tgcatataatatgggttcatcgccgcactacctcatgcttgaacaagaaa atatgtcggcggtgaataggctggtggaaagtaccttgaacagagacact tgttgtgtcgccttttggcccgttttattcggccgtgttgcctttgcggg tgactggagttgcttcattggccgattgcaaggtacgaccttgtcaaaag ccaggacgactttaactagccgaccaaaagtttcagatgaattttttgct tcgacgtacctgtttttggttgtcgatgcttaaaggcgctgccggatcgt ccggctgagttagccagaccgtctcgctgagtgagccggaccgtttcgct gagttagctggactgtccacttggctccggactgtccggacacagatgtc ataccgtctatgatgcgcaagacatggagctttgatcggctgcccgatca cccttgcccccagcacctccattcttattagtcttcttgcctagccttct gagtaactactcctcgtgaaccatttgatgtgcgaatatcacaaatggcg atgtttttgtctttgcctttatcggccacttcaggtcgaactatgacctt ttttgcttgctagttccactatattgaaatgaaatgattgtgtgtccact tacatctcctaaaagatcaaccaaccctcatttatgctcaattgtatctg tctacaaaaaaacattacaatcattggtggcatgtgaaaaataattatgc catttgcaataagaacatctctttaattcatccataggaggaatagtatg cgttaatttaatgctgtcactgaaagtcgcctagaggggggtgaataggg cgaaactgaaatttacaaagttaatcacaactacaagcccggttagcgtt agaaatataatcgagtctgcgagagagggtgcaaaacaaatcgcaagcaa ataaagagtgtgacacgcggatttgttttaccgaggttcggttctcgcaa acctactccccgttgaggcggtcacaaagaccgggtctctttcaaccctt tccctctctcaaacggtccctcggaccgagtgagctttctcttctcaaat caaccgggaacaaaacttccccgcaaggaccaccacacaattggtgtctc ttgccttggttacaattgagttttgatcacaagcaaagaatgagaaaaga agaaagcaatccaagcgcaagagctcaaaagaacacaacaaatctctctc acttaacactaaagcttttgtggaatttgggagaggatttgatcacttgg gtgtgtcttgtattgaatgcctagctcttgtaagtgattggaagttgaga aacttggatgacttgaatgtggggtggttgtactacccaaaaagttcagt ggtaaacaaggtataaagataatcaaactagggtaacctattgggtccca tcaaaattaacctatgcagatcattaagattaattagaacatgagtgggt aaaaagaagtgatcaagggcacaacttgcctggcacttgagattctaggt gccaacttgctcttcagatgacacgtgacctcgctctagtcgtagcaata caaacaaacattgtataggcaaaattaacattacaccaaacataagaata aactgcgtaataataatttatgtgtcgctacgagatcgtaggagcgagaa tcactaaattcggagctacgtttaagaagttatggttttccgaagtcttt atgtgcttggtaaaaaaaaattaattgagtgatcaattctaatgtggctt ccatgttaaaacagagttactaggttataaacaatattattataaaatta atacaattggaatggtcaaaatgagttaaaatgaattagttatgaattaa ccaagtttatgtaatttgtttttcattctagaaatcattttctattttta ccttctttgtacttcctcccagcgatatatggaactgttgnnnnnnnnnn nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnncgggcccaca cgtcagagctccaacggtcggaacccaacgacctggtgacgtggctggcg caccgaacagtgtctggtggcgcaccggactgtctggtgcgccatgcgac agcagcctacaccaaactagccgtttggtgtaggctgctgtcgcatggcg caccagacagtccggtgcgcccaccggacactgttcggtgcgccagccac gtcaccaggtcgttgggttccgaccgttggagctctgacgtgtgggcccg ccttgcctgttcggtggcgcaccggacagtcctgttagattgttccggtt gtgccacccgcgcgtgctctgttccttttacgcgcgctggcgccgcattt aatgcgttgcagacgaccgttgacgcgaagtagtcgttgctccgctggct caccggacagtcccggttgtgcacaggacatgtccggtgaattatagcgg agcgggaatccgaagctgggcgagttcagagtcgctctcccctggagcac cggacatgtccggtgaattatagcgaagcgcctctgagaattcccgaagg tgcgaagtttggcttagagtaccctggtgcaccggacaatgtccggtggc acaccggacagtccggtgcgccagaccagggctgccttcggttatcccat gctctctttgatgaacccaattcttggtctttttattggctaagtgtgaa cctttggcacctgtataacttatagactagagcaaactagttagtccaat tatttgtgttgggcaattcaaccaccaaaatcaattaggaactaggtgta agcctaattccctttcaatctccccctttttggtgattgatgccaacaca aaccaaagcaagtatagagatgcataattgaactaatttgcataatgtaa gtgcaaaggttacttagaattgagccaatataaatacttaatataaatac ttataagatatgcatggattgtttcttctatttttaacattttggaccac gcttgcaccacttattttgtttttgcaaattcttttgtaaatccttttca aagtccttttgcaaatagtcaaaggtaaatgaataagattttgagaagca tttatgagatttgaaattttctccccctgtttcaaatgcttttcctttga ctaaacaaaactcccccttaatgaaatcctcctcttagtgttcaagaggg ttttaagatatcgattttgaaaatactactttctcccccttttgaacaca atgagataccaatttggaaaatcataccaaatgaaaaaactcttttttaa aattagggtggtggtgcggttcttttgctttgggctcatgctctctcccc ctttggtataaatcgccaaaaacggaatcattagagcccttacctaacta ctttctcccctttggcaaataaaacatgagtgaaggttataccaaagccg gagagatgctcggagcgacggcgaaggatgagttacgtagtggaagcctt tgtcttcaccgaagactccaattccctttcaatacacctatgacttggtt tgaatttcactcgaacacacattagtcatagtatatgaaagagacatgat caaaggtatatgaatgagctatgtgtgcaattcaacaaaagaagttccta gaatcaagaatatttagctcatgcctaagtttgttaaaggtttgttcatc aagtggcttggtaaagatatcggctaattgatctttagtgttaatatatg aaatctcgatatctcccttttgttggtgatcccttagaaagtgataccga atggctatgtgtttagtgcggctatgctcaacgggattatccgccatgcg gattgcactctcattatcacatagaagaggaaccttggttaatttgtaac cgtagtccctaagggtttgcctcatccaaagtagttgcgcgcaacaatgg cctgcggcaatgtactcggcttcggcggtagaaagagctacaaaattttg cttctttgaagcccaagacaccaaggaccttcccaagaactggcaagtcc ccgatgtgctctttctattaattttacaccccgcccaatcggcatcggaa taaccaattaaatcaaatgtggatcccctaggataccaaagcccaaactt aggagtataaactaaatatctcaagattcgttttacggccgtaaggtgag cttccttagggtcggcttggaatcttgcacacatgcatacggaaagcatt atatccggtcgagatgcacataaatatagcaaagagcctatcatcgaccg gtataccttttgatccacggacttaccttccgtgtcgaggtcgagatgcc cattggttcccatgggtgtcttgatgggtttggcatccttcatcccaaac ttgcttagaatatcttgagtatacttcgtttggcttaggaaggtgccctc ttggagttgcttcacttgaaatcctaagaaatacttcaactcccccatca tagacatctcgaatttttgtgtcatgatcctactaaattcttcacatgta gatttgttagtagacccaaatataatatcatcaacataaatttggcatac aaacaaatcattgtcaagagttttagtgaataaagtaggatcggcttttc cgactttgaagccattagtgataagaaaatctctaaggcattcataccgt gttcttggggcttgcttgagcccataaagcgccttagagagtttataaac atggttagggtactcactatcttcaaagtcgggaggttgctcaacataga cctcttccttgattggtccattgaggaagacacttttcacgtccatttga taaagcttgaagccatggtaagtagcataggctaataatatacgaattga ttcaagcctagctacgggtgcataggtttcaccgaaatctaaaccttcga cttgggagtatcctttggcaacaagtcgggctttgttccttgtcaccaca ccatgctcgtcttgcttgttgcggaacacacatttggttcctacaacatt ttgattaggacgtggaactaaatgccatacctcattcctcgtgaagttgt tgagctcctcttgcatcgccaccacccaatccgaatcttgtagtgcttcc tctacactgtgtggctcaatagaggaaacaaaagattaatgctcacaaaa atgtgctacacgagatctagtggttacccccttatggatgtctccgagga tggtgtcgacggggtgatctcgttggattgcttggtggactcttgggtgt ggcggtcttgggtcttcatcctccttgtcttgatcatttgcatctcccct tgattattgtcgtcttcttgaggtggctcatttgcttgatcttctccttc atcaacttgagcctcatcctcattttgagtcggtggagatgcttgcgtgg aggaggatggttgatcttgtgcatttggaggctcttcggattccttagga cacacatccccaatggacatgttccttagtgcgatgcatggagcctctta aacacctatctcatcaagatcaacttgctctacttgagagccgttagtct catcaaaacacaacgtcacaagaaacttcaacttgtccggaggacttgtt aaagacactatatgcccttgtgtttgaatcatatcctagtaaaaagcctt ctacagtcttaggagcaaatttagattttctacctcttttaacaagaata aagcatttgctaccaaagactctaaaatatgaaatattgggctttttacc ggttaggagttcatatgatgtcttcttgaggattcggtgtagatataacc ggttgatggcgtagcaagcgatgttaaccgcctcggcccaaaaccgatcc gaagtcttgtactcatcaagcatggttctagccatgtccaatagagttcg attcttcctctccactacaccattttgttggggcgtgtagggagaagaga actcatgcttgatgccctcctcctcaaggaagccttcaatttgagagttc ttgaactccgtcccgttgtcgcttctaattttcttgatcctcaagccgaa ctcattttgagcccgtctcaagaatccctttaaggtctcttgggtgtagg atttttcctgcaaaaagaacacccaagtgaagcgagaataatcatccaca ataactagacagtacttactcccgccgatgcttatgtaagcaatcgggcc gaatagatccatgtgtaggagctccagtggcctgtcggtcgtcatgatgt tcttgtgtggatgatgggttctaacttgcttccccgcttggcatgcgcta caaatcctgtctttctcaaaatgaacatttgttaatcctaaaatgtgttc tccctttagaagcttatgaagattcttcatcccaacatgggatagtcggc ggtgctagagccaacccatgttagtcttagcaattaagcaagtgttgagt ttagctctatcaaaatctaccaagtatagctgaccctctaacactccctt aaatgctactgaatcgtcacttcttctaaagacagtgacacctacattag taaataaacagttgtagcccatttgacataattgagatacagaaagcaaa ttgtaatctaatgaatcaacaagaaaaacattggaaatggaatggtcagg ggatataactattttacccaaacctttgaccaaaccttggtttccatccc cgaatgtgatagctcgttggggatcctggtttttctcataggaggagaac atcttcttctcccctgtcatgtggtttgtgcacccgctatcgatgatcca acttgagcccccagatgcataaacctacaaaacaagtttagttcttgatt ttaggtacccaaatggttttgggtcctttggcattagacacaagaacttt gggtacccaaacacaagtctttgaccccttgtgcttgcccccaacatatt tggcaactacctagccgaatttgttagttaacacatatgatgcatcaaaa gttttaaatgaaatgctatgttcatttgatgcactaggagttttcttctt aggcaacttagcacgggttggttgcctagaactagatgtctcacccttat acataaaagcatggttagagccagagtgagacttcctagaatgaattttc ctaattttgtcctcgggataaccgacagggtacaaaatgtaacactcgtt atctcgaggcatgtgagccttgcccttaacaaagttagacaatttcttag gagggacattaagtttgacattgcctccctgttggaagccaatgccatcc ttaatgccagggtgtctccctctatagagcatgcttctagcaaatttaaa tttttcgttttctaagtcatgctcgacaattttagcatctaattttgcta tatgatcattttgttgtttaattaaggccatgtgatcatgaataacatca atgttaacatctttacatctagtgcaaatagtagtatgttcaacggtaga tgtagagggtttgcaagaattaagatcaacaatcttagcacgcaatatat catttttatctttaagatcggaaattgtaacattgcaaacatctagttct ttagccttagcaagcaatttttcattttcaaatctaaagctggcaagata aatgtttaattcttcaatcttagcaagcaaatcatcattatcatttctaa gattgggaattgaaacattacaaacatttgaatcaaccttagctaacaaa ttagcattctcatttctaaggttgtctatagtctcatggcaagtgcttag ctcactagatagtttttcacatttttctacttctagagcgtaagcatttt taactttaacatgcttcttgttttctttaataaggaagtcctcttgggac tccaagagatcatccttctcatggatagcactaatcaattcatttaattt ttctttttgttgcatgtttaagttggcaaaaagagtacgcaaattatctt cctcatcactagcattatcatcgctagaggactcatatctagtggaggat ttggatttaaccttcttctttttgccgtcctttgccatgaggcacttgtg gccgacgttggggaagagaaggcctttgttgatggcgatgttggcggcgt cctcgtcggaggaggagtcggaggagctctcgtccgagtcccactcgcga catacatgggcgtcgccacccttcttcttctcctttcttctccccttctt gtcatcgcccctatcactgtcactagaaataggacattttgctataaagt gaccgggcttaccacacttgtagcaaaccttcttggagcggggcttgtaa tctttccccctcctttgcttgaggatttggcggaagctcttgatgacgag cgccatttcctcgttgtcgagctttgaggcgtcgattggttgtctacttg gtgtagactcctccttcttctcctctgtcgccttgaatgcgaccggttgt gcttcggacgtggagggatcatctagctcgttgattttctttgagccttt gatcatatactcaaagctcacaaaattcccgattacttcctcgggagtca ttagtgtatatctaggattaccacgaattaattgaacttgagtagggtta aggaaaataagtgatctaagaataaccttaaccatctcgtggtcatccca ctttttgctcccgaggttgcgcacttggttcaccaaggttttgagccggt tgtacatgtcttgtggctcttctcctttgcgaagccggaagcgaccgagc tccccctcgatcgtctcccgcttggtgatcttggttagctcatctccctc gtgagaggtcttgagcacgtccccaaacttccttggcgctcttcaaccct tgcaccttgttatactcctctcgacttagagaggcgaggagtatagttgt ggcttgtgagttgaagtgctcgatttgggccacctcgtcctcatcatagt cctcatcccctacagatggtacctgtacaccaaactcaacgacatcccat atacttttgtggagtgaggttagatgaaatcgcattaaatcactccacct tgcataatcttcaccatcaaaagttggtggtttgcctaatgggacggaaa gtaaaggcgtatgttttggaatgcgaggatagcgtaaggggatcttacta aacttcttgcgcttatggcgcttagaagttacggagggcgcgtcggagcc ggaggtggacggtgatgaagtatcggtatcgtagtagaccaccttcctca tcttctttttcttatcgccactccgatgggacttgtgggaggaggctttc ttctccttccccttctcctttttgcgggactcttccgatgaagccttctc gtggcttgtagtgggcttgtcgccggtctccatctccttcttggcgtgtt ctcccgacatcactccgagcggttaggctctaataaagcaccgagctctg ataccaattgaaagtcgcctagagggggggggggtgaatagggcgaaact gaaatttacaaagttaatcacaactacaagcccggttagcgttagaaata taatcgagtccgcgagagagggtgcaaaataaatcgcaagcaaataaaga gtgtgacacgcggatttgttttaccaaggttcggttctcgcaaacctact ccccgttgaggcggtcacaaagaccgggtctctttcaaccctttccctct ctcaaacggtccctcggaccgagtgagctttctcttctcaaatcaaccgg gaacaaaacttccccgcaaagaccaccacataattggtgtctcttgcctt ggttacaattgagttttgatcacaagcaaagaatgagaaagaagaaagca atccaagcgcaagagctcaaaagaacacaacaaatcactctcacttaaca ctaaagcttttgtggaatttgggagaggatttgatcacttgggtgtgtct tgtattgaatgcctagctcttgtaagtggttggaagttgagaaacttgga tgacttgaatgtggggtggttaggggaattaataaaaaaaaaattgaggg tgatcactttggttaaacaagaccgaaaaactcctcctatgcctaacccc ttatgttaaaacatcaaaagacaaattcatttttcctattccacgaatga ttgtaagagagagaaatccaaaggtcggaacatcagtgacgagaactcaa aaaaacaaaaaatatgttttctactctaccccccccccctccccccccgn nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnc ccacacgtcagagctccaacggtcggaatccaacggcctggtgacgtggc tggcgcaccggatagtgtccggtggcgcaccggactgtccggtgtgccat gcgacagcagcctccaccaaactagccgtttggtggaggctgctgtcgca tggcacaccggacagtccggtgcgccaccggacactatccggtgcgccag ccacgtcaccaggccgttggattccgaccgttggagctctgacgtgtggg cccgcctggctgtccggtggcgcaccgaacaattcctgtagattgtccag tgtgccacccgtgcgtgctctgctcctctgcgcgcgctggcgcgcattta atgcgttacagacgaccgttggcgcgaagtagtcgttgctccgctggctc accggacagtccggtgtgcaccggacatgtccggtgaattatagcggagc gtgaatctgaagttggcgagttcagagtcgctcttccctggagcaccgga cactgtccggtgaattatagcgaagcgcctctgagaattcccgaaggtgc gaagtttggcttggagtaccctggtgcaccggacaatgtctggtggcaca ccgaacagtccggtgcgccagaccagggctgccttcggttatcccatgct ctctttgatgaacccaattcttggtctttttattggctaagtgtgaacct ttgacacctgtataacttatagactagagcaaactagttagtccaattat ttgtgttgggcaattcaaccaccaaaatcaattaggaactaggtgtaagc ctaattccctttcagtcactttatactagctcgtcaaatattttatcaca tttggcaatgttaaaagtaaatttatcttcttgctgattcttttgaaccg acttcaaagaagaacatatagaaggtttagcctgtgttggccaaacaagt tcgacggtatatatatccatggattcatcatcctaactatcgtgccccac tagatgcatcttatggctagccgattttgatgcctctctctacttcgact tttgcccgctagtgcaagtgcactaacaaaataaactatgtgccatccaa tttctcttttaagtaagatcacaacccattaaaagtcagcccttctaaat attttccgtgatatggatctgaaagcatcggtttatagtgtcttggaacc tccagatatagtcattaaccggttcttcgcccccctgctggactaaagct aaatcagttaattctaactcaggttctactgagaaaaagtgttcataaaa tttctgctctaatttatcccaagagttaatagagttaggtgctagggttg cataccatgcgaatgcggtaccagtcagggatagcgagaataaacaaaca cgataggcttccctaccaaccaattctcctaggtgtcctaagaattgtct tatatgctcatgcgtgttttttccgccttcaccggaaaatttagagaaat ccgatagtctagttccctgtggatatggcaagacgacaaatcagtgacta taaggtttccgatatgactgccccatgcctggcacactaacatcgagttt atctctgaacaacctagctacttcgtccctaatcttttccaccatgtcta gtgaccatccattaagtttgagggtgaaaatctcaggttgcctgacatta ttttgtcgactttcctcctaggggatgttcaagatgcgtgttaggtgtcc tattaatgccaccaccttctgccctatatctctcgggctctctggtggcc gaagaatattcggccatgtgcatatgaggtggcatgacatgattataata tgccgctggtggagtagcgtagtgctgctatgatggatgtgaaaaatgtg catattatgtagctcctggttctgttctgtgtacgtatacctcgactgtc cggctaaggggctggatggtccatgatggggctcaatgatccgaggttat atccggatggtctggccatatatggcgcgacctagatagtctgtgcgcgg gtgtgtgttgggtctgatggtccgacaaatatgccggacgatctatgaca tatccgatcagtccggtgtaaggggttagacagtctgcaaccgtaatgta gccggacggtccgagataactttcgaaactatagtaccaaatcctatagg aatgagaatccaagtttccccgtgaaatcctttgatccaaaggagccttt agaataatttatggcaggaatgtttacacgttgaggctacttttcacacg cgtctacaacctttcgcgccgtttttcttggttgctgccactatgcaaca gatgcctcgattcaaattcaaaaccacaacaaggtgattcctttattttt tcctatggtgccaccaaagacattttcacctgaattcctatattttcaat ttctgtagggttcaaaacatccaccatctctattcccatgtttttcctat ttctatgtttttttatccttcattcccaagaaggccgtaacatgtgtaac ccaaaatggcctgagcacaaagccccaaagccaggttgagcccagatggc cctcatagaatctcccgatcccagaccagtctcccgctgacccatcctgc tcagtgctcacttccacgcgggccccagcccgccacgagtccccgctccc acctgctcctcgccgccgttagcgcgccccgagccccgacagcctttctc cttctagggtttctctctgtttcgtcctcatcgaatcggagtttcgc 143398-145767 (or 146054) Forward Strand Exon 1: 143398-143493 Exon 2: 143562-143652 Exon 3: 144047-144164 Exon 4: 144238-144318 Exon 5: 144399-144447 Exon 6: 144715-144829 Exon 7: 144945-145036 Exon 8: 145136-145283 Exon 9: 145743-145849 Exon 10: 145881-145978 Exon 11: 146048-146054 Identified through both Augustus and FGENESH. Augustus has two possible gene structures (4 or 5 exons...additional internal exon). FGENESH has 11 exons. I believe FGENESH is more accurate according to EST/cDNA data (see below). Did a blastp and showed a hit on Serine/threonine-protein kinase (PKc_like superfamily domain) with an evalue of 5e-31 for transcript 1 from Augustus. Transcript 2 from augustus did not have as good of a match. FGENESH actually showed a stronger hit of 2e-138 (Serine/threonine-protein kinase AFC3). I therefore used the FGENESH and I tried to translate the EST data that showed Exon 3, 4, and 5 (or just 3 and 4 from cDNA/EST) being one exon and got stop codons. The FGENESH also matched other cDNA/EST data where exon 3, 4, and 5 are separate. Protein: MESSRSRKRTRQAHDCAAAPPPEREVVGPALLVARGGASPPWREDDRDGHFVFDLGENLT RRYKILSKMGEGLGFLQSVVSSSAGYLRYYSNEEGYELDFRSIRKYRDAAMIEIDVLNRL AENEKYRSLCVQIQRWFDYRNHICIVFEKLGPSLYDFLKRNRYQPFPVELVREFGRQLLE SVAYMHELRLIHTDLKPENILLVSSEYIKVPSTKKNSQGEMHFKCLPKSSAIKLIDFGST AFDNRNHNSIVSTRHYRAPEIILGLGWSFPCDIWSVGCILVELCSVRSLHTLYFSSTSLG EALFQTHENLEHLAMMERVLGPLPEDMIRKASF ATGGAGTCATCGCGGTCTCGGAAGCGGACGCGCCAGGCGCATGACTGTGCCGC CGCGCCGCCACCAGAGCGAGAAGTGGTAGGGCCCGCTTTGCTCgtgagta ctcgcagatttgcggggtttcaggctgtgctgatgcagatttggccacgc tgtgcacgcagGTAGCGAGGGGTGGCGCGTCGCCACCATGGAGGGAGGAT GACCGTGATGGACACTTCGTATTCGACCTTGGCGAGAACTTGACCCGCCG ATgtgagttcccgactctttgactctctccttcagttcctcgagcccttg tgcatggatatatcgtttagttgtgtctaataatttgtaattagtggcaa catctttttcttattttggttttatctgcttttttattgttataagaccc cgtttgcaaactggcagttagtacacctttaggttttcctttgtcgtatc aagctaataatgtgtgaagcagggttcccattctgatctttagtttttgt gattttaatagcaataaccttttttcgcaaaaaagtaacaaactttatgc agtatgtggttcacatagttcccagtatccttttggacttagatttagtt tgatatatggttatgtttattgacagaatgctttttagctttacagATAA AATCTTGAGCAAAATGGGAGAAGGTTTGGGTTTTCTGCAATCTGTCGTCT CCTCTTCTGCTGGGTATCTCCGTTATTATTCTAATGAAGAGGGCTACGAG TTGGATTTTCGTAGgtacatttgggcgtgttttggaatgctgggaccgtg aaacacatgaatatgttgccataaaagttgttcgcagTATCCGCAAGTAC CGTGATGCTGCAATGATTGAGATAGATGTGCTCAATCGCCTTGCAGAAAA TGAAAAGTACAGATCTCTgtgagtatctcagagacctaattacttaaagt ataaccctattctgtttctgacttgacagttcctcttcaatttgtcagTT GTGTTCAAATTCAGAGATGGTTTGACTATCGTAATCATATATGCATTgta agtactagcttttggtcaattatactagtttctaaaagtcagttaccaat tttgagaatgcatttttttgcagatgtgttggttgtttataatatttatc ttattggccctcccagtacattgtatggaatgagagacttgtctatagaa attatgataaatcgcctgtagtatagtctttttgttcttcgctcgtgaag gttccctcatatggtcacaaattaaaaatattctttattcaatggatatc aatgattttgtcagGTTTTTGAGAAGCTTGGGCCAAGCTTGTATGATTTT CTAAAGAGAAATAGATACCAACCTTTCCCTGTGGAACTTGTGCGGGAGTT TGGACGGCAACTGTTGGAATCTGTAGCATgtgtgttagttattgcttgat tctgcacatcaattattagctaatttggtagcattggcttcacttttcta ttgcttcaaaggggcagtgttaaccatcttttcttttgtaacagATATGC ATGAGTTGCGGCTTATCCACACTGATCTGAAGCCAGAAAACATACTACTT GTCTCTTCTGAGTATATAAAAGTTCCAAGTACAAAGgtgcctattctcac ccactcttttagtgattgtctgctatttctttactaggaatacagtattt gcagttttcttgaatgattgcttggggttgtacagAAGAATTCGCAAGGT GAAATGCATTTCAAGTGCTTGCCGAAGTCCAGTGCCATAAAGCTGATAGA TTTTGGCAGTACCGCCTTTGATAATCGGAACCATAACTCGATTGTTTCTA CGAGGCATTATAGGGCACCTGAAATAATATTAGgtaaattggttattttt gaagatttgattcagttttatcttgtttgagatatggactatgataattc agtgtgaagatttatgtgctcttaaatatgctcgatagagtaaaacagac atcgtccaaaatcttagcaaagtcttggcacatcctgattcatactctta tttactcctaaatgatgtgttttaatcaaatctgaaagaaacaggtggta cttcttcctctttgatggtttggtgttttctgcacatatttagttttaca gttgatggttgagcgatatcgcactgtattgttgaaattgattgaactga gttctatgcttgtgtgcagaataagagttagatcttttatactatttcgt tagtggccatgtcagctatcatgatagttgctaaaggtttcttctccaat tctgtatgtcattagttctgatgttggatgtttcttttgcagGTTTAGGC TGGAGTTTTCCATGTGATATTTGGAGTGTTGGCTGTATTCTTGTTGAGCT ATGCTCTGTTAGATCTCTTCACACACTTTATTTTTCTAGTACCAGTCTGg taattgttttcctaacatcacattttttagGGGGAAGCATTGTTTCAGAC ACACGAGAATCTGGAACACCTAGCAATGATGGAGAGGGTTTTGGGACCTC TACCAGAGGATATGATACGGAAAGCAAGgtgcgttacttgcaatcttaat ttgcgtattattttatcttttgaatgaaatgtatgtgttcaaatcagCTT TTGA(End of Gene)agcatatctttatttgcttgatgttcaaagttctttttttgatggg gctaggatttttggttttactgtgctttcttgcctatttattgaacattc atttgtttaagcattttgagagaatgatgcatttgattttgctgcattag agtatcagagttacattttgaggaaataacactttcttctctatttattg aacgtttctaatctgttacacttcattatgtttcatgattagcagagtat cagtgtacataactttcatattgtttcattattctattttgtttattgcc agtttgccacttgaattttcaaagctaatagctgccttttatttattcta ttttgtttattgtaccttgattttcgaagttaataactgcctttatggcc actactaatactaacttgccctccctaccccttttttttgtttattgttc cttgtatttggtaattttgcatgtgtattgcctgcttgttatatttaatt tgtggcattgaaatttcttgggaactcttgtaatcttttttttgtttctt tccttcatttttaatattcattcgttgtgcttacatgtataccacctcaa atgtatagctcttcagctcagaaatattttaggagagcaacacgattaaa ttggcctgaaggtgctgtttcaagagaaagcatcagagctgtgaggaaac tggatcgactaaaggttttagttctctcgtttcttctactttggccatgc ttatctatcaacttaccacgctttgacttaagaaattttaatattttctt gtctttgatctgattatattgtctcctagatttcaaggtagttacggatc aaggttagaagttggggtttgcgcttgtgattagttgtagtaattagcta tattgtttggtatacctgctagtatttttttacattaatgtagaataaca tttattagctttcttttggaaggaataaattgatttaaataaaatgtaat tgtgcatagacaaaaatgctatccttcatgaaagcattagaagatgccaa actttattttgggagacaccaatttgcttaataatacattataaaacggg tttgaagaggacaacctgcgactaactcgggcataaaataagttcggctt gttggaggcagttaggaaagcataagagttaaatttttctgtttcattca gtgaattgctagggctgtcttcaaaatttcttaataaataataataggtg ctatctagtagcgctgctagtaatttcaaattttaaactaacatatgatg tgaaacgtacctcaaaatgagtaagaagctatggaaacatgaggattgac cttacagatcctgaaacttatgaaccacaactacaggacttggtatcgag gaacgctgaccattcaaaggtggcactggtggacttgctatacggtctcc tacggttcgagccatcggaacgcctgacagcagaagaagctctggaccat ccattcttcaggaacccaacatgacttgttgagcaggcttgactcatgta actctactgaaggtggtggtaggaagcaggcagagccgtccccattgcaa agctgaaggcatcttgttcctctataccagagagtttgtgtaaggcctgt caatggggcttcttgtggtgataaaactgatggctttagtcacccaatgt catgtacactagcatttttcctagggagcttagttcgccagacggcgctt gtatgttgttaaccgggatacactatgagctgtgccccgatgccctggtg ctataatccaatcgcaaatgatgaaatcaagcatagcgtgcgtccgtata gtttatgattgaattgtaccttggaatgattgccctacattcgttcattc ctacctagcgttgttacttcgatggttctggggtgctgttgataggtgca cacggcttgggatttgttgcatcccaactgccgtgtcctgttgctaatac ttgtgctgctgctgaagtgctgggtgggtttgaattctgtcgtgcctaca atgcatgtggtataggaaatagtgtagtagtacctcgctgcttggtacca gaagcggatggcttctacttctcactctggcccactactatagaaatgga cactgtaccgtcagaatcggatggctttaagccagccgctctgctggcgt ttttttgagataaaatactgtgcacactgtcagccggttcggctgttagc tgccagacggaatattcacaccctatacggctagcttcaagggaagtacc tggttggattgtgtactcgaccacgtactagctatagtttttctgcctcg actgctaaataattggacacatcaagttgccaaccaataggtaatgatgc ttaacatggcctctggtacattaaagcctgcacgtatactctttcgagaa cttgatttggcacggtcttagcacctggatgattaaataaaatcacgtac gactacgaaagagacatctaaattataaactatcaaaacggtggaagata actttgaacaaaccagaaatgaaataaaagagaaaggggaccacacccta gatgttaaacactagattttgcacctggacggtccgaccgtgtaaccagg acggtccacgggatagttcggacggctcgcaattagattaattcaagctg aattctttaccccttgcgtgattgtcctaatgaatctcgtgggaagttgt tggaaaccgcctaggaacgggacctccccactatatatatgaagggatac gatcgattcgaacacacaccaatcgatttatcaatctaaatttaccttat gtattaggagtagaaataatttatccttagctttagtcttcctcaacctc aattttcgtctctcttcagctctatatcgtctagagacgtcttgagtggc ctgccgatcccaagaaaacctaggatctttcctcctcgatggggtccctc ctgagatacgagatcttggttatgcaggagattcacaaatccctctacgt cctcacggacagtccgccgcctccacaaggacggtttgcgcacccgcaga gaagagcagggctctctgcgcagagtcacagatggttcgatgacctaccg cggacgttccacgctcttccagagagattcccaaggtcttgttccgcgca agcggggtctagggattattggtgtttggtcccaaaaaaacgcgaacaca atttttggcatctccgctggggacgactgcgcttcagatctattagatcg accatgatggtcgatttcaaagattgtaccaacatcttcccaagtaacat cgttaggccaactatggagagtctatcggtcgaggaggagctatggttcc atgatctcatgaagcagtggatgacgacgcaccgcgacaacgcgcttggg tacgcgaagaagcgaaggagaaattcttgtcacactccgcgatggatcga cactagaagattatcaagtaatgggaaataaagctcgactccctcgtacc tttgctcaatgatagcaatgtaagtaattccagtgatgatcaatctatta tgcattttgtggaattacaacaagatcagttagaacaaacgcatagggag atagaagagacactaaaaaaactcacacacacatttgaaaaatctactat tcccagttttccatcccatgatgtcgcaatagaggcaacctcgctggatg catcatcaacaaataggcctacacaatcccaacaattatatggtatgacg gtaaactcatattaaggatagccgctacctccacaacacctaattgaccc aacaactcctctcgccatggtcagactggccgagtataaccagttggccc catatgggaccgttctatagcccaccgtggaacgtctgatgacctaagaa