Evidence for inter-specific recombination among the mitochondrial genomes of Fusarium species in the Gibberella fujikuroi complex. Gerda Fourie1, Nicolaas A. van der Merwe2, Brenda D.Wingfield2, Mesfin Bogale2, Bettina Tudzynski3, Michael J Wingfield1, Emma T Steenkamp1. Departments of Microbiology and Plant Pathology1 and Genetics2, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, South Africa. University of Munster, Germany3. *gerda1.fourie@fabi.up.ac.za. SUPPLEMENTARY MATERIAL Table S1. Mitochondrial amino acid codon usage and tRNA anti-codon sequences for Fusarium circinatum, F. verticillioides and F. fujikuroi. Table S2. Inverted, direct and dispersed repeats identified in the mitochondrial genomes of Fusarium circinatum, F. verticillioides and F. fujikuroi. Table S3. Intron distribution, type, size, and endonuclease of F. circinatum, F. verticillioides, F. fujikuroi, F. oxysporum, F. graminearum and F. solani. Table S4. Comparison of the alternative trees using the SH test. Figure S1. Physical map and BLAST comparison of the mt genomes of F. circinatum against F. oxysporum, F. verticillioides and F. fujikuroi. Figure S2. Midpoint rooted maximum likelihood phylogenetic tree of the amino acid LAGLIDADG endonuclease domains identified within intron regions of F. circinatum, F. verticillioides, F. fujikuroi, F. oxysporum, F. graminearum and F. solani. Figure S3. Midpoint rooted maximum likelihood phylogenetic tree of the amino acid GIY-YIG endonuclease domains identified within intron regions of F. circinatum, F. verticillioides, F. fujikuroi, F. oxysporum, F. graminearum and F. solani. Figure S4. Maximum likelihood phylogenies for Fusarium based on mitochondrial proteincoding nucleotide sequences. Figure S5. Maximum likelihood phylogenies for Saccharomyces species based on mitochondrial protein-coding nucleotide sequences. Table S1. Mitochondrial amino acid codon usage and tRNA anti-codon sequences for Fusarium circinatum, F. verticillioidesa and F. fujikuroi b. Amino acid Codon Codon usage c Anti-codon d (F. circinatum / F. verticillioides / F. fujikuroi) TGC / TGC / TGC Alanine GCG GCA GCT GCC (F. circinatum / F. verticillioides / F. fujikuroi) 14 / 15 / 12 81 / 81 / 72 127 / 122 / 111 29 / 34 / 33 Arginine AGG AGA CGG CGA CGT CGC 0 / 0 / 15 75 / 75 / 72 0/0/0 1/1/5 6 / 8 / 12 2/0/1 ACG, TCG, TCT / ACG, TCG, TCT / ACG, TCG, TCT, CCT Asparagine AAT AAC 178 / 180 / 169 39 / 36 / 40 GTC / GTC /GTC Aspartic acid GAT GAC 87 / 86 / 79 16 / 19 / 19 GTC / GTC / GTC Cysteine TGT TGC 21 / 22 / 31 6 / 6 / 15 GCA / GCA / GCA Glutamic acid GAG GAA 14 / 15 / 14 106 / 105 / 107 TTC / TTC / TTC Glutamine CAG CAA 5/7/8 81 / 80 / 79 TTG / TTG / TTG Glycine GGG GGA GGT GGC 24 / 25 / 23 121 / 123 / 123 144 / 144 / 135 2/1/1 TCC / TCC / - Histidine CAT CAC 54 / 54 / 48 30 / 31 / 30 GTG / GTG / GTG Isoleucine ATA ATT ATC 365 / 365 / 360 135 / 133 / 154 32 / 33 / 35 GAT / GAT / GAT Leucine TTG TTA CTG 20 / 20 / 24 499 / 496 / 480 11 / 11 / 11 TAA, TAG / TAA, TAG / TAG CTA CTT CTC AAG AAA 53 / 54 / 62 52 / 50 / 55 0/1/2 7 / 7 / 17 112 / 111 / 107 Methionine ATG 129 / 129 / 117 CAT / CAT, GTT / CAT, GTT Phenylalanine TTT TTC 236 / 236 / 239 124 / 124 / 113 GAA / GAA / GAA Proline CCG CCA CCT CCC 10 / 8 / 9 40 / 40 / 34 86 / 91 / 83 12 / 9 / 12 TGG / TGG / TGG Serine AGT AGC TCG TCA TCT TCC 131 / 130 / 126 24 / 25 / 29 6/6/4 91 / 91 / 89 110 / 111 / 112 10 / 8 / 11 GCT, CGA, TGA / GCT, TGA / GCT, TGA Threonine ACG ACA ACT ACC 2/4/0 111 / 114 / 108 105 / 101 / 106 5 / 4 / 10 TGT / TGT / TGT Tryptophan TGG TGA 2/2/9 59 / 59 / 51 TCA / TCA / TCA Tyrosine TAT TAC 170 / 172 / 172 49 / 47 / 55 GTA / GTA / GTA Valine GTG GTA GTT GTC 37 / 36 / 31 160 / 161 / 160 102 / 102 / 102 8 / 7 / 14 TAC / TAC / TAC End End TAG TAA 4/3/4 10 / 11 / 10 Lysine a TTT / TTT / TTT In order to compare codon usage between species within GFC, the mitochondrial genome of F. verticillioides described by Al-Reedy et al. were also included. b Codon usage was calculated with the online tool at http://www.protocol-online.org. Missing or under-represented codons and corresponding third position G or C are indicated in italics. c Codon usage are indicated in the order F. circinatum, F. verticillioides and F. fujikuroi. c The tRNA anti-codon for each codon are indicated in the order F. circinatum, F. verticillioides and F. fujikuroi. Table S2. Inverted, direct and dispersed repeats identified in the mitochondrial genomes of Fusarium circinatum, F. verticillioides and F. fujikuroi. Sequence motif F. circinatum GAGCTTTAGCTTGCG ACGAAGTATGCTTGCGCCGGAATCGC GCAAGCTAGAAAAAAATGT TCGCGCAAGCTAGAA TCTCTAAAAAAAATATATTTTTTTTT ATA GGGCTGCGCAAGCTAAAGCTC GCTGCGCAAGCTAAAGCT GCGCAAGCTAAAGCT GAGGGCTGCGCAAGCTAAAGCTCC AGAGCTTTAGCTTGCGC GAGCTTCTATTTCATA CGCGCAAGCTAAAGC GCAAGCTAAAGCTCT AGCTTCTGATTCCCTACGGG CGCAAGCTAAAGCTAT GCTCTTAGCTTTTAGGAGAC CACAGTAAGGCGCTAGCTAT AAAAAAAATGTATTTTTT CCTACGAGTGACGCTGTGTGCACGTA TTATAAT Nucleotide position Position relative to other genes Typea 3084, 4500, 4538 3111, 6407 between nad2 and nad3 between nad2 and nad3 + atp9 and cox2 direct dispersed 3133, 6429, 10695 between nad2 and nad3 + atp9 and cox2 + nad5 and cob between nad2 and nad3 dispersed dispersed 4322, 4396, 6536, 8179, 8247, 11979 4389, 8172 4499, 4537 6158, 11611 6536, 8246 6539, 7819 7636, 7657 8247, 8269, 11979 between nad2 and nad3 + cox2 and nad4L + cox1 and nad1 + nad4 and atp8 between nad2 and nad3 + cox2 and nad4L + nad5 and cob between nad2 and nad3 + atp9 and cox2 + cox2 and nad4L + nad5 and cob between nad2and nad3 + cox2 and nad4L between nad2 and nad3 between atp9 and cox2 + nad5 and cob between atp9 and cox2 + cox2 and nad4L between atp9 and cox2 + cox2 and nad4L between cox2 and nad4L between cox2 and nad4L + nad5and cob 10 872, 10895 12557, 12582 19807, 20049 34754, 39038 between nad5 and cob between nad5 and cob between cob and cox1 between cox1 and nad1 + nad4 and atp8 4167, 4346 4317, 4391, 8174, 33982, 38967 4319, 4393, 8176, 11976 inverted direct and dispersed direct and dispersed dispersed direct dispersed dispersed dispersed direct direct and dispersed inverted inverted inverted dispersed GGCTGCGCAAGCTAAAGCTC CTTTAGCTTGCTAAATTATAGCTCGC TATAGCT GCGATTTTCTCCGAAAATAC CTCTAACAAAGTGTACGCCTAT GGAGCTTTAGCTTGCGC CTAACAACGTGTATTCATAG CTTGCGCATCGTTCT TTTGCTTTTAGCTTTTACCT AAGCTCTGTATTTTTTTTATAAA TCGTTCTGGCTAGCCAGCC TATATTTTTTTATATAGC ATAAGGAATTACAGAAAT CGCAAGCTAAAGCTCTCGGAGAAAAT CGCGCAAGCT ATCGTTCTGGCTAGCCAGCC TATAAAAAAAGATATTTTTTTTATAG CCCTCCTC Sequence motif F. verticillioides AGCTTTAGCATGCTG TGCGCAAGCTAAAGCT CGAAGCCGAAAAATATG AGCTAGAAAAAAATGTATTTTTTTCT AGCTT 38968, 40668 39420, 41139 between nad4 and atp8 between atp8 and atp6 inverted inverted 43338, 43504 43426, 46901 dispersed dispersed 48448, 55323 48984, 55097 48394, 53639 tRNA cluster upstream of rns tRNA cluster upstream of rns + between cox3 and nad6 tRNA cluster upstream of rns + nad6 tRNA cluster upstream of rns + nad6 tRNA cluster upstream of rns + rnnl between cox3 and nad6 between cox3 and nad6 between cox3 and nad6 + tRNA cluster upstream of rnnl tRNA cluster upstream of nad6 + rnnl tRNA cluster upstream of nad6 + rnnl between nad6 and rnnl dispersed dispersed dispersed inverted inverted direct and dispersed direct dispersed dispersed 54169, 55346 65164, 66479 tRNA cluster upstream of rnnl between ORF2 and nad2 direct inverted Nucleotide position Position relative to other genes Typea 1753, 1795 1997, 4363, 6863, 6894, 7679, 30276, 32013, 38724, 40439 1867, 13493 2009, 5530 between nad2 and nad3 between nad2 and nad3 + nad3 and atp9 + cox2 and nad4L + tRNA cluster upstream of rnnl between nad2 and nad3 + cob and cox1 between nad2 and nad3 + atp9 and cox2 direct direct and dispersed 43405, 48361 44064, 48479 43480, 54162 46235, 46416 46439, 48225 46981, 54170, 55347 dispersed dispersed GCTAAAGCTCGCTCTC GAGGGCTAACGAAGTA TTCGGAGAAAATCGC TCGGAGAAAATCGCGCA AAATCGCTGCGCAAGCTAAAGCT CTATAAAAAAATATATTTTTTTTTTA ACATAGCGAGCTCGCTATGAAATACC GCA GCTTCTATTTCGGCG GCTAGAGCTTGCGCCT GGTGAGCTGGCGCCT CGGAGAAAATCGCTGCGCAAGCTAAA GCT GGAGAAAATCGCTGCGCAAGCTAAAG CT GCAAGCTAAAGCTCT GTAGCGCGAGCTCCCCTCCTAT GATAATTTTGTTTATC TTATATATAGCTTGC GAGCTTGCGCCTTTCTA AATTCTTATCCTTATCC CACAGTAAGGCGCTAGCTAT TAAAAAAAAAAAGAAAATTTTTTTTT ATAGC GCGCTAGAGCTTGCGCCT GCTTGCTATGAAGGCT TGCTTTAGCTTTTACAGCT 3041, 7728 3262, 5748 4216, 6848 4217, 6911 4356, 6856, 6887, 7672 4382, 5572 between nad2 and nad3 + cox2 and nad4L between nad2 and nad3 + atp9 and cox2 between nad3 and atp9 + cox2 and nad4L between nad3 and atp9 + cox2 and nad4L between nad3 and atp9 + cox2 and nad4L between nad3 + atp9 dispersed dispersed dispersed dispersed dispersed inverted 4182, 5298 5349, 11217, 13412, 30228, 30409, 32084, 33310 6834, 7650 6850, 7666 between atp9 and cox2 between atp9 and cox2 + nad5 and cob + cox1 and tRNA cluster upstream of nad6 direct dispersed between atp9 and cox2 between cox2 and nad4L direct direct 6851, 6882, 7667 between cox2 and nad4L direct 6866, 6897, 7121 6965, 6990 7257, 7273 7318, 7455 7557, 13416 direct inverted direct direct dispersed 11329, 11350 11872, 11897 10235, 11175 between cox2 and nad4L between cox2 and nad4L between cox2 and nad4L between cox2 and nad4L between cox2 and nad4L + between cob and cox1 between nad5 and cob between nad5 and cob between cob and cox1 14797, 19608, 30226, 33308 13964, 20315 20011, 20031 between cob and cox1 + cox1 and nad1 + tRNA cluster upstream of rns + nad6 between cob and cox1 + within nad4 between cox1 and nad1 dispersed inverted inverted inverted dispersed inverted ACCTACGAGTGACGCTGTGTGCACGT ATTATAATTA TAGCTTTAGCTTGCTAT TTTAGCTTGCCTTTAGCTTTTAGCT AGGTGAGCTGGCGCCTTCGG TAGCTTTAGCTTGCG CTTTAGCTTGCGGAT AGGGCTGCGCTAGAGCTT ACATTCCTGCACTAGCTAG AAGGCGCAAGCTCGTG GCTATAAAAAAAAAATATATTTTTTT TTTATA TATGCTTTATAAAAAAAAATATATTT TTTATAGCC TTTATAAAAAAAAATAATTTGTTTAT AGG GAGCTATATAAAAAAATATATTTTTC TTATAGC Sequence motif F. fujikuroi AAGGCTTTAGCTTAC TATATTTAGGCGCAAGCTCTAGC TTTAGCTTACGCAAGC AGCTTGCACGGGACGGGTAAAAATGT A GAAAAATCAGCCAAAG TTAGTAGTAGCGCGAGC GATAATTTTGTTTATC 20595, 25748 between nad1 and nad4 + rns and cox3 dispersed 27540, 27559 29820, 31588 30246, 38694 30087, 37541, 40623 30090, 32043 inverted inverted dispersed dispersed dispersed 33302, 41409 33965, 40194 32106, 33337 39258, 40596 between atp6 and rns between rns and cox3 tRNA cluster upstream of rns + rnnl tRNA cluster upstream of rns + rnnl tRNA cluster upstream rns + cox3 and nad6 tRNA cluster upstream of rns + rnnl tRNA cluster upstream of rns and rnnl tRNA cluster upstream of nad6 tRNA cluster upstream of rnnl 41068, 41922 between rnnl and ORF2 inverted 50593, 51280 between ORF2 and nad2 inverted 52069, 52717 between ORF2 and nad2 inverted Nucleotide position Position relative to other genes Typea 2650, 5636 2357, 5490 2765, 5641 3227, 3442 between nad3 and atp9 + cox2 and nad4L between nad3 and atp9 + cox2 and nad4L between nad3 and atp9 + cox2 and nad4L between atp9 and cox2 dispersed dispersed dispersed direct 3982, 13454 5457, 5532 5791, 5807 between atp9 and cox2 + cob and cox1 between cox2 and nad4L between cox2 and nad4L dispersed inverted direct dispersed dispersed direct inverted TCAGCATGCTAAAGCT TAGAGCTTGCGTAGAGCTTGCGTAGA GCTTG GTAGAGCTTGCGTAGAGCTTGC GCTTGCGGCTAAAAAA CACAGTAAGGCGCTAGCTAT AAAAAAAAATACATTTTTTTTT TGCTTTAGCTTTTACAGCT AATTTTAAACGTATTAGATTATTCTA TGAA CTATGAAGGCGCAAGCTC CGAGCTTTAGCTAGCTAGCCGCCGAA AAA AGGCTGGCGCAAGCTATGC AAAAAAAAGTATTTTTTTTT AAAAAAAATGTTAATTTTTTT AAGCATACTTCGTTAGCCCTAAAAGA AAGATATTT GCTTGCGCCTAAAAAATATATTTTTT TTA a 6227, 6269 8703, 8725, 8747, 8769 between cox2 and nad4L between nad5 and cob direct direct 8735, 8757 8784, 13111 9952, 9977 12753, 13012 16227, 16247 16546, 20167 between nad5 and cob between nad5 and cob + cob and cox1 between nad5 and cob between cob and cox1 between cox1 and ORF1 between ORF1 and nad1 direct dispersed inverted inverted inverted direct 24804, 26664 26665, 27967 between cox3 and nad6 + nad6 and rnnl between cox3 and nad6 dispersed inverted 33490, 34926 35640, 36606 43563, 43639 43874, 44224 between rnnl and ORF2 between rnnl and ORF2 between ORF2 and nad2 between ORF2 and nad2 inverted inverted inverted inverted 44578, 46190 between ORF2 and nad2 inverted Direct repeats are repeat units within the same intergenic region. Dispersed direct repeats are direct repeat units within different intergenic regions. Inverted repeats are repeat units within the same intergenic region however the repeat sequence is in a reverse sequence orientation. Inverted repeats for each genome were identified with Einverted EMBOSS and intergenic exact repeat elements were identified using REPFIND (http://zlab.bu.edu/repfind). Repeats in blue are the single repeat motif shared between F. circinatum and F. verticillioides and the repeat in red are the single repeat motif shared between F. verticilllioides and F. fujikuroi. Table S3. Intron distribution, type, size, and endonuclease of F. circinatum, F. verticillioides, F. fujikuroi, F. oxysporum, F. graminearum and F. solani. Typea Insertion position b Size Core structurec ORF Typed Protein length Domainse cox1 f.cir.cox1.intron1 f.cir.cox1.intron2 f.cir.cox1.intron3 f.cir.cox1.intron4f group 1B group 1B group 1D group 1B 395 622 716 871 1296 1182 1434 2287 44-1183 18-1078 1214-1332 1287-2238 53-1182 859-1039 989-1309 1058-1357 1116-1194 18-1069 1158-1276 18-956 26-196 34-1040 LAGLIDADGD LAGLIDADGD LAGLIDADGD LAGLIDADGD LAGLIDADGS GIY-YIG LAGLIDADGS GIY-YIG LAGLIDADGD GIY-YIG GIY-YIG LAGLIDADGD LAGLIDADGD LAGLIDADGD LAGLIDADGD LAGLIDADGD LAGLIDADGD LAGLIDADGD LAGLIDADGS GIY-YIG GIY-YIG LAGLIDADGS GIY-YIG LAGLIDADGD LAGLIDADGD LAGLIDADGD LAGLIDADGD LAGLIDADGD LAGLIDADGD GIY-YIG LAGLIDADGS 383 356 312 357 378 455 362 281 305 400 335 378 359 307 313 324 410 352 320 440 73 379 277 313 302 289 352 352 308 314 342 109-205; 260-350 82-173; 206-311 68-155; 187-283 96-191; 220-320 247-344 161-254 96-174 54-151 68-155; 187-283 135-228 82-163 108-204; 259-350 83-179; 204-315 62-152; 184-279 87-165; 193-280 84-165; 196-295 121-219; 278-380 92-187; 212-312 190-287 147-230 1-62 92-170 92-170 31-119; 150-271 33-126; 153-263 24-118; 173-262 82-178; 203-308 65-152;184-280 80-159; 189-285 127-210 95-170 f.cir.cox1.intron5 f.cir.cox1.intron6 f.cir.cox1.intron7 f.ver.cox1.intron1 f.ver.cox1.intron2 f.gram.cox1.intron1 f.gram.cox1.intron2 f.gram.cox1.intron3 f.gram.cox1.intron4 f.gram.cox1.intron5 f.gram.cox1.intron6 f.gram.cox1.intron7 f.gram.cox1.intron8 f.gram.cox1.intron9 f.gram.cox1.intron10f group 1B group 1B group 1B group 1D group 1B group1B group1B group 1B group1D group 1B group 1B group 1B group 1B group 1B group 1B 1067 1134 1271 716 1067 219 395 622 716 728 738 828 871 908 1067 1383 2151 1048 1499 1307 1295 1313 1200 1364 985 1114 1286 1200 1019 2383 23-193 53-1116 857-1040 1279-1397 23-193 1185-1254 216-1203 26-1098 1157-1275 19-953 18-986 48-262 52-1111 23-99 35-205 f.gram.cox1.intron11 f.gram.cox1.intron12 f.solani.cox1.intron1 f.solani.cox1.intron2 f.solani.cox1.intron3 f.solani.cox1.intron4 f.solani.cox1.intron5 f.solani.cox1.intron6 f.solani.cox1.intron7 f.solani.cox1.intron8 group 1B group 1B group 1B group 1B group 1B group 1B group 1D group 1B group 1B group 1B 1134 1271 246 287 395 622 716 738 1067 1134 1217 1053 1337 2196 1402 1173 1378 1030 1283 1082 cox2 f.gram.cox2.intron1 f.gram.cox2.intron2 f.gram.cox2.intron3 group1C2 group 1B group1C1 112 230 653 1511 1183 1865 39-264 924-1148 44-274 LAGLIDADGD GIY-YIG GIY-YIG 298 294 185 17-115; 174-276 58-142 116-175 cox3 f.cir.cox3.intron1 f.gram.cox3.intron1 f.gram.cox3.intron2 f.gram.cox3.intron2 f.solani.cox3.intron1 f.solani.cox3.intron2 group 1B group 1B group1C2 group1C2 group 1B group 1A 221 221 336 336 221 642 1159 1510 1433 1433 1540 1532 978-1127 1083-1472 27-255 27-255 803-991 1037-1499 LAGLIDADGD LAGLIDADGD LAGLIDADGS LAGLIDADGS 304 340 352 352 68-155; 187-283 56-157; 182-318 160-259 160-259 LAGLIDADGD 426 55-153; 200-297 cob f.cir.cob.intron1 group 1B 205 1626 20-177 f.cir.cob.intron2 f.cir.cob.intron3 f.cir.cob.intron4f group 1D group 1A group 1B 396 491 507 1790 1232 973 494-647 967-1177 24-182 f.fuj.cob.intron1 f.gram.cob.intron1 f.gram.cob.intron2 group1A group1C1 group1D 491 279 396 740 2236 2271 475-685 1778-2009 1049-1168 LAGLIDADGS LAGLIDADGS no ORF LAGLIDADGD LAGLIDADGS LAGLIDADGS LAGLIDADGS LAGLIDADGD GIY-YIG 320 157 no ORF 295 140 107 131 488 292 61-171 59-144 no ORF 51-138; 189-278 45-139 6-89 26-116 98-188; 236-349 76-164 Intron f.gram.cob.intron3f group1A 491 2427 2179-2373 38-986 56-343 LAGLIDADGS LAGLIDADGS LAGLIDADGD no ORF 267 247 319 no ORF 73-162 60-229 74-170; 206-303 no ORF f.gram.cob.intron4 f.gram.cob.intron5 group 1B group1C1 507 780 1062 1960 nad1 f.cir.nad1.intron1 f.ver.nad1.intron1 f.gram.nad1.intron1 f.gram.nad1.intron2 group 1B group 1B group1A group 1B 637 637 146 637 367 1105 319 1157 24-220 20-180 64-261 33-160 no ORF GIY-YIG no ORF GIY-YIG no ORF 305 no ORF 136 no ORF 87-175 no ORF 9-48 nad2 f.cir.nad2.intron1 f.gram.nad2.intron1 f.gram.nad2.intron2 f.gram.nad2.intron3 f.gram.nad2.intron4 f.solani.nad2.intron1 group 1C group1C2 group1C2 group1C2 group1A group II 763 379 763 1181 1624 421 1335 1434 1494 1614 1632 2362 44-275 59-302 49-292 32-269 1322-1553 LAGLIDADGD no ORF no ORF LAGLIDADGD LAGLIDADGD 421 no ORF no ORF 418 349 131-231; 291-389 no ORF no ORF 137-235; 298-398 61-157; 219-314 nad3 f.gram.nad3.intron1 group1C2 91 1450 42-285 LAGLIDADGD 423 139-238; 295-395 nad4L f.gram.nad4L.intron1 group1C1 240 1822 1341-1770 LAGLIDADGD 351 58-157; 220-319 nad4 f.solani.nad4.intron1 group 1C2 505 1397 17-295 LAGLIDADGD 429 139-235; 295-396 nad5 f.oxy.nad5.intron1 f.gram.nad5.intron1 f.solani.nad5.intron1 group 1B group 1B group 1C2 718 718 324 1010 1019 1348 22-934 26-941 38-272 LAGLIDADGD LAGLIDADGD LAGLIDADGD 304 302 429 68-195; 182-280 65-156; 220-319 139-240; 299-399 atp6 f.gram.atp6.intron1 group 1B 367 1450 1127-1414 LAGLIDADGD 356 67-165; 228-323 atp9 f.gram.atp9.intron1 group1A 181 1088 32-206 GIY-YIG 276 55-138 a Intron type determined by RNAweasal (http://megasun.bch.umontreal.ca/RNAweasel)[57]. b Nucleotide insertion position with regards to the start of the coding sequence. c The region that contain the core structure of the introns (P3, P4, P6, P7, P8) identified with RNAweasal. d ORF type identified with ORF finder (genetic code 4; Mold, Mitochondria) and BLASTp comparison. e Protein domains characterized with InterProScan (http://www.ebi.ac.uk/Tools/pfa/iprscan). LAGLIDADGD contained two domains whereas LAGLIDADGS only one. f Biorfic intron encodes two OFRs. Table S4. Comparison of the alternative trees using the SH testa. Treeb 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 ln L Concatenated 23946.77288 24036.16059 24392.70239 23985.70301 23985.70301 24069.94940 cox2 1228.15556 1239.02323 1276.80028 1231.13107 1231.13107 1236.25901 cox3 1401.62495 1408.11666 1417.21305 1401.84707 1401.84707 1406.41134 nad2 3148.58325 3215.37298 3234.18315 3153.04252 3153.04252 3210.77348 nad5 3300.29017 3310.49859 3355.77261 3304.15995 3304.15995 3309.07299 atp6 1414.31761 1414.31761 1461.32707 1427.75361 1427.75361 1427.75361 cob 1950.24117 1950.24117 Δln L (best) 89.3877 445.92951 38.93012 38.93012 123.17652 P Valuec Significantly worse?e 0.015 0.000 0.217 0.217 0.001 best no no yes yes no (best) 10.86767 48.64472 2.97551 2.97551 8.10345 0.298 0.003 0.697 0.697 0.403 best yes no yes yes yes (best) 6.49171 15.5881 0.22212 0.22212 4.78639 0.349 0.082 0.821 0.821 0.501 best yes yes yes yes yes 0.000 0.000 0.564 0.564 0.000 best no no yes yes no 0.373 0.005 0.689 0.689 0.419 best yes no yes yes yes (best) 66.78973 85.5999 4.45927 4.45927 62.19023 (best) 10.20842 55.48244 3.86978 3.86978 8.78282 0 (best) 47.00946 13.436 13.436 13.436 (best) 0 0.647 0.007 0.159 0.159 0.159 yes best no yes yes yes 0.883 best yes 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 1999.88279 1950.85218 1950.85218 1951.4114 nad4 2426.04132 2426.04132 2476.34155 2435.05980 2435.05980 2435.80133 nad1 1945.34168 1946.36185 2070.26604 1981.95746 1981.95746 1982.77271 atp8 190.34829 190.34829 190.34829 190.34829 190.34829 190.34829 atp9 401.32802 401.32802 401.32802 398.16849 398.16849 401.32802 nad4L 448.80268 448.80268 448.80268 448.80268 448.80268 448.80268 cox1 3224.81067 3232.70819 3279.47385 3214.10898 3214.10898 3240.97086 nad3 672.47475 676.02756 49.64162 0.61101 0.61101 1.17023 0.002 0.819 0.819 0.745 no yes yes yes 0 (best) 50.30022 9.01848 9.01848 9.76001 0.66 0.008 0.336 0.336 0.302 yes best no yes yes yes (best) 1.02017 124.92435 36.61578 36.61578 37.43103 0.702 0.000 0.013 0.013 0.008 best yes no no no no 0 (best) 0 0 0 0 0.000 3.15953 3.15953 3.15953 (best) 0 3.15953 0.164 0.164 0.164 0 0 (best) 0 0 0 0.000 0.000 10.70168 18.59921 65.36486 (best) 0 26.86188 0.38 0.18 0.000 0.78 0.032 yes yes no best yes no 0.98611 4.53892 0.643 0.264 yes yes 0.000 0.000 0.000 0.000 0.593 0.164 0.000 0.000 0.000 no best no no no no yes yes yes best yes yes no no yes no no no 3 4 5 6 1 2 3 4 5 6 680.50691 671.48864 671.48864 673.52633 nad6 1147.84359 1152.62491 1161.45593 1146.60988 1146.60988 1144.14278 9.01827 (best) 0 2.03769 0.106 3.70081 8.48214 17.31315 2.4671 2.4671 (best) 0.6 0.219 0.068 0.705 0.705 0.792 0.643 yes best yes yes yes yes yes yes yes best a The Shimodaira-Hasegawa tests were conducted in PAUP [89]. b Tree topologies correspond to those presented in Figure 4. Newic format for the various trees are as follows: Tree 1: (DQ364632,((F.fuj,(F.ver,Fsp34)),(AY945289,AY874423))); Tree 2: (DQ364632,(fsp34,F.ver,f.fuj),(AY945289,AY874423)); Tree 3: (DQ364632,(fsp34,F.ver,f.fuj, AY945289,AY874423)); Tree 4: (DQ364632,(((AY874423,AY945289),(f.ver,fsp34)),F.fuj)); Tree 5: (DQ364632,(f.fuj,((f.ver,fsp34),(AY945289,AY874423)))); Tree 6:(DQ364632,(Fsp34, (F.ver,(F.fuj,(AY945289,AY874423))))). Species and isolates are designated as follows: Fsp34 = F. circinatum, F. ver = F. verticillioides, F. fuj = F. fujikuroi, DQ364632 = F. graminearum, AY945289 = F. oxysporum isolate F11, AY874423 = F. oxysporum isolate VPRI 19292 c d P values for the SH tests. For each dataset, the tree receiving the best likelihood score are indicated with “Best”; those topologies that are significantly worse (P < 0.05) than the best tree are indicated with “Yes” and those that are not (P > 0.05) with “No”. F. circinatum CDS F. circinatum tRNA genes F. oxysporum F. verticillioides F. fujikuroi Figure S1. Physical map and BLAST comparison of the mt genomes of F. circinatum against F. oxysporum, F. verticillioides and F. fujikuroi. The coding sequences (CDSs) and tRNA genes predicted in the F. circinatum genome are indicated in blue and red, respectively. The GC content of the F. circinatum mt genome is indicated in black and was plotted (using a sliding window) as the deviation from the average of 31.4% that was calculated over the entire sequence. The map and BLAST comparison was constructed with CGView Server (http://stothard.afns.ualberta.ca/cgview_server/). 100 F. gram nad3 intron1 LAGLIDADG1 F. sol nad5 intron 1 LAGLIDADG 1 F. gram cox3 intron 2 LAGLIDADG1 F. gram cox2 intron1 LAGLIDADG1 F. solani nad4 intron1 LAGLIDADG1 F. gram atp6.intron1 LADLIDADG1 F. gram nad2 intron4 LAGLIDADG1 F. gram nad2 intron3 LAGLIDADG1 F. cir nad2 intron1 LAGLIDADG1 F. gram nad4L intron1 LAGLIDADG1 F. gram cox1 intron7 LAGLIDADG1 F. gram cox1 intron7 LAGLIDADG2 F. sol nad5 intron2 LAGLIDADG2 F. gram atp6 intron1 LAGLIDADG2 F. gram nad3 intron1 LAGLIDADG2 F. sol nad4 intron1 LAGLIDADG2 F. gram cox2 intron1 LAGLIDADG2 F. gram nad2 intron4 LAGLIDADG2 F. gram nad2 intron3 LAGLIDADG2 F. cir nad2 intron1 LAGLIDADG2 F. gram nad4L intron 1 LAGLIDADG2 F. ver cox1 intron1 LAGLIDADG2 100 F. sol cox1 intron5 LAGLIDADG2 F. cir cox1 intron3 LAGLIDADG2 96 F. gram cox1 intron4 LAGLIDADG2 F. gram cox1 intron8 LAGLIDADG2 F. cir cox1 inton4 ORF1 LAGLIDADG2 100 F. gram cox1 intron9 LAGLIDADG1 F. cir cox1 intron4 LAGLIDADG2 F. gram cox1 intron6 LAGLIDADG2 100 F. sol cox1 intron8 LAGLIDADG2 100 F. sol cox1 intron6 LAGLIDADG2 F. gram nad5 intron1 LAGLIDADG2 F. cir cob intron3 LAGLIDADG2 F. gram cob intron4 LAGLIDADG2 F. sol cox3 intron1 LAGLIDADG1 F. cir cob intron1 ORF2 LAGLIDADG1 100 F .gram cox1 intron2 LAGLIDADG1 89 F. cir cox1 intron1 LAGLIDADG1 F. sol cox1intron3 LAGLIDADG1 F. gram cob intron4 LAGLIDADG1 F. oxy nad5 intron1 LAGLIDADG2 100 F. oxy nad5 intron1 LAGLIDADG1 F. gram nad5 intron1 LAGLIDADG1 94 F. gram cox1 intron4 LAGLIDADG1 F. ver cox1 intron1 LAGLIDADG1 100 F. sol cox1 intron5 LAGLIDADG1 F. cir cox1 intron3 LAGLIDADG1 F. gram cox1 intron2 LAGLIDADG2 F. cir cox1 intron1 LAGLIDADG2 F .sol cox1 intron3 LAGLIDADG2 F. cir cox3 intron1 LAGLIDADG2 F. gram cob intron1 LAGLIDADG1 F .gram cox1 intron5 LAGLIDADG2 F .gram cox1 intron8 LAGLIDADG1 100 F .cir cox1 intron4 ORF1 LAGLIDADG1 F. gram cox3 intron1 LAGLIDADG1 100 F. cir cox3 intron1 LAGLIDADG1 F. sol cox1 intron2 LAGLIDADG2 F. sol cox1 intron1 LAGLIDADG2 F. gram cox1 intron3 LAGLIDADG2 F. sol cox1 intron4 LAGLIDADG2 F. cir cox1 intron2 LAGLIDADG2 100 F .gram cox1 intron1 LAGLIDADG1. F .cir cox1 intron6 LAGLIDADG1 F. gram cob intron3 ORF1 LAGLIDADG1 F. cir cob intron3 LAGLIDADG1 99 F. gram cox1 intron5 LAGLIDADG1 99 F. sol cox1 intron6 LAGLIDADG1 F. gram cox1 intron6 LAGLIDADG1 F. sol cox1 intron2 LAGLIDADG1 F .sol cox1 intron1 LAGLIDADG1 F. cir cox1 intron2 LAGLIDADG1 100 F. sol cox1 intron4 LAGLIDADG1 F .gram cox1 intron3 LAGLIDADG1 F. sol cox3 intron1 LAGLIDADG2 F. cir cob intron3 LAGLIDADG1 F. cir cob intron4 ORF1 LAGLIDADG1 0.5 Figure S2. Midpoint rooted maximum likelihood phylogenetic tree of the amino acid LAGLIDADG endonuclease domains identified within intron regions of F. circinatum (F.cir), F. verticillioides (F.ver), F.fujikuroi (F. fuj), F. oxysporum (F. oxy), F. solani (F. sol) as well as F. graminearum (F. gram). Bootstrap values (>85%), based on 1000 replications are indicated at the internodes. LAGLIDADG1 and 2 = the first and second conserved domain of the double LAGLIDADG. F. sol cox1 intron 7 F. gram cox1 intron10 ORF1 100 F. cir cox1 intron5 F. ver cox1 intron2 F. gram cox1 intron10 ORF2 F. gram cob intron2 F. ver nad1 intron1 F. gram cox2 intron2 F. gram cox1 intron1 97 F. gram atp9 intron1 F. gram cox1 intron12 F.cir cox1 intron7 F. gram cox2 intron3 F. gram nad1 intron2 1 Figure S3. Midpoint rooted maximum likelihood phylogenetic tree of the amino acid GIY-YIG endonuclease domains identified within intron regions of F. circinatum (F.cir), F. verticillioides (F.ver), F.fujikuroi (F. fuj), F. oxysporum (F. oxy), F. solani (F. sol) as well as F. graminearum (F. gram). Bootstrap values (>85%), based on 1000 replications are indicated at the internodes. atp6 F. circinatum 99 99 atp8 F. fujikuroi F. verticillioides F. circinatum F. oxysporum 85 100 atp9 F. verticillioides F. fujikuroi 96 F. oxysporum F. oxysporum F. oxysporum F. verticillioides F. oxysporum 81 F. graminearum F. solani F. graminearum F.solani F. solani NC 003388 NC003388 NC003388 AY884128 AY884128 NC004514 NC004514 NC004514 0.02 0.02 0.05 cob cox1 cox2 F. verticillioides 84 F. verticillioides F. circinatum F. circinatum F. fujikuroi 91 F. verticillioides F. oxysporum 100 F. oxysporum 100 F. graminearum F. solani NC 003388 AY884128 NC 004514 cox3 97 F. oxysporum 99 97 82 F. oxysporum 100 100 F. graminearum 95 NC 003388 AY884128 NC 004514 NC 004514 0.05 nad2 F. verticillioides 100 100 F. fujikuroi 98 F. circinatum 92 F. oxysporum 83 99 100 F. graminearum F. solani 87 88 F. graminearum AY945289 AY884128 AY884128 NC 004514 F. solani AY874423 NC 003388 AY884128 NC 003388 NC 004514 0.02 NC 004514 0.05 nad3 0.05 nad4L F. verticillioides F. circinatum nad4 F. circinatum 97 F. fujikuroi F. oxysporum F. oxysporum F. graminearum 95 F. solani F. verticillioides F. solani F. graminearum NC 003388 AY884128 F. verticillioides F. fujikuroi 100 F. circinatum F.oxysporum 100 97 F.oxysporum 100 F. graminearum F. solani NC 003388 AY884128 NC 004514 F. oxysporum F. oxysporum F. fujikuroi 100 F. circinatum F. fujikuroi F. graminearum F. oxysporum F. solani NC 003388 85 F. verticillioides F. oxysporum F. oxysporum F. fujikuroi 85 F. graminearum NC 003388 AY884128 F. circinatum 100 F. oxysporum F. solani nad1 95 F. fujikuroi 100 F. oxysporum F. solani F. verticillioides 88 85 NC 003388 AY884128 NC 004514 0.05 F. circinatum F. fujikuroi 98 0.02 0.05 100 F. fujkuroi 100 F. graminearum AY884128 96 F. circinatum 83 F. oxysporum NC 004514 0.02 0.02 nad5 F. verticillioides nad6 F. oxysporum F. circinatum 100 100 88 F. fujikuroi F. fujikuroi 93 F. oxysporum 91 94 F. oxysporum 100 100 F. graminearum 99 F. solani NC 003388 AY884128 F. verticillioides F. circinatum F. graminearum F. solani NC003388 AY884128 NC004514 NC 004514 0.05 F. oxysporum 0.1 Figure S4. Maximum likelihood phylogenies for Fusarium based on mitochondrial proteincoding nucleotide sequences. Species and isolates designated as follows: F. circinatum (JX910419), F. verticillioides (JN041210), F. fujikuroi (JX910420), F. graminearum (DQ364632)F.oxysporum (AY945289, AY874423), F. solani (JN041209) Trichoderma reesi (NC 003388), Metarhizium anisopliae (AY884128), Lecanicillium muscarium (NC004514). concatenated dataset atp6 S. cerevisiae NC 012145 97 S. castellii NC 003920 97 S. cerevisiae NC 012145 S. castellii NC 003920 81 C. glabrata NC 004691 0.02 0.05 cob S. servazzii NC 004918 100 S. castellii NC 003920 S. castellii NC 004918 94 S. servazzii NC 003920 C. glabrata NC 004691 cox2 S. cerevisiae NC 012145 91 S. cerevisiae NC 012145 S. servazzii NC 004918 S. castellii NC 003920 S. castellii NC 003920 S. servazzii NC 004918 C. glabrata NC 004691 S. cerevisiae NC 012145 S. pastorianus NC 001224 S. pastorianus NC 001224 S. pastorianus NC 001224 0.02 S. pastorianus NC 001224 0.01 cox1 100 S.cerevisiae NC 012145 S. servazzii NC 004918 S. pastorianus NC 001224 C. glabrata NC 004691 93 97 S. pastorianus NC 001224 S. servazzii NC 004918 100 atp8 C. glabrata NC 004691 C. glabrata NC 004691 0.02 0.02 Figure S5. Maximum likelihood phylogenies for Saccharomyces species based on mitochondrial protein-coding nucleotide sequences.