ELECTORONIC SUPPLEMENTARY MATERIALS 1. MATERIALS & METHODS TAXONOMIC SAMPLING In addition to 54 fish species used in Azuma et al. (2008), 22 atherinomorphs (including 14 medakas) were newly added in this study, making a total number of species analyzed 76 (table S1). The 14 medakas included all of the three known species groups (celebensis, javanicus and latipes species groups), with the latter comprising two species (Oryzias luzonensis and O. latipes) and four regional populations of O. latipes (Shanghai, South Korea, southern and northern Japanese populations). SPECIMENS AND DNA EXTRACTION A portion of epaxial musculature (ca. 0.25 g) from fresh specimens of each species was excised and the tissue immediately preserved in 99.5% ethanol. Total genomic DNA from the ethanol-preserved tissue was extracted using DNeasy (Qiagen) or Aquapure genomic DNA isolation kit (Bio-Rad Laboratories, Inc.) following manufacturer’s protocols. PCR AND SEQUENCINGS Whole mitogenomes of the eight medaka species were amplified in their entirety using a long PCR technique (Cheng et al. 1994). Seven fish-versatile PCR primers for the long PCR were used in the following four combinations: L2508-16S + H12293-Leu; L2508-16S + H15149-CYB; L8343-Lys + H1065-12S; and L12321-Leu + S-LA-16S-H (for locations and sequences of these primers, see Inoue et al. 2000, 2001; Ishiguro et al. 2001, Kawaguchi et al. 2001; Miya & Nishida 2000) to amplify the entire mitogenome in two reactions. Long PCR reaction conditions followed Miya and Nishida (1999). Long PCR products diluted with TE buffer (1:19) were subsequently used as templates for short PCR reactions employing fish-versatile PCR primers in various combinations to amplify contiguous, overlapping segments of the entire mitogenome. The short PCR reactions were carried out following protocols previously described (Miya and Nishida 1999), then purified using Exosap-IT enzyme (GE Healthcare Bio-Sciences Corp.), and subsequently sequenced with dye-labeled terminators (BigDye terminator ver. 1.1/3.1, Applied Biosystems) and the primers used in the short PCRs. Sequencing reactions were conducted following the manufacturer’s instructions, followed by electrophoresis on ABI Prism 3100 or 3130 DNA sequencers (Applied Biosystems). A list of PCR primers used in this study is available from MM upon request. SEQUENCE EDITING AND ALIGNMENT Each sequence electropherogram was edited with EditView (ver. 1.01; Applied Biosystems) and the multiple sequences were concatenated using AutoAssembler (ver. 2.1; Applied Biosystems). The concatenated sequences were carefully checked and annotated using DNASIS (ver. 3.2; Hitachi Software Engineering) and a sequence file was created for each gene. 1 Mitogenome sequences from the 22 atherinomorphs were concatenated with the pre-aligned sequences used in Azuma et al. (2008) in a FASTA format, which was subjected to multiple alignment using MAFFT ver. 6 (Katoh & Toh 2008). The aligned sequences were imported into MacClade ver. 4.08 (Maddison & Maddison 2000) and the resulting gaps in the pre-aligned sequences were manually removed to reproduce the alignment used in Azuma et al. (2008). The dataset comprises 6966 positions from first and second codon positions of the 12 protein-coding genes (excluding ND6 gene), 1673 positions from the two rRNA genes and 1407 positions from the 22 tRNA genes (total 10,046 positions). The third codon positions of the protein-coding genes were entirely excluded because of their extremely accelerated rates of changes that may cause high level of homoplasy (Miya & Nishida 2000) and overestimation of divergence time (Benton & Ayala 2003). PHYLOGENETIC ANALYSIS Unambiguously aligned sequences were divided into four partitions (first, second codon positions, rRNA and tRNA genes) and subjected to the partitioned maximum-likelihood (ML) analysis using RAxML ver. 7.0.4 (Stamatakis 2006). General time reversible model with sites following a discrete gamma distribution (GTR + ; the model recommended by the author) was used and a rapid bootstrap (BS) analysis was conducted with 1000 replications (–f a option). This option performs BS analysis using GTRCAT, which is GTR approximation with optimization of individual per-site substitution rates, and classification of those individual rates into certain number of rate categories. After implementing the BS analysis, the program uses every fifth BS tree as a starting point to search for ML tree using GTR + model of sequence evolution to obtain more stable likelihood values. DIVERGENCE TIME ESTIMATION A relaxed molecular-clock method for dating analysis developed by Thorne and Kishino (2002) was used to estimate divergence times. This method accommodates unlinked rate variation across different loci (“partitions” in this study), allows the use of time constraints on multiple divergences, and uses a Bayesian MCMC approach to approximate the posterior distribution of divergence times and rates based on a single tree topology estimated from the other method (ML tree in this study). A series of software in a program package multidistribute (v9/25/2003) was used for these analyses. Baseml in PAML ver. 3.14 was used to estimate model parameters for each partition separately under the F84 + model of sequence evolution (the most parameter-rich model implemented in multidistribute). Based on the outputs from baseml, branch lengths and the variance-covariance matrix were estimated using estbranches in multidistribute for each partition. Finally multidivtime in multidistribute was used to perform Bayesian MCMC analyses to approximate the posterior distribution of substitution rates, divergence times, and 95% credible intervals. In this step, multidivtime uses estimated branch lengths and the variance-covariance matrices from all partitions without information from the aligned sequences. MCMC approximation with a burnin period of 100,000 cycles was obtained and every 100 cycles taken until a total of samples reaching 10,000. To diagnose possible failure of the 2 Markov chains to converge to their stationary distribution, at least two replicate MCMC runs were performed with two different random seeds for each analysis. Application of multidivtime requires values for the mean of the prior distribution for the time separating the ingroup root from the present (rttm) and its standard deviation (rttmsd) and we set conservative estimates of 4.45 (= 445 Mya) and 4.45 SD, respectively. The tip-root branch lengths were calculated using TreeStat v. 1.1 for all terminals and their average was divided by rttm (4.45) to estimate rate of the root node (rtrate) and its standard deviation (rtratesd), which were set to 0.074 and 0.074, respectively. The priors for the mean of the Brownian motion constant, brownmean and brownsd, were both set to 0.5, specifying a relatively flexible prior. The multidivtime program allows for both minimum (lower) and maximum (upper) time constraints and it has been argued that multiple calibration points would provide overall more realistic divergence time estimates. We therefore sought to obtain an optimal phylogenetic coverage of calibration points across our tree, although we could set maximum constraints based on fossil records only for the three basal splits between Sarcopterygii and Actinopterygii, Polypteriformes and Actinopteri, Acipenseriformes and Neopterygii (A–C in figure 1; table S2). We also set lower and upper time constraints for three nodes in cichlids divergence, which show excellent congruities with Gondwanan continental fragmentations, assuming that they have never dispersed across oceans. Accordingly we set a total of 27 time constrains based on both fossil record and biogeographic events as shown in figure 1 and table S2. 2. RESULTS GENOME ORGANIZATION The whole mitogenome sequences from the eight medaka species reported here for the first time were registered in DDBJ/EMBL/GenBank (table S1 in ESM). The genome contents (including 13 protein-coding, two rRNA and 22 tRNA genes and the control region) and gene orders were identical to those of typical vertebrates. 3. ACKNOWLEDGMENTS We thank Y. Azuma, Y. Yamanoue and other members of Marine Molecular Biology Laboratory, Ocean Research Institute, The University of Tokyo, for their invaluable advice and discussions. Sincere thanks are also go to J.L. Thorne for his advice in performing multidivtime analysis. 4. REFRENCES Azuma, Y. et al. 2008 Mitogenomic evaluation of the historical biogeography of cichlids toward reliable dating of teleostean divergences. BMC Evol. Biol. 8, 215. (doi: 10.1186/1471-2148-8-215) 3 Benton, M.J. & Ayala, F. 2003 Dating the tree of life. Science 300, 1698–1700. (doi: 10.1126/science.1077795) Benton, M.J. & Donoghue, P.C. 2007 Paleontological evidence to date the tree of life. Mol. Biol. Evol. 24, 26–53. (doi:10.1093/molbev/msl150) Cheng, S., Higuchi, R. & Stoneking, M. 1994 Complete mitochondrial genome amplification. Nat. Genet. 7, 350–351. (doi:10.1038/369684a0) Hurley, I.A. et al. 2007 A new time-scale for ray-finned fish evolution. Proc. R. Soc. B, 274, 489–498. (doi: 10.1098/rspb.2006.3749) Inoue, J.G.., Miya, M., Tsukamoto, K. & Nishida, M. 2000 Complete mitochondrial DNA sequence of the Japanese sardine Sardinops melanostictus. Fish. Sci. 66, 924–932. (doi: 10.1111/j.1444-2906.2000.00148.x) Inoue, J.G.., Miya, M., Tsukamoto, K. & Nishida, M. 2001 Complete mitochondrial DNA sequence of the Japanese anchovy Engraulis japonicus. Fish. Sci. 67, 828–835. (doi: 10.1046/j.1444-2906.2001.00329.x) Inoue, J.G. et al. 2009 The historical biogeography of the freshwater knifefishes using mitogenomic approaches: a Mesozoic origin and out-of-India migration of the Asian notopterids (Actinopterygii: Osteoglossomorpha). Mol. Phylogenet. Evol. (doi: 10.1016/j.ympev.2009.01.020) Ishiguro, N.B., Miya, M. & Nishida, M. 2001 Complete mitochondrial DNA sequence of ayu, Plecoglossus altivelis. Fish. Sci. 67, 474–481. (doi: 10.1046/j.1444-2906.2001.00283.x) Janvier, P. 1996 Early vertebrates. Oxford: Oxford University Press. Katoh, K. & Toh, H. 2008 Recent developments in the MAFFT multiple sequence alignment program. Briefings Bioinformat. 9, 286–298. (doi:10.1093/bib/bbn013) Kawaguchi, A., Miya, M. & Nishida, M. 2001 Complete mitochondrial DNA sequence of Aulopus japonicus (Teleostei: Aulopiformes), a basal Eurypterygii: longer DNA sequences and higher-level relationships. Ichthyol. Res. 48, 213–223. (doi: 10.1007/s10228-001-8139-0) Maddison, W.P. & Maddison, D.R. 2000 MacClade version 4. Sunderland, Massachusetts: Sinauer Associates. Masters, J.C., de Wit, M.J. & Asher, R.J. 2006 Reconciling the origins of Africa, India and Madagascar with vertebrate dispersal scenarios. Folia Primatol. 77, 399–418. (doi: 10.1159/000095388) Miya, M. & Nishida, M. 1999 Organization of the mitochondrial genome of a deep-sea fish Gonostoma gracile (Teleostei: Stomiiformes): first example of transfer RNA gene rearrangements in bony fishes. Mar. Biotechnol. 1, 416–426. (doi:10.1007/PL00011798) Miya, M. & Nishida, M. 2000 Use of mitogenomic information in teleostean molecular phylogenetics: a tree-based exploration under the maximum-parsimony optimality criterion. Mol. Phylogenet. Evol. 17, 437–455. (doi:10.1006/mpev.2000.0839) Parenti, L.R. 2008 A phylogenetic analysis and taxonomic revision of ricefishes, Oryzias and relatives (Beloniformes, Adrianichthyidae) Zool. J. Linn. Soc. 154, 494–610. Patterson, C. 1993 Osteichthyes: Teleostei. In The fossil record 2 (ed. M.J. Benton), pp. 621–656. London: Chapman & Hall. . Smith, A.G., Smith, D.G. & Funnell, B.M. 1994 Atlas of Mesozoic and Cenozoic coastlines. 4 Cambridge: Cambridge University Press. Storey, B.C. 1995 The role of mantle plumes in continental breakup: case histories from Gondwanaland. Nature 377, 301–308. (doi:10.1038/377301a) Thorne, J.L. & Kishino, H. 2002 Divergence time and evolutionary rate estimation with multilocus data. Syst. Biol. 51, 689–702. (doi: 10.1080/10635150290102456) Tyler, J.C. & Sorbini, L. 1996 New superfamily and three new families of tetraodontiform fishes from the Upper Cretaceous: the earliest and most morphologically primitive plectognaths. Smithson. Contrib. Paleobiol. 82, 1–59. Wilson, M.V H., Brinkman, D.B. & Neuman, A.G. 1992 Cretaceous Esocoidei (Teleostei): early radiation of the pikes in North American fresh waters. J. Paleontol. 66, 839–846. Yamanoue, Y., Miya, M., Inoue, J.G., Matsuura, K. & Nishida, M. 2006 The mitochondrial genome of spotted green pufferfish Tetraodon nigroviridis (Teleostei: Tetraodontiformes) and divergence time estimation among model organisms in fishes. Gene Genet. Syst. 81, 29–39. Yang, Z. 1997 PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13, 555–556. Zhu M. et al. 2006 A primitive fish provides key characters bearing on deep osteichthyan phylogeny. Nature 441, 77–80. (doi:10.1038/nature04563) 5 Supplementary Table S1. List of species used in this study with DDBJ/GenBank/EMBL accession numbers. Taxonomic treatment of species of the family Adrianichthyidae follows Parenti (2008) ————————————————————————————————————— Order Family Species Accession No. ————————————————————————————————————— Outgroups (sharks) Carcharhiniformes Scyliorhinidae Scyliorhinus canicula Y16067 Triakidae Mustelus manazo AB015962 Ingroups (lobe-finned fishes) Coelacanthiformes Latimeriidae Ceratodontiformes Ceratodontidae Ingroups (ray-finned fishes) Polypteriformes Polypteridae Acipenseriformes Acipenseridae Lepisosteiformes Polyodontidae Lepisosteidae Amiiformes Hiodontiformes Osteoglossiformes Amiidae Hiodontidae Osteoglossidae Albuliformes Anguilliformes Cypriniformes Notacanthidae Anguillidae Muraenidae Congridae Engraulidae Clupeidae Cyprinidae Salmoniformes Balitoridae Salmonidae Clupeiformes Esociformes Aulopiformes Polymixiiformes Gadiformes Atheriniformes Esocidae Chlorophthalmidae Polymixiidae Gadidae Atherinopsidae Melanotaenidae Latimeria menadoensis Neoceratodus forsteri AP006858 AJ584642 Polypterus ornatipinnis AP004351 Polypterus senegalus senegalus AP004352 Erpetoichthys calabaricus AP004350 Acipenser transmontanus AB042837 Scaphirhynchus cf. albus AP004354 Polyodon spathula AP004353 Lepisosteus oculatus AB042861 Atractosteus spatula AP004355 Amia calva AB042952 Hiodon alosoides AP004356 Osteoglossum bicirrhosum AB043025 Pantodon buchholzi AB043068 Notacanthus chemnitzi AP002975 Anguilla japonica AB038556 Gymnothorax kidako AP002976 Conger myriaster AB038381 Engraulis japonicus AB040676 Sardinops melanostictus AB032554 Cyprinus carpio X61010 Danio rerio AC024175 Crossostoma lacustre M91245 Coregonus lavaretus AB034824 Salmo salar U12143 Oncorhynchus mykiss L29771 Esox lucius AP004103 Chlorophthalmus agassizi AP002918 Polymixia japonica AB034826 Gadus morhua X99772 Menidia menidia AB370893 Melanotaenia lacustris AP004419 6 Cyprinodontiformes Beloniformes Beryciformes Gasterosteiformes Scorpaeniformes Perciformes Notocheilidae Iso hawaiiensis Aplocheilidae Aplocheilus panchax Goodeidae Xenotoca eiseni Cyprinodontidae Jordanella floridae Scomberesocidae Cololabis saira Exocoetidae Exocoetus volitans Hemiramphidae Hyporhamphus sajori Adrianichthyidae Oryzias latipes group Oryzias luzonensis Southern Japanese populations Oryzias latipes (Hd-rR) Oryzias latipes (Nago) Oryzias latipes Northern Japanese populations Oryzias latipes (HNI) Oryzias latipes (Hirosaki) China and West Korean populations Oryzias latipes (SOK) Oryzias latipes (Shanghai) Oryzias javanicus group Oryzias javanicus Oryzias minutillus Oryzias dancena Oryzias celebensis group Oryzias celebensis Oryzias marmoratus Oryzias sarasinorum Berycidae Beryx splendens Holocentridae Sargocentron rubrum Gasterosteidae Gasterosteus aculeatus Scorpaenidae Helicolenus hilgendorfi Cichlidae Oreochromis sp. Neolamprologus brichardi Tropheus duboisi Astronotus ocellatus Paretroplus maculatus Etroplus maculatus Hypselecara temporalis Ptychochromoides katria Paratilapia polleni Tylochromis polylepis Pomacentridae Abudefduf vaigiensis 7 AB373006 AB373005 AP006777 AP006778 AP002932 AP002933 AB370892 AB498064 AB498065 AP008946 AP004421 AB498066 AP008941 AP008947 AP008948 AB498067 AB498068 AB498069 AB498070 AP005981 AB370891 AP002939 AP004432 AP002944 AP002948 AP009126 AP006014 AP006015 AP009127 AP009504 AP009505 AP009506 AP009507 AP009508 AP009509 AP006016 Amphiprion ocellaris AP006017 Labridae Pseudolabrus sieboldi AP006019 Halichoeres melanurus AP006018 Pleuronectiformes Paralichthyidae Paralichthys olivaceus AB028664 Tetraodontiformes Tetraodontidae Takifugu rubripes AJ421455 Tetraodon nigroviridis AP006046 ————————————————————————————————————— 8 Supplementary Table S2. Maximum (U) and minimum (L) time constrains (Ma) used for dating at nodes in figure S2 ————————————————————————————————————— Node Constraints Calibration information ————————————————————————————————————— A U 472 The minimum age for the basal split of bony fish based on the earliest known acanthodian remains from Late Ordovician (Janvier 1996) L 419 The †Psarolepis fossil (sarcopterygian; Zhu et al. 2006) from Ludlow (Silurian) (Hurley et al. 2007) U 419 The minimum age for the Sarcopterygii/Actinopterygii split L 392 The †Moythomasia fossil (actinopteran) from the Givetian/Eifelian boundary (Hurley et al. 2007) U 392 The minimum age for the Polypteriformes/Actinopteri split L 345 The †Cosmoptychius fossil (neopterygian or actinopteran) from Tournasian (Hurley et al. 2007) D L 130 The †Protopsephurus fossil (Polyodontidae) from Hauterivian (Cretaceous) (Hurley et al. 2007) E L 284 The †Brachydegma fossil (stem amiids) from Artinskian (Permian) (Hurley et al. 2007) F L 136 The †Yanbiania fossil (Hiodontidae) from the Lower Cretaceous (Hurley et al. 2007) G L 112 The †Laeliichthys fossil (Osteoglossidae) from the Aptian (Cretaceous) (Patterson 1993) H L 151 The †Anaethalion, †Elopsomolos, and †Eoprotelops fossil (Elopomorpha) from Kimmeridgian (Jurassic) (Hurley et al. 2007) I L 94 The †Lebonichthys (Albulidae) fossil from the Cenomanian (Cretaceous) (Patterson 1993) J L 49 The Conger (Congridae) and Anguilla (Anguillidae) fossils from the Ypresian (Tertiary) (Patterson 1993) K L 146 The †Tischlingerichthys fossil (Ostariophysi) from Tithonian (Jurassic) (Hurley et al. 2007) L L 56 The †Knightia fossil (Clupeidae) from the Thanetian (Tertiary) (Patterson 1993) M L 49 The †Parabarbus fossil (Cyprinidae) from the Ypresian (Tertiary) (Patterson 1993) N L 74 The †Esteseox foxi fossil (Esociformes) from the Campanian (Cretaceous) (Wilson et al. 1992) O L 94 The †Berycopsis fossil (Polymixiidae) from the Cenomanian B C 9 (Cretaceous) (Patterson 1993) P L 50 The pleuronectiform fossil from the Ypresian (Tertiary) (Patterson 1993) Q L 98 The tetraodontiform fossil from the Cenomanian (Tyler & Sorbini 1996) R L 32 The estimated divergence time between Takifugu and Tetraodon (Benton and Donoghue 2007) S U 95 L 85 The upper and lower bounds of separation between Madagascar and Indian (Smith et al. 1994; Storey 1995) T U 145 L 112 The upper and lower bounds of separation between Indo-Madagascar landmass and Gondwanaland (Smith et al. 1994; Storey 1995; Masters et al. 2006) U U 120 L 100 The upper and lower bounds of separation between African and L South American landmasses (Smith et al. 1994; Storey 1995) ————————————————————————————————————— 10 Supplementary Table S3. Comparisons of divergence time estimates between the present study and previous studies ————————————————————————————————————— Node This study Azuma et al. Yamanoue et al. (2008)* (2006) ————————————————————————————————————— Sarcopterygii vs. Actinopterygii 428 (419–442) 429 (417–449) 470 (415–524) Teleostei vs. Neopterygii 364 (346–378) 365 (348–378) 390 (340–442) Euteleostei vs. Otocephala 289 (269–310) 288 (268–307) 315 (270–363) Cyprinus vs. Danio 153 (125–183) 147 (120–174) 167 (131–208) Acanthopterygii vs. Paracanthopterygii 209 (191–225) 207 (190–224) 223 (191–264) Percomorpha vs. Berycomorpha 200 (185–217) 198 (183–215) 206 (174–245) Oryzias vs. Tetraodontiformes 180 (166–195) 176 (163–191) 184 (154–221) Oryzias vs. Cichlidae 150 (139–161) 152 (141–165) —— Gasterosteus vs. Tetraodontidae 173 (159–189) 170 (156–185) 192 (153–235) Takifugu vs. Tetraodon 78 (63–93) 78 (65–93) 73 (57–94) ————————————————————————————————————— * Estimated with biogeography-based time constraints on cichlid divergence 11 FIGURE LEGEND Figure S1. Maximum likelihood tree from analysis of whole mitogenome sequences (10,046 positions excluding third codon positions) from 76 fish species using RAxML ver. 7.0.4. Numerals beside internal branches indicate bootstrap probabilities based on 1000 replicates. Scale indicates expected number of substitutions per site. 12