Ancestral and More Recently Acquired Syntenic Relationships of MADS-box Genes Uncovered by the Physcomitrella patens Pseudochromosomal Genome Assembly Plant Cell Reports E. I. Barker N. W. Ashton Department of Biology, University of Regina, Regina, SK, S4S 0A2, Canada Author for correspondence: Neil W. Ashton e-mail: physcomitrella@gmail.com OR3 New phylogenetic analysis of Physcomitrella patens MADS-box genes and pseudogenes Supplementary Introduction We previously constructed separate trees of type I, MIKCC and MIKC* MADS-box genes in Physcomitrella and a poorly resolved composite tree of MADS-box genes from several plant taxa (Barker and Ashton 2013). For our current analysis, we included the pseudogenes, PPTIM6 and PPMA5. Here, we present the major results that are referred to in our current paper. Supplementary Results Weighted Maximum Parsimony (WMP) cladograms of Physcomitrella Type I, Type II (MIKCC) and Type II (MIKC*) MADS-box genes are illustrated below. The WMP tree of type I genes contained a clade of Mβ genes that was sister to a clade comprising two subclades, one containing Mα genes and another containing three type I genes that cannot be classified as belonging to any of the three major clades of angiosperm type I genes, Mα, Mβ and Mγ (Pařenicová et al. 2003). The pseudogene, PPTIM6, grouped with the Mβ genes. In the Bayesian and Maximum Likelihood (ML) trees, the branch leading to the clades of Mα genes and unclassified genes was collapsed into a polytomy. The WMP and Bayesian trees of type II MIKCC genes both contained two major clades of MIKCC genes, one comprising PPM2 and its sister clade containing PPM1 and PpMADS1. The second clade contained PpMADS-S as sister to a clade containing PPMC5 and PPMC6. In the ML tree, only the PPM2 clade was resolved. The WMP and Bayesian trees of type II MIKC* genes contained two major clades, which we designate as the PpMADS2 clade and the PPMA9 clade. PPMA5, the pseudogene, grouped with PPM7. In the ML tree, the branch leading to the PpMADS2-PPMA12 and PpMADS3-PPMA8 subclades was collapsed into a polytomy. WMP cladograms of Physcomitrella Type I, Type II (MIKCC) and Type II (MIKC*) MADS-box genes. The main clades within each of these major groups are shaded differentially. Pseudogenes are designated by Ψ. Bootstrap support values (or Bayesian posterior probabilities) are shown above (WMP) or below (Bayesian/ML) branches. Supplementary Material and Methods For the phylogenetic tree of type I genes, we used 50 amino acid residues from the most conserved middle portion of the encoded MADS domain of the Physcomitrella type I genes (except PPTIM6 which possesses a truncated MADS-box) and the single MADS domain protein from each of two algal species, Ostreococcus lucimarinus and O. tauri, that were used to root the tree. For the MIKCC tree rooted with the MIKC* genes, PpMADS2 and PPM6, the complete protein sequences encoded by the MIKCC genes, along with the MADS domain, a portion of the I domain and the extended K domain (Krogan and Ashton 2000) encoded by the MIKC* genes were aligned in Clustal Omega (Sievers et al. 2011) and adjusted manually by eye in Mesquite (Maddison and Maddison 2015). For the tree of MIKC* genes rooted with the MIKCC genes, PPM1 and PPMC5, the complete protein sequences encoded by the MIKC* genes (except for the polyglutamine tract of the C-terminal domain) and the N-terminal domain, the MADS domain, a portion of the I domain and the extended K domain encoded by the MIKCC genes were aligned using the same procedure. (All alignments are available upon request.) The DNA sequences were then aligned according to the protein alignments. Model testing for the Bayesian and ML trees was performed using jModeltest 2.1.7 (Guindon and Gascuel 2003; Darriba et al. 2012) and the default settings except Base Tree Search=Best. The Corrected Akaike Information Criterion (Akaike 1974) was used to choose the best model (TPM2+I for the type I genes, TIM3+G for the MIKCC genes and TrN+I+G for the MIKC* genes). WMP and ML trees were constructed in PAUP 4.0 beta (Swofford 2002) using the default settings except that the initial trees were obtained by random stepwise addition. WMP and ML trees were bootstrapped with 1000 replicates, with the exception that the ML tree of MIKC* genes was bootstrapped with 100 replicates. Bayesian trees were constructed using MrBayes (Huelsenbeck and Ronquist 2001; Ronquist and Huelsenbeck 2003). For the type I, MIKCC and MIKC* trees, 50,000 generations, 300,000 generations and 1,000,000 generations respectively were run. Trees were visualised in TreeGraph2 (Stöver et al. 2010). Supplementary References Akaike H (1974) A new look at the statistical model identification. IEEE Transactions on Automatic Control AC-19(6):716-723 Barker EI, Ashton NW (2013) A parsimonious model of lineage-specific expansion of MADS-box genes in Physcomitrella patens. Plant Cell Rep 32(8):1161-1177 Darriba D, Taboada GL, Doallo R, Posada D (2012) jModelTest2: more models, new heuristics and parallel computing. Nat Methods 9(8):772 Guindon S, Gascuel O (2003) A simple, fast and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52:696-704 Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference of phylogeny. Bioinformatics 17:754755 Krogan NT, Ashton NW (2000) Ancestry of plant MADS-box genes revealed by bryophyte (Physcomitrella patens) homologues. New Phytol 147(3):505-517 Maddison WP, Maddison DR (2015) Mesquite: a modular system for evolutionary analysis. Version 3.04 http://mesquiteproject.org Pařenicová L, de Folter S, Kieffer M, Horner DS, Favalli C, Busscher J, Cook HE, Ingram RM, Kater MM, Davies B, Angenent GC, Colombo L (2003) Molecular and phylogenetic analyses of the complete MADSbox transcription factor family in Arabidopsis: new openings to the MADS world. Plant Cell 15:1538-1551 Ronquist F, Huelsenbeck JP (2003) MRBAYES 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572-1574 Sievers F, Wilm A, Dineen D, Gibson TJ, Karplu K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7:539 Stöver BC, Müller KF (2010) TreeGraph 2: combining and visualizing evidence from different phylogenetic analyses. BMC Bioinformatics 11:7 Swofford DL (2002) PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates, Sunderland Massachusetts