Table S1. Summary of the Ca. S. dinenymphae draft genomes. Sample name B4-10h C5-4h C9-6h D9-6h 1,606 599 1,092 1,255 3,151,146 1,173,072 1,878,065 1,777,750 N50 (bp) 5,671 8,649 4,864 3,017 Largest contig (bp) 52,787 36,477 56,258 31,391 Number of contigs (>200bp) Total length (bp) Table S2. Genome-to-Genome distance between the draft genomes B4-10h C5-4h C5-4h 85.4 ± 2.5 C9-6h 86.2 ± 2.5 78.7 ± 2.8 D9-6h 87.7 ± 2.3 81.9 ± 2.7 C9-6h 84.1 ± 2.6 Table S3. Stepwise removal of contaminating contigs during the B4-10h genome assembly. Total length (bp) Number of contigs N50 (bp) Largest contig (bp) 3,790,463 633 (> 500bp) 44,956 105,658 Step1 3,566,269 384 45,584 105,658 Step2 3,540,430 364 46,500 105,658 Step3 3,519,982 346 46,500 105,658 The removal was done based on the sequence similarity to Bacteroidetes genomes in database (step 1), the GC content (step 2), and the tetranucleotide frequencies (step 3). See Materials and methods for detail. Table S4. Genes and regions related to transposable elements in the B4-10h genome. B4-10h Phage integrase-like genes 4 Transposases 19 CRISPRs 3 CRISPR-associated (cas) genes 2 120 Ca. S. dinenymphae S. dinenymphae Ca. pseudotrichonymphae Az. A. pseudotrichonymphae D. gadei D. capnocytophagoides B. thetaiotaomicron B. fragilis P. ruminicola 100 Genes / Mb 80 60 40 20 0 J K L D V T M N U O C G E F H I Information storage and processing Cellular processes and signaling P Q Metabolism Figure S1. Comparison of the Ca. S. dinenymphae B4-10h genome with other Bacteroidales genomes based on Cluster of Bacteroidetes non-supervised orthologous groups. Abbreviations of the functional categories are: J, Translation, ribosomal structure and biogenesis; K, Transcription; L, Replication, recombination and repair; D, Cell cycle control, cell division, chromosome partitioning; V, Defense mechanism; T, Signal transduction mechanisms; M, Cell wall/membrane/envelope biogenesis; N, Cell motility; U, Intracellular trafficking, secretion, and vesicular transport; O, Posttranslational modification, protein turnover, chaperones; C, Energy production and conversion; G, Carbohydrate transport and metabolism; E, Amino acid transport and metabolism; F, Nucleotide transport and metabolism; H, Coenzyme transport and metabolism; I, Lipid transport and metabolism; P, Inorganic ion transport and metabolism; Q, Secondary metabolism biosynthesis, transport and catabolism. 50 40 GHs / Mb 30 20 10 0 Bacteroidales with cellulase/xylanase (n=59) Bacteroidales without cellulase/xylanase (n=21) Figure S2. Frequency of glycoside hydrolase (GH) genes in the genomes of Ca. S. dinenymphae and other Bacteroidales bacteria. The frequency is shown as GH genes per megabase nucleotides. The values are separately presented between the Bacteroidales members with cellulase and/or xylanase genes and members without these genes. The number of GH genes were counted in the NCBI database and checked by BLAST searches against the CAZy database. Open circles indicate outliers and a closed circle indicates the frequency in the Ca. S. dinenymphae B4-10h genome. Figure S3. Phylogenetic tree of genes belonging to the GH5 family and subfamilies. The approximately-maximum-likelihood tree was constructed with a bootstrap analysis of 10,000 resamplings using FastTree with default JTT and CAT substitution models. The numbers at branch ends indicate the subfamily of GH5. The red circles indicate GH5 family genes of Ca. S. dinenymphae. Nodes supported by bootstrap values of > 80% are indicated by filled circles. SusD Outer membrane (21) SusD SusC SusC (39) Glycoside hydrolases periplasm SusR cytoplasmic (1) membrane Sugar TonB-ExBD complex (9) (1) Arabinose Xylose (1) XylE (1) TonB-ExBD complex cytoplasm activation of sus genes Sugar Arabinose transporter transporter Figure S4. Overview of the predicted polysaccharide-utilizing Sus system in Ca. S. dinenymphae. The parentheses indicate the number of genes found in the Ca. S. dinenymphae B4-10h genome. Modified from Martens et al. (2009). TonB-ExBD is a receptor complex that facilitates active transport. PRPP Fructose-6P HisF 3.6.1.31 2.42.17 5.3.1.16 3.5.4.19 HisH Ribulose-5P 4.2.1.19 Erythrose-4P His 2.6.1.9 3.1.3.15 1.1.1.23 2.5.1.54 Anthranilate 2.4.2.18 5.3.1.24 4.1.3.27 4.2.3.4 4.1.1.48 1.3.1.12 2.6.1.57 Shikimate 4.2.1.10 2.6.1.57 5.4.99.5 4.2.3.5 1.3.1.43 Chorismate 4.2.1.51 Glycerate-3P 4.2.1.51 2.6.1.1 Trp Phosphoenol pyruvate Ile 2.6.1.42 4.2.1.9 Val 2.6.1.42 4.2.1.9 2.6.1.9 Phosphoserine Tyr 1.14.16.1 Phe 2.6.1.57 3.1.3.3 4.2.1.20 Ser 2.3.1.30 2.1.2.1 pyruvate 2.5.1.47 Cys O-AcetylL-serine 1.1.1.86 2.2.1.6 2.6.1.1 2.6.1.9 2.5.1.19 2.7.1.71 1.1.1.25 Trp 4.2.1.20 2.8.1.7 Ala Homocysteine Gly 2.1.1.13 4.4.1.8 Met 2.3.1.48 1.1.1.86 Thr 2.3.3.13 O-Succinylhomoserine 4.2.3.1 2.3.1.46 Homoserine 2.7.1.39 4.2.1.33 1.1.1.3 Oxaloacetate 4.2.1.33 2.6.1.1 Asp 6.3.5.4 2.7.2.4 4.3.3.7 1.17.1.8 2.6.1.83 5.1.1.7 4.1.1.20 Lys 6.3.4.5 Asn 1.1.1.85 1.2.1.11 4.3.2.1 Arg Gln 6.3.1.2 Succinate 2.6.1.42 Leu 1.4.1.13 Succinyl-CoA 1.4.1.14 Glu 2.7.2.11 1.2.1.41 1.5.1.2 Pro Figure S5. Predicted biosynthetic pathways for amino acids in the Ca. S. dinenymphae genome. The colors indicate presence of the complete pathway in B4-10h (blue), in other Ca. S. dinenymphae genomes (green), and partial pathways in either of B4-10h or the other (red). Fibronectin SAMD00024442_1_92 SAMD00024442_7_63 SAMD00024442_7_66 SAMD00024442_63_11 Ankyrin-like repeat SAMD00024442_5_16 SAMD00024442_14_29 Tetratricopeptide repeat SAMD00024442_3_44 SAMD00024442_40_7 Prevotella melaninogenica Prevotella intermedia Prevotella ruminicola Alistipes finegoldii O. splanchnicus Bacteroides thetaiotaomicron Bacteroides reticulotermitis Bacteroides fragilis Porphyromonas gingivalis Porphyromonas asaccharolytica P. propionicigenes B. viscericola Parabacteroides distasonis D. gadei D. mossii D. capnocytophagoide P. acetatigenes Ca. A. pseudotrichonymphae Bacterial Ig SAMD00024442_4_13 SAMD00024442_12_11 SAMD00024442_14_18 SAMD00024442_18_14 SAMD00024442_18_15 SAMD00024442_18_16 SAMD00024442_18_17 SAMD00024442_19_19 SAMD00024442_54_2 SAMD00024442_55_11 SAMD00024442_55_12 SAMD00024442_59_3 Figure S6.S6.Orthologs ofS.Ca. S. dinenymphae genes related to cell adhesion in the order Figure Orthologs of dinenymphae genes related to cell adhesion in the order Bacteroidales. BLASTP searches were performed 2,303 proteins encoded in the proteins S. dinenymphae B4-10h Bacteroidales. BLASTP searchesusing weretheperformed using the 2,302 encoded in the Ca. genome as query against the individual protein databases of 18 Bacteroidales genomes (e-value < 1E-10 ). S. dinenymphae B4-10h genome query against individual protein databases of 18 Magenta and cyan indicate presenceas and absence of the the orthologs, respectively. Bacteroidales genomes (e-value < 1E-10). Magenta and cyan indicate presence and absence of the orthologs, respectively.