emi12945-sup-0001

advertisement
Table S1. Summary of the Ca. S. dinenymphae draft genomes.
Sample name
B4-10h
C5-4h
C9-6h
D9-6h
1,606
599
1,092
1,255
3,151,146
1,173,072
1,878,065
1,777,750
N50 (bp)
5,671
8,649
4,864
3,017
Largest contig (bp)
52,787
36,477
56,258
31,391
Number of contigs (>200bp)
Total length (bp)
Table S2. Genome-to-Genome distance between the draft genomes
B4-10h
C5-4h
C5-4h
85.4 ± 2.5
C9-6h
86.2 ± 2.5
78.7 ± 2.8
D9-6h
87.7 ± 2.3
81.9 ± 2.7
C9-6h
84.1 ± 2.6
Table S3. Stepwise removal of contaminating contigs during the B4-10h genome assembly.
Total length (bp)
Number of contigs
N50 (bp)
Largest contig (bp)
3,790,463
633 (> 500bp)
44,956
105,658
Step1
3,566,269
384
45,584
105,658
Step2
3,540,430
364
46,500
105,658
Step3
3,519,982
346
46,500
105,658
The removal was done based on the sequence similarity to Bacteroidetes genomes in database (step 1), the GC content (step 2), and the
tetranucleotide frequencies (step 3). See Materials and methods for detail.
Table S4. Genes and regions related to transposable elements in the B4-10h genome.
B4-10h
Phage integrase-like genes
4
Transposases
19
CRISPRs
3
CRISPR-associated (cas) genes
2
120
Ca.
S. dinenymphae
S. dinenymphae
Ca.
pseudotrichonymphae
Az. A.
pseudotrichonymphae
D. gadei
D. capnocytophagoides
B. thetaiotaomicron
B. fragilis
P. ruminicola
100
Genes / Mb
80
60
40
20
0
J K L D V T M N U O C G E F H I
Information
storage and
processing
Cellular processes
and signaling
P Q
Metabolism
Figure S1. Comparison of the Ca. S. dinenymphae B4-10h genome with other
Bacteroidales genomes based on Cluster of Bacteroidetes non-supervised orthologous
groups. Abbreviations of the functional categories are: J, Translation, ribosomal structure and
biogenesis; K, Transcription; L, Replication, recombination and repair; D, Cell cycle control,
cell division, chromosome partitioning; V, Defense mechanism; T, Signal transduction
mechanisms; M, Cell wall/membrane/envelope biogenesis; N, Cell motility; U, Intracellular
trafficking, secretion, and vesicular transport; O, Posttranslational modification, protein
turnover, chaperones; C, Energy production and conversion; G, Carbohydrate transport and
metabolism; E, Amino acid transport and metabolism; F, Nucleotide transport and
metabolism; H, Coenzyme transport and metabolism; I, Lipid transport and metabolism; P,
Inorganic ion transport and metabolism; Q, Secondary metabolism biosynthesis, transport and
catabolism.
50
40
GHs / Mb
30
20
10
0
Bacteroidales with
cellulase/xylanase
(n=59)
Bacteroidales without
cellulase/xylanase
(n=21)
Figure S2. Frequency of glycoside hydrolase (GH) genes in the genomes of Ca. S.
dinenymphae and other Bacteroidales bacteria. The frequency is shown as GH genes per
megabase nucleotides. The values are separately presented between the Bacteroidales
members with cellulase and/or xylanase genes and members without these genes. The number
of GH genes were counted in the NCBI database and checked by BLAST searches against the
CAZy database. Open circles indicate outliers and a closed circle indicates the frequency in
the Ca. S. dinenymphae B4-10h genome.
Figure S3. Phylogenetic tree of genes belonging to the GH5 family and subfamilies. The
approximately-maximum-likelihood tree was constructed with a bootstrap analysis of 10,000
resamplings using FastTree with default JTT and CAT substitution models. The numbers at
branch ends indicate the subfamily of GH5. The red circles indicate GH5 family genes of Ca.
S. dinenymphae. Nodes supported by bootstrap values of > 80% are indicated by filled
circles.
SusD
Outer membrane (21)
SusD
SusC
SusC
(39)
Glycoside hydrolases
periplasm
SusR
cytoplasmic (1)
membrane
Sugar
TonB-ExBD
complex
(9)
(1)
Arabinose
Xylose
(1)
XylE
(1)
TonB-ExBD
complex
cytoplasm
activation of sus genes
Sugar
Arabinose
transporter transporter
Figure S4. Overview of the predicted polysaccharide-utilizing Sus system in Ca. S.
dinenymphae. The parentheses indicate the number of genes found in the Ca. S.
dinenymphae B4-10h genome. Modified from Martens et al. (2009). TonB-ExBD is a
receptor complex that facilitates active transport.
PRPP
Fructose-6P
HisF
3.6.1.31
2.42.17
5.3.1.16
3.5.4.19
HisH
Ribulose-5P
4.2.1.19
Erythrose-4P
His
2.6.1.9
3.1.3.15
1.1.1.23
2.5.1.54
Anthranilate
2.4.2.18
5.3.1.24
4.1.3.27
4.2.3.4
4.1.1.48
1.3.1.12
2.6.1.57
Shikimate
4.2.1.10
2.6.1.57
5.4.99.5
4.2.3.5
1.3.1.43
Chorismate
4.2.1.51
Glycerate-3P
4.2.1.51
2.6.1.1
Trp
Phosphoenol
pyruvate
Ile
2.6.1.42
4.2.1.9
Val
2.6.1.42
4.2.1.9
2.6.1.9
Phosphoserine
Tyr
1.14.16.1
Phe
2.6.1.57
3.1.3.3
4.2.1.20
Ser
2.3.1.30
2.1.2.1
pyruvate
2.5.1.47
Cys
O-AcetylL-serine
1.1.1.86
2.2.1.6
2.6.1.1
2.6.1.9
2.5.1.19
2.7.1.71
1.1.1.25
Trp
4.2.1.20
2.8.1.7
Ala
Homocysteine
Gly
2.1.1.13
4.4.1.8
Met
2.3.1.48
1.1.1.86
Thr
2.3.3.13
O-Succinylhomoserine
4.2.3.1
2.3.1.46
Homoserine
2.7.1.39
4.2.1.33
1.1.1.3
Oxaloacetate
4.2.1.33
2.6.1.1
Asp
6.3.5.4
2.7.2.4
4.3.3.7
1.17.1.8
2.6.1.83
5.1.1.7
4.1.1.20
Lys
6.3.4.5
Asn
1.1.1.85
1.2.1.11
4.3.2.1
Arg
Gln
6.3.1.2
Succinate
2.6.1.42
Leu
1.4.1.13
Succinyl-CoA
1.4.1.14
Glu
2.7.2.11
1.2.1.41
1.5.1.2
Pro
Figure S5. Predicted biosynthetic pathways for amino acids in the Ca. S. dinenymphae
genome. The colors indicate presence of the complete pathway in B4-10h (blue), in other Ca.
S. dinenymphae genomes (green), and partial pathways in either of B4-10h or the other (red).
Fibronectin
SAMD00024442_1_92
SAMD00024442_7_63
SAMD00024442_7_66
SAMD00024442_63_11
Ankyrin-like repeat
SAMD00024442_5_16
SAMD00024442_14_29
Tetratricopeptide repeat
SAMD00024442_3_44
SAMD00024442_40_7
Prevotella melaninogenica
Prevotella intermedia
Prevotella ruminicola
Alistipes finegoldii
O. splanchnicus
Bacteroides thetaiotaomicron
Bacteroides reticulotermitis
Bacteroides fragilis
Porphyromonas gingivalis
Porphyromonas asaccharolytica
P. propionicigenes
B. viscericola
Parabacteroides distasonis
D. gadei
D. mossii
D. capnocytophagoide
P. acetatigenes
Ca. A. pseudotrichonymphae
Bacterial Ig
SAMD00024442_4_13
SAMD00024442_12_11
SAMD00024442_14_18
SAMD00024442_18_14
SAMD00024442_18_15
SAMD00024442_18_16
SAMD00024442_18_17
SAMD00024442_19_19
SAMD00024442_54_2
SAMD00024442_55_11
SAMD00024442_55_12
SAMD00024442_59_3
Figure
S6.S6.Orthologs
ofS.Ca.
S. dinenymphae
genes
related
to cell
adhesion
in the order
Figure
Orthologs of
dinenymphae
genes related
to cell
adhesion
in the
order Bacteroidales.
BLASTP searches
were performed
2,303 proteins
encoded
in the proteins
S. dinenymphae
B4-10h
Bacteroidales.
BLASTP
searchesusing
weretheperformed
using
the 2,302
encoded
in the Ca.
genome as query against the individual protein databases of 18 Bacteroidales genomes (e-value < 1E-10 ).
S. dinenymphae
B4-10h
genome
query
against
individual
protein databases of 18
Magenta and cyan
indicate
presenceas
and
absence
of the the
orthologs,
respectively.
Bacteroidales genomes (e-value < 1E-10). Magenta and cyan indicate presence and absence
of the orthologs, respectively.
Download