Supplementary Table 1 Genomes of the 144 bacteria

advertisement
Supplementary Fig. 1
Classification of the 144 bacteria based on the normalized
amino acid compositions at the N-terminal region and on the amino acid compositions.
The dendrograms represent the results of the hierarchical clustering analysis of 144
bacteria based on the normalized amino acid compositions at the N-terminal region (a),
and on the amino acid compositions (b). The normalized amino acid compositions
represent the biases of an amino acid residue at a position of all amino acid sequences of
a bacterium. In this dendrogram, colored shapes represent the taxonomic classes (purple
squares: Alphaproteobacteria; light-blue squares: Betaproteobacteria; blue squares:
Gammaproteobacteria;
green
squares:
Deltaproteobacteria;
yellow
squares:
Epsilonproteobacteria; gray circles: Actinobacteria; pink circles: Bacilli; orange circles:
Clostridia). The values in parentheses are the G+C contents of the DNA sequences of a
bacterium.
Supplementary Fig. 2
Means of the normalized amino acid compositions in each
1
taxonomic class.
The means of the normalized amino acid compositions (N
b a p)
were calculated in each
taxonomic class. The means at the N-terminal region (2≤p≤41) are shown in 1-1 to 1-17
(1-1: Lys; 1-2: Asn; 1-3: Ile; 1-4: Arg; 1-5: Gln; 1-6: Met; 1-7: Asp; 1-8: Glu; 1-9: Ala;
1-10: Phe; 1-11: His; 1-12: Leu; 1-13: Val; 1-14: Tyr; 1-15: Gly; 1-16: Trp; 1-17: Cys).
The means at the C-terminal region (n−39≤p≤n) are shown in 2-1 to 2-18 (2-1: Lys; 2-2:
Asn; 2-3: Thr; 2-4: Ser; 2-5: Ile; 2-6: Arg; 2-7: Gln; 2-8: Met; 2-9: Asp; 2-10: Glu; 2-11:
Phe; 2-12: His; 2-13: Leu; 2-14: Val; 2-15: Tyr; 2-16: Gly; 2-17: Trp; 2-18: Cys). The
means of the normalized amino acid compositions in each taxonomic class are
represented
by
Betaproteobacteria;
colored
blue:
lines
(purple:
Alphaproteobacteria;
Gammaproteobacteria;
green:
light-blue:
Deltaproteobacteria;
yellow-green: Epsilonproteobacteria; gray: Actinobacteria; pink: Bacilli; orange:
Clostridia).
2
Supplementary Fig. 3
S
apl
for each type of amino acid residue at the terminal
regions.
S
apl
for each type of amino acid residue at the N-terminal region (2≤p≤41) are shown
in 1-1 to 1-17 (1-1: Lys; 1-2: Asn; 1-3: Ile; 1-4: Arg; 1-5: Gln; 1-6: Met; 1-7: Asp; 1-8:
Glu; 1-9: Ala; 1-10: Phe; 1-11: His; 1-12: Leu; 1-13: Val; 1-14: Tyr; 1-15: Gly; 1-16:
Trp; 1-17: Cys). S
apl
for each type of amino acid residue at the C-terminal region
(n−39≤p≤n) are shown in 2-1 to 2-18 (2-1: Lys; 2-2: Asn; 2-3: Thr; 2-4: Ser; 2-5: Ile;
2-6: Arg; 2-7: Gln; 2-8: Met; 2-9: Asp; 2-10: Glu; 2-11: Phe; 2-12: His; 2-13: Leu; 2-14:
Val; 2-15: Tyr; 2-16: Gly; 2-17: Trp; 2-18: Cys). The scores, S
a p l,
represent the
correspondence between a result of the hierarchical clustering analysis and the
classification according to the taxonomic classes. The variable a, p, l and n is amino
acid residue, the position from the termini, the distance from position p, and the length
of amino acid sequences, respectively.
3
Supplementary Table 1
Genomes of the 144 bacteria
The chromosomal sequences of 144 bacteria were obtained from a public database
available at the National Center for Biotechnology Information (NCBI). The taxonomic
classes of these 144 bacteria were Alphaproteobacteria, Betaproteobacteria,
Gammaproteobacteria, Deltaproteobacteria, Epsilonproteobacteria, Actinobacteria,
Bacilli, and Clostridia.
Supplementary Table 2
Prediction of the subcellular localization for amino acid
sequences of the 144 bacteria
For all annotated amino acid sequences deduced from 144 bacterial genomic DNA,
subcellular localization was predicted. This prediction was performed with PSORTb
v.2.1.0. The percentages of these predicted amino acid sequences were averaged in each
taxonomic class.
4
Download