pmic12228-sup-0005-text

advertisement
SUPPLEMENTAL DATA
SUPPLEMENTARY TABLE LEGENDS
Table S1. PEAKS DBSEARCH_psm. Peptide sequences assigned to spectra after
analyses by PEAKS or an NCBInr DataBase search. Column A, sequence of the
peptide; B, log10 of the p-value; C, mass of the spectrum; D, parts per million (ppm); E,
m/z; F, retention time (RT); G, reference name of the assigned spectrum; H, accession
number of the assigned protein; I, description of the protein function; J, organism in
which the protein is found; K, standard modifications associated with sample
preparation considered in the searches.
Table S2. PEAKS DBSEARCH_denovo. Spectra associated to peptides after de novo
identification using PEAKS. These spectra were not previously assigned to any protein
after the initial NCBInr DataBase search. Column A, spectrum identifier; B, amino acid
sequence assigned to the spectrum; C, peptide length; D, average local confidence
(ALC); E, m/z; F, retention time (RT); G, mass assigned to the peptide; H, ppm; I,
standard modifications associated with sample preparation considered in the searches; J,
confidence percentage assigned to each amino acid of the peptide sequence.
Table S3. PEAKSPTM_psm. Search to identify peptides with PTMs. Column A,
sequence of the peptide; B, log10 of the p-value; C, mass of the spectrum; D, ppm; E,
m/z; F, retention time (RT); G, reference name of the assigned spectrum; H, accession
number of the assigned protein; I, description of the protein function; J, organism in
which the protein is found; K, PTMs.
Table S4. PEAKSPTM_denovo. Search to identify post-translational modifications of
peptide sequences identified by a de novo search. Column A, spectrum identifier; B,
amino acid sequence assigned to the spectrum; C, peptide length; D, ALC; E, m/z; F,
retention time (RT); G, mass assigned to the peptide; H, ppm; I, PTMs; J, confidence
percentage assigned to each amino acid of the peptide sequence.
Table S5. Experimental PTMs. This table shows the acetylated proteins detected in the
current study and corroborated by a previous study [1].
Table S6. Mutations found by analyzing RNA-seq data. Column A, genome position
where the RNA mutation is found; B, M. pneumoniae gene identifier; C, DNA strand of
the coding genes, ¨+¨ and ¨–¨; D, ratio WT/mutant ratio indicates the proportion of
wild-type to mutated reads; E, value of quality for the mutation assignment by SAM
tools software; F, wild-type amino acid; G, mutated amino acid; H, codon in the wildtype DNA sequence; I, codon in the mutated DNA sequence; J, “yes” indicates if the
peptide has been found by MS; K, a cross indicates that the assigned protein was found
in MS experiments; L, protein name; M, protein function.
Table S7. SPIDER_psm. Search for mutations using SPIDER software. List of the
different peptides assigned to the different spectra. Column A, peptide sequence; B,
log10 of the p-value; C, mass of the spectrum; D, ppm; E, m/z; F, retention time (RT);
G, reference name of the assigned spectrum; H, accession number of the assigned
protein; I, description of the protein function; J, organism in which the protein is found;
K, PTMs, the nomenclature ¨mutation¨ is used to indicate that the peptide corresponds
to a mutated version of the protein determined by SPIDER; L, whether or not the
identified peptide is unique for a protein, whereby “Y” indicates that the peptide is
unique and,”N” not unique.
Table S8. SPIDER_de novo. Search for mutations in the peptides identified by de novo
using SPIDER. List of the different peptides assigned to the different spectra. Column
A, spectrum identifier; B, amino acid sequence assigned to the spectrum; C, peptide
length; D, ALC; E, m/z; F, retention time (RT); G, mass assigned to the peptide; H,
ppm; I, PTMs; J, confidence percentage assigned to each amino acid of the peptide
sequence.
Table S9. COGs of the mutated proteins. Functional enrichment into COG categories
was determined using the Fisher’s test. Column A, letter associated with the functional
category; B, functional category; C, number of wild-type peptides associated to the
functional category; D, number of mutated peptides associated to the functional
category; E and F, odd ratios and p-values, respectively, calculated with Fisher´s test.
Statistically significant cases are marked in grey.
Table S10. Codon bias. The enrichment of codons depending on the type of mutation as
determined by the Fisher’s test. Columns A and B, wild-type and mutated amino acids,
respectively; C, mutated codon; D, number of each codons for each mutation; E,
number of times the codon appears in the genome; F and G, odd ratios and p-values,
respectively, as calculated using Fisher´s test. The statistically significant cases are
marked in grey.
Table S11. Mutation rates in the M. pneumoniae proteome. Column A, M. pneumoniae
gene ID; B, “+” and “–“ DNA strand of the coding gene; C, genome position where the
mutation is found; D, codon that encodes the mutation; E, wild-type amino acid; F,
mutated amino acid; G and H, sequences of the wild-type and mutated peptides,
respectively; I, function of the protein encoded by the ORF; J, COG category associated
to the corresponding ORF (for description, see Supporting Information Table S9); K
and L, the areas of wild-type peptides (WT) in the two biological replicates (R1 and R2)
M and N areas for aa variant peptides (V) in the two biological replicates (R1 and R2),
respectively. O to R, deltaRT values for WT and V peptides in the two biological
replicates (R1 and R2). S, area for WT variant considering values of replicas that
deltaRT<0.5. If the values of the two deltaRT are <0.5, then the average of the area of
the two replicates has been calculated. Column T shows the same values but for the V
peptide. Column U shows the error rates calculated by dividing the area of aa variant
peptide vs area of WT peptide. These error rates have been calculated only for the areas
with significant deltapvlues (<0,5).
Table S12. BLAST and new proteins. (A) BLAST search results using the
HomoConTrans19 database, which includes horse, human, and E. coli contaminants.
The results were filtered by values of identity higher than 87.5% and no gaps. (B) List
of proteins that have not been previously identified in a M. pneumoniae genome
annotation.
Table S13. BLAST Repetitions. BLAST alignment results after discarding the
repetitive three amino acids at the N-termini of the peptides. Column A, the repetitive
amino acid; B, the target protein identified in the BLAST search; C, the percentage of
identity; D, the alignment length of the peptide sequence after elimination of the three
repetitive residues; E, number of mismatches; F, number of gaps; G, start position of the
peptide sequences that shows the results of the alignment; H, end position of the peptide
that shows the results of the alignment; I, start position in the amino acid sequence of
the target protein that shows the alignment results; J, end position in the amino acid
sequence of the target protein that shows the alignment results; K and L, the E- value
and Bit scores, respectively, of the BLAST alignment. M, amino acid sequence used in
the BLAST search after removing the repetitive amino acids.
SUPPLEMENTARY FIGURE LEGENDS
Figure S1. Percentages of spectra assigned in the different steps of the analysis. (A)
Percentage of the total number of spectra assigned to an amino acid sequence in the
different stages of the analysis. (B) Percentage of identified peptides that were assigned
to proteins from the NCBInr database search (including human, horse, E. coli, and
Mycoplasma). (C) Percentage of peptides assigned to M. pneumoniae proteins.
Different bar colors indicate the three different software analyses used in the study: de
novo (black bars), de novo plus PTMs (grey bars) and de novo/PTMs/mutations by
Spider (red bars).
Figure S2. A) Correlation of coverage with protein amounts. The coverage (number of
detected peptides corrected by molecular weight) of a protein is dependent of the total
amount of protein (area of the three best peptides). Dots represent all the proteins
detected and quantified by MS. Protein coverage (number of detected peptides divided
by the molecular weight of a protein) of detected MS proteins correlates with the protein
abundance (R=0,6). B) Representation of percentage of WT and aa variant peptides
versus coverage. As expected the number of detected aa variant peptides is higher in
proteins with a high coverage value.
Figure S3. A) Representation of the domains in the Ala-tRNA synthetase. B) Clustal
alignment of Ala-tRNA synthetase of E.coli, B. subtilis and M. pneumoniae. Ala-tRNA
synthetase of M. pneumoniae has a degenerated C-Ala domain at its C-terminus.
Figure S4. Frequency of the amino acid repetitions at the N-termini of discarded
peptides. Histogram of repetitive amino acid sequences at the N-termini of the peptides
that were not assigned to a protein in a database search after SPIDER analysis.
1.
van Noort, V., et al., Cross-talk between phosphorylation and lysine acetylation in a
genome-reduced bacterium. Mol Syst Biol, 2012. 8: p. 571.
Download