pmic12228-sup-0005-text

SUPPLEMENTAL DATA SUPPLEMENTARY TABLE LEGENDS Table S1. PEAKS DBSEARCH_psm. Peptide sequences assigned to spectra after analyses by PEAKS or an NCBInr DataBase search. Column A, sequence of the peptide; B, log10 of the p-value; C, mass of the spectrum; D, parts per million (ppm); E, m/z; F, retention time (RT); G, reference name of the assigned spectrum; H, accession number of the assigned protein; I, description of the protein function; J, organism in which the protein is found; K, standard modifications associated with sample preparation considered in the searches. Table S2. PEAKS DBSEARCH_denovo. Spectra associated to peptides after de novo identification using PEAKS. These spectra were not previously assigned to any protein after the initial NCBInr DataBase search. Column A, spectrum identifier; B, amino acid sequence assigned to the spectrum; C, peptide length; D, average local confidence (ALC); E, m/z; F, retention time (RT); G, mass assigned to the peptide; H, ppm; I, standard modifications associated with sample preparation considered in the searches; J, confidence percentage assigned to each amino acid of the peptide sequence. Table S3. PEAKSPTM_psm. Search to identify peptides with PTMs. Column A, sequence of the peptide; B, log10 of the p-value; C, mass of the spectrum; D, ppm; E, m/z; F, retention time (RT); G, reference name of the assigned spectrum; H, accession number of the assigned protein; I, description of the protein function; J, organism in which the protein is found; K, PTMs. Table S4. PEAKSPTM_denovo. Search to identify post-translational modifications of peptide sequences identified by a de novo search. Column A, spectrum identifier; B, amino acid sequence assigned to the spectrum; C, peptide length; D, ALC; E, m/z; F, retention time (RT); G, mass assigned to the peptide; H, ppm; I, PTMs; J, confidence percentage assigned to each amino acid of the peptide sequence. Table S5. Experimental PTMs. This table shows the acetylated proteins detected in the current study and corroborated by a previous study [1]. Table S6. Mutations found by analyzing RNA-seq data. Column A, genome position where the RNA mutation is found; B, M. pneumoniae gene identifier; C, DNA strand of the coding genes, ¨+¨ and ¨–¨; D, ratio WT/mutant ratio indicates the proportion of wild-type to mutated reads; E, value of quality for the mutation assignment by SAM tools software; F, wild-type amino acid; G, mutated amino acid; H, codon in the wildtype DNA sequence; I, codon in the mutated DNA sequence; J, “yes” indicates if the peptide has been found by MS; K, a cross indicates that the assigned protein was found in MS experiments; L, protein name; M, protein function. Table S7. SPIDER_psm. Search for mutations using SPIDER software. List of the different peptides assigned to the different spectra. Column A, peptide sequence; B, log10 of the p-value; C, mass of the spectrum; D, ppm; E, m/z; F, retention time (RT); G, reference name of the assigned spectrum; H, accession number of the assigned protein; I, description of the protein function; J, organism in which the protein is found; K, PTMs, the nomenclature ¨mutation¨ is used to indicate that the peptide corresponds to a mutated version of the protein determined by SPIDER; L, whether or not the identified peptide is unique for a protein, whereby “Y” indicates that the peptide is unique and,”N” not unique. Table S8. SPIDER_de novo. Search for mutations in the peptides identified by de novo using SPIDER. List of the different peptides assigned to the different spectra. Column A, spectrum identifier; B, amino acid sequence assigned to the spectrum; C, peptide length; D, ALC; E, m/z; F, retention time (RT); G, mass assigned to the peptide; H, ppm; I, PTMs; J, confidence percentage assigned to each amino acid of the peptide sequence. Table S9. COGs of the mutated proteins. Functional enrichment into COG categories was determined using the Fisher’s test. Column A, letter associated with the functional category; B, functional category; C, number of wild-type peptides associated to the functional category; D, number of mutated peptides associated to the functional category; E and F, odd ratios and p-values, respectively, calculated with Fisher´s test. Statistically significant cases are marked in grey. Table S10. Codon bias. The enrichment of codons depending on the type of mutation as determined by the Fisher’s test. Columns A and B, wild-type and mutated amino acids, respectively; C, mutated codon; D, number of each codons for each mutation; E, number of times the codon appears in the genome; F and G, odd ratios and p-values, respectively, as calculated using Fisher´s test. The statistically significant cases are marked in grey. Table S11. Mutation rates in the M. pneumoniae proteome. Column A, M. pneumoniae gene ID; B, “+” and “–“ DNA strand of the coding gene; C, genome position where the mutation is found; D, codon that encodes the mutation; E, wild-type amino acid; F, mutated amino acid; G and H, sequences of the wild-type and mutated peptides, respectively; I, function of the protein encoded by the ORF; J, COG category associated to the corresponding ORF (for description, see Supporting Information Table S9); K and L, the areas of wild-type peptides (WT) in the two biological replicates (R1 and R2) M and N areas for aa variant peptides (V) in the two biological replicates (R1 and R2), respectively. O to R, deltaRT values for WT and V peptides in the two biological replicates (R1 and R2). S, area for WT variant considering values of replicas that deltaRT<0.5. If the values of the two deltaRT are <0.5, then the average of the area of the two replicates has been calculated. Column T shows the same values but for the V peptide. Column U shows the error rates calculated by dividing the area of aa variant peptide vs area of WT peptide. These error rates have been calculated only for the areas with significant deltapvlues (<0,5). Table S12. BLAST and new proteins. (A) BLAST search results using the HomoConTrans19 database, which includes horse, human, and E. coli contaminants. The results were filtered by values of identity higher than 87.5% and no gaps. (B) List of proteins that have not been previously identified in a M. pneumoniae genome annotation. Table S13. BLAST Repetitions. BLAST alignment results after discarding the repetitive three amino acids at the N-termini of the peptides. Column A, the repetitive amino acid; B, the target protein identified in the BLAST search; C, the percentage of identity; D, the alignment length of the peptide sequence after elimination of the three repetitive residues; E, number of mismatches; F, number of gaps; G, start position of the peptide sequences that shows the results of the alignment; H, end position of the peptide that shows the results of the alignment; I, start position in the amino acid sequence of the target protein that shows the alignment results; J, end position in the amino acid sequence of the target protein that shows the alignment results; K and L, the E- value and Bit scores, respectively, of the BLAST alignment. M, amino acid sequence used in the BLAST search after removing the repetitive amino acids. SUPPLEMENTARY FIGURE LEGENDS Figure S1. Percentages of spectra assigned in the different steps of the analysis. (A) Percentage of the total number of spectra assigned to an amino acid sequence in the different stages of the analysis. (B) Percentage of identified peptides that were assigned to proteins from the NCBInr database search (including human, horse, E. coli, and Mycoplasma). (C) Percentage of peptides assigned to M. pneumoniae proteins. Different bar colors indicate the three different software analyses used in the study: de novo (black bars), de novo plus PTMs (grey bars) and de novo/PTMs/mutations by Spider (red bars). Figure S2. A) Correlation of coverage with protein amounts. The coverage (number of detected peptides corrected by molecular weight) of a protein is dependent of the total amount of protein (area of the three best peptides). Dots represent all the proteins detected and quantified by MS. Protein coverage (number of detected peptides divided by the molecular weight of a protein) of detected MS proteins correlates with the protein abundance (R=0,6). B) Representation of percentage of WT and aa variant peptides versus coverage. As expected the number of detected aa variant peptides is higher in proteins with a high coverage value. Figure S3. A) Representation of the domains in the Ala-tRNA synthetase. B) Clustal alignment of Ala-tRNA synthetase of E.coli, B. subtilis and M. pneumoniae. Ala-tRNA synthetase of M. pneumoniae has a degenerated C-Ala domain at its C-terminus. Figure S4. Frequency of the amino acid repetitions at the N-termini of discarded peptides. Histogram of repetitive amino acid sequences at the N-termini of the peptides that were not assigned to a protein in a database search after SPIDER analysis. 1. van Noort, V., et al., Cross-talk between phosphorylation and lysine acetylation in a genome-reduced bacterium. Mol Syst Biol, 2012. 8: p. 571.

pmic12228-sup-0005-text

Related documents

Products

Support

pmic12228-sup-0005-text

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib