Supplement: Data-independent Microbial Metabolomics with Ambient Ionization Mass Spectrometry Christopher M. Rath,1 Jane Y. Yang,2 Theodore Alexandrov,1,3 Pieter C. Dorrestein,1,2* 1 Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California at San Diego, La Jolla, CA, United States of America 2 Department of Chemistry and Biochemistry, University of California at San Diego, La Jolla, CA, United States of America 3 Center for Industrial Mathematics, University of Bremen, Bremen, Germany E-mail: pdorrestein@ucsd.edu Discussion FTICR-MS results, from ISP2 agar sampled over 4 days, reveal spectra with near-identical peak distributions and intensities (Figure S1A), whereas there is some observable difference at the MS/MS level (Figure S1B). 85% of MS/MS nodes represented the “ideal case” with contributions from all four days (Figure S1C). This means that in 85% of MS/MS spectra between replicates were scored as identical by our data processing pipeline in all four days. Of the remaining 15%, 8% was present in 3/4 days, 4% on 2/4 days, and 5% on 1 day only. However, when the 56 nodes representing this 15% are examined in more detail, a slightly more complicated picture occurs (Figure S1E). This is accomplished by deleting the “ideal case” nodes in Cytoscape, then rearranging the remaining nodes. In 16/56 cases a “split-node” was observed, where the four spectra at the same precursor were split over two groups but still connected by an edge based upon spectral similarity. In 7/56 cases an “unmatched-pair” was observed, where split nodes at the same precursor mass were not linked by the MS/MS networking algorithm. Finally, in the remaining 33 cases, singletons were observed, in which spectra were removed from analysis prior to processing (due to spectral quality filters that were employed) meaning that a full set of four spectra at a given precursor isolation window are not present in the data. Four representative m/z values (I-IV) nodes, one for each of the different possible outcomes in our experiment, were selected and examined in greater detail (Figure S2) Four selected m/z values (I-IV) nodes were examined as representative of the different outcomes possible in our experiment (Figure S2). In the ideal case (I), corresponding with the 85% of nodes present in this experiment as scored by our spectral networking, 4/4 spectra are present in a single node. This result seems valid upon manual inspection, where the spectra appear identical based on the presence/absence of peaks and intensities. For split nodes (II), very similar MS and MS/MS spectra are present, however days 1, 2, and 4 are present in one node, whereas day 3 is present in a separate but linked node. This behavior, observed in 6% of the data, is often observed when only a single major peak is present in an MS/MS spectrum, since our computational algorithm looks for shifts in multiple peaks. Thus different small/variable noise peaks can be easily scored as nonidentical, resulting in a similar (all four spectra are split over linked nodes) but non-identical assignment (all four spectra are not present in a single node). This problem could be addressed by computational strategies such as including a noise windowing function, which would include only the N most intense peaks that are greater than the noise—thus limiting the effect of small noisy peaks. However, fundamentally MS/MS data interpretation approaches will always be problematic if little fragmentation is observed. Experimental approaches to addressing this could include taking advantage of the complementary spectra produced by nonendergonic fragmentation techniques such as ETD for multiply charged ions [1], or UVPD [2], or EID for singly charged ions [3]. In the unmatched pair case (III), it can be seen that the signal intensity by MS is low 1 with only noise present. While interpretable MS/MS spectra are present, they are quite noisy, presenting a possible explanation for the exclusion of the day 2 spectrum and the non-matching behavior of the day 1, 3 node versus the day 4 node. An alternative explanation is that we limited the edges displayed in Cytoscape to the top 10 matches. Thus if you have more than 10 matches, only the top 10 are displayed. Given that abundant fragmentation detected, it seems possible that empirical optimization of our algorithms may improve results in low signal to noise examples or capture a larger number of matched spectra. Finally, in the singleton case (IV), 3/4 nodes cluster together whereas one is excluded from analysis likely due to noise or exceeding the number of matches we set. A time course of Bacillus subtilis 3610 was examined by nanoDESI data-independent MS/MS to evaluate changes in metabolic output (Figure S4). Although nucleic acid based approaches or abundant protein based fingerprinting techniques can be very powerful, detecting subtle changes such as colony age could be problematic. Can these differences be captured by nanoDESI with data-independent MS/MS? B. subtilis 3610 exhibits observable changes in metabolic profiles over a 4-day time course. For example, based upon FTICRMS analysis at the surface of the colony, surfactin signals (5) decrease over time, SKF (6) increases over time, and plipastatin (7) remains constant and low (Figure S4A) in agreements with previous observations [4]. In MS/MS mode, metabolites (5-7) can be consistently identified over time, with increasing or decreasing signal correlating with that observed by FTICR-MS (Figure S4B). We chose to further evaluate changes in plipastatin A (7) over time to illustrate the ability of nanoDESI with data-independent MS/MS to elucidate fine changes in molecular output. Supplemental Figures 2 Figure S1. Data-independent nanoDESI of agar Petri dishes over four days by MS, MS/MS, and spectral networking. Day 1 is red, day 2 is orange, day 3 is green, day 4 blue, and overlap is in black. (I-IV) represent m/z values across diverse views of the dataset examined further (Figure S2). A. nanoDESI FTICR-MS from agar Petri dishes over four days. Relative intensity is displayed on the Y-axis from 0-100%, and the X-axis is m/z 100-2,000. B. nanoDESI data-independent MS/MS of agar Petri dishes over four days. MS/MS absolute intensity is displayed as color intensity of a peak, MS/MS spectra from m/z 100-2,000 are displayed on the Yaxis, and precursor isolation window is displayed on the X-axis for m/z 100-2,000. C. Bar graph comparing molecular networking results across four days with numerical percentages. The ideal case where all four nodes cluster together is represented as black circles, nodes representing spectra from a single day are represented as red, orange, green, and blue diamonds. Nodes consisting of spectra from two days are represented as cyan squares. Nodes consisting of spectra from 3/4 days are represented as purple triangles. D. Molecular networking of agar repeatability. This network represents the interconnectivity of spectral/chemical space in the dataset. E. Filtered spectral networking of MS/MS dataset. All “ideal nodes” representing matches across all four replicates have been removed manually from the network in Cytoscape. Nodes can either be linked, indicating identified spectral similarity, matched but unlinked, indicating presence of both nodes in the dataset but insufficient algorithmic similarity, or unpaired, indicating that one node was removed from the dataset during processing. 3 4 Figure S2. Data-independent nanoDESI of agar over four days by MS, and MS/MS: Spectral networking case studies. Day one is in red, day two in orange, day three in green, and day four is in blue. A. MS isolation windows. Data are presented as intensity versus m/z. The isolation window for (I-IV) is illustrated from the FTICR-MS detector for days 1-4 with percentage of nodes in each class. Spectra with matching grey backgrounds clustered together. Spectra not highlighted (white background) were excluded from analysis and are not represented in MS/MS networks. B. MS/MS spectra. Data are presented as intensity versus MS/MS m/z. Spectra with matching grey backgrounds clustered together. Spectra not highlighted (white background) were excluded from analysis and are not represented in MS/MS networks. 5 6 Figure S3. Data-independent nanoDESI MS/MS allows for differentiation of bacterial strains. Pseudomonas aeruginosa PA01 is orange, Pseudomonas aeruginosa PA14 is blue, PA01-PA14 overlap is in cyan, Bacillus subtilis 3610 is red, Bacillus subtilis PY79 is green, 3610-PY79 overlap is in magenta, and interclass overlap is in black. A. NanoDESI FTICR-MS of P. aeruginosa PA01 and PA14. Absolute intensity is displayed on the Y-axis, and the X-axis is for m/z 100-2,000. B. NanoDESI FTICR-MS of B. subtilis 3610 and PY79. Absolute intensity is displayed on the Y-axis, and the X-axis is for m/z 100-2,000. C. NanoDESI dataindependent MS/MS P. aeruginosa PA01 and PA14. MS/MS absolute intensity is displayed as color. MS/MS spectra from m/z 50-2,000 are displayed on the Y-axis, MS isolation window is displayed on the X-axis for m/z 100-2,000. D. NanoDESI data-independent MS/MS of B. subtilis 3610 and PY79. MS/MS absolute intensity is displayed as color. MS/MS spectra from m/z 50-2,000 are displayed on the Y-axis, MS isolation window is displayed on the X-axis for m/z 100-2,000. E. Spectral networking of MS/MS spectra of all four datasets considered jointly, organized by MS/MS spectral similarity. Compounds (2-6) are noted. Nodes present in agar control or co-clustered between agar and microbial samples were removed from the network. Figure S4. Data-independent nanoDESI MS/MS time-course. Bacillus subtilis 3610 is displayed with day 1 in red, day 2 is orange, day 2 in yellow, day 3 in green, day 4 in blue. A. NanoDESI FTICR-MS spectra of B. 7 subtilis 3610 day 1-4. Absolute intensity is displayed on the Y-axis and the X-axis is for m/z 100-2,000. B. NanoDESI data-independent MS/MS of B.subtilis 3610 day 1-4 with overlap is in black. MS/MS absolute intensity is displayed as color saturation, MS/MS spectra from m/z 50-2,000 are displayed on the Y-axis, and MS isolation windows are displayed on the X-axis for m/z 100-2,000. Figure S5. Imaging mass spectrometry time course of Bacillus subtilis 3610. Day 1 is in red, day 2 is in orange, day 3 is in green, and day 4 is in blue. Ion spatial distributions for given m/z values are presented at 450 µm spatial resolution. Approximate location of nanoDESI sampling is illustrated by an X. Images were generated with TIC spectra normalization, and brightness was set to automatic. Samples were run on four separate days. References 1. Syka, J. E. P., Coon, J. J., Schroeder, M. J., Shabanowitz, J., Hunt, D. F.: Peptide and Protein Sequence Analysis by Electron Transfer Dissociation Mass Spectrometry. Proc Nat Acad Sci USA 101, 9528-9533 (2004). 2. Ly, T., Julian, R. R.: Ultraviolet Photodissociation: Developments Towards Applications for MassSpectrometry-Based Proteomics. Ange Chem Int Ed 48, 7130-7137 (2009). 3. Yoo, H. J., Liu, H., Hakansson, K.: Infrared Multiphoton Dissociation and Electron-Induced Dissociation as Alternative MS/MS Strategies for Metabolite Identification. Anal Chem 79, 7858-7866 (2007). 4. Yang, Y.-L., Xu, Y., Straight, P., Dorrestein, P. C.: Translating Metabolic Exchange with Imaging Mass Spectrometry. Nat Chem Biol 5, 885-887 (2009). 8