SIMPLIFIED TWO-DIMENSIONAL CAPILLARY ELECTROPHORESIS-MASS SPECTROMETRY MAPPING: ANALYSIS OF PROTEOLYTIC DIGESTS. Guillaume L. Erny and Alejandro Cifuentes* Institute of Industrial Fermentations (CSIC), Juan de la Cierva 3, 28006 Madrid, Spain *Corresponding author: Dr. Alejandro Cifuentes, Fax#: 34-91-5644853, e-mail: acifuentes@ifi.csic.es Abbreviations: EIE, extracted ions electropherogram; TIE, total ions electropherogram; CYC-B, cytochrome c from bovine; CYC-R, cytochrome c from rabbit; CYC-H, cytochrome c from horse. Keywords: 2D, mapping, CE-MS, proteomics, peptides, proteins, chemometric. 1 Abstract Capillary electrophoresis-mass spectrometry (CE-MS) has demonstrated to be a very useful hyphenated technique for Proteomic studies. However, the huge amount of data stored in a single CE-MS run makes necessary to account with procedures able to extract all the relevant information made available by CE-MS. In this work, we present a new and easy approach able to generate a simplified two-dimensional map from CE-MS raw data. This new approach provides the automatic detection and characterization of the most abundant ions from the CEMS data including their m/z values, ion intensities and analysis times. It is demonstrated that visualization of CE-MS data in this simplified 2D format allows (i) an easy and simultaneous visual inspection of large datasets, (ii) an immediate perception of relevant differences in closely related samples, (iii) a rapid monitoring of data quality levels in different samples and (iv) a fast discrimination between comigrating polypeptides and ESI-MS fragmentation ions. The strategy proposed in this work does not rely on an excellent mass accuracy for peak detection and filtering since MS values obtained from an ion trap analyzer are used. Moreover, the methodology developed works directly with the CE-MS raw data, without interference by the user, giving simultaneously a simplified 2D map and a much easier and more complete data evaluation. Besides, this procedure can easily be implemented in any CEMS laboratory. The usefulness of this approach is validated by studying the very similar trypsin digests from bovine, rabbit and horse cytochrome c. It is demonstrated that this simplified 2D approach allows obtaining in a fast and simple way specific markers for each species. 2 1. Introduction 1.1 General aspects Proteins are fundamental components of all living beings and include many substances such as enzymes, hormones, antibodies, etc, necessary for the proper functioning of any organism. Separation and identification of proteins became in the last years of great importance impulsed by the development of the Proteomics field and the seeking for a better understanding of some biologic functions. However, it is well known that to get a complete knowledge on the proteins content of any organism in a given moment is an extremely complex task since organisms usually contain thousands of proteins of very different concentration, size, hydrophobicity and charge. Moreover, enzymatic digestion of these proteins is usually required increasing enormously the amount of compounds to analyze, which illustrates the difficulty of this task [1]. Some recent developments have focused on the separation of proteins without aiming to identify all of them, but intending to provide protein profiles or fingerprintings under specific conditions. This fingerprinting (usually displayed in a form of a 2D-map using color coding for intensity) can be used to establish Proteomic patterns for diagnostic purpose or to easily obtain a biomarker specific for a particular disease. Moreover, these profiling techniques can be useful not only for clinical applications but also for food analysis including e.g., food adulterations, detection of genetically modified organisms, etc [2]. Alternatively, proteolysis patterns are sometime favored as the peptidic fragments are more soluble, stable, and usually easier to separate [3]. However, as each protein gives rise to numerous fragments, the pattern complexity is significantly increased in this case. 3 Protein or peptide maps are usually achieved using 2D separation techniques, 2D-PAGE being the most common procedure. Other 2D techniques have also been used such as HPLC/HPLC, HPLC/CZE, CZE/MEKC, [4-6] as well as hyphenated techniques with MS as second dimension (e.g., CE-MS or HPLC-MS) [7,8]. The great advantage of MS is that allows identification of a given compound based on its relative molecular mass (Mr). In this latter case data visualization of MS data in the format of a map (fraction number or retention time as y-axis and m/z as x-axis, with a color color-coding for signal intensity) seems to be suitable [9]. However, one of the main problems when using MS as second dimension is the increased complexity due to fragmentation of parent ions. Although the use of soft ionization procedures such as electrospray for HPLC-MS [10] and CE-MS [11] reduces significantly the internal energy during the ionization, and thus limits the fragmentation process, the ionization will never be “soft” enough to ensure that a particular detected ion does not result from the fragmentation of a parent ion. Apart of the above mentioned limitations, the huge amount of data in different formats produced by Proteomic techniques has determined the urgent need for procedures able to extract the relevant information from the MS spectra [12]. Specific tools have already been developed to display m/z ratios in conjunction with data from a separation step [13-15]. These procedures cannot only be used for MS mapping, but also for visual analysis and comparison between various datasets through adequate normalization. The increasing activity in this field underlines the need for flexible data visualization tools that can easily be applied to a wide variety of experimental setups. As evident, all these technologies are still burdened with certain limitations. The most severe limitation, however, might not be the technical aspect of MS and/or separation (data accumulation) but rather the subsequent data evaluation. 4 The aim of this paper is to demonstrate the possibilities of a new and easy approach developed at our lab able to provide a simplified 2D mapping of CE-MS data. The approach is based on the automatic detection and characterization of the main peaks and m/z values from the raw data. The simplified 2D mapping will be obtained by performing a classical peak analysis for every m/z containing important information. To our knowledge, this is the first time that such approach has been proposed. The strategy proposed in this work does not rely on an excellent mass accuracy (like the one provided by more expensive MS analyzers as e.g., TOF-MS or FT-ICR-MS) as an attribute for peak detection and filtering since data obtained from an ion trap analyzer are used. Moreover, the methodology developed works directly with the CE-MS raw data, without interference by the user, providing simultaneously a simplified 2D map and a much easier and complete data evaluation. The usefulness of this approach is validated by analyzing the trypsin digests of cytochrome c from three different species, namely, bovine (CYC-B), rabbit (CYC-R) and horse (CYC-H). It is shown that this simplified 2-D procedure permits the detection of specific markers for each species even from very similar proteolytic digests. 1.2 Theoretical section. The chemometric tool developed in this work allows carrying out the following three steps in an automatic way (see Figure 1). First, raw data from a given CE-MS run are automatically converted in a 2 dimensional matrix, namely, time m/z (step 1 in Figure 1). In the second step of Figure 1, the m/z values containing useful information are detected (vide infra), and a series of extracted ions electropherograms (EIE) are reconstructed based on those principal m/z ions. In the last step, each individual EIE is automatically analyzed to obtain the main m/z values together with their mass incertitude, peak area and analysis time (see table in Figure 1, step 3). These three steps provide an automatic and drastic reduction in the data size, making 5 easier the following simplified 2-D representation and allowing a better study and visualization of the CE-MS results. In order to automatically detect the m/z values of interest, the standard deviation (SD) of the ionic intensity values obtained for each m/z is calculated along the time scale using our approach. Logically, the most interesting m/z values will have the highest SD as a result of the large ionic intensity variation observed along the time. Therefore, m/z values of interest are selected as those with a SD higher than a certain threshold. EIEs are then reconstructed by summing for each time the ionic intensities obtained inside an m/z interval centered at the detected main m/z value plus twice the mass incertitude (0.5 m/z in our case). However, in the case where two detected m/z ions are close to each other (less than 0.4 m/z), they will be processed in a single EIE (see Figure 1, Step 2). The resulting data will be a series of array, each of them representing an EIE and indexed by the average m/z and mass incertitude. Peaks are then detected in each EIE as a succession of data points whose signal is 10 times higher than the average noise calculated using the 20 first points of every EIE. For each detected peaks two electrophoretic parameters were measured, the peak area, A, and the peak migration time, tm, by A t I i (1) i and tm 1 ti I i A i (2) where the summation i is over every point that defines a particular peak for a given m/z, being Ii and ti the intensity and time respectively. As can be seen, the migration time has been 6 calculated using the first statistical moment [16] and will slightly differ from the peak maximal for asymmetrical peaks. However, the use of this parameter allows a higher precision than the peak maximal whose precision can be limited by the sampling rate [17]. For each detected peak, area and migration time, as well as m/z and its incertitude are recorded in a table as indicated in step 3 of Figure 1. These data are next used to get the simplified 2D CE-MS representation. 2. Experimental section 2.1 Chemicals Ammonia (30%) was from Panreac (Barcelona, Spain), methanol (HPLC grade) from Scharlau (Barcelona, Spain) and formic acid from Merck (Darmstadt, Germany). Trypsin and cytochrome c from bovine heart (CYC-B), horse heart (CYC-H) and rabbit heart (CYC-R) were from Sigma (St. Louis, MO, USA). Water was deionized with a Milli-Q system (Millipore, Bedford, MA, USA). 2.2 Protein hydrolysis Cytochrome c from the different species were dissolved in a buffer solution containing 200 mM sodium acetate, 20 mM Tris and 0.2 mM calcium chloride at a concentration of 2 mg/ml. Trypsin was dissolved in water at a concentration of 2 mg/ml. CYC and trypsin were mixed at a ratio of 10 to 1, and the digestion was allowed to proceed for 16 h at 37°C. The enzymatic digestion was stopped by increasing the temperature to 80°C for 10 min. Proteins digest were stored at -4°C. 7 2.3 Capillary Electrophoresis-Electrospray-Mass Spectrometry (CE-ESIMS) CE-ESI-MS analyses were carried out in a PACE/5500 CE instrument (Beckman, Fullerton, CA, USA) coupled to a Bruker Daltonic Esquire 2000 ion-trap mass spectrometer (Bruker Daltonik, Bremen, Germany) using commercial coaxial sheath-flow interface. The separation method was adapted from Simo et al. [18,19]. Briefly, the MS was operated in the positive ion mode, and scanned from 200 to 1100 m/z at 13000 u/s. ESI parameters were: nebulizer pressure, 27579 Pa; dry gas flow, 8L/h; dry gas temperature, 120°C; and a sheath liquid made of methanol-water (50/50) at a flow rate of 4 μL min-1. Separation was performed in a 90 cm long capillary (50 μm i.d., from Composite Metal services, Worcester, England) using a buffer made of 0.9 M formic acid adjusted to pH 2 with ammonium hydroxide. Between runs the capillary was rinsed for 3 min with water and 1 min with buffer. CYC hydrolysates were injected without any dilution or purification step for 20 sec at 3447 Pa. 2.4 Data analysis and programming For this work, different computer tools were used. The software integrated with the instrument (DataAnalysis version 3.0, Bruker Daltonic Bremen, Germany) was used to obtain the extracted ion electropherograms (EIEs) as well as to convert the raw data in ASCII format. Visual basic (Visual Basic 6.0, Microsoft) was used to program the different filtering routines, and the computation of the electrophoretic figures of merits. Results were recorded in an Excel spreadsheet (Excel 2000, Microsoft) for further analysis. 3. Results and discussion 8 As mentioned above, the usefulness of this new approach was validated by comparing the 2D mapping obtained after digestion with trypsin of cytochrome c from three species. Namely, bovine (CYC-B), horse (CYC-H) and rabbit (CYC-R) cytochrome c digested with trypsin were compared. An additional aim was to find a CE-MS marker for each species, which could be used as quality control to detect e.g., adulterations of minced meat [20-22]. Logically, this approach for 2D-CE-MS mapping can be useful in many other applications including the finding of biomarkers, the identification of therapeutic polypeptide targets, the establishment of patterns for diagnostic purposes [9], etc. The total-ion electropherograms (TIEs) obtained by CE-MS of the three cytochromes digested with trypsin are shown in Figure 2. As can be seen, few differences can directly be detected from these CE-MS electropherograms. Indeed, although peak 1 could be used as a marker for CYC-R, no unique feature can be observed for CYC-B and CYC-H. For example, if peak 2 is no present in CYC-R, it is present in CYC-B and CYC-H, similarly peak 3 is not present in CYC-H but present in CYC-B and CYC-R and peak 4 is not present in CYC-R, but present in CYC-H and CYC-B. The same applies to the group of peaks labeled as 5 in Figure 2. In order to obtain more information (including specific markers for each species), the classical procedure would be to analyze the full MS spectra for every peak and to compare these results among the different species. However, this procedure is labor intensive and time consuming. Alternately, a straight 2D mapping of the samples could be compared. An example of such representation for the hydrolysis of the CYC-B is shown in Figure 3. In our case, this 2D map was obtained from the original 2 dimensional matrix (step 1 in Figure 1), that was pasted in an excel spreadsheet. For size and speed consideration, the m/z values have been compressed by a factor of 40. As can be seen, much more information is obtained in this case. However, as evident from the wealth of data, it was impossible to evaluate the raw data using 9 commercially available software. For example, with our MS set-up (mass scan from 200 to 1100) a 2D matrix as the one shown in Figure 3 will easily represent more than 10 Mbyte. More importantly, such representation can provide an overloaded of information that can hide important differences [9]. Therefore, the usefulness of the new approach described under Theory for achieving a simplified 2D-CE-MS mapping was tested. The original TIE of a given trypsin digest analyzed by CE-MS is shown in Figure 4A, and its corresponding graph of the measured standard deviations (SD) for each m/z is shown in Figure 4B. As can be observed, the m/z values with high SD values agree with the most intense spots shown in Figure 3. For example, it can be seen in Figure 4b that the EIEs corresponding to m/z = 584.9 and 589.2 will have important information (highest SD in Figure 4B). Those two m/z values correspond to two of the most intense spots in Figure 3. Moreover, some ions that contribute in a large extent to the noise (e.g., m/z = 282.2 in Figure 3) do not give a high SD. The highest standard deviation can be found between m/z of 500 and m/z of 700 in good agreement with the results of Figure 3. The insert shown in Figure 4B corresponds to a zoom of the m/z values (x axis) between 500 and 510, and standard deviations (y axes) between 0 and 5000. As can be seen, each peak is extremely sharp with a peak width at half height well below 0.5 m/z. This will allow to automatically obtaining relevant extracted ion electropherograms with a suitable mass accuracy. Indeed, taking in Figure 4b a threshold of 2000, corresponding to 1% of the maximum SD, 735 different EIEs have been automatically generated out of the 9000 m/z possible. Figure 4C shows the TIC obtained by summation of the 735 selected EIE, as can be seen, no difference can be visually observed between electropherograms of Figure 4A and 4C, what is further corroborated by the electropherogram of Figure 4d that shows the differences between Figures 4A and 4C. Namely, Figure 4D shows that the residual between the two figures will never represent more than 10% of the full peak, and is usually below 5%. Figure 10 4E shows the total ion electropherogram after the step 3 of our approach (see Figure 1). To obtain this Figure, all points that were not detected as part of a peak were set to zero in the original matrix, points being part of a peak were baseline corrected. As can be seen, the elimination of the data that do not add information provides a significant increase in sensitivity, as also demonstrated by the insert of Figure 4E. In Figure 4F the differences between Figure 4E and Figure 4A are plotted. As can be seen, although a certain amount of information can be lost, this is basically due to a multitude of small peaks resulting from the fragmentation of the main ions that are not included. As an example, the software has detected and measured 1628 peaks for the hydrolysis of bovine cytochrome c spanning four orders of magnitude (area between 3000 and 15000000). After applying the procedure proposed in this work, the resulting file contains less than 150 Kbytes, a reduction by more than 50 times from the original data set recorded by the MS instrument (7.5 Mbytes). The 2D mapping using this new set of data is shown in Figure 5. Comparing this mapping with the original one displayed in Figure 3, it can be seen that all important information have indeed been conserved. Moreover, a much higher resolution is observed in the time scale in Figure 5 than in Figure 3. This is striking when comparing in Figure 5 the alignments of the spots from peaks 7, 9, 11 and 12 with the one from peaks 1, 2 and 8. This result came from the uses of electrophoretic parameters. Indeed, peaks in the m/z dimension resulting from the ESI-MS fragmentation of the same parent compound will have the same peak shape. However, peaks in the m/z dimension resulting from different parent compounds will have different peak shapes. The accurate measurement of the electrophoretic parameters (migration time, but also peak variance, peak asymmetry…) allows highlighting small differences in the peak shapes. This is illustrated in Figure 6, where the full MS spectra of peak 10 (Figure 6A), and the EIEs obtained using the five more abundant ions from Figure 6A (Figure 6B) are compared with the full MS spectra for peak 8 (Figure 6C), and the EIEs 11 obtained from the five more abundant ions in Figure 6C (Figure 6D) including m/z values higher than 300 (typically, m/z values lower than 300 have a high contribution to the noise signal). Using the ten most intense spots for peak 10, an average migration time of 26.648 min was obtained with a standard deviation of 0.008 min (i.e., a relative standard deviation of less than 0.05%), showing the very high precision of the procedure proposed to determine the peak center. Moreover, this result shows that our simplified 2D mapping can be of great help to differentiate CE comigrating polypeptides from those produced by ESI-MS fragmentation. Furthermore, the easy manipulation of the obtained data makes also possible to achieve other interesting 2D representations. For example, in capillary electrophoresis, migration time can be related to the mobility, a parameter that depends on the analyte and the separation buffer. Although different equations have been proposed to calculate the mobilities from experimental data [23], an adequate procedure for CE-MS can be the use of two reference peaks for standardization. The equation originally proposed by Ikuta and colleagues use one analyte and the electroosmotic flow (eof) marker as reference peaks [24]. However, as in CEMS with a very low pH buffer no eof marker is usually observed, two analytes will be used in this work as reference. The original equation has to be accordingly modified to t ref 2 t ref1 t ref2 ref 1 ref1 t t ref 1 t ref2 (3) where t and are the migration time and mobility of the peak of interest, tref1 and tref2 are the migration times of the two reference peaks, and ref1 and ref2 are their corresponding theoretical mobilities. In order to choose the adequate two reference peaks, data were sorted by ionic intensity showing that two out of the three most intense spots are common to the three species (corresponding to the peaks marked with an asterisk in Figure 2) ((tref1)CYC-B = 12 27.552, (m/z) = 589.3, (tref2)CYC-B = 18.002, (m/z) = 585.2; ((tref1)CYC-H = 27.111, (m/z) = 589.3, (tref2)CYC-H = 17.778, (m/z) = 585.3; ((tref1)CYC-R = 27.134, (m/z) = 589.3, (tref2)CYC-R = 17.748, (m/z) = 585.2). Therefore, these two peaks were used as references for the three samples and their analysis time and ref1 and ref2 were used in equation 3. Using this approach, electrophoretic mobility values were then automatically calculated for the main ions. This was automatically done for all the extracted data recorded in an excel spreadsheet (Figure 1, step 3). The electrophoretic mobility values provided by equation 3 are logically independent on the applied voltage and capillary dimensions [24]. This point needs to be highlighted since in CE-MS a large part of the capillary is usually not thermostated. Therefore, by using equation 3 the negative thermal effect on reproducibility is practically eliminated [25]. Although in this study reference points could easily be determined, in more complex samples this point cannot be so easy. However, the use of internal standards will provide the same results in these more complex matrices. Once the electrophoretic mobility values were obtained, the simplified 2D map shown in Figure 7 was obtained representing mobility m/z for the three proteolytic digests (bovine, horse and rabbit). Only the most intense spots have been plotted, corresponding the three different symbols to the three studied species ( bovine; x rabbit; + horse) and the different colors to the ion intensity (black high intensity; red intermediate intensity; blue low intensity). The strikingly reduced amount of information shown in Figure 7, their high matching, as well as the high precision in the mobility and m/z parameters allow now an easy and accurate comparison of the CE-MS results obtained for the proteolytic digests from these three species. Spots labeled as “ref” are the ones used for mobility calibration. Thus, as can be seen in 13 Figure 7, numerous spots are common to the three species, with a very high level of confidence. For example the measured mobility and m/z values for spots labeled as 1 in Figure 7 are CYC-B = 2.03610-8 m2 V-1 s-1, m/z CYC-B = 617.3 ± 0.7; CYC-R = 2.03210-8 m2 V-1 s-1, m/z CYC-R = 617.4 ± 0.8; CYC-H = 2.03210-8 m2 V-1 s-1, m/z CYC-H = 617.4 ± 0.5. More interestingly, specific differences between the three species can now be easily recognized. For example spots labeled from 2 to 5 could be selected to distinguish the three species. Thus, spot 2 or 5 can be selected as marker for CYC-H, spot 3 as marker for CYC-B and spot 4 as marker for CYC-R. Not only the marker can easily be identified using this simplified mapping but also its intensity, giving rise in that way to the most favorable choice among the different markers that could be selected. For instance, spots labeled as A or B in Figure 7 could also be selected as markers, however, since their intensity is lower than the markers chosen above, their use to differentiate species would be less favorable. Interestingly, an additional proof of the usefulness of this 2-D representation is that in some cases only the combination of the information from both dimensions makes possible to understand the proteolytic profile. A representative example of this idea can be seen considering the two fragments marked as 6 in Figure 7 ( bovine and x rabbit). From a first sight of the 2-D map it is observed that the match between these two fragments is not as good as the match observed for other fragments (see for instance in Figure 7 the exact match for spots with a mobility between 15 and 1810-9 m2 V-1 s-1) indicating that they are different compounds. Thus, as can be seen in the 2-D map these fragments have the same m/z value (m/z CYC-B = 600.4 ± 0.7, m/z CYC-R = 600.4 ± 0.7) but different mobility (CYC-B = 0.88810-8 m2 V-1 s-1, CYC-R = 0.87710-8 m2 V-1 s-1). Interestingly, the corresponding full MS spectra of these spots were also very similar. This made even more difficult to detect some significant difference directly from the CE-MS electropherogram (i.e., the only difference was the presence of a secondary ion at m/z equal 1005.9 for CYC-B, and at m/z equal to 999.0 for 14 CYC-R). Based on the visual information provided by the 2-D map these two spectra were studied in more detail confirming that they correspond to a fragment from bovine cytochrome c identified [19] as GITWGEETLMEYLENPK (Mr = 2010.2) and, considering similar cleavage into the rabbit sequence, to the fragment GITWGEDTLMEYLENPK (Mr = 1996.2), corresponding the aforementioned ion of m/z = 600.4 to the common fragment LENPK. Interestingly, this very small difference (glutamic acid for aspartic acid) between the two sequences can also explain their different mobility. Indeed, using a theoretical model to predict the CE migration of peptides [25] the mobility of GITWGEETLMEYLENPK has been estimated to be 0.82710-8 m2 V-1 s-1 and the mobility of GITWGEDTLMEYLENPK as 0.82410-8 m2 V-1 s-1. The fact that such small differences can easily be detected through the simplified 2D map confirms the accuracy and usefulness of the proposed approach. Two different samples containing (i) 95% of CYC-B + 5% CYC-H and (ii) 95% of CYC-B + 5% CYC-R were prepared with the aim to demonstrate the usefulness of the selected markers to selectively identify the species. Samples were digested with trypsin and then analyzed by CE-MS. The m/z scan range was decreased to 540 to 760 (target mass 650 m/z) as the selected markers are in this range. This allowed an increase in the signal/noise ratio by a factor of ca. 1.5 (data not show). The EIEs obtained for the three m/z values used as markers (i.e. m/z 728.9 ± 0.2 for CYC-B, m/z 736.0 ± 0.2 for CYC-H and m/z 665.4 ± 0.2 for CYC-R) are shown in Figure 8. Namely, the sample (i) composed of 95% CYC-B + 5% CYC-H is shown in Figure 8-IA to IC, while the sample (ii) composed of 95 % CYC-B + 5% CYC-R is shown in Figure 8-IIA to IIC. As expected, the bovine marker is present in all samples, while the horse marker is only present in the 5% horse sample (Figure 8-IB) and the rabbit marker is only present in the 5% rabbit sample (Figure 8-IIC). This result corroborates the usefulness of our procedure to provide selective species-markers. Moreover, this procedure also allows overcoming the migration time shifts observed in CE. As an example, the electrophoretic 15 mobility of the bovine marker (m/z = 728.9) determined according to equation 3 was 1.04010-8 m2 V-1 s-1 in Figure 7 and 1.04610-8 m2 V-1 s-1 in Figure 8 showing again the advantages offered by this 2D methodology. 4. Concluding remarks The procedure presented in this work to obtain simplified 2D maps allows an automatic representation of raw CE-MS data. It is demonstrated that this new approach provides simplified 2D maps and a reduction of the initial amount of data by a factor of 50 without any major loss of information. This tool has been tested studying trypsin digests of cytochrome c from three different species (bovine, horse and rabbit). For the three species more than 1500 spots were generated, each of them indexed by three parameters: migration time, ionic intensity and m/z value. It has been shown that the developed 2D procedure also helps to differentiate between CE comigrating compounds and ESI-MS fragmentation-ions. Moreover, spots can be easily distinguished based on very subtle differences in their mobilities or m/z values using the generated 2D maps. As an example, two very similar fragments from CYC-B and CYC-R were visually distinguished in the 2D map. These fragments could not be differentiated based on their standard CE-MS electropherograms. Since this approach makes full use of the advantages derived from a CE separation prior to MS analysis, it can be foreseen that working under the same CE conditions and with the same ESI-MS settings, reproducible and specific fingerprintings could ideally be obtained. Moreover, the very good results obtained with this simple approach suggest that this could routinely be used to simplify CE-MS data as well as LC-MS or GC-MS data. 16 Acknowledgements G.L.E. thanks the Spanish MEC for a postdoctoral grant. Authors are grateful to the AGL2005-05320-C02-01 Project (Ministerio de Educacion y Ciencia) and the S-505/AGR0153 Project (Comunidad Autonoma de Madrid, CAM) for financial support of this work. 17 5. References [1] Anderson, N.L., Anderson, N.G., Moll Cell. Proteomics. 2002, 1, 845-867. [2] Xiao, Z., Prieto, D., Conrads, T.P., Timothy D., et al. Mol. Cell. Endocrinol. 2005, 230, 95-106. [3] Issaq, H.J. Electrophoresis 2001, 22, 3629-3638. [4] Issaq, H.J., Chan, K.C., Janini, G.M., Conrads, T.P., Veenstra, T.D. J. Chromatogr. A 2005, 817, 35-47. [5] Dolnik, V. Electrophoresis 2006, 27, 126-141. [6] Kasicka, V. Electrophoresis 2006, 27, 142-175. [7] Monton, M.R.N., Terabe, S. Anal. Sci. 2005, 21, 5-13. [8] Stutz, H. Electrophoresis 2005, 26, 1254-1290. [9] Roesli, C., Elia, G., Neri, D. Curr. Opin. Chem. Biol. 2006, 10, 35-41. [10] Wilson, I.D:, Plumb, R., Granger, J., Major, H., et al. J. Chromatogr. B 2005, 817, 67-76. [11] Wittke, S., Fliser, D., Haubitz, M. et al. J Chromatogr. A 2003, 1013, 173-181 [12] Pietrogrand, M.C., Marchetti, N., Dondi, F., Righetti, P.G. J. Chromatogr. B 2006, 833, 51-62. [13] Palagi, P.M:, Walther, D., Quadroni, M., et al. Proteomics, 2005, 5, 2381-2384. [14] Li, X.J., Pedrioli, P.G., Eng, J., et al. Anal. Chem. 2004, 76, 3856-3860. [15] Katajamaa, M., Oresic, M., BMC: Bioinformatics, 2005, 6, Art. No. 179. [16] Dyson N., 1990, Chromatographic Integration Methods, The Royal Society of Chemistry, Cambridge, UK. [17] Dyson, N. J. Chromatogr. A 1999, 842, 321-340. [18] Simo, C., Gonzalez, R., Barbas, C., Cifuentes, A. Anal. Chem. 2005, 77, 7709-7716. [19] Simo, C., Cifuentes, A. Electrophoresis. 2003, 24, 834-842. 18 [20] Girish, P.S., Anjaneyulu, A.S.R., Viswas, K.N., Shivakumar, B.M., et al. Meat Sci. 2005, 70, 107-112, [21] Vallejo-Cordoba, B., Gonzalez-Cordova, A.F., Mazorra-Manzano, M.A., RodriguezRamirez, R.. J. Sep. Sci. 2005, 28, 826-836, [22] A. Cifuentes, Electrophoresis 2006, 27, 283-303. [23] Survay, M.A., Goodall, D.M., Wren, S.A.C., Rowe, R.C. J. Chromatogr. A, 1996, 741, 99-113. [24] Ikuta, N., Yamada, Y., Yoshiyama, T., Hirokawa, T. J. Chromatogr. A, 2000, 894, 11-17. [25] Cifuentes, A., Poppe, H. J. Chromatogr. A, 1994, 680, 321-340. 19 FIGURE LEGENDS Figure 1. Schematic representation of the steps performed before obtaining the simplified 2D CE-MS map. Figure 2. CE-ESI-MS total ion electropherograms of the trypsin hydrolysates from bovine, rabbit and horse cytochrome c. Running buffer, 0.9 M formic acid adjusted to pH 2 with ammonium hydroxide; injection at 3447 Pa for 20s; running voltage 20 kV; capillary length 90 cm (total and detection length). MS conditions: sheath liquid methanol/water (50/50) at 4 μL/min; nebulizer gas, 27579 Pa and 8L/min at a temperature of 120 °C; MS scan range, m/z 200-1100 (target mass: 650 m/z). The two peaks marked with an asterisk are used for mobility normalization (see text for more details), peaks marked with a number are used in the text to compare the three electropherograms. Figure 3. CE-ESI-MS full two-dimensional map of bovine cytochrome c digested with trypsin. Color code: Black, intensity > 180 000; brown, intensity > 100 000; red, intensity > 56 000; orange, intensity > 32 000; yellow, intensity > 18 000; blue, intensity > 10 000. All experimental conditions as in Figure 2. Figure 4. (A). CE-ESI-MS total ion electropherogram of. (B) Measured standard deviation (SD) for each m/z value of A. (C) Reconstructed total ion electropherogram using only the m/z values whith SD > 2000. (D) Residual between (A) and (C). (E) Reconstructed total ion electropherogram using only the part of every m/z where a peak has been detected (Step 3 in Figure 1). (F) Residual between (E) and (A). All experimental conditions as in Figure 2. 20 Figure 5. Simplified 2D CE-MS map of bovine cytochrome c digested with trypsin. Color code: black, area > 570 000; brown, area > 320 000; red, area > 180 000; orange, area > 100 000; yellow, area > 56 000; bleu, area > 5 000. All experimental conditions as in Figure 2. Figure 6. (A) Full MS spectra and (B) EIEs of the most intense fragments of peak 10 in Figure 5 and (C) full MS spectra and (D) EIEs of the most intense fragments of peak 8 in Figure 5. Figure 7. Comparison of the simplified 2D CE-MS map of hydrolysed cytochrome c from bovine heart (), rabbit heart (x) and horse heart (+). Colour code: black, area superior or equal to 50 % of the highest area; red, area superior or equal to 20% of the highest area; blue, area superior or equal to 10% of the highest area. Figure 8. Extracted ion electropherograms from digested samples containing (I) 95% bovine + 5 % horse cytochrome c and (II) 95% bovine + 5 % rabbit cytochrome c. EIEs corresponding to (A) m/z 728.9 ± 0.2, (B) m/z 736.0 ± 0.2 and (C) m/z 665.4 ± 0.2. The MS scan range was set at m/z 540-760 (target mass 650 m/z). Rest of conditions as in Figure 2. 21 Step 1 Step 2 m1 + m1 Mass Original CE-MS Data m2 + m2 Time Time Step 3 Peak #1 #2 #3 … m/z /Da m1 m2 m2 … Incertitude m1 m2 m2 … Peak area A1 A2 A3 … Migration time tm1 tm2 tm3 … Simplified 2D CE-MS map (time m/z) Figure 1. 22 Intens. /107 1.5 * CYC Bovine CYC Bovine 5 1 3 4 0.5 1 * 2 0 10 Intens. /107 1.5 14 18 Time /min 22 26 30 CYC Rabbit CYC Rabbit 1 * 5 4 0.5 1 3 2 * 0 10 Intens. /107 1.5 14 18 * CYC Horse Time /min 22 26 30 CYC Horse 5 1 4 3 0.5 * 1 2 0 10 14 18 Time /min 22 26 30 Figure 2. 23 m/z 282.2 Figure 3. 24 1.5 2 B. 5 1 SD /10 Intens. /10 7 A. 1 0.5 0 0 10 14 18 Time /min 22 26 200 30 1.5 400 600 m/z 800 1000 0.15 D. 7 1 Intens. /10 Intens. /10 7 C. 0.5 0 0.05 0 10 14 18 Time /min 22 26 30 10 14 18 Time /min 22 26 30 14 18 Time /min 22 26 30 0.3 1.5 E. F. 0.25 7 7 1 11 14 0.5 Intens. /10 Intens. /10 0.1 0.2 0.15 0.1 0 10 14 18 Time /min 22 26 30 10 Figure 4. 25 2 1 3 5 4 6 8 9 7 10 11 12 1000 m/z 800 600 400 200 10 12 14 16 18 20 22 24 26 28 30 Time /min Figure 5. 26 Intens. x10 4 Intens. x10 4 All, 26.3-26.9min (#650-#666) A. All, 21.7-22.5min (#537-#558) C. 494.2 600.2 656.9 1.25 1.5 487.2 763.3 282.2 1.00 607.3 282.2 1.0 0.75 512.1 244.1 0.5 358.2 713.7 0.50 1005.9 892.4 446.6 423.2 841.4 372.3 644.2 568.7 985.0 0.25 683.5 763.3 891.5 920.4 200 Intens. x10 5 1070.0 0.00 0.0 300 400 500 600 700 800 900 1000 Intens. x10 5 600.2±0.5 B. 200 m/z 763.3±0.5 300 400 500 600 700 D. 800 900 1000 m/z 656.9±0.5 3 487.2±0.5 494.2±0.5 1.5 512.1±0.5 2 607.3±0.5 1.0 1005.9±0.5 713.7±0.5 1 0.5 841.4±0.5 0 0.0 21.5 25.8 26.0 26.2 26.4 26.6 26.8 27.0 27.2 21.6 21.7 21.8 21.9 22.0 22.1 22.2 22.3 22.4 Time [min] Time [min] Figure 6. 27 900 CYC Bovine CYC Rabit CYC Horse CYC Bovine CYC Rabit CYC Horse B 5 2 3 600 4 A Ref 6 1 m /z Ref 300 0 6 8 10 12 14 -9 16 2 -1 18 20 22 -1 mobility /10 m V s Figure 7. 28 1.5 1.5 I-A Bovine marker m/z = 728.9 Intens. /106 Intens. /106 Bovine marker m/z = 728.9 II-A 0 0 30 1.5 35 Time /min 40 30 45 1.5 I-B 0 Time /min 40 45 II-B Horse marker m/z = 736.0 Intens. /106 Intens. /106 Horse marker m/z = 736.0 35 0 30 1.5 35 Time /min 40 30 45 1.5 I-C 0 Time /min 40 45 II-C Rabbit marker m/z = 665.4 Intens. /106 Intens. /106 Rabbit marker m/z = 665.4 35 0 30 35 Time /min 40 45 30 35 Time /min 40 45 Figure 8. 29