Document 12943195

advertisement
Improved protein separation and identification by use of 2D liquid protein fractionation and ion mobility mass spectrometry
1
1
1
1
1
2
Susan E. Slade ; Konstantinos Thalassinos ; Sarah J. Nicholson ; Jonathan P. Williams ; James H. Scrivens ; Kevin Giles and Robert H. Bateman
2
1
Biological Mass Spectrometry and Proteomics Facility, Dept. of Biological Sciences, University of Warwick, Coventry, United Kingdom
2
Waters Micromass MS Technologies, Floats Road, Wythenshawe, Manchester, United Kingdom
ABSTRACT
MATERIALS AND METHODS
Chromatofocusing
Column (CF)
Sample
Reverse Phase
Column (RP)
Purpose
To utilise and evaluate the combination of a commercial two-dimensional liquid protein
separation system with mass spectrometry-based protein identification for proteomics
applications.
Intact Mass
Digest
Methods
E. coli cell lysate proteins were resolved by chromatofocusing followed by
hydrophobicity chromatographic separations. A number of protein fractions were
concentrated and tryptically digested prior to analysis by means of LC-ESI-MS/MS
and protein identification.
MALDI-PMF
Bioinformatic approaches were developed to extract biologically relevant information
from the generated dataset.
Results
Preliminary results indicate that we have a demonstrated a significant improvement in
our protein identification confidence due to increased sequence coverage for proteins
from a widerange of molecular weight and isoelectric point, compared to samples from
gel-based sources.
Interesting observations were noted including one protein that eluted at a number of pI
intervals, with differing peptide sequences observed in each fraction.
In addition, the solution-based protein separation system allows characterisation of
both tryptically digested proteins and the intact species thus enabling further protein
characterisation.
INTRODUCTION
Gel-based proteomics experiments have proved highly successful in analysing
relatively complex biological systems but have a number of limitations. These include
narrow sample loading capacity reducing the quantity of protein available for
downstream mass spectrometry-based identification.
When the capacity is
exceeded, poor resolution of protein species is evident. Other limitations include gelto-gel variation, narrow dynamic range, difficulties in generating a homogeneous
sample for analysis due to problems with protein solubility and the low efficiency of
peptide extraction post-tryptic digestion compounding the problems encountered in
protein identification. Prefractionation of complex proteomes can be required prior to
analysis particularly when the sample source contains a number of highly abundant
proteins. The characterisation of low abundance species continues to be a challenge
in the field of proteomics.
Two-dimensional liquid chromatography protein separation and mass spectrometrybased identification
UV Detector
280 nm
LC-ESI-MS/MS
UV Detector
214 nm
Database Search
Figure 1. Schematic representation of the separation mechanism employed by the
PF2D protein separation system (Beckman Coulter) and subsequent potential MS
analysis strategies.
Typically a single 2D-LC separation combining the collection of protein fractions from
the CF and RP columns could generate approximately 800 samples for digestion and
MS analysis. Clearly a more focused approach to the analysis of biological systems is
required with the necessity to derive biologically relevant information from the vast
quantity of mass spectrometry data generated.
We have developed extensive in-house bioinformatics resources which have
facilitated the interpretation of the 2D-LC data, elements of which are presented on
poster TuP-253 (Thalassinos, Slade et al.).
Due to the highly complex nature of many biological systems, it may not be possible to
resolve each of the proteins in a sample into individual fractions using the 2D-LC
separation. Therefore we have explored the potential of ion mobility to resolve
peptides of identical m/z, which may result from a tryptic digest of a 2D-LC fraction
containing a number of proteins, see Figure 2.
An Escherichia coli (strain W3110) cell lysate containing 3 mg of protein was applied to
a chromatofocusing HPCF column equilibrated in CF start buffer, using the PF2D
protein fractionation system (Beckman Coulter, U.S.A.). Proteins were resolved over
the pH range 8.5 to 4.0 using the proprietary elution buffer followed by a 1 M sodium
chloride eluate. Fractions were collected either by pH interval or volume.
Each CF fraction was then applied to a non-porous silica RP column (Beckman
Coulter) and further resolved by an organic solvent gradient using acetonitrile with
trifluroacetic acid as ion pairing agent. Fixed volume fractions were collected during
o
each separation and aliquotted into smaller volumes prior to storage at -70 C
A number of fractions were selected for MS-based protein identification across pH
intervals and protein concentrations. A 175 µL aliquot of each fraction was dried,
reconstituted in 10 µL of ammonium bicarbonate buffer and processed on the
MassPrep robotic protein handling system (Waters Micromass MS Technologies)
using the maufacturer’s in-solution digest protocol. In brief, protein samples were
reduced, alkylated with iodoacetamide, digested with trypsin and the peptides
acidified.
! We have identified 183 proteins with a minimum of one observed peptide, of which
92 were identified with two peptides and 51 with 3 peptides.
! In addition, many CF fractions contain multiple proteins, identified with a significant
number of peptides, see Figure 4.
The combination of 2D-LC chromatofocusing and hydrophobicity fractionation steps
prior to MS-based protein identification has proved highly successful in the analysis of
our model system.
The ability to load milligram quantities results in protein identifications with substantially
improved sequence coverage and thus confidence in protein assignments. Where
protein sequences are not available for interrogation, the high quality of the tandem MS
data in combination with the higher m/z peptides observed, allow longer regions of
peptides to be de novo sequenced.
ProteinLynx Global Server 2.1 (Waters Micromass MS Technologies) was used to
interrogate the data obtained from the LC-ESI-MS/MS experiments against an inhouse database containing sequences from E. coli W3110, trypsin and keratin
contaminants.
We have demonstrated that proteins identified with one peptide in one RP fraction may
subsequently be observed in other fractions from the 2D-LC separation. The biological
implications of this project will only be evident when the complete dataset of RP
fractions has been analysed and the proteins identified.
Relational database model
The vast quantity of data generated for each CF and RP fraction ensures the absolute
requirement for bioinformatic handling of the biological information.
The database chosen to store the experimental and protein identification results was
MySQL 5.0 (http://www.mysql.com/). A program, written in the Java programming
language (v 1.5.0_04), allows the user to enter a variety of experimental 2D-LC and
MS parameters used, including (but not restricted to) methods, locations of CF and RP
fractions (tray and well numbers), processing parameters, datafile locations,
processed spectra etc. The program also parses the GS protein identification results
and links identifications to each fraction. The database can then be queried, for
example listing all fractions from CF and RP separations in which a protein, by
accession number, was identified.
Doubly charged precursor ions at m/z 246 were selected from two peptides having the
sequences GRGDS and SDGRG, for ion mobility study using the Synapt HDMS
(Waters Micromass MS Technologies), a hybrid MS-IMS-MS instrument with a
Quadrupole / IMS / Orthogonal-TOF configuration. The ion mobility separation device
(IMS) comprises three consecutive travelling wave RF ion guides (Triwave)
incorporating a repeating sequence of transient DC pulses to propel ions through the
guide in the presence of a background gas. Ions are accumulated in the Trap T-Wave
and periodically released into the IMS T-Wave where they separate according to their
mobility.
Protein resolution is achieved in the first dimension using chromatofocusing (CF) by
generating a pH gradient on an ion exchange resin. The column is equilibrated at
basic pH and the solublised proteins are applied to the column. Proteins that have an
isoelectric point (pI) equivalent to the column pH have a net zero charge, thus do not
bind and elute immediately from the column and are collected. Over a period of a few
hours, the pH of the column is reduced and sequentially the proteins elute from the
column according to their pI and are collected for further analysis. Any proteins still
bound to the column at acidic pH are eluted using a high ionic strength buffer and the
fractions collected, see Figure 1.
RESULTS
Further resolution of the proteins is achieved by a second dimension of protein
separation according to hydrophobicity. Each fraction from the chromatofocusing
column is sequentially applied to a reverse phase (RP) column under aqueous
conditions. An organic solvent gradient is used to elute the proteins from the column
with the hydrophilic proteins emerging initially from the column followed by those
having a greater hydrophobic nature.
To date, we have obtained significant numbers of protein identifications from 7 of the
possible 32 reverse phase-separated CF fractions, spanning the pH region 6.4 to 4.0.
RESULTS
Protein identification
Figure 2. Overlay of the arrival time mobility distributions for the doubly charged ions
of two isomeric peptides using the Synapt HDMS (Waters Micromass MS
Technologies, U.K.). Collisional cross-section measurements indicate a 5 %
difference in physical size and shape.
CONCLUSIONS AND FUTURE WORK
The tryptic extracts were analysed by means of nano-LC-ESI-MS/MS on a Q-Tof
Ultima Global with in-line CapLC system (Waters Micromass MS Technologies). The
tryptic extract was desalted using an in-line C18 precolumn cartridge (Dionex, U.S.A.)
and the peptides further resolved on a 75 µm C18 PepMap column (Dionex, U.S.A.)
using an increasing acetonitrile concentration gradient.
Ion mobility study of isomeric peptides
We have combined a two-dimensional liquid chromatography (2D-LC) protein
separation system with mass spectrometry-based identification using a well
characterised commercially available cell lysate.
Figure 3. ProteoView visualisation of E. coli fractionated proteins using a 2D-LC
approach. Vertical columns represent CF fractions and horizontal bands protein
absorbance. Shown on the left is the absorbance measured at 214 nm during the
reverse phase elution of one chromatofocusing fraction.
Figure 5. Interpreted product ion spectrum of a doubly charged tryptic peptide with
amino acid residue assignments.
! Each of the CF fractions contains multiple proteins as depicted by a onedimensional representation generated by the ProteoView software (Beckman
Coulter), see Figure 3. Each vertical “track” depicts a single CF fraction, with the
horizontal bands indicating the presence of protein detected by absorbance at 214 nm.
Proteins that elute from the CF column at differing pH values than their predicted pI
have been selected for further characterisation.
Figure 4. Global Server protein identifications from a single reverse phase fraction
from the 2D-LC separation. The upper left pane indicates that multiple protein species
!
The RMMs of proteins identified to date range from 8 KDa to 100 KDa
! Abundant peptides (RMM > 3.5 KDa) have been observed exhibiting multiple
charge states.
! The product ion spectra (MS/MS) of higher m/z peptides can be interpreted to yield
good candidate amino acid sequences (de novo), see Figure 5.
! The majority of proteins elute near to their expected pI with reproducible RP
retention times, but some proteins exhibit differing properties to those expected (see
poster TuP-253 for further details).
! Abundant proteins may elute across adjacent fractions from both the CF and RP
columns. Frequently a protein may be identified in one fraction with only one peptide,
to be subsequently identified in the next fraction by significantly more peptides.
! Sequence coverage can approach 50 % on proteins having RMMs from 9 to 35
KDa.
! Using the database, we have identified two doubly charged isobaric peptides of
m/z 965 Da with identical protein elution profiles from the RP column, but differing in
sequence (AFTSEEFTHFLEELTK and LVDKVIGITNEEAISTAR) for ion mobility
separation and MS/MS identification. This would be achieved in the first and/or third
T-Wave ion guides of the Synapt HDMS system generating first and second
generation fragment ion spectra.
The ability to produce intact proteins after the RP separation ensures that posttranslational modification mapping can be undertaken on both the digested and full
length protein. The latter approach is currently being explored in a high throughput
manner.
We propose to introduce real-time database searching and exclusion of identified
proteins on-the-fly to encourage the observation of the lower abundance protein
species.
We will incorporate more directed mass spectrometry approaches to the identification
of peptides e.g. neutral loss trigger for data directed acquisition on phosphorylated
peptides.
We plan to extend our studies to include biological systems that are incompatible with
gel-based separation methods.
Due to the complexity of the fractions analysed, protein quantitation may be an issue.
We plan to assess the suitability of iTRAQ technology (Applied BioSystems, U.S.A.)
For the quantitation of protein expression levels in proteomic studies in combination
with 2D-LC protein fractionation.
REFERENCES
Lubman, D.M. et al. (2002). Journal of Chromatography B. 782 (1-2) 183 -196.
Zheng, S. et al. (2003). Biotechniques 35 (6) 1202-1211.
Zhu, K. et al. (2004). Journal of Chromatography A.1053 (1-2) 133 -142.
Giles K, Pringle SD, Worthington KR, Little D, Wildgoose JL and Bateman RH, Rapid Commun. Mass
Spectrom. 18 (2004) 2401.
Levreri, I. et al. (2005). Clinical Chemistry and Laboratory Medicine 43 1327-1333.
Download