Experimental Results

advertisement
Characterisation and quantitation expression analysis of recombinant proteins in
plant complex mixtures using nanoUPLC mass spectrometry
André M. Murad1, Gustavo H. M. F. Souza2, Jerusa S. Garcia3, Elíbio L. Rech1*
1
Embrapa Genetic Resources and Biotechnology, Laboratory of Gene Transfer, Parque
Estação Biológica, PqEB, Av. W5 Norte, Brasília, DF, 70770-917, Brazil
2
Waters Corporation, MS Applications Research and Development Laboratory,
Alameda Tocantins, 125, 27th floor, West Side, Alphaville, São Paulo, SP, 06455-020,
Brazil.
3
Alfenas Federal University, Institute of Exact Sciences, Alfenas, MG, 37170-000,
Brazil
*corresponding author
Keywords: Soybean, IdentityE, ExpressionE, MSE, ProteinLynx Global SERVER,
ABSTRACT
Identification of recombinant protein expressed in a total soluble protein (TSP) plant
extract by mass spectrometry is desirable and necessary to accelerate further processing
steps. Basically, the protocol consists of an initial TSP sample preparation and trypsin
digestion prior preliminary characterization of recombinant proteins expressed in TSP
samples of transgenic soybean seeds utilizing a nanoUPLC-MSe. As low as a 50 ug TSP
sample can be effectively analyzed. Experimental data for the TSP extraction and
sample preparation are discussed. The development of the process takes up to 3 days.
INTRODUCTION
The production of recombinant protein is an important step in several academic,
industrial and pharmaceutical processes. Several heterologous protein expression
systems are available, including bacterial1, mammalian cell-culture2 and plant3,
4
systems. Although these comprise the main production systems, the search for novel
methods to increase protein yield, facilitate manipulation and reduce cost continues.
Seeds are a vital alternative for recombinant protein production for several reasons: they
can undergo long-term storage at ambient temperatures5, 6, they can provide an
appropriate biochemical environment for protein stability through the creation of
specialised storage compartments6, 7, they are not contaminated by human or animal
pathogens8, they do not undergo non-enzymatic hydrolysis or protease degradation
owing to their desiccation characteristics5,
8
and they do not carry the phenolic
substances that are present in tobacco leaves, which is important for downstream
processing3, 8. We recently produced several soybean transgenic plants expressing
important pharmaceutical molecules, such as proinsulin6, human growth hormone
(hGH)9 and human coagulation factor IX (hFIX)10, showing the viability of this system.
On other hand, producing these transgenic lines is extremely time consuming11 and
requires at least 150 days to obtain the first seeds and another 3 years for a homozygote
line. At the early stage, we have little material for recombinant protein purification; as a
result, the detection, quantification and characterisation of recombinant molecules rely
mainly on the manipulation of total soluble protein (TSP), which contains a complex
mixture with a low abundance of the protein of interest. Thus, we need a method that
detects, qualifies and quantifies recombinant proteins in TSP using less than ¼ of a
single seed mass (50 mg).
Typically, the identification of a recombinant protein is performed using western blot
analysis12 and is quantified by enzyme-linked immunosorbent assays (ELISAs)13. These
methods are widely used because they are simple and relatively fast for identification
and quantification, but they lack sensitive detection ability when small amounts of
antigens are used, or no antibody is available, or a false positive is found and there is no
way to verify the quality, amino acid sequence or post-translational modification of the
recombinant protein. Two dimensional electrophoresis (2-DE) has been developed for
proteomics14, 15, and because of its association with mass spectrometry, it has become a
primary tool for the identification and characterisation of plant complex mixtures15, 16.
2-DE can also be used for quantification and protein mapping of tissues17, comparative
proteomics18, 19 and post-translational identification20, but it requires a minimum sample
amount, cannot detect molecules in low abundance, needs spot manipulations for good
identification15, is mainly performed by peptide mass fingerprinting (PMF)21, 22, and has
difficulty in analysing proteins with similar mass and pI because they appear as a single
spot. The combination of gel and liquid chromatography mass spectrometry (LC-MS)
methods
may result in better identification of proteins in complex samples23,
24
,
overcoming the problems of 2-DE. Liquid chromatography (LC) increases the low
detection/resolution of complex mixtures on mass spectrometers (MS)25. Furthermore,
the analysis of peptides or complex samples commonly known as “system samples” that
are digested by trypsin is key in the detection of low abundance proteins, but this
technique has limitations in terms of the analyte dilution and the minimum amounts of
complex protein mixtures needed to guarantee a good dynamic range and detection of
low abundance proteins15, 25-28.
Nano-scale liquid chromatography with 2D separations as a strong cation exchange
(SCX) followed by reverse-phase (RP) chromatography or 2D RPxRP using two pH
and acetonitrile pulses combined with mass spectrometry with data independent
acquisitions (nanoLC-MSE) has several benefits for proteome analysis. Among these
benefits are detection and linear sequence structural information at the femtomole
level29,
30
, small surface areas and minimal dead-volumes, which minimises analyte
losses due to surface adsorption, and low flow rates that reduce analyte dilution.
Thereby, analytes of low abundance can be separated with a high recovery rate when
associated with a high dynamic range and a prevailing MS detection system 31. Recently,
the nanoLC-MS method was used for the detection of differences in expression of
soybean plasma membrane proteins under osmotic stress32, the regulation of stress
identification on tomatoes induced by iron deficiency33 and the detection of
neuropeptides secreted in Cancer borealis34, demonstrating the capability and potential
of this method. Moreover, nanoLC-MSE is an important tool in post-translational
characterisation of proteins, such as the identification of N-terminal peptide
modifications in the chloroplast proteome35, the analysis of human protein oxidations
leading to functional reduction/annulation36, and the characterisation of the
phosphorylation pattern of several phosphatase splice variants expressed in a human cell
line37, 38. Finally, quantification is also possible with the nanoLC-MS technique using
labelling methods such as (18O) labelling peptides39 and the iTRAQ™ method40, based
on relative quantification methods, such as the use of stochastic measurements between
mass and intensity deviations for each ion detected41 or the absolute quantification
based on a constant ion current acquired with low (MS) and high energies (MS/MS)
into the mass spectrometer, called MSE 42-45.
We describe herein (Fig. 1) an easy-to-handle, label-free nanoUPLC-MSE method with
absolute quantification and small sample usage for the detection, quantification and
characterisation of low abundance recombinant proteins expressed in soybean seeds,
specifically the immunogenic tumour NY-ESO-1 antigen (cancer testis antigen 1,
CTAG)46. CTAG is a protein product of the human X chromosome with 180 amino acid
residues (Fig. 2), mass 18 kDa, a glycine-rich N-terminal region and an extremely
hydrophobic C-terminal region that is so insoluble it can be confused with a
transmembrane domain46,
47
and is therefore a challenge in the identification and
characterisation of TSP extracts, as in our case. The expression pattern analysis by RTPCR for CTAG has confirmed that expression is restricted to testis and is not present in
other normal tissue, but is found in several types of cancer, including bladder, breast
and lung cancer48. The recombinant CTAG produced in Escherichia coli (E. coli) was
the first to be evaluated in the clinical setting and ranks among the most promising trials
published so far with CTAG because of the broad immunological and favourable
clinical results46, 49; thus, the use of CTAG as a vaccine is viable only if coupled with a
low cost, scalable recombinant protein production system. Additionally, the nanoUPLCMSE used in this procedure has particularities that enhance recombinant protein
characterisation with high selectivity and specificity. The nanoUPLC-MSE is composed
of a non-split, direct pump infusion, nanoscale liquid chromatography system
(nanoACQUITY® UPLC, Waters, Milford, MA) and related columns and accessories.
These include the use of columns packed with smaller particle sizes (<2 μm)50 and the
use of columns with a smaller internal diameter (I.D. <100 μm)51. Another development
to couple RP with a different separation mechanism is the method of 2D
chromatography. This method can be accomplished using the ion exchange properties
between the peptides or proteins with the stationary phase and mobile phase, e.g., an
increase or decrease of chaotropic “salting plugs” or pH. For the last 10 years, this
technique has been used as a cation exchange column (SCX) and “salting pulses” with
ammonium formate, e.g., at different concentrations.
Advances in this technology may allow the exploration of new frontiers in separation
science to avoid ion suppression from orthogonal separation and to increase peak
capacity52. These chromatography systems coupled with a high-end mass spectrometry
instrument allow minimal amounts of system samples to be injected and detected with
high selectivity and specificity. To achieve such high standards in this particular
experiment workflow, from sample preparation to acquisition and processing, these
standards must be controlled to avoid contamination and other characteristics, as
described in detail in this protocol.
MATERIALS
Reagents
Chemicals and solvents
 Sterile deionised water with a conductivity of less than 1.3 µS/cm, total organic
carbon (TOC) less than 2 ppb, and a semiconductor equivalent specification of
0.055 µS/cm (18.2 mΩ.cm) at point-of-use at 25 °C
 Petroleum Ether, 30-75 °C, BAKER ANALYZED Reagent (J.T. Baker, cat. no.
9274-03)
 Tris base (2-Amino-2-(hydroxymethyl)-1,3-propanediol) - (Fisher Scientific Ltd,
cat. no. BP152-5)
 KCl (Aldrich-Sigma Chemical Co. Ltd, cat. no. P9541)
 DL-Dithiothreitol (threo-1,4-dimercapto-2,3-butanediol) for molecular biology,
≥98% (DTT, Sigma-Aldrich, cat. no. D9779)
 Phenylmethanesulfonyl fluoride ≥98.5% (PMSF, Sigma-Aldrich, cat. no. P7626)
 Sodium dodecyl sulphate for molecular biology, ≥98.5% (SDS, Sigma-Aldrich,
cat. no. L4390)
 Acetone CHROMASOLV® Plus, for HPLC, ≥99.9% (Sigma-Aldrich, cat. no.
650501)
 NH4HCO3 ReagentPlus®, ≥99.0% (Sigma-Aldrich, cat. no. A6141)
 RapiGEST™ SF (Waters, cat. no. 186001861) 53
 Iodoacetamide BioUltra (Sigma-Aldrich, cat. no. I1149)
 Trifluoroacetic acid spectrophotometric grade, ≥99% (TFA, Sigma-Aldrich, cat.
no. 302031)
 Acetonitrile LC-MS CHROMASOLV®, ≥99.9% (Fluka, cat. no. 34967)
 Formic puriss. p.a., for mass spectroscopy, ~98% (T) (FA, Fluka, cat. no. 94318)
 nanoACQUITY™ UPLC™ trap column Symmetry C18 5 μm, 180 µm x 20 mm
trap column (Waters, cat no. 186003514)
 nanoACQUITY™ UPLC™ analytical column of 100 μm x 100 mm, 1.7 μm
BEH130 C18 (Waters, cat. no. 186003546).
Enzyme and standards
 Trypsin (Promega, cat. no. V511A)
 MassPREP Protein Digestion Standard Alcohol Dehydrogenase (MPDS ADH Waters, cat. no. 186002328)
 [Glu1]-Fibrinopeptide B human (GFP - Sigma-Aldrich, cat. no. F3261)
Kits
 Quant-iT™ Protein Assay Kit, 500 Assays, 0.25-5 µg for use with the Qubit™
fluorometer (Invitrogen, cat. no. Q33212)
Buffers and Solutions
 Extraction buffer (see REAGENT SETUP)
 50 mM NH4HCO3 (see REAGENT SETUP)
 Digestion solution (see REAGENT SETUP)
 Alkylation solution (see REAGENT SETUP)
 Reduction solution (see REAGENT SETUP)
 Hydrolysis solution (see REAGENT SETUP)
 Sample solution for nanoUPLC-MSE analysis (see REAGENT SETUP)
 MPDS ADH solution (see REAGENT SETUP)
 Surfactant solution (see REAGENT SETUP)
 Mobile phase A (see REAGENT SETUP)
 Mobile phase B (see REAGENT SETUP)
 GFP solution (see REAGENT SETUP)
 Cold Acetone (Store acetone at -20 °C)
EQUIPMENT
 Coffee grinder (Krups, model n. F203)
 Refrigerated centrifuge (Eppendorf, model 5810R)
 Analytical balance (Metter Toledo, cat. no. XP105D)
 2 mL microtubes (Axygen, cat. no. MCT-200-C)
 1.5 mL microtube (Axygen, cat. no. MCT-150-C)
 Vortex (Scientific industries, model G560E)
 Dry bath (Fisher Scientific, cat. no. 11-718-2)
 Waters Total Recovery vial (Waters, cat. no. 186000384c)
 nanoACQUITY™ UPLC™ system (Waters, Milford, MA, USA)
 NanoLockSpray™ - nanoESI source (Waters, Manchester, UK)
 Synapt HDMS™ mass spectrometer (Waters, Manchester, UK)
REAGENT SETUP
Extraction buffer (20 mM Tris-HCl, pH 8.3, 1.5 mM KCl, 10 mM DTT, 1 mM
PMFS, 0.1 % V/V SDS) For 1 litre, dissolve 2.42 g of Tris base, 0.1 g of KCl, 1.54 g of
DTT, 0.174 g of PMSF and 1 g of SDS in 800 mL of deionised water. Adjust the pH to
8.3 with HCl and add water to make up a final volume of 1 litre. Store at -20 °C for up
to 6 months.
50 mM NH4HCO3 For 1 litre, dissolve 3.95 g of NH4HCO3 in 800 mL of deionised
water. Filter through a 0.22 µm filter and store at room temperature (20–24 °C) for up to
6 months.
Digestion solution Add 400 μL of 50 mM NH4HCO3 to one 20 μg vial of Promega
Trypsin. Make aliquots of 10 µL and store at -80 °C for up to 6 months.
Alkylation solution (300 mM Iodoacetamide) For 1 mL, dissolve 55 mg in 500 µL of
deionised water. Add water to 1 mL. Store at -80 °C for up to 6 months.
Reduction solution (100 mM DTT) For 1 mL, dissolve 15 mg in 500 µL of deionised
water. Add water to 1 mL. Store at -80 °C for up to 6 months.
Hydrolysation solution (5 % V/V TFA) For 10 mL, add 0.5 mL of TFA in 9.5 mL of
deionised water. Store at room temperature (20–24 °C) for up to 6 months.
Sample solution for nanoUPLC-MSE analysis (3 % V/V acetonitrile, 0.1% V/V FA)
For 10 mL, add 0.3 mL of acetonitrile and 0.01 mL of FA to 9.5 mL of deionised water.
Store at room temperature (20–24 °C) for up to 6 months.
ADH solution Add 1 mL of the nanoUPLC-MSE solution to one vial of MPDS ADH.
Make aliquots of 10 µl and store at -80 °C for up to 6 months.
Surfactant solution (0.2 % V/V) Add 0.5 mL of water to one vial of 1 mg of
RapiGest™ SF. Store at 4 °C for up to 3 months.
Mobile phase A (0.1% V/V FA) For 1 litre, add 1 mL of FA to 999 mL of deionised
water. Store at room temperature (20–24 °C) for up to 3 months.
Mobile phase B (0.1% V/V FA in acetonitrile) For 1 litre, add 1 mL of FA to 999 mL
of acetonitrile. Store at room temperature (20–24 °C) for up to 1 year.
GFP solution (200 fmol.µL-1) Stock Solution: Add 2000 µL of acetonitrile/water
2.5/7.5 to 0.1% FA to give a solution of 32 pmol.l-1. Store in the freezer. Take 625 µL
of the stock solution and fill to 100 mL with acetonitrile/water 2.5/7.5 with 0.1% of FA,
giving a solution of 320 fmol.l-1. Use within 3 months.
PROCEDURE
Total soluble protein extraction from recombinant CTAG soybean seeds. TIMING
1-2 h for one sample
1| Using a coffee grinder, grind the soybean seeds into a fine powder. Using an
analytical balance, weigh out 100 mg of powder and store the remaining powder in a
vacuum bag at -80 °C for up to 1 year.
2| Place the weighed sample into a 2 mL capped centrifuge tube. Add 1 mL of
petroleum ether and slowly vortex the sample for 15 min. Discard the supernatant and
repeat the step twice (2X). Troubleshooting: Gently drop the solution out to avoid
powder losses.
3| Allow the petroleum ether to evaporate for 10 min. Add 1 mL of the extraction buffer
and slowly vortex the sample at room temperature for 10 min.
4| Leave the sample on the centrifuge for 5 min at 5000 r.min-1 at 4 °C. Transfer the
supernatant to a new centrifuge tube. At this step, it can be stored at -20°C for one
week. Pause point
Protein concentration TIMING 1-2 h
5| For each 200 L of sample, add 800 L of cold acetone to the centrifuge tube. Vortex
thoroughly and keep at -20 °C for 1 h, vortexing every 15 min.
6| Centrifuge the sample for 10 min at 13000 rpm. Discard the supernatant and allow the
pellet to dry at room temperature for 30 min. Critical Step Do not overdry the pellet or it
may become instable and partially insoluble.
7| Carefully dissolve the pellet with 500 μL of 50 mM NH4HCO3. Quantify it using the
Quant-iT™ Protein Assay Kit (Invitrogen) and dilute it with 50 mM NH4HCO3 to a 1
g.l-1 concentration. At this point, the sample can be stored at -20 °C for one week.
Critical Step For quantification purposes, the fluorometer must be calibrated for the
correct protein dosage.
Sample preparation for nanoUPLC-MSE acquisition TIMING 2 d
8| Place 50 μL of the 1 g.l-1 sample in a capped microcentrifuge tube.
9| Add 10 μL of 50 mM NH4HCO3.
10| Add 25 μL of the surfactant solution and vortex. Critical step The surfactant solution
must be applied only if the sample is placed in the ammonium bicarbonate buffer at an
alkaline pH. At an acidic pH, the surfactant will be depredated, and the solution’s
kinetic energy will be reduced prior to digestion, resulting in more missed cleavages
and bigger peptide fragments. .?Troubleshooting
11| Place the tube in a dry bath set at 80 °C. Heat for 15 min. Critical step: Ensure the
dry bath is set to the correct temperature before heating the sample.
12| Remove the tube from the dry bath. Perform a short spin; then add 2.5 μL of the
reduction solution and vortex slightly.
13| Place the tube in a dry bath set at 60 °C and heat for 30 minutes. Critical step:
Ensure the dry bath is set to the correct temperature before heating the sample.
14| Remove from the dry bath, allow the tube to cool to room temperature and then
centrifuge it. Add 2.5 μL of the alkylation solution and vortex slightly.
15| Place the sample in the dark at room temperature and allow 30 minutes of reaction
time.
16| Add 10 μL of the digestion solution and vortex slightly. Digest the sample at 37°C
in a dry bath overnight. This produces a 1:100 wt:wt ratio of enzyme:protein.
17| Following digestion, to precipitate the surfactant, add 10 μL of hydrolysation
solution and vortex. Then centrifuge the samples at 14000 rpm at 6 °C for 30 minutes.
Transfer the supernatant to a Waters Total Recovery vial. Critical step The surfactant
must be fully precipitated to ensure proper dissolution of the protein prior to injection in
the chromatograph and to avoid contamination during MSE acquisition. Ensure the
centrifugation step is well controlled to avoid the injection of precipitation residues into
the nanoUPLC system. Troubleshooting.
18| Add 5 μL of ADH and then add 85 μL of the nanoLC-MSE solution. The final
concentration of the protein is 250 ng.μL-1 and that of ADH is 25 fmol.μL-1. The final
volume is 200 μL. Store at -80 °C up to 6 months. Critical step: Correctly pipetting
these solutions is crucial for a good protein quantification by PLGS; therefore, it is
critical to keep the counts/fmol stoichiometric ratio between the sum of the ion intensity
and the concentration for a standard protein (manual response factor). It is desirable to
use a manual response factor instead of the concentration amount of the internal
standard protein for the best quantification analysis.
NanoUPLC-MSE acquisition TIMING 1 d
19| The nanoACQUITY™ UPLC™ system was configured as follows: the samples
were initially transferred with an aqueous 0.1% formic acid solution to trap the column
with a flow rate of 15 μL.min-1 for 1 min with a 5 μL loop.
CRITICAL STEP: To acquire data with the system, some considerations must be made
upon installation and engineering the setup. The initial instrument setup is critical. For
this purpose and for system qualification, 1 μg of the E. coli digestion standard was
acquired during installation. The E. coli sample was spiked with rabbit phosphorylase B
for a final concentration of 40 fmol.μL-1 on the column. The expected dynamic range
was measured and the specifications were applied to reach a minimum of 2-3 orders of
magnitude for the Synapt HDMS first generation mass spectrometer. After system
qualification completion, the samples were left running in the MSE positive mode with a
nano-electrospray source.
20| The peptides were separated with a gradient of 5–40 % mobile phase B over 90 min
at a flow rate of 600 nL.min-1, followed by a 10 min rinse with 85% of mobile phase B.
21| The column was re-equilibrated at the initial conditions for 10 min. The column
temperature was maintained at 35 °C. The lock mass was delivered from the auxiliary
pump of the nanoACQUITY pump with a constant flow rate of 150 nL.min-1 at a
concentration of 200 fmol of GFP solution (Sigma-Aldrich, USA) to the reference
sprayer of the mass spectrometer NanoLockSpray™ source. ?Troubleshooting: The
column diameter is critical to achieve the best resolving power and increase the peak
capacity. For optimum loading for 75 μm inner diameter columns, consider using 250 to
500 ng of protein digest and 200 to 400 nL.min-1; for 100 μm columns, use 440 to 880
ng of digest and 400 to 600 nL.min-1; for 150 μm columns, use 1 to 2 μg of digest and
800 nL.min-1 to 1.2 uL.min-1; and for 300 μm columns, use 4 to 8 ug and 4 to 5 uL.min-1
with an analytical ESI source. If the analysis is with a common 2D SCX or 2D with
dilution, the amount of sample injected can be multiplied by the fraction number to keep
the column capacity at a maximum.
22| All samples were analysed in triplicate using a Synapt HDMS™ first generation
mass spectrometer. For all measurements, the mass spectrometer operated in the “Vmode” of analysis with a typical resolving power of at least 10000 full-width halfmaximum (FWHM) and a sampling rate of 10 to 20 points across the chromatography
peak to provide good quantification and peak representation into the chromatogram.
23| All analyses were performed using the positive nano-electrospray ion mode
(nanoESI+).
24| The time-of-flight analyser of the mass spectrometer was externally calibrated with
GFP b+ and y+ ions from m/z 50 to 1990 with the data post acquisition lock mass
corrected using the GFP monoisotopic precursor ion of [M + 2H]2+ = 785.8426.
25| The reference sprayer was sampled with a frequency of 30 s.
26| The nanoUPLC-MSE data were collected in an alternating low energy and elevated
energy mode of acquisition. The continuum spectra acquisition time in each mode was
1.5 s of scan time with at least 10 points per peak on the chromatogram.
27| In the low energy MS mode, the data were collected at a constant collision energy of
3 eV.
28| In the elevated energy MS mode, the collision energy was increased from 12 to 45
eV during each 1.5 s spectrum.
29| The radiofrequency applied to the quadrupole mass analyser was adjusted such that
ions from m/z 50 to 2000 were efficiently transmitted.
Data Processing and Protein Identification TIMING 1 d
30| The MS data obtained from the nanoUPLC-MSE were processed and searched using
the ProteinLynxGlobalServer (PLGS) version 2.4v configured as follows. Sequences
from Glycine max were downloaded from UniProt54. In PLGS, a new databank named
“GLYCINE” was created, and the file containing amino acid sequences was appended.
The protein identifications were obtained with the embedded ion accounting algorithm
of the software and by searching the database with MassPREP™ Protein Digestion
Standards (MPDS) inside as an UniProtKB/Swiss-Prot sequences (Phosphorylase P00489 - PHS2_RABIT, Bovine Hemoglobin - P02070 - HBB_BOVIN, ADH - P00330
- ADH1_YEAST, BSA - P02769 - ALBU_BOVIN) and a CTAG-P78358 protein
appended to the database. CRITICAL STEP: The database must be correctly loaded into
the PLGS. The identifications and quantitative data packaging were generated using
dedicated algorithms42, 55 and searching against a species-specific database56. Refer to
the software manual on how to proceed with the input method into the databank
administration tool. ?Troubleshooting.
31| In PLGS, a new workflow was created for Electrospray-MSE analysis by setting the
data bank to “GLYCINE” and setting the peptide and fragment tolerance to automatic.
The minimum fragment ion matches per peptide was set to 3. The minimum fragment
ion matches per protein was set to 7. The minimum peptide matches per protein was set
to 1. The maximum protein mass was set to 600 kDa. Trypsin was chosen as the
primary digest reagent, allowing 1 missed cleavage. Carbamidomethyl-C and the
oxidation of M were set to fixed and variable modification, respectively. N-linked and
O-linked options were set as variable glycosylation modification, the calibration protein
was set to P00330 (corresponding to ADH sequence in database) and the calibration
protein concentration was set to 25 fmol.uL-1. CRITICAL STEP: These configurations
will determine the protein identification processes and may vary from sample to sample.
Changes in specificity and selectivity can vary because the minimum fragment ion
matches per peptide was set to 3 and can be as low as 1; the minimum fragment ion
matches per protein was set to 7 and can be as low as 5; and the minimum peptide
matches per protein was set to 1. The maximum protein mass was set to 600 kDa; if the
EST database was used, this can be increased to at least 1000 kDa. For standard
concentration assignments, it is preferable to use the manual response to keep the
counts/fmol ratio within a minimum coefficient of variation (CV).
32| In PLGS, a new data preparation was created for Electrospray-MSE analysis by
setting the chromatographic peak width and MS TOF resolution in automatic mode. The
lock mass for charge 2 was set to m/z 785.8426 (corresponding to GFP mass), and the
lock mass windows were set to ±0.25 Da. The low and elevated energy thresholds were
set to 250.0 and 100.0 counts, respectively. The retention time windows were set to
automatic, and 1500 counts were applied to the intensity threshold. CRITICAL STEP:
Ensure the m/z value of GFP and the charge state set are correctly assigned to avoid
error in the PLGS processing. Check the instrument calibration prior to analysis. If the
interval window is more than 0.4 Da for GFP, calibrate the instrument.
?Troubleshooting
33| In PLGS, open a new project. Add 3 new original samples, named SOYCTAG L3,
SOYCTAG L37, and SoyCN, which correspond to the lineage 3, 37 of the recombinant
CTAG in soybean and non-transgenic soybean samples to be analysed and compared,
respectively. If more samples need to be compared, add more original sample tags.
34| In PLGS, add a new microlitre plate named CTAG. For each sample, add the
original raw data from the acquisition, the data preparation file and the workflow file to
a vial position. After the files are combined, raw data processing is possible. Tables 2
and 3 indicate a typical result. CRITICAL STEP: Ion detection, clustering, and
normalisation were performed in PLGS with ExpressionE software license installed
(Waters, Manchester, UK). The intensity measurements are typically adjusted, i.e.,
deisotoped and charge state-reduced EMRTs that replicate throughout the complete
experiment for analysis at the EMRT cluster level. The components are typically
clustered together with a 10 ppm mass precision and a 0.25-min time tolerance or
sufficient value to achieve at least 15 points per peak. The alignment of elevated energy
ions with low energy precursor peptide ions is conducted with an approximate precision
of 0.05 min. To analyse the protein identification and quantification level, the observed
intensity measurements are normalised to the intensity measurement of the identified
peptides of the digested internal standard, as described elsewhere56.
35| For expression analysis, add a new “expression analysis” in PLGS, placing the
samples created in step 33 into separate groups. In the quantification analysis, use the
normalisation in proteins, selecting ADH protein in the table. The results are shown in
Fig. 4.
Troubleshooting advice can be found in Table 3.
ANTICIPATED RESULTS
This is an easy-to-follow protocol to determine if a target recombinant protein has been
expressed in any expression system, especially in a situation where a small sample must
be used or no antibody is available to run blotting detection methods. We successfully
detected the human growth hormone and coagulation factor IX proteins expressed in
transgenic soybean lines9,
10
and present the preliminary results on the CTAG
recombinant molecule expressed in the same system. Two lineages, SOYCTAG L3 and
SOYCTAG L37, and a SOYBEAN Negative from the BR-16 cultivar were used as
samples in this protocol. The amino acid sequence of CTAG can be observed in Fig. 2.
Fig. 1 shows a diagram of the workflow. The sample preparation from TSP to the
nanoUPLC procedure is critical for a successful identification. The use of high purity
water and reagents is recommended due to the sensitivity of the technique. The low
peptide dilution provided by nanoUPLC permits each compound to enter the mass
spectrometer almost individually, allowing the production of MS and MSMS spectra
from almost every peptide in the sample. When nanoACQUITY is associated with MSE
acquisitions43, as the ion current is continuous and both MS and MS/MS are acquired in
parallel, the chromatography peaks are sharpened as more points per peak are obtained,
and there is high reproducibility between different injections, usually in the full loop
method with 2 μL or 5 μL sample injection loading. Fig. 3 shows the resulting
nanoUPLC chromatogram, MSE spectra from [M + 2H]+2 = 857.87 CTAG fragment and
the respective processed spectra by PLGS. The orthogonal separations57 with the SCX
columns58, 59 or recent technologies at the first-dimension linear gradient with fractions
at different pH levels with high-resolution separations both in the first- and in the
second dimensions52 are permitted due to the complexity of the chromatogram in this
particular sample (Fig. 2A). To improve separation, this nanoUPLC system can be used
with 2D RPxRP nanocolumns with small particles sizes at 1.7 μm for BEH or 1.8 μm
for HSS T3 capillary column technologies that allow, for the first dimension, a highresolution separation with organic mobile phase pulse fractions with 20 mM ammonium
formate at pH 10 with a 300 μm x 50 mm XBridge™ BEH 130 Å C18 5 μm column
(Waters, Milford, MA) and a second dimension separation with a trap column followed
by an analytical column of 75 μm X 100 mm at a low pH of 2.6. Even so, five peptides
from CTAG (Table 2, Fig. 2) were detected with high selectivity and specificity. These
peptides showed no trace of post translational modification, but the possibility cannot be
discarded because another 6 CTAG peptides were not detected (Fig. 2). Additionally, a
proteomic profile can be processed with absolute quantitative values for each protein
(Table 1). In this example, the CTAG recombinant protein was detected and quantified
in nanograms based on the stoichiometric ion intensity values of the minimum three
prototypic peptides of ADH and the identified protein. A relation between the total
detected protein and the specific protein concentration can be applied, allowing
calculation of the percentage of the expressed protein in relation to TSP. The percentage
of each detected protein can be observed in Table 1. CTAG has an expression value of
0.1%, which is low compared to that of the other transgenic soybean seeds expressing
hGH9 (2.9%), but it has a similar value compared to factor IX expression (0.2%)10.
Other soybean proteins, such as β-conglycinin and glycinin, have expected values
mainly for storage proteins from soybean seeds60. Through this protocol, it is also
possible to check the protein expression changes by comparing two or more samples.
Fig. 4 shows a two-by-two comparison among SOYCTAG L3, SOYCTAG L37, and a
SOYBEAN Negative protein expression list. It is possible to compare the expression
level of the two transgenic lines and choose one with more recombinant protein
production, in this case SOYCTAG L37. This technique, as with the IdentityE and
ExpressionE software in PLGS (Waters, UK), can also be used to check higher and
lower regulations of native proteins, providing information regarding the side effects of
the introduction of transgenes at the proteomic level.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
Swartz, J.R. Advances in Escherichia coli production of therapeutic proteins. Curr.
Opin. Biotechnol. 12, 195-201 (2001).
Chu, L. & Robinson, D.K. Industrial choices for protein production by large-scale cell
culture. Curr. Opin. Biotechnol. 12, 180-187 (2001).
Tremblay, R., Wang, D., Jevnikar, A.M. & Ma, S. Tobacco, a highly efficient green
bioreactor for production of therapeutic proteins. Biotechnol. Adv. 28, 214-221 (2010).
Daniell, H., Singh, N.D., Mason, H. & Streatfield, S.J. Plant-made vaccine antigens and
biopharmaceuticals. Trends Plant Sci. 14, 669-679 (2009).
Boothe, J. et al. Seed-based expression systems for plant molecular farming. Plant
Biotechnol. J. 8, 588–606 (2010).
Cunha, N.B.d. et al. Correct targeting of proinsulin in protein storage vacuoles of
transgenic soybean seeds. Genet. Mol. Res. 9, 1163-1170 (2010).
Jolliffe, N.A., Craddock, C.P. & Frigerio, L. Pathways for protein transport to seed
storage vacuoles. Biochem. Soc. Trans. 33, 1016-1018 (2005).
Ma, J.K.-C., Drake, P.M.W. & Christou, P. The production of recombinant
pharmaceutical proteins in plants. Nat. Rev. Genet. 4, 794-805 (2003).
Cunha, N.B. et al. Expression of functional recombinant human growth hormone in
transgenic soybean seeds. Transgenic Res. (2010).
Cunha, N.B. et al. Accumulation of functional recombinant human coagulation factor
IX in transgenic soybean seeds. Transgenic Res. (2010).
Rech, E.L., Vianna, G.R. & Aragão, F.J.L. High-efficiency transformation by biolistics
of soybean, common bean and cotton transgenic plants. Nat. Protoc. 3, 410-418 (2008).
Blas, A.L.D. & Cherwinski, H.M. Detection of antigens on nitrocellulose paper
immunoblots with monoclonal antibodies. Anal. Biochem. 133, 214-219 (1983).
Perlmann, P. & Engvall, E. Enzyme-linked immunosorbent assay (ELISA).
Quantitative assay of immunoglobulin G. Immunochemistry 8, 871-874 (1971).
O'Farrells, P.H. High resolution two-dimensional electrophoresis of proteins. J. Biol.
Chem. 250, 4007-4021 (1975).
Shevchenko, A., Tomas, H., Havlis, J., Olsen, J.V. & Mann, M. In-gel digestion for
mass spectrometric characterization of proteins and proteomes. Nat. Protoc. 1, 28562860 (2006).
Weiss, W. & Görg, A. Two-dimensional electrophoresis for plant proteomics. Methods
Mol. Biol. 355, 121-143 (2007).
Blackstock, W.P. & Weir, M.P. Proteomics: quantitative and physical mapping of
cellular proteins. Trends Biotechnol. 17, 121-127 (1999).
Murad, A.M. et al. Screening of entomopathogenic Metarhizium anisopliae isolates and
proteomic analysis of secretion synthesized in response to cowpea weevil
(Callosobruchus maculatus) exoskeleton. Comp. Biochem. Physiol., C 142, 365-370
(2006).
Murad, A.M. et al. Proteomic analysis of Metarhizium anisopliae secretion in the
presence of the insect pest Callosobruchus maculatus. Microbiology 154, 3766–3774
(2008).
Halligan, B.D. ProMoST: A tool for calculating the pI and molecular mass of
phosphorylated and modified proteins on 2 dimensional gels. Methods Mol. Biol. 527,
283-298 (2009).
Henzel, W.J. et al. Identifying proteins from two-dimensional gels by molecular mass
searching of peptide fragments in protein sequence databases. Proc. Natl. Acad. Sci. U.
S. A. 90, 5011-5015 (1993).
Wilson, N., Simpson, R. & Cooper-Liddell, C. Introductory glycosylation analysis
using SDS-PAGE and peptide mass fingerprinting. Methods Mol. Biol. 534, 205-212
(2009).
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
Gevaert, K. et al. Exploring proteomes and analyzing protein processing by mass
spectrometric identification of sorted N-terminal peptides. Nat. Biotechnol. 21, 566-569
(2003).
Hunter, A.P. & Games, D.E. Chromatographic and mass spectrometric methods for the
identification of phosphorylation sites in phosphoproteins. Rapid Commun. Mass.
Spectrom. 8, 559-570 (1994).
Wilkins, J.A., Xiang, R. & Horváth, C. Selective enrichment of low-abundance peptides
in complex mixtures by elution-modified displacement chromatography and their
identification by electrospray ionization mass spectrometry. Anal. Chem. 74, 3933-3941
(2002).
Husson, S.J. et al. Comparative peptidomics of Caenorhabditis elegans versus C.
briggsae by LC–MALDI-TOF MS. Peptides 30, 449-457 (2009).
Guerrier, L. & Boschetti, E. Protocol for the purification of proteins from biological
extracts for identification by mass spectrometry. Nat. Protoc. 2, 832-837 (2007).
Guerrier, L., Righetti, P.G. & Boschetti, E. Reduction of dynamic protein concentration
range of biological extracts for the discovery of low-abundance proteins by means of
hexapeptide ligand library. Nat. Protoc. 3, 883-890 (2008).
Deterding, L.J., Moseley, M.A., Tomer, K.B. & Jorgenson, J.W. Nanoscale separations
combined with tandem mass spectrometry. J. Chromatogr. A 554, 73-82 (1991).
Shen, Y. et al. High-efficiency nanoscale liquid chromatography coupled on-line with
mass spectrometry using nanoelectrospray ionization for proteomics. Anal. Chem. 74,
4235-4249 (2002).
Mirgorodskaya, E., Braeuer, C., Fucini, P., Lehrach, H. & Gobom, J. Nanoflow liquid
chromatography coupled to matrixassisted laser desorption/ionization mass
spectrometry: Sample preparation, data analysis, and application to the analysis of
complex peptide mixtures. Proteomics 5, 399–408 (2005).
Nouri, M.-Z. & Komatsu, S. Comparative analysis of soybean plasma membrane
proteins under osmotic stress using gel-based and LC MS/MS-based proteomics
approaches. Proteomics 10, 1930-1945 (2010).
Brumbarova, T., Matros, A., Mock, H.-P. & Bauer, P. A proteomic study showing
differential regulation of stress, redox regulation and peroxidase proteins by iron supply
and the transcription factor FER. Plant J. 54, 321-334 (2008).
Behrens, H.L., Chen, R. & Li, L. Combining microdialysis, NanoLC-MS, and MALDITOF/TOF to detect neuropeptides secreted in the crab, Cancer borealis. Anal. Chem.
80, 6949–6958 (2008).
Zybailov, B. et al. Sorting signals, N-terminal modifications and abundance of the
chloroplast proteome. PLoS one 3, e1994 (2008).
Barnes, S. et al. High-resolution mass spectrometry analysis of protein oxidations and
resultant loss of function. Biochem. Soc. Trans. 36, 1037-1044 (2008).
Bouché, J.-P. et al. NanoLC-MS/MS analysis provides new insights into the
phosphorylation pattern of Cdc25B in vivo: full overlap with sites of phosphorylation
by Chk1 and Cdk1/cycB kinases in vitro. J. Proteome Res. 7, 1264-1273 (2008).
Unwin, R.D., Griffiths, J.R. & Whetton, A.D. A sensitive mass spectrometric method
for hypothesis-driven detection of peptide post-translational modifications: multiple
reaction monitoring-initiated detection and sequencing (MIDAS). Nat. Protoc. 4, 870877 (2009).
Mori, M. et al. Production of 18O-single jabeled peptide fragments during trypsin
digestion of proteins for quantitative proteomics using nanoLC−ESI−MS/MS. J.
Proteome Res. 9, 3741–3749 (2010).
Yang, Y. et al. A comparison of nLC-ESI-MS/MS and nLC-MALDI-MS/MS for
GeLC-based protein identification and iTRAQ-based shotgun quantitative proteomics.
J. Biomol. Tech. 18, 226-237 (2007).
Levin, Y. et al. Real-time evaluation of experimental variation in large-scale LC–
MS/MS-based quantitative proteomics of complex samples. J. Chromatogr. B 877,
1299-1305 (2009).
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
Li, G.-Z. et al. Database searching and accounting of multiplexed precursor and product
ion spectra from the data independent analysis of simple and complex peptide mixtures.
Proteomics 9, 1696–1719 (2009).
Geromanos, S.J. et al. The detection, correlation, and comparison of peptide precursor
and product ions from data independent LC-MS with data dependant LC-MS/MS.
Proteomics 9, 1683–1695 (2009).
Xu, D. et al. Novel MMP-9 Substrates in Cancer Cells Revealed by a Label-free
Quantitative Proteomics Approach. Mol. Cell Proteomics 7, 2215-2228 (2008).
Cheng, F.-y., Blackburn, K., Lin, Y.-m., Goshe, M.B. & Williamson, J.D. Absolute
protein quantification by LC/MSE for global analysis of salicylic acid-induced plant
protein secretion responses. J. Proteome Res. 8, 82–93 (2009).
Gnjatic, S. et al. NY-ESO-1: Review of an Immunogenic Tumor Antigen. Adv. Cancer
Res. 95, 1-30 (2006).
Chen, Y. et al. A testicular antigen aberrantly expressed in human cancers detected by
autologous antibody screening. Proc. Natl. Acad. Sci. U. S. A. 94, 1914-1918 (1997).
Kurashige, T. et al. NY-ESO-1 expression and immunogenicity associated with
transitional cell carcinoma: correlation with tumor grade. Cancer Res. 61, 4671-4674
(2001).
Murphy, R. et al. Recombinant NY-ESO-1 cancer antigen: production and purification
under cGMP conditions. Prep. Biochem. Biotechnol. 35, 119-134 (2005).
Liu, H. et al. Effects of column length, particle size, gradient length and flow rate on
peak capacity of nano-scale liquid chromatography for peptide separations. J.
Chromatogr. A 1147, 30-36 (2007).
Liu, H., Finch, J.W., Luongo, J.A., Li, G.-Z. & Gebler, J.C. Development of an online
two-dimensional nano-scale liquid chromatography/mass spectrometry method for
improved chromatographic performance and hydrophobic peptide recovery. J.
Chromatogr. A 1135, 43-51 (2006).
Gilar, M., Olivova, P., Daly, A.E. & Gebler, J.C. Two-dimensional separation of
peptides using RP-RP-HPLC system with different pH in first and second separation
dimensions. J. Sep. Sci. 28, 1694–1703 (2005).
Yu, Y.-Q., Gilar, M., Lee, P.J., Bouvier, E.S.P. & Gebler, J.C. Enzyme-friendly, mass
spectrometry-compatible surfactant for in-solution enzymatic digestion of proteins.
Anal. Chem. 75, 6023-6028 (2003).
Consortium, T.U. The Universal Protein Resource (UniProt) in 2010. Nucleic Acids
Res. 38, D142-D148 (2010).
Silva, J.C. et al. Quantitative proteomic analysis by accurate mass retention time pairs.
Anal. Chem. 77, 2187-2200 (2005).
Silva, J.C., Gorenstein, M.V., Li, G.-Z., Vissers, J.P.C. & Geromanos, S.J. Absolute
quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Mol. Cell
Proteomics 5, 144-156 (2005).
Gilar, M., Olivova, P., Daly, A.E. & Gebler, J.C. Orthogonality of separation in twodimensional liquid chromatography. Anal. Chem. 77, 6426–6434 (2005).
Millea, K.M. et al. Evaluation of multidimensional (ion-exchange/reversed-phase)
protein separations using linear and step gradients in the first dimension. J.
Chromatogr. A 1079, 287-298 (2005).
Gilar, M. et al. Comparison of 1-D and 2-D LC MS/MS methods for proteomic analysis
of human serum. Electrophoresis 30, 1157–1167 (2009).
Li, C. & Zhang, Y.-M. Molecular evolution of glycinin and β-conglycinin gene families
in soybean (Glycine max L. Merr.). Heredity doi 10.1038/hdy.2010.97 (2010).
Acknowledgements
We are grateful to G. Ritter at Ludwig Cancer Research Institute (New York Branch)
for providing genes and antibodies. We acknowledge support from C. Bloch at the Mass
Spectrometry Laboratory-EMBRAPA. We acknowledge discussions with G. Ritter and
C. Bloch and thank J. Taquita for technical help. This work was supported by Brazilian
Agricultural Research Corporation, National Council for Scientific and Technological
Development and Fundacao de Apoio a Pesquisa-DF.
Table 1 | List of identified proteins by PLGS in the CTAG soybean transgenic line.
Entry
Description
P78358
O22120
C6T488
P04776
P19594
Q549Z4
P04405
C6TKH0
B3TDK4
P08170
P01063
P01064
P24337
Q39805
Q7GC77
Q852U4
Q852U5
P05046
C6T9Z5
C6TDF5
Q9SEK9
Q9SEK8
Q9XET1
Q9SEL0
Q9XER5
Cancer testis antigen 1
α-subunit of β-conglycinin Fragment OS Glycine max PE 2 SV 2
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Glycinin G1 OS Glycine max GN GY1 PE 1 SV 2
2S albumin OS Glycine max PE 1 SV 2
Proglycinin A2B1 OS Glycine max PE 2 SV 1
Glycinin G2 OS Glycine max GN Gy2 PE 1 SV 2
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Lipoxygenase OS Glycine max PE 3 SV 1
Seed lipoxygenase 1 OS Glycine max GN LOX1 1 PE 1 SV 2
Bowman Birk type proteinase inhibitor C II OS Glycine max PE 1 SV 2
Bowman Birk type proteinase inhibitor D II OS Glycine max PE 1 SV 2
Hydrophobic seed protein OS Glycine max PE 1 SV 1
Dehydrin-like protein OS Glycine max PE 2 SV 1
Glycinin A3B4 subunit OS Glycine max PE 1 SV 1
Glycinin A1bB2 784 OS Glycine max PE 2 SV 1
Glycinin A1bB2 445 OS Glycine max PE 2 SV 1
Lectin OS Glycine max GN LE1 PE 1 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Seed maturation protein PM25 OS Glycine max GN PM25 PE 2 SV 1
Seed maturation protein PM26 OS Glycine max GN PM26 PE 2 SV 1
Seed maturation protein PM31 OS Glycine max GN PM31 PE 2 SV 1
Seed maturation protein PM24 OS Glycine max GN PM24 PE 2 SV 1
Seed maturation protein PM22 OS Glycine max GN PM22 PE 2 SV 1
mW
pI (pH) PLGS
Amount
(Da)
Score
(ng)
17981 8.4739 2886.386
0.2635
63126 4.7254 51090.21
65.47
24103 5.1341 49332.05
0
55671 5.8257 34638.61 28.6832
18447 5.0153 26866.75
6.9832
54356 5.2983 26163.09
9.1288
54356 5.2983 26155.1 16.8381
31640 6.4124 25943.07
4.4002
94352 5.8755 23234.38 11.2248
94310 5.9301 22866.77
0
9194 4.3797 19673.12
2.7779
9460 4.6657 18789.71
1.434
8353 6.0467 17254.79
0.5404
23703
6.084 16428.98
5.4437
58151 5.4199 14016.68
0.3179
54264 5.9489 12395.54
0.6519
54183 5.7768 12393.59
0.4914
30908 5.5955 11981.52
8.6499
42796 6.2935 11699.31
1.0956
41854 6.9985 11311.33
0.5088
25713 4.7899 9964.625
1.0382
26087
4.63 9770.589
0.9765
17735
6.104 9168.433
1.5019
26824 4.9752 8024.353
0.8781
16677 4.9629 7963.376
0.6137
% of TSP
0.11445
28.43671
0
12.45847
3.033133
3.965068
7.31358
1.911214
4.87546
0
1.206573
0.622854
0.234721
2.364455
0.138079
0.283151
0.213438
3.757059
0.475871
0.220996
0.450939
0.42414
0.652346
0.3814
0.266559
Q9LLQ6
C6T1Q7
C6T588
Q9AVK8
Q2I0H4
Seed maturation protein PM34 OS Glycine max GN PM34 PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Allergen Gly m Bd 28K Fragment OS Glycine max PE 2 SV 1
Glyceraldehyde 3 phosphate dehydrogenase OS Glycine max GN GAPC1 PE 2 SV
1
Q9XET0 Putative uncharacterised protein OS Glycine max GN PM30 PE 2 SV 1
C6TBB3 Putative uncharacterised protein OS Glycine max PE 4 SV 1
P93165
Em protein OS Glycine max PE 4 SV 1
Q04672 Sucrose binding protein OS Glycine max GN SBP PE 1 SV 1
C6SVM2 Putative uncharacterised protein OS Glycine max PE 2 SV 1
Q07CZ3 Glyceraldehyde 3 dehydrogenase C subunit OS Glycine max PE 2 SV 1
C6SWV3 Putative uncharacterised protein OS Glycine max PE 2 SV 1
C6TB70
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Q9XES8 Seed maturation protein PM28 OS Glycine max GN PM28 PE 4 SV 1
C6T0L2
Putative uncharacterised protein OS Glycine max PE 4 SV 1
Q38IW8 Triosephosphate isomerase OS Glycine max PE 2 SV 1
Q9SWB2 Seed maturation protein PM41 OS Glycine max GN PM41 PE 4 SV 1
Q42795 β-amylase OS Glycine max PE 1 SV 1
Q39871 Late embryogenesis abundant protein OS Glycine max GN MP2 PE 2 SV 1
C6T0B5
Putative uncharacterised protein OS Glycine max PE 2 SV 1
C6SVR5 Putative uncharacterised protein OS Glycine max PE 2 SV 1
P00330
ALCOHOL DEHYDROGENASE I EC 1 1 1 1
C6SZ11
Putative uncharacterised protein OS Glycine max PE 2 SV 1
O64458 Gly m Bd 30K allergen OS Glycine max GN P34 PE 2 SV 1
C6TD82 Putative uncharacterised protein OS Glycine max PE 2 SV 1
C6TCF1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
C6TB67
Putative uncharacterised protein OS Glycine max PE 2 SV 1
C6EVF9
Elongation factor 1 α-OS Glycine max GN EF 1A PE 2 SV 1
31746
17812
16750
52608
36741
6.6812
5.9577
4.5317
5.6576
6.8421
7863.564
7729.998
7076.108
5503.088
5307.646
0.43
1.3127
0.6071
2.8615
1.4628
0.186769
0.570167
0.263692
1.242884
0.635363
15088
12337
11484
60484
17367
36701
27618
24404
9506
11134
27187
8172
56036
50613
13998
23888
36668
27031
42730
31058
27781
22971
49365
9.4202
5.3837
5.3518
6.4228
9.468
6.8421
5.6695
6.5098
4.4641
6.3754
5.8176
4.6642
5.1887
6.2924
5.697
5.636
6.2734
6.4528
5.5616
7.5082
5.0962
7.7157
9.2369
5145.053
4899.041
4895.396
4608.476
4510.81
4077.103
3894.106
3485.424
3193.133
3078.308
2909.143
2896.554
2892.254
2760.776
2535.701
2489.419
2351.86
2284.277
2154.713
2149.048
2055.682
1790.652
1653.604
0.761
0.1225
0.0814
3.3696
0.5253
0
0.7139
0.7356
0.1892
1.1507
0.2015
0.2108
2.0148
4.4162
0.4396
0.4216
0.9173
0.5469
3.4426
0.0461
0.3383
0.3387
1.3382
0.330538
0.053208
0.035356
1.463576
0.228163
0
0.31008
0.319506
0.082178
0.499803
0.087521
0.09156
0.875123
1.918164
0.190939
0.183121
0.398427
0.237544
1.495283
0.020023
0.14694
0.147113
0.581243
C6T072
C6SWE0
P26413
Q6RIB6
C6TK76
C6TGM9
A1KR24
C6T920
Q84V19
Q9SP11
Q96450
C6T9C2
Q71EW8
C6K8D1
C6SZX7
C6T1V2
C6TNU2
P27066
C6TB98
C6T8D8
C6T4R9
C6SZN7
C6TLT3
C6TMG1
C6T049
Q6RIB7
C6T4Z6
C6SVT0
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Heat shock 70 kDa protein OS Glycine max GN HSP70 PE 3 SV 1
Malate dehydrogenase OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Dehydrin OS Glycine max GN LEA 2 D11 PE 3 SV 1
Phosphoglycerate kinase Fragment OS Glycine max PE 2 SV 1
Sucrose binding protein 2 OS Glycine max GN SBP2 PE 2 SV 1
Sucrose binding protein homolog S 64 OS Glycine max GN SBP PE 2 SV 1
14 3 3-like protein A OS Glycine max GN GF14A PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Methionine synthase OS Glycine max PE 2 SV 1
Seed biotinylated protein 68 kDa isoform OS Glycine max PE 2 SV 1
Glutathione peroxidase OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Ribulose bisphosphate carboxylase large chain OS Glycine max GN rbcL PE 1 SV
3
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Fructose bisphosphate aldolase Fragment OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 4 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Fructose bisphosphate aldolase OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Enolase OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
17442
17355
70835
35504
41507
22317
25369
25296
55740
55799
29030
34556
84229
67906
18491
17729
47497
52576
5.2698
5.2698
5.1815
6.3424
5.1583
5.9214
6.1198
9.7126
6.1009
6.316
4.4978
5.8074
5.8874
6.1461
6.9435
6.3926
5.6664
5.976
1651.261
1615.874
1481.25
1466.792
1249.91
1194.818
1148.557
1114.197
1069.194
1005.86
910.0024
842.0455
833.6119
781.9993
737.6782
727.8019
684.3593
675.6335
0.3113
0.1385
1.7718
0.4538
0.1395
0.1194
0.8074
0.2452
0.3031
0.2359
0.4026
0.8502
1.3438
6.5867
0.1674
0.189
0.0516
0.5484
0.135212
0.060157
0.769576
0.197107
0.060591
0.051861
0.350692
0.106502
0.131651
0.102462
0.174868
0.369282
0.583676
2.860914
0.07271
0.082092
0.022412
0.238196
33906 5.5009 664.4331
28937 7.1175 633.6357
17656 10.1526 584.8538
12980 5.1436 576.5699
29708 10.2354 557.4932
38315 7.3405 534.5306
17988 5.3859 529.731
47689 5.1445 518.4756
15883 10.5421 502.4344
18011 6.9304 460.7158
0.3868
0.3172
0.3291
0.1605
0.0924
0.3023
0.1186
0.5378
0.0449
0.0918
0.168005
0.137775
0.142944
0.069713
0.040134
0.131303
0.051514
0.233592
0.019502
0.039873
Q39839
C6SYU0
C6SZN6
C6TG05
C6K8D0
C6T1R3
Q9SPB8
Q0GJJ9
Q9SWB4
C6TNI8
P29530
C6SXU0
C6SW79
P28551
C6T7U2
Q39801
C6SVF1
C6TGA6
C6SXS9
Q8RVH5
C6TBB8
B1Q2X4
B0M1A9
B1ACD5
C6TNU3
P54774
C6T9X5
C6SXR4
C6TCR6
Nucleoside diphosphate kinase 1 OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Trypsin inhibitor 26 kDa isoform OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Malate dehydrogenase OS Glycine max GN Mdh1 PE 3 SV 1
ACP thioesterase protein Fragment OS Glycine max GN FATB1b PE 4 SV 1
Poly ADP ribose polymerase 3 OS Glycine max GN PARP3 PE 2 SV 1
Putative uncharacterised protein Fragment OS Glycine max PE 2 SV 1
P24 oleosin isoform A OS Glycine max PE 2 SV 2
Putative uncharacterised protein OS Glycine max PE 2 SV 1
40S ribosomal protein S12 OS Glycine max PE 2 SV 1
Tubulin β-chain Fragment OS Glycine max GN TUBB PE 2 SV 2
Putative uncharacterised protein OS Glycine max PE 2 SV 1
51 kDa seed maturation protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Basic 7S globulin 2 OS Glycine max PE 1 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Protein disulfide isomerase OS Glycine max GN PDIL 1 PE 3 SV 1
Peroxisomal 3 ketoacyl CoA thiolase OS Glycine max PE 2 SV 1
Kunitz trypsin protease inhibitor OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Cell division cycle protein 48 homolog OS Glycine max GN CDC48 PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
16432 5.8828
18273 8.6237
17935 5.5091
40355 6.5206
25930 8.4036
25133 5.9641
36119 8.2277
22492 5.4897
91630 5.2626
20580 5.5631
23487 9.0505
27721 6.8196
14788
5.176
45721
5.553
51462 5.2559
50951 6.7427
15968 6.1593
34167 4.8087
23473 5.7777
47174
8.174
13300 8.9643
58554 4.9532
48585
7.7
22661 5.0623
37948 5.8645
89713 5.0054
43553 5.7779
14832 11.3069
36175 4.6644
458.5279
422.0796
386.4056
381.6409
377.5146
362.5658
330.7828
318.6968
316.8825
313.7788
298.6236
295.3087
249.5643
246.5856
242.4084
239.2772
237.9553
227.2209
219.1115
205.5323
190.4985
190.1399
189.2588
177.8031
174.8052
169.9282
169.9056
166.6849
163.951
0
0.1511
0.0485
0.322
0.2736
0.1876
0.2051
0.8157
0.8352
0.3245
0.3017
0.2351
0.0961
0.0953
0.5776
0.2423
0.0405
0.0914
0.1407
0
0.1962
0.5267
0.2631
0
0.1576
0.8228
0.226
0.0521
0.1824
0
0.06563
0.021066
0.13986
0.118837
0.081484
0.089085
0.354297
0.362767
0.140946
0.131043
0.102115
0.041741
0.041393
0.250879
0.105242
0.017591
0.039699
0.061113
0
0.085219
0.228771
0.114277
0
0.068453
0.357381
0.098162
0.022629
0.079225
C6T262
C6TN03
C6TL46
A4ZGT5
Q9SPJ6
C6TG88
C6ZRP9
Q7XAC5
C6TJD3
O22518
C6TGJ9
Q0PJB9
Q8L7J4
C6T520
C6T6B2
C6SY64
C6T470
B0M1A8
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Transcription factor bZIP129 Fragment OS Glycine max GN bZIP129 PE 2 SV 1
Maturation protein pPM32 OS Glycine max GN PM32 PE 2 SV 1
Putative uncharacterised protein Fragment OS Glycine max PE 2 SV 1
Pti1 kinase-like protein OS Glycine max PE 2 SV 1
Embryo specific urease OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
40S ribosomal protein SA OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
MYB transcription factor MYB131 Fragment OS Glycine max GN MYB131 PE 2
SV 1
Pyruvate kinase OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Proteasome subunit β-type OS Glycine max PE 2 SV 1
Putative uncharacterised protein OS Glycine max PE 2 SV 1
Peroxisomal aminotransferase Fragment OS Glycine max PE 2 SV 1
22373 7.2902 163.3037
29982 10.6386 144.9159
29582 5.7539 142.5109
20603 9.6048 142.3783
18871 5.3156 135.9585
17907
4.584 131.2757
34932 8.9379 130.005
90099 5.6169
128.19
35726 7.6161 122.8045
33885 4.9052 122.2737
28237 11.1379 117.5363
36138 9.0851 111.3419
0.317
0.6681
0.7757
0.784
0.2404
1.3555
0.1044
0.4533
0.1528
0.2313
0.042
0.7036
0.137688
0.290187
0.336923
0.340528
0.104417
0.588758
0.045346
0.19689
0.066368
0.100464
0.018243
0.305607
55280
18088
19918
24533
27220
31458
0.2255
0.206
0.1522
0
0.0752
0.3781
0.097945
0.089476
0.066108
0
0.032663
0.164227
7.0847
5.8
5.1687
7.0461
4.5276
6.087
109.9843
109.3591
104.2455
103.5601
101.0626
100.999
Table 2 | List of peptide sequences found by PLGS for CTAG protein.
Precursor MH+
Charge MH+ Error
Score
Start
End
Sequence
Modifications
(Da)
state
(Da)
1715.0154
2.04
0.0032
7.8859
108
124 (R)SLAQDAPPLPVPGVLLK(E)
1349.7391
2
0.0019
7.2048
125
136 (K)EFTVSGNILTIR(L)
1871.1141
3
-0.0038
7.0925
107
124 (R)RSLAQDAPPLPVPGVLLK(E)
2485.308
2.71
0.0177
6.1839
87
107 (R)LLEFYLAMPFATPMEAELARR(S)
Oxidation M (8)
2855.4158
4
0.0052
6.0799
82
106 (R)GPESRLLEFYLAMPFATPMEAELAR(R) Oxidation M
(13)
Retention
Intensity
Time (min)
110.3133
19656
94.5072
7191
102.1798
2493
141.8096
2994
63.4806
2744
Table 3 | Troubleshooting table.
Problem
Recommendations
A contaminant with a repetitive cluster
with singly charged ions encountered
during chromatography.
Use only high quality pipette tips and
tubing. Poor quality plastics release
quantities of compounds into the sample
that will affect chromatography and MS
analysis.
Poor peptide profile
Digest a new sample with a recently
prepared high-quality trypsin. Check the
pH of the sample before adding surfactant;
it must be alkaline.
After MSE acquisition, PLGS processes
stop with message “failed to process
raw data” or resulted in insufficient
data.
This indicates a problem in the MS
acquisition. Check the ionisation source,
the changing and cleaning probe and cone;
check also that the GFP solution is
adequately delivered by the lock mass.
Look into the raw data.
High pressure during chromatography
stops the acquisition.
The column or capillary has clogged.
Replace the column and capillary and
ensure that the sample is digested and
correctly centrifuged.
PLGS does not quantify the sample.
Ensure that ADH was added to the sample
and that the information was given in the
workflow process of PLGS.
Contamination appears during
chromatography
Check all solutions. Use only MS and
HPLC reagents and deionised water with
total organic compounds less than 4 ppb
to avoid contamination.
PLGS does not process the database
Introducing the database into PLGS
requires that the sequences are in a
FASTA format with the same strings and
character patterns.
Low reproducibility due to column
saturation
Keep the total protein mass load into
column depending on the column
diameter: 75 μm for 250 μg to 500 μg, 100
μm for 440 μg to 880 μg, 150 μm for 1 μg
to 2 μg and 300 μm for 4 μg to 8 μg.
PLGS does not show results
If no result is displayed, check the log
files or LockMass m/z window with no
errors. Check the data preparation file for
errors in the LockMass values.
Fig 1 | Protocol workflow. A total soluble protein from the soybean transgenic line
expressing the CTAG molecule was digested with trypsin and submitted to nanoUPLCQ-TOF MSE analysis.
Fig 2 | CTAG amino acid sequence. Coloured boxes indicate the peptides found by
PLGS (Table 2), and the overlapping regions are indicated by changes in colour.
Fig 3 | Experimental spectra results. (A) Chromatogram of the nanoUPLC of soybean
CTAG lineage 3. The arrows indicate the eluted peptides corresponding to the CTAG
digested protein (Table 2). (B) MS spectra from 110.31 min containing the [M + 2H] +2
= 857.87 ion from the CTAG protein fragment. (C) MSMS spectra from [M + 2H]+2 =
857.87 ion precursor. (D) Deconvoluted MSMS spectra processed by PLGS and the de
novo sequence corresponding to ion [M+H]+=1715.01 from the trypsin digestion of
CTAG.
Fig 4 | The expression analysis between samples. CTAG L37 and CTAG L3 correspond
to the transgenic soybean lines and SOY CN corresponds to the negative soybean seed.
Red numbers indicate the down regulation ratio of the selected protein; green numbers
correspond to the up regulation; and gray indicates no modification of the expression
level. The Log of the ratio and its standard deviation are shown in parentheses. The P
value corresponding to values from 0 to 1, where 0-0.05 is considered down-regulated
and 0.95-1.00 is considered up-regulated, is shown in brackets.
Download